EducAItion: Can GPT help revive an underutilised art by effectively marking and providing feedback on oral examinations in STEM?

Finally posting my MEng thesis…

ABSTRACT

With Large Language Models (LLMs) such as ChatGPT becoming widely available – and popular amongst university students – current approaches to instruction are being questioned and assessments redesigned. Driving this are concerns with balancing authenticity and rigour while still teaching students to use these tools, especially in STEM courses. Reviving the oral examination method can address those concerns and this paper explores the use of the GPT4-turbo-preview model (GPT for simplicity) via API as well as OpenAI's web-based ChatGPT as an assessor for oral examinations in STEM. The research is structured around four trials: to evaluate GPT's accuracy and consistency when marking exam transcripts, compare its variance to that of humans, gauge student experience with the oral exam format, and test its robustness against bad actors and transcript modifications. Findings indicate that GPT can effectively mark transcripts with less variance when compared with different human examiners and generate personalised feedback which participants find useful. While GPT shows promise, it can be influenced by the professor's language in transcripts. Tests show that modifications emulating a ‘‘cruel professor" can significantly impact GPT's marking (p=0.003). The quality of feedback also varies; the most constructive results from a transcript that has been pre-processed to remove over-enthusiastic phrases from the professor such as ‘‘perfect", or ‘‘very very good". Non-verbal communication and native language considerations aside, this research indicates GPT's potential as a consistent and efficient assessor for oral examinations in STEM.


Read the full paper here.

Next
Next

What can Israel do to improve its economy and the lives of inhabitants in the face of a long and costly conflict?