Ask and distract
Data-driven methods for the automatic generation of multiple-choice reading comprehension questions from Swedish texts
Time: Tue 2023-10-17 14.00
Subject area: Speech and Music Communication
Doctoral student: Dmytro Kalpakchi , Tal, musik och hörsel, TMH
Opponent: Professor Joakim Nivre, Uppsala University, Uppsala, Sweden
Supervisor: Associate Professor Johan Boye, Tal, musik och hörsel, TMH
Multiple choice questions (MCQs) are widely used for summative assessment in many different subjects. The tasks in this format are particularly appealing because they can be graded swiftly and automatically. However, the process of creating MCQs is far from swift or automatic and requires a lot of expertise both in the specific subject and also in test construction.
This thesis focuses on exploring methods for the automatic MCQ generation for assessing the reading comprehension abilities of second-language learners of Swedish. We lay the foundations for the MCQ generation research for Swedish by collecting two datasets of reading comprehension MCQs, and designing and developing methods for generating the whole MCQs or its parts. An important contribution is the methods (which were designed and applied in practice) for the automatic and human evaluation of the generated MCQs.
The best currently available method (as of June 2023) for generating MCQs for assessing reading comprehension in Swedish is ChatGPT (although still only around 60% of generated MCQs were judged acceptable). However, ChatGPT is neither open-source, nor free. The best open-source and free-to-use method is the fine-tuned version of SweCTRL-Mini, a foundational model developed as a part of this thesis. Nevertheless, all explored methods are far from being useful in practice but the reported results provide a good starting point for future research.