PhD Research

My PhD research focused on automatically generating questions from text for educational purposes. The first question generator I built (see QG System 1 below) started as a class project for two different classes I was taking at the time. I created 3 different question generation systems prior to MARGE - the system upon which my PhD dissertation is focused.


MARGE (Module to Automatically Read, Generate, Evaluate) is the automatic question generation system that was the focus of my PhD dissertation. MARGE generates questions from both individual sentences and the passage as a whole, and is the first QG system to successfully generate meaningful questions from textual units larger than a sentence. The passage-generation components will be presented in April at the CICLing conference in Budapest, Hungary. A system description paper will be forthcoming in IEEE Transactions on Learning Technologies.

QG System 3: Infusing NLU into NLU

A fair criticism of most work in automatic question generation is that these systems just perform syntactic source text manipulation according to grammar rules with no concern for sentence meaning. In this work, I take steps toward infusing Natural Language Understanding (NLU) into the Natural Language Generation (NLG) process by analyzing the central semantic content of each independent clause in a sentence. Sentence structure scaffolds semantics in surprisingly predictable ways, particularly in expository text such as textbooks. Identifying the semantic aim of a sentence allows the question generator to match a template that hones in on that point, and not generate as many trivial questions as syntactically possible. In June 2016, I presented this work at the Intelligent Tutoring Systems conference in Zagreb, Croatia. The 2016 ITS conference had a 15% acceptance rate for long papers, and I was honored that my paper was nominated for the best paper award.
Next, I added an automated evaluation feature based on TextRank to evaluate the meaningfulness and importance of questions. This, plus additional evaluations, was presented in September 2016 at the International Natural Language Generation conference in Edinburgh.

QG System 2: Using Multiple Parsers

My prior work in QG from semantic role labels (see below) demonstrated a 44% reduction in the error rate compared to two state-of-the-art question generators. Seeking further improvements, I explored combining the SRL info with phrases extracted via a dependency parse to create a "Question Generation" tree structure for easily extracting source content for generation. This approach demonstrated a 17% reduction in the error rate compare to my prior work, and generated 21% more semantically oriented rather than fact based questions. I presented this work at the 2015 Artificial Intelligence in Education conference in Madrid.

QG System 1: Using Semantic Role Labels

I extracted text from science textbooks at the middle and high school levels, parsed it with SENNA, and looked for semantic role label patterns from which I could generate questions. A key component of the project was the implementation of filtering to prevent generating low-quality questions. I coded this system in Python, and presented it at ACL 2014 in Baltimore.

Copyright © 2005 | All Rights Reserved