HCII PhD Thesis Defense: Kelly Rivers

Kelly Rivers

Thursday, July 13, 2017 - 12:00pm
Gates Hillman Center 6115

"Automated Data-Driven Hint Generation for Learning Programming"

Thesis Committee
Ken Koedinger (Chair, HCII/Psychology)
Brad Myers (HCII/ISR)
Vincent Aleven (HCII)
Sharon Carver (Psychology)
Tiffany Barnes (CS, North Carolina State University)
Feedback is an essential component of the learning process, but in fields like computer science, which have rapidly increasing class sizes, it can be difficult to provide feedback to students at scale. Intelligent tutoring systems can provide personalized feedback to students automatically, but they can take large amounts of time and expert knowledge to build, especially when determining how to give students hints. Data-driven approaches can be used to provide personalized next-step hints automatically and at scale, by mining previous students’ solutions.
I have created ITAP, the Intelligent Teaching Assistant for Programming, which automatically generates next-step hints for students in basic Python programming assignments. ITAP is composed of three stages: canonicalization, where a student's code is transformed to an abstracted representation; path construction, where the closest correct state is identified and a series of edits to that goal state are generated; and reification, where the edits are transformed back into the student's original context. With these techniques, ITAP can generate next-step hints for 100% of student submissions, and can even chain these hints together to generate a worked example. Early analysis showed that hints could be used in practice problems in a real classroom environment, but also demonstrated that students' relationships with hints and help-seeking were complex and required deeper investigation.
In my thesis work, I surveyed and interviewed students about their experience with help-seeking and using feedback, and found that students wanted more detail in hints than was initially provided. To determine how hints should be structured, I ran a usability study with programmers at varying levels of knowledge, where I found that more novice students needed much higher levels of content and detail in hints than was traditionally given. I also found that examples were commonly used in the learning process, and could serve an integral role in the feedback provision process. I then ran a randomized control trial experiment to determine the effect of next-step hints on learning and time-on-task in a practice session, and found that having hints available resulted in students spending 13.7% less time during practice while achieving the same learning results as the control group. Finally, I used the data collected during these experiments to measure ITAP’s performance over time, and found that generated hints became more optimal as data was added to the system.
My dissertation has contributed to the fields of computer science education, learning science, human-computer interaction, and data-driven tutoring. For computer science education I have created ITAP, which can serve as a practice resource for future programming students during the learning process. In the learning sciences, I have replicated the expertise effect by finding that more expert programmers want less detail in hints than novice programmers; this finding is especially important as it implies that programming teachers may provide novices with less assistance than they need. I have contributed to the literature on human-computer interaction by identifying multiple possible representations of hint messages, and analyzing how users react to and learn from these different formats during program debugging. Finally, I have contributed to the new field of data-driven tutoring by establishing that it is possible to always provide students with next-step hints, even without a starting dataset beyond the instructor solution, and by demonstrating that those hints can be improved automatically over time. 
Queenie Kravitz