Home > Seminars > Paige Rodeghero - Learning Programmer Behavior to Improve Automatic Documentation Generation Algorithms

Paige Rodeghero - Learning Programmer Behavior to Improve Automatic Documentation Generation Algorithms


11/30/2017 at 3:30PM


11/30/2017 at 4:45PM


140 DeBartolo


College of Engineering close button

Patrick Flynn

Patrick Flynn

VIEW FULL PROFILE Email: flynn@nd.edu
Phone: 574-631-8803
Website: http://www.nd.edu/~flynn
Office: 384A Fitzpatrick Hall
Curriculum Vitae


College of Engineering Duda Family Professor of Engineering
Computer Vision Biometrics Pattern Recognition Computer Graphics and Scientific Visualization Mobile Application Development
Click for more information about Patrick
Add to calendar:
iCal vCal

Programmers spend a large portion of their time reading and navigating source code in order to comprehend it.  However, studies of program comprehension consistently find that programmers would prefer to focus on small sections of code during software maintenance, and "try to avoid" comprehending everything. Programmers depend on documentation to quickly understand source code. Source code documentation explains how the source code works in a variety of ways in plain English text.  It summarizes source code, explains the behavior of a section of code, shows relationships to other code, etc.  Specifically, a 'source code summary" is a small (typically 1-3 sentences) amount of text explaining what the source code does or how it can be used. 

Unfortunately, source code summaries are time-consuming to write.  Because software is constantly changing, the documentation often needs to also be frequently updated.  However, as Fluri et al. describes many times the documentation is left in its original state due to time constraints.  The result is that documentation may be incorrect and often times misleading. Recently, efforts to automatically generate documentation have proliferated. The long-term goal is to reduce the manual effort to write source code summaries and be able to generate summaries from source code with little to no effort.  

Recent research has targeted the problem of automatic source code summarization.  Generally speaking, these summarization tools work by analyzing the source code, determining the important terms, and creating summaries based on those terms.  The current tools aimed at performing this task have been shown to be effective under specific conditions, but are unable to achieve human-level quality summaries. The reason that current tools are unable to achieve the quality of a human-written summary is because there is a knowledge gap in the literature: the research community does not know precisely what a summary should include. Summaries should reflect what programmers need, and we do not know exactly what programmers need.       

My research seeks to close this gap in the current literature. My strategy is to: 1) study how programmers write documentation, in order to 2) write algorithms that mimic their process.  I will 1) observe programmers reading source code and then mimic the programmers by writing an algorithm that extracts the same information they looked at, 2) observe programmers having meetings and mimic their behavior creating user stories, and 3) observe programmers communication in industry to categorize their behavior.  My research assists code  summarization  research  by providing a guide to the way programmers work and read source code. 

Seminar Speaker:

Paige Rodeghero

Paige Rodeghero

University of Notre Dame

Paige Rodeghero is a 5th year PhD candidate from the University of Notre Dame.  She works in the field of software engineering and specializes in human centric software engineering research.  She publishes at venues such as the ACM/IEEE International Conference on Software Engineering (ICSE) and IEEE Transactions on Software Engineering (TSE).  She won the ACM Distinguished Paper Award for her work at ICSE 2014.  Ms. Rodeghero is currently on the job market.