Oral Candidacy - Peter Ivie

Start: 3/28/2017 at 11:30AM
End: 3/28/2017 at 2:00PM
Location: 117 I Cushing Hall
Peter Ivie

Proposal Defense

March 28, 2017          11:30 am          117 I Cushing

Adviser:  Dr. Douglas Thain

Committee Members:

Dr. Scott Emrich          Dr. Kevin Lannon          Dr. Gregory Madey


Enabling and Ensuring Reproducible Computational Science with Prune


Computing as a whole suffers from a crisis of reproducibility. Programs executed in one context are astonishingly hard to reproduce in another context, resulting in wasted effort by people and general distrust of results produced by computer. Some of the problem lies in the fact that every program has implicit dependencies on data and execution environment which are rarely understood by the end user. But in addition, the program needs to be organized in a way that is understandable by a scientist and where the parameter space can be explored. To address these problems, we present Prune, the Preserving Run Environment. In Prune, every task to be executed is wrapped in a functional interface and coupled with a strictly defined environment. The task is then executed by Prune rather than the user to ensure reproducibility. As a scientific workflow evolves in Prune, a growing but immutable tree of derived data is created. The provenance of every item in the system can be precisely described, facilitating sharing and modification between collaborating researchers, along with efficient management of limited storage space. We present the user interface and the initial prototype of Prune, and demonstrate its application in matching records and comparing surnames in U.S. Censuses. With replicability demonstrated, we propose going forward with an evaluation of reproducibility by quantifying the effectiveness of Prune when synchronizing a workflow between collaborators.

