Home > Events > PhD Defense - Saurabh Nagrecha

PhD Defense - Saurabh Nagrecha

Start: 5/5/2017 at 2:00PM
End: 5/5/2017 at 5:55PM
Location: 384 Nieuwland Hall
Attendees: Faculty and students are welcome to attend the presentation portion of the defense. Light refreshments will be served.
Add to calendar:
iCal vCal

Saurabh Nagrecha

Dissertation Defense

May 5, 2017          2:00 pm          384E Nieuwland 

Advisor: Prof. Nitesh V. Chawla

Committee Members:

Dr.  Sidney K. D'Mello         Dr. Reid. A. Johnson         Dr. Tim Weninger



Operationalizing Imbalanced Class Problems in Data Science


The increasing diversity of data sources has propelled data science into an equally diverse set of application domains. Across these domains, a fundamentally common task is that of classification. A ubiquitous challenge in classification is that of imbalanced class distributions. This problem occurs when one class is underrepresented in the data, this makes modeling and successful implementation challenging. While class imbalance occurs across a wide variety of domains, predictive pipelines require a bespoke approach for each domain. The Actionable Knowledge Discovery (AKD) paradigm formalizes domain-actionability by structuring data and metrics to cater to the domain. However, it does not account for the role of domain knowledge in influencing problem and pipeline design. In this dissertation, we extend AKD and explore how imbalanced class problems can be transformed from academic proofs of concept to operationally viable solutions for their intended domain. We demonstrate the role of domain-driven problem and pipeline design across the diverse domains of cost-sensitive classification, auto insurance, online video content, Massive Open Online Courses (MOOCs) and network science in the form of deployed solutions. As result, we show how the domain influences which questions we ask of the data and how we should interpret them.