Home > Seminars > Anuj Karpatne - Theory-guided Data Science: A New Paradigm for Advancing Scientific Discovery in the Big Data Era

Anuj Karpatne - Theory-guided Data Science: A New Paradigm for Advancing Scientific Discovery in the Big Data Era


2/9/2017 at 3:30PM


2/9/2017 at 5:00PM


129 DeBartolo


College of Engineering close button

Nitesh Chawla

Nitesh Chawla

VIEW FULL PROFILE Email: nchawla@nd.edu
Phone: 574-631-1090
Website: http://www.nd.edu/~nchawla/
Office: 384 Nieuwland Science Hall


College of Engineering Frank M. Freimann Professor
Dr. Chawla's research interests are broadly in the areas of Big Data: data science, machine learning, network science and their applications social networks, healthcare informatics/analytics, and climate/environmental sciences. He directs the Notre Dame Interdisciplinary Center for Network ...
Click for more information about Nitesh
Add to calendar:
iCal vCal

The potential of data science methods, that have found tremendous success in the commercial arena, is increasingly being recognized for advancing scientific discovery. To capture this excitement, some have even referred to the rise of data science in scientific disciplines as "the end of theory," the idea being that, in domains where data are available in sufficiently large quantities, one can discard scientific theories and completely rely on the knowledge contained in data. Unfortunately, this notion of "black-box" application of data science has met with limited success in scientific domains, where complex physical phenomena are insufficiently represented using scarce supplies of data samples.
This talk will introduce a novel paradigm for advancing scientific discovery that uses the unique capability of data science methods to automatically learn patterns and models from large data, without ignoring the treasure of accumulated scientific knowledge. This theory-guided data science paradigm seeks to integrate scientific consistency as a critical component of model performance, along with the twin pillars of training accuracy and model complexity that are at the heart of state-of-the-art machine learning frameworks. This talk will describe several strategies for integrating scientific consistency in conventional learning frameworks, using illustrative examples of emerging applications from a diverse range of scientific domains such as material science, hydrology, climate science, turbulence modeling, bio-medical engineering, neuroscience, and bio-marker discovery. The talk will conclude with a detailed case study on mapping the dynamics of freshwater bodies at a global scale using data from Earth observing satellites.

Seminar Speaker:

Anuj Karpatne

Anuj Karpatne

University of Minnesota

Anuj Karpatne is a PhD candidate at the University of Minnesota, where he works with his advisor Prof. Vipin Kumar on an NSF Expeditions in Computing project on "understanding climate change: a data-driven approach." Anuj's research focuses on addressing some of the pertinent challenges in analyzing complex physical data in inter-disciplinary problems. His research has resulted in a system to monitor the dynamics of surface water bodies on a global scale, which was highlighted in a recent NSF news story. This system is enabling a number of environmental studies on the impact of climate change and/or human actions on water availability. Anuj's system will also be key to providing information about surface area changes in lakes around the world (considered one of the 50 essential climate variables) to inform the climate change mitigation and adaptation efforts of the United Nations Framework Convention on Climate Change (UNFCCC). Anuj has received the Doctoral Dissertation Fellowship and the Informatics Institute Fellowship at the University of Minnesota. He is also a co-author of the second edition of the leading textbook, "Introduction to Data Mining." Before joining the University of Minnesota, Anuj received his bachelor's and master's degrees from the Indian Institute of Technology Delhi.