As AI becomes increasingly capable of following instructions and conducting analyses, I believe that scientists will increasingly play the role of selector and evaluator.
In this talk, I will share our recent advances in AI-enabled hypothesis generation and research evaluation. Rather than treating AI hallucinations as obstacles to eliminate, we leverage data and literature to steer AI creativity toward generating effective hypotheses.

Chenhao Tan,
University of Chicago
I will also introduce HypoBench, a dedicated benchmark for evaluating hypothesis generation, which reveals significant room for potential improvement of current AI models. Finally, I will present ongoing work that formalizes the evaluation of research outcomes beyond the paper itself and use AI to conduct robust evaluation of research evaluation, with a case study on mechanistic interpretability.
Chenhao Tan is an Associate Professor of Computer Science and Data Science at the University of Chicago, and directs the Chicago Human+AI Lab. He earned his PhD in Computer Science from Cornell University and dual bachelor’s degrees in computer science and economics from Tsinghua University. His research focuses on human-centered AI, communication & intelligence, AI & Scientific Discovery, and AI alignment. His work has been covered by major news media outlets, including the New York Times and the Washington Post. He also won a Sloan research fellowship, an NSF CAREER award, an NSF CRII award, a Google research scholar award, research awards from Amazon, IBM, JP Morgan, and Salesforce, a Facebook fellowship, and a Yahoo! Key Scientific Challenges award.