AI for Scientists: Accelerating Discovery Through Knowledge, Data, and Learning
Author: Sun, Jennifer Jianing
Year: 2024
Degree: Dissertation (Ph.D.)
Advisors: Perona, Pietro; Yue, Yisong
Committee Members: Bouman, Katherine L.; Perona, Pietro; Yue, Yisong; Chaudhuri, Swarat; Kennedy, Ann
Option: Computing and Mathematical Sciences
DOI: 10.7907/d6y8-4590
Abstract
With rapidly growing amounts of experimental data, machine learning is increasingly crucial for automating scientific data analysis. However, many real-world workflows demand expert-in-the-loop attention and require models that not only interface with data, but also with experts and domain knowledge. My research develops full stack solutions that enable scientists to scalably extract insights from diverse and messy experimental data with minimal supervision. My approaches learn from both data and expert knowledge, while exploiting the right level of domain knowledge for generalization. This thesis presents progress towards developing automated scientist-in-the-loop solutions, including methods that automatically discover meaningful structure from data such as self-supervised keypoints from videos of diverse behaving organisms. We will then discuss methods that use these interpretable structures to inject domain knowledge into the learning process, such as guiding representation learning using symbolic programs of behavioral features computed from keypoints. This work is the result of close collaborations with domain experts, such as behavioral neuroscientists, in order to identify bottlenecks and integrate these methods in real-world workflows. My aim is to enable AI that collaborates with scientists to accelerate the scientific process.
Files
- Jennifer_2023_Caltech_Thesis.pdf (application/pdf)