Understanding Kinase-Substrate Interaction with Deep Learning and High-Throughput Scanning

Author: Yu, Changhua

Year: 2026

Degree: Dissertation (Ph.D.)

Advisor: Van Valen, David A.

Committee Members: Sternberg, Paul W.; Yue, Yisong; Mayo, Stephen L.; Van Valen, David A.

Option: Bioengineering

DOI: 10.7907/2m7a-nx91

Abstract

Catalyzed by more than 500 human kinases, protein phosphorylation contributes to almost every aspect of cellular signaling. Despite advances in the past decades, a profound gap persists between the scale of the kinase signaling network and our ability to characterize it: over 95% of human phosphosites lack an assigned kinase, approximately one-third of the human kinome remains functionally understudied, and mechanisms by which kinase domains select substrates remain under-explored. Consequently, the druggable landscape for targeting phosphorylation rewiring events in disease contexts remains limited.

This thesis seeks to address these challenges through developments spanning machine learning, high-throughput interactome screening, and deep mutational scanning. We develop KINBERT, a transformer-based protein language model that jointly encodes paired kinase domain and substrate peptide sequences and demonstrate its utility through disease variant interpretation and identification of host kinase targets during viral infection. We develop PhosphoPCA, a barcoded yeast-based assay that links kinase–substrate phosphorylation to cell growth for high-throughput pooled profiling of kinase specificity and apply it to identify novel substrates for understudied kinases and to engineer validated live-cell kinase biosensors. We leveraged saturated mutagenesis to enable deep mutational scanning of kinase domains with measurable components of variant fitness from protein stability, phosphorylation activity, and substrate selectivity.

Together, the thesis builds an integrated framework for systematically decoding the kinase–substrate interaction space, providing a diverse set of novel technologies for illuminating the dark kinome, interpreting pathogenic phosphoSNVs, uncovering the mutational effect of kinase variants, and enabling generalizable engineering of kinase biosensors.