A Platform for High Throughput Discovery of Sequence Defined Affinity Reagents
Author: Chen, Linlin
Year: 2026
Degree: Dissertation (Ph.D.)
Advisor: Guttman, Mitchell
Committee Members: Elowitz, Michael B.; Mayo, Stephen L.; Gradinaru, Viviana; Guttman, Mitchell
Option: Bioengineering
DOI: 10.7907/xrr0-ck24
Abstract
Protein-level measurement of biological systems remains constrained by the availability of specific, validated affinity reagents. While next-generation sequencing has enabled transcriptome-wide measurement at single-cell and spatial resolution, analogous protein measurement is limited by the scarcity of high-quality binders against most of the proteome. Conventional antibody generation is a one-target-at-a-time process in which specificity must be assessed after selection, reproducibility depends on biological source material that is finite and sequence-undefined, and effort scales linearly with target number.
This thesis describes a platform for high-throughput nanobody discovery that addresses these limitations through two innovations within a single in vitro workflow. The first is pooled ribosome display selection: a synthetic nanobody library is panned against many proteins simultaneously in a single reaction, eliminating the need for sequential per-target campaigns. The second is split-pool DNA barcoding for target deconvolution: protein identity is assigned to each enriched nanobody by combinatorial barcode intersection without physical separation, enabling target number to scale logarithmically with experimental complexity. Critically, because enrichment is quantified across all targets simultaneously, specificity emerges as a comparative property of the data — a nanobody is called as a specific binder not by passing a threshold in isolation, but because it enriches preferentially against one target relative to the others in the pool. Selected nanobodies are sequence-defined and therefore renewable and reproducible by construction.
The platform was applied to a panel of human cell-surface extracellular domain proteins. Controlled experiments established that pooled selection correctly enriches and deconvolves specific binders and that a combined enrichment-specificity scoring framework distinguishes genuine hits from promiscuous sequences. De novo binder discovery was demonstrated across panels of 3, 24, and 120 proteins, with called hits validated by multiplexed quantitative binding assay, flow cytometry on cells expressing the native membrane-anchored protein, and immunoprecipitation from cell lysate. CDR sequence convergence analysis provided independent evidence of affinity-driven selection, with multiple distinct convergent clusters per target indicating sampling of different epitopes. Split-pool barcoding produced concordant results with physical split deconvolution and enabled scale-up to 120 proteins.
The platform demonstrates that throughput and specificity in binder discovery need not trade off and provides a practical path toward systematic reagent generation against large protein panels. These reagents are directly applicable to the interaction mapping assays developed in this laboratory, to sequencing-based protein measurement platforms currently bottlenecked by binder availability, and to the systematic characterization of the cell-surface proteome more broadly.