Foundations and Applications of Single-Cell RNA Sequencing

Author: Booeshaghi, Ali Sina

Year: 2022

Degree: Dissertation (Ph.D.)

Advisor: Pachter, Lior S.

Committee Members: Greer, Julia R.; Colonius, Tim; Melsted, Páll; Pachter, Lior S.

Option: Mechanical Engineering

DOI: 10.7907/ptbp-a779

Abstract

Single-cell RNA-sequencing is an experimental technique for studying cellular gene expression, with a multitude of engineering challenges. These challenges transcend the boundaries of traditional academic disciplines and the field of mechanical engineering, that aims to address roadblocks in critical technologies towards engineering our environment, is central to this endeavor.

This thesis addresses three engineering challenges that must be met in order to realize the goal of bringing single-cell RNA sequencing to the clinic. The first is scalable cellular isolation and sampling. Chapter 2 describes the poseidon and colosseum instruments that enable massive scale single-cell isolation and collection. They each have novel design elements that reduce cost and enable modularity, at a similar accuracy to expensive commercial alternatives.

The second challenge is the rapid preprocessing of single-cell RNA-sequencing data. Chapter 3 describes the kallisto | bustools command-line tools that make scalable scRNAseq analysis fast and efficient. These tools implement novel algorithms for sequence read-alignment, barcode error correction, and molecular counting that helps resolve ambiguities in sequence mapping.

The third challenge is refining gene expression data to the isoform level. This refinement is crucial for understanding transcriptional regulation and the effects of alternative splicing in biological processes. Towards that end, I have extended the kallisto | bustools workflow to process full-length scRNAseq data taking advantage of expectation maximization algorithm to disambiguate sequence alignments. Chapter four describes how I used these tools to assemble the first ever spatially-resolved single-cell isoform atlas, and in particular one of great interest in the neuroscience community (the mouse primary motor cortex) with data generated with three RNA-sequencing assays.

Files