Stories in Single Cell RNA Sequencing

Author: da Veiga Beltrame, Eduardo

Year: 2022

Degree: Dissertation (Ph.D.)

Advisor: Sternberg, Paul W.

Committee Members: Thomson, Matthew; Pachter, Lior S.; Van Valen, David A.; Sternberg, Paul W.

Option: Bioengineering

DOI: 10.7907/4kgh-8420

Abstract

This thesis describes the projects I have worked on since starting the Caltech bioengineering program in fall 2017. The general theme of my projects is that they are all about single cell RNA sequencing (scRNA-seq), spanning the experimental and computational realms.

Chapter 1 is an introduction explaining the essential concepts and is meant to be readable by a wide audience. For the other chapters, each one describes a separate project in a succinct manner, including links to the related preprint, published paper or code repositories at the start of each chapter.

Chapter 2 describes the scVI generative model for scRNA-seq data and the scvi-tools framework, which forms the basis of many of my computational projects.

Chapter 3 describes an open source 3D printable syringe pump system that was developed envisioning facilitating many kinds of experiments, in particular droplet based scRNA-seq.

Chapter 4 describes a new way of fabricating hydrogel beads with unique DNA barcodes that are used for scRNA-seq experiments.

Chapter 5 describes a database listing most published scRNA-seq studies that I helped create, and provides a useful overview of the state of the field.

Chapter 6 describes the kallisto bus workflow, which is used for pre-processing scRNA-seq data, going from FASTQ file to gene count matrix in a very efficient manner.

Chapter 7 describes a new way of using scVI to quantify the trade- off in the quality of scRNA-seq of a given dataset when surveying more cells or sequencing more reads per cell.

Chapter 8 describes tools developed for the WormBase users to leverage scRNA-seq data on C. elegans, and which can be deployed with any other scRNA-seq dataset.

Chapter 9 describes a remarkably successful offshoot of the devel- opment of these tools: a simple scVI based analysis and visualization strategy for finding candidate marker genes using C. elegans scRNA-seq data, which was experimentally validated by members of the Sternberg lab.

Files