Using single-cell gene expression data to derive putative transcription factor interactions linked to neuronal phenotype
Howard Hughes Medical Institute, Janelia Research Campus, Ashburn, VA 20147
Recent advances in single-cell RNA-sequencing technology have resulted in the identification of numerous transcriptomic cell types in multiple organs, including a variety of substructures in the brain. The majority of these efforts have focused on classifying cells into types, often distinguishable by one or several marker genes. Because these marker genes are selected as a readout based mainly on the specificity of their expression patterns, they are often not causal genes responsible for maintaining the identity of a given cell type, but rather reflect an individual neuronal phenotype or may even have unknown function in the brain. This suggests an alternative analysis of higher-depth single-cell RNA-seq data, namely the use of regularized statistical models to generate putative transcription factor codes resulting in expression of genes related to â€œdownstreamâ€ neuronal phenotypes such as electrophysiology and neurotransmitter signaling. Whereas this general strategy has been applied to population and tissue data, the ability to resolve expression patterns in a single cell provides higher-confidence predictions. This presentation highlights a series of transcription factor predictions, derived from multiple single-cell RNA-seq data sets, and shows the use of this data modality to build a transcription factor code on a gene-by-gene basis. Finally, given that these are gene interactions predicted solely from co-expression data, it is important to assess whether these predictions are consistent with recent epigenetic data obtained from neuronal samples. By showing that a subset of the predictions is indeed supported by ChIP-seq and ATAC-seq data, this analysis generates a series of novel testable hypotheses about transcription factor regulation of genes responsible for cell type-specific phenotypes.