Unsupervised classification in single-cell transcriptome analysis
Xintian You1, Tim OF Conrad1
Zuse Institute Berlin, Takustr.7, Berlin 14195, Germany
The development of single-cell RNA-seq techniques allows measurement of gene expression at single cell level, which allows unbiased stratification of cells that otherwise appears homogenous. Due to the limited starting material, single-cell RNA-seq data is plagues with relatively high noise, from which the interesting biological variance is challenging to be distinguished from technical variance. Although current protocols do not yet permit quantifying the absolute number of RNA molecules per cell, several facilitating techniques have been developed, including adding ERCC spike-ins and UMIs to the library and/or appending smFISH for correction. To extract biologically meaningful information, we developed a computational pipeline that performs unsupervised classification by using Singular Value Decomposition in conjunction with Gene Set Enrichment Analysis and Gene Ontology Enrichment Analysis. We increase the signal-to-noise ration by discarding components of low biological importance using permutation tests, which simultaneously reduces the dimensionality. Final visualisation of the classification is achieved by hierarchical clustering and t-Distributed Stochastic Neighbour Embedding. Using published datasets, we showed that our method can achieve very similar classification results without post hoc parameter optimisation. Moreover, our method allows direct identification and removal of the “unwanted” effect such as the cell cycle.
Credits: None available.
You must be logged in and own this product in order to post comments.