0      0

Single Cell Biology | EK26


DCLEAR: Reconstructing Single Cell Lineage Trees from CRSIPR recorders by Distance-based Methods


Mar 17, 2021 12:00am ‐ Mar 17, 2021 12:00am

Description

DCLEAR: Reconstructing Single Cell Lineage Trees from CRSIPR recorders by Distance-based Methods Il-Youp Kwak1, Wuming Gong2 1Department of Applied Statistics, College of Business & Economics, Chung-Ang University, 84, Heukseok-ro, Dongjak-gu, Seoul, Republic of Korea 2Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN 55114, USA. Background. A fundamental challenge in biology is the reconstruction of developmental trajectories as they divide and progress through different stages. The recent advance of CRISPR-based molecular tools, such as intMEMOIR and scGESTALT, have produced a new generation of techniques that enable the reconstruction of cell lineages of complex organism at single-cell resolution. However, there are significant challenges for computationally inferring the lineage trees upon the noisy experimental readout at the single cell level. The recent Allen Institute lineage reconstruction DREAM challenges was the first attempt to rigorously examine the performance and robustness of lineage reconstruction algorithms by using benchmark experimental and in silico data. Results. We have developed two distance-based lineage reconstruction methods named weighted Hamming distance (WHD) and k-mer replacement distance (KRD) for this DREAM challenge and won two sub-challenges for reconstructing C. elegans and M. musculus lineage trees of 1,000 and 10,000 cells, respectively. WHD method used the information content to weigh each possible mutated state in the character array, and the state weight was optimized by Bayesian hyperparameter optimization using the solution trees in the training datasets. The distance matrix between all cells was built using this weighted hamming distance. KRD method used the prominence of mutations in the character arrays to estimate the summary statistics that were used for the generation of the tree to be reconstructed. These estimated parameters, combined with the pre-defined parameters such as number of cell divisions, were then used to simulate multiple lineage trees starting from the unmutated root. Different possibilities for the k-mer replacement distances were estimated from a simulation process that generate pseudo-trees similar to the real trees. The cell distances were evaluated by aggregating replacement distance by individual k-mers. We also systematically compared the performance of WHD and KRD, with other existing methods such as FastTree2, Cassiopeia and Hamming distance, and demonstrated the superior performance on recovering a variety of lineage structures. Conclusion. We have shown that WHD and KRD are two novel methods of estimating the cell distances from the CRSIPR/Cas9-enabled recorders, and outperform existing methods by several metrics and under a wide variety of parameters regimes. Our new algorithms should enable the accurate large-scale lineage tracing efforts. The WHD and KRD methods were implemented as an R package DCLEAR

Speaker(s):

  • Wuming Gong, PhD, Lillehei Heart Institute, University of Minnesota

You must be logged in and own this session in order to post comments.

Print Certificate
Completed on: token-completed_on
Print Transcript
Please select the appropriate credit type:
/
test_id: 
credits: 
completed on: 
rendered in: 
* - Indicates answer is required.
token-content

token-speaker-name
token-index
token-content
token-index
token-content
token-index
token-content
token-index
token-content
token-index
token-content
token-index
token-content
/
/
token-index
token-content
token-index
token-content