Recent developments for analyzing droplet-based single cell transcriptomic data
Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes. The newly developed droplet-based technologies enable efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the rapid technology advance, statistical methods and computational tools are still lacking for analyzing droplet-based single cell transcriptomic data. One important question in the analysis of scRNA-Seq data is to identify cell subtypes from heterogeneous tissues. In this talk, I will describe novel statistical methods for clustering population-based scRNA-seq data. Our approach explicitly models UMI count data, characterizes variations across different cell clusters via Dirichlet mixture prior, and poses a Bayesian hierarchical model for heterogeneity among multiple individuals. In both simulation studies and real data analysis, our proposed method outperforms existing methods with satisfactory clustering accuracy and stability, and thus will facilitate and accelerate novel biological discoveries.