Novel methods to analyze compositionality in single-cell transcriptomics data
Tanya Karagiannis and Eric Reed
Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, MA
Data Intensive Study Center, Tufts University, Medford, MA
Over the past decade, advances of high-throughput biomolecular measurement technologies have been highlighted by the ability to generate comprehensive biological profiles at the level of individual cells. Of these technologies, single-cell transcriptomics, which generates comprehensive profiles of the expression of gene transcripts, has been most widely utilized due to its cost-effectiveness and power to distinguish cell types at a fine resolution. Accordingly, investigators have unprecedented opportunities to molecularly resolve the landscape of cells that compose individual samples and gain biological insight into diseases and other conditions by characterizing compositional diversity across samples. To this end, advances in single-cell technologies have coincided with the development of novel analytical methods suited to study these data at a variety of resolutions, e.g., cell-to-cell or sample-to-sample. In this session, we will discuss the development of two such methods, and their application to investigate cell type compositional changes in peripheral blood mononuclear cells (PBMCs) related to aging and extreme old age. The first method is an entropy-based “diversity statistic” for detecting global cell type compositional changes across multiple samples and biological conditions. This method allows for comparison across different cell-type resolutions and through conditional analyses within “cell compartments”, defined by groups of cell types with similar molecular profiles. To delineate these cell compartments, we utilize the second method, a data driven approach to perform nested sub-grouping of cell types estimated via repeated perturbation of the data.