Preprint
Summary: Mitochondrial transcript abundance is a standard quality control metric in single-cell RNA sequencing, but fixed percentage thresholds fail to account for the substantial variation in mitochondrial content across cell types and tissues, risking both retention of compromised cells and exclusion of transcriptionally active viable cell populations. We present MitoChontrol, a cell-type-aware probabilistic framework for mitochondrial quality control that models the mitochondrial transcript fraction within transcriptionally coherent clusters as a Gaussian mixture distribution. Compromised-cell components are identified from the upper tail of each cluster-specific distribution, and filtering thresholds are defined as the point at which the posterior probability of cellular compromise exceeds a user-definded confidence value. Applied to controlled perturbation experiments and a pancreatic ductal adenocarcinoma single-cell dataset, MitoChontrol selectively removes transcriptionally compromised cells while preserving biologically elevated but viable populations, outperforming fixed-threshold and outlier-based approaches.
Availability and Implementation: MitoChontrol is implemented in Python and integrates directly with AnnData-based workflows. It is freely available under the GNU General Public License v3 (GPL-3.0) at: https://github.com/uttamLab/MitoChontrol (DOI: https://doi.org/10.5281/zenodo.19423054)
Drug treatment can control HIV-1 replication, but it cannot cure infection. This is because of a long-lived population of quiescent infected cells, known as the latent reservoir (LR), that can restart active replication even after decades of successful drug treatment. Many cells in the LR belong to highly expanded clones, but the processes underlying the clonal structure of the LR are unclear. Understanding the dynamics of the LR and the keys to its persistence is critical for developing an HIV-1 cure. Here we develop a quantitative model of LR dynamics that fits available patient data over time scales spanning from days to decades. We show that the interplay between antigenic stimulation and clonal heterogeneity shapes the dynamics of the LR. In particular, we find that large clones play a central role in long-term persistence, even though they rarely reactivate. Our results could inform the development of HIV-1 cure strategies.
We present an easy-to-use, nonlinear filter for effective background identification in fluorescence microscopy images with dense and low-contrast foreground. The pixel-wise filtering is based on comparison of the pixel intensity with the mean intensity of pixels in its local neighborhood. The pixel is given a background or foreground label depending on whether its intensity is less than or greater than the mean respectively. Multiple labels are generated for the same pixel by computing mean expression values by varying neighborhood size. These labels are accumulated to decide the final pixel label. We demonstrate that the performance of our filter favorably compares with state-of-the-art image processing, machine learning, and deep learning methods. We present three use cases that demonstrate its effectiveness, and also show how it can be used in multiplexed fluorescence imaging contexts and as a denoising step in image segmentation. A fast implementation of the filter is available in Python 3 on GitHub.
The diverse T cell receptor (TCR) repertoire confers the ability to recognize an almost unlimited array of antigens. Characterization of antigen specificity of tumor-infiltrating lymphocytes (TILs) is key for understanding antitumor immunity and for guiding the development of effective immunotherapies. Here, we report a large-scale comprehensive examination of the TCR landscape of TILs across the spectrum of pediatric brain tumors, the leading cause of cancer-related mortality in children. We show that a T cell clonality index can inform patient prognosis, where more clonality is associated with more favorable outcomes. Moreover, TCR similarity groups’ assessment revealed patient clusters with defined human leukocyte antigen associations. Computational analysis of these clusters identified putative tumor antigens and peptides as targets for antitumor T cell immunity, which were functionally validated by T cell stimulation assays in vitro. Together, this study presents a framework for tumor antigen prediction based on in situ and in silico TIL TCR analyses. We propose that TCR-based investigations should inform tumor classification and precision immunotherapy development.
Multiplexed imaging technologies have made it possible to interrogate complex tumor microenvironments at sub-cellular resolution within their native spatial context. However, proper quantification of this complexity requires the ability to easily and accurately segment cells into their sub-cellular compartments. Within the supervised learning paradigm, deep learning based segmentation methods demonstrating human level performance have emerged. However, limited work has been done in developing such generalist methods within the label-free unsupervised context. Here we present an unsupervised segmentation (UNSEG) method that achieves deep learning level performance without requiring any training data. UNSEG leverages a Bayesian-like framework and the specificity of nucleus and cell membrane markers to construct an a posteriori probability estimate of each pixel belonging to the nucleus, cell membrane, or background. It uses this estimate to segment each cell into its nuclear and cell-membrane compartments. We show that UNSEG is more internally consistent and better at generalizing to the complexity of tissue morphology than current deep learning methods. This allows UNSEG to unambiguously identify the cytoplasmic compartment of a cell, which we employ to demonstrate its use in an exemplar biological scenario. Within the UNSEG framework, we also introduce a new perturbed watershed algorithm capable of stably and automatically segmenting a cluster of cell nuclei into individual cell nuclei that increases the accuracy of classical watershed. Perturbed watershed can also be used as a standalone algorithm that researchers can incorporate within their supervised or unsupervised learning approaches to extend classical watershed, particularly in the multiplexed imaging context. Finally, as part of developing UNSEG, we have generated a high-quality annotated gastrointestinal tissue (GIT) dataset, which we anticipate will be useful for the broader research community. We demonstrate the efficacy of UNSEG on the GIT dataset, publicly available datasets, and on a range of practical scenarios. In these contexts, we also discuss the possibility of bias inherent in quantification of segmentation accuracy based on F1 score. Segmentation, despite its long antecedents, remains a challenging problem, particularly in the context of tissue samples. UNSEG, an easy-to-use algorithm, provides an unsupervised approach to overcome this bottleneck, and as we discuss, can help improve deep learning based segmentation methods by providing a bridge between unsupervised and supervised learning paradigms. (GitHub)