Publications

2025

Basher, Abdur Rahman M A, Caleb Hallinan, and Kwonmoo Lee. (2025) 2025. “Heterogeneity-Preserving Discriminative Feature Selection for Disease-Specific Subtype Discovery.”. Nature Communications 16 (1): 3593. https://doi.org/10.1038/s41467-025-58718-1.

Disease-specific subtype identification can deepen our understanding of disease progression and pave the way for personalized therapies, given the complexity of disease heterogeneity. Large-scale transcriptomic, proteomic, and imaging datasets create opportunities for discovering subtypes but also pose challenges due to their high dimensionality. To mitigate this, many feature selection methods focus on selecting features that distinguish known diseases or cell states, yet often miss features that preserve heterogeneity and reveal new subtypes. To overcome this gap, we develop Preserving Heterogeneity (PHet), a statistical methodology that employs iterative subsampling and differential analysis of interquartile range, in conjunction with Fisher's method, to identify a small set of features that enhance subtype clustering quality. Here, we show that this method can maintain sample heterogeneity while distinguishing known disease/cell states, with a tendency to outperform previous differential expression and outlier-based methods, indicating its potential to advance our understanding of disease mechanisms and cell differentiation.

Song, Tzu-Hsi, Leonardo Clemente, Xiang Pan, Junbong Jang, Mauricio Santillana, and Kwonmoo Lee. (2025) 2025. “Fine-Grained Forecasting of COVID-19 Trends at the County Level in the United States.”. NPJ Digital Medicine 8 (1): 204. https://doi.org/10.1038/s41746-025-01606-1 (co-last authors: KL, MS).

The novel coronavirus (COVID-19) pandemic has had a devastating global impact, profoundly affecting daily life, healthcare systems, and public health infrastructure. Despite the availability of treatments and vaccines, hospitalizations and deaths continue. Real-time surveillance of infection trends supports resource allocation and mitigation strategies, but reliable forecasting remains a challenge. While deep learning has advanced time-series forecasting, its effectiveness relies on large datasets, a significant obstacle given the pandemic's evolving nature. Most models use national or state-level data, limiting both dataset size and the granularity of insights. To address this, we propose the Fine-Grained Infection Forecast Network (FIGI-Net), a stacked bidirectional LSTM structure designed to leverage county-level data to produce daily forecasts up to two weeks in advance. FIGI-Net outperforms existing models, accurately predicting sudden changes such as new outbreaks or peaks, a capability many state-of-the-art models lack. This approach could enhance public health responses and outbreak preparedness.

Busatto, Sara, Tzu-Hsi Song, Hyung Joon Kim, Caleb Hallinan, Michael N Lombardo, Anat O Stemmer-Rachamimov, Kwonmoo Lee, and Marsha A Moses. 2025. “Breast Cancer-Derived Extracellular Vesicles Modulate the Cytoplasmic and Cytoskeletal Dynamics of Blood-Brain Barrier Endothelial Cells.”. Journal of Extracellular Vesicles 14 (1): e70038. https://doi.org/10.1002/jev2.70038.

Extracellular vesicles (EVs) from brain-seeking breast cancer cells (Br-EVs) breach the blood-brain barrier (BBB) via transcytosis and promote brain metastasis. Here, we defined the mechanisms by which Br-EVs modulate brain endothelial cell (BEC) dynamics to facilitate their BBB transcytosis. BEC treated with Br-EVs show significant downregulation of Rab11fip2, known to promote vesicle recycling to the plasma membrane and significant upregulation of Rab11fip3 and Rab11fip5, which support structural stability of the endosomal compartment and facilitate vesicle recycling and transcytosis, respectively. Using machine learning and quantitative global proteomic, we identified novel Br-EV-induced changes in BECs morphology, motility, and proteome that correlate with decreased BEC cytoplasm and cytoskeletal organization and dynamics. These results define early steps leading to breast-to-brain metastasis and identify molecules that could serve as targets for therapeutic strategies for brain metastasis.

2024

Wang, Chuangqi, Hee June Choi, Lucy Woodbury, and Kwonmoo Lee. 2024. “Interpretable Fine-Grained Phenotypes of Subcellular Dynamics via Unsupervised Deep Learning.”. Advanced Science (Weinheim, Baden-Wurttemberg, Germany) 11 (41): e2403547. https://doi.org/10.1002/advs.202403547.

Uncovering fine-grained phenotypes of live cell dynamics is pivotal for a comprehensive understanding of the heterogeneity in healthy and diseased biological processes. However, this endeavor poses significant technical challenges for unsupervised machine learning, requiring the extraction of features that not only faithfully preserve this heterogeneity but also effectively discriminate between established biological states, all while remaining interpretable. To tackle these challenges, a self-training deep learning framework designed for fine-grained and interpretable phenotyping is presented. This framework incorporates an unsupervised teacher model with interpretable features to facilitate feature learning in a student deep neural network (DNN). Significantly, an autoencoder-based regularizer is designed to encourage the student DNN to maximize the heterogeneity associated with molecular perturbations. This method enables the acquisition of features with enhanced discriminatory power, while simultaneously preserving the heterogeneity associated with molecular perturbations. This study successfully delineated fine-grained phenotypes within the heterogeneous protrusion dynamics of migrating epithelial cells, revealing specific responses to pharmacological perturbations. Remarkably, this framework adeptly captured a concise set of highly interpretable features uniquely linked to these fine-grained phenotypes, each corresponding to specific temporal intervals crucial for their manifestation. This unique capability establishes it as a valuable tool for investigating diverse cellular dynamics and their heterogeneity.

Moon, D., S. Kim, C. Wang, K. Lee, and J. Doh. 2024. “Deep Learning-Based Automated Analysis of NK Cell Cytotoxicity in Single Cancer Cell Arrays”. BioChip Journal 18. https://doi.org/https://doi.org/10.1007/s13206-024-00158-y (co-last authors: JD and KL).

The cytotoxicity assay of immune cells based on live cell imaging offers comprehensive information at the single cell-level information, but the data acquisition and analysis are labor-intensive. To overcome this limitation, we previously developed single cancer cell arrays that immobilize cancer cells in microwells as single cell arrays, thus allow high-throughput data acquisition. In this study, we utilize deep learning to automatically analyze NK cell cytotoxicity in the context of single cancer cell arrays. Defined cancer cell position and the separation of NK cells and cancer cells along distinct optical planes facilitate segmentation and classification by deep learning. Various deep learning models are evaluated to determine the most appropriate model. The results of the deep learning-based automated data analysis are consistent with those of the previous manual analysis. The integration of the microwell platform and deep learning would present new opportunities for the analysis of cell–cell interactions.

Song, Tzu-Hsi, Mengzhi Cao, Jouha Min, and Hyungsoon Im and. 2024. “Interpretable Deep Learning for Breast Cancer Cell Phenotyping Using Diffraction Images from Lens-Free Digital In-Line Holography”. BioRxiv. https://doi.org/10.1101/2021.05.29.446284.

Lens-free digital in-line holography (LDIH) offers a wide field of view at micrometer-scale resolution, surpassing the capabilities of lens-based microscopes, making it a promising diagnostic tool for high-throughput cellular analysis. However, the complex nature of holograms renders them challenging for human interpretation, necessitating time- consuming computational processing to reconstruct object images. To address this, we present HoloNet, a novel deep learning architecture specifically designed for direct analysis of holographic images from LDIH in cellular phenotyping. HoloNet extracts both global features from diffraction patterns and local features from convolutional layers, achieving superior performance and interpretability compared to other deep learning methods. By leveraging raw holograms of breast cancer cells stained with well-known markers ER/PR and HER2, HoloNet demonstrates its effectiveness in classifying breast cancer cell types and quantifying molecular marker intensities. Furthermore, we introduce the feature-fusion HoloNet model, which extracts diffraction features associated with breast cancer cell types and their marker intensities. This hologram embedding approach allows for the identification of previously unknown subtypes of breast cancer cells, facilitating a comprehensive analysis of cell phenotype heterogeneity, leading to precise breast cancer diagnosis.Competing Interest StatementThe authors have declared no competing interest.

In this paper, a new biological modeling approach is proposed for predicting complex heterogeneous subcellular behaviors. Cell protrusion which initiates cell migration has a significant amount of subcellular heterogeneity in micrometer length and minute time scales. It is driven by actin polymerization, e.g., pushing the plasma membrane forward, and then regulated by a multitude of actin regulators. While mathematical modeling is central to system-level understandings of cell protrusion, most of the modeling is based on the ensemble average of actin regulator dynamics at the cellular or population levels, preventing from capturing the heterogeneous cellular activities. With these in mind, a systematic modeling framework is proposed in this paper for predicting velocities of heterogeneous protrusion of migrating cells driven by multiple molecular mechanisms. The modeling framework is developed through the integration of the multiple AutoRegressive eXogenous (ARX) models employing probability density input variables. Unlike conventional ARX models, it provides an effective framework for modeling heterogeneous subcellular behaviors with complex nonlinearities and uncertainties of dynamic systems. To train and validate the proposed model, numerous subcellular time series are extracted from time-lapse movies of migrating PtK1 cells using spinning disk confocal microscope: The current edge velocities and fluorescent intensities of mDia1, actin at the leading edge are used as the input while the future cell edge velocities are selected as an output. It is demonstrated that the proposed approach is highly effective in predicting the future trends of heterogeneous cell protrusion. In particular, by capturing the various multiple activities from the dataset, it is expected that it would improve the understanding of the molecular mechanism underlying cellular and subcellular heterogeneity.

2023

Jang, Junbong, Young H Kim, Brian Westgate, Yang Zong, Caleb Hallinan, Ali Akalin, and Kwonmoo Lee. 2023. “Screening Adequacy of Unstained Thyroid Fine Needle Aspiration Samples Using a Deep Learning-Based Classifier.”. Scientific Reports 13 (1): 13525. https://doi.org/10.1038/s41598-023-40652-1 (co-last authors: KL, YHK, AA).

Fine needle aspiration (FNA) biopsy of thyroid nodules is a safe, cost-effective, and accurate diagnostic method for detecting thyroid cancer. However, about 10% of initial FNA biopsy samples from patients are non-diagnostic and require repeated FNA, which delays the diagnosis and appropriate care. On-site evaluation of the FNA sample can be performed to filter out non-diagnostic FNA samples. Unfortunately, it involves a time-consuming staining process, and a cytopathologist has to be present at the time of FNA. To bypass the staining process and expert interpretation of FNA specimens at the clinics, we developed a deep learning-based ensemble model termed FNA-Net that allows in situ screening of adequacy of unstained thyroid FNA samples smeared on a glass slide which can decrease the non-diagnostic rate in thyroid FNA. FNA-Net combines two deep learning models, a patch-based whole slide image classifier and Faster R-CNN, to detect follicular clusters with high precision. Then, FNA-Net classifies sample slides to be non-diagnostic if the total number of detected follicular clusters is less than a predetermined threshold. With bootstrapped sampling, FNA-Net achieved a 0.81 F1 score and 0.84 AUC in the precision-recall curve for detecting the non-diagnostic slides whose follicular clusters are less than six. We expect that FNA-Net can dramatically reduce the diagnostic cost associated with FNA biopsy and improve the quality of patient care.

Biber, John C, Andra Sullivan, Joseph A Brazzo, Yuna Heo, Bat-Ider Tumenbayar, Amanda Krajnik, Kerry E Poppenberg, et al. 2023. “Survivin As a Mediator of Stiffness-Induced Cell Cycle Progression and Proliferation of Vascular Smooth Muscle Cells.”. APL Bioengineering 7 (4): 046108. https://doi.org/10.1063/5.0150532.

Stiffened arteries are a pathology of atherosclerosis, hypertension, and coronary artery disease and a key risk factor for cardiovascular disease events. The increased stiffness of arteries triggers a phenotypic switch, hypermigration, and hyperproliferation of vascular smooth muscle cells (VSMCs), leading to neointimal hyperplasia and accelerated neointima formation. However, the mechanism underlying this trigger remains unknown. Our analyses of whole-transcriptome microarray data from mouse VSMCs cultured on stiff hydrogels simulating arterial pathology identified 623 genes that were significantly and differentially expressed (360 upregulated and 263 downregulated) relative to expression in VSMCs cultured on soft hydrogels. Functional enrichment and gene network analyses revealed that these stiffness-sensitive genes are linked to cell cycle progression and proliferation. Importantly, we found that survivin, an inhibitor of apoptosis protein, mediates stiffness-dependent cell cycle progression and proliferation as determined by gene network and pathway analyses, RT-qPCR, immunoblotting, and cell proliferation assays. Furthermore, we found that inhibition of cell cycle progression did not reduce survivin expression, suggesting that survivin functions as an upstream regulator of cell cycle progression and proliferation in response to ECM stiffness. Mechanistically, we found that the stiffness signal is mechanotransduced via the FAK-E2F1 signaling axis to regulate survivin expression, establishing a regulatory pathway for how the stiffness of the cellular microenvironment affects VSMC behaviors. Overall, our findings indicate that survivin is necessary for VSMC cycling and proliferation and plays a role in regulating stiffness-responsive phenotypes.

Krajnik, Amanda, Erik Nimmer, Joseph A Brazzo, John C Biber, Rhonda Drewes, Bat-Ider Tumenbayar, Andra Sullivan, et al. 2023. “Survivin Regulates Intracellular Stiffness and Extracellular Matrix Production in Vascular Smooth Muscle Cells.”. APL Bioengineering 7 (4): 046104. https://doi.org/10.1063/5.0157549.

Vascular dysfunction is a common cause of cardiovascular diseases characterized by the narrowing and stiffening of arteries, such as atherosclerosis, restenosis, and hypertension. Arterial narrowing results from the aberrant proliferation of vascular smooth muscle cells (VSMCs) and their increased synthesis and deposition of extracellular matrix (ECM) proteins. These, in turn, are modulated by arterial stiffness, but the mechanism for this is not fully understood. We found that survivin is an important regulator of stiffness-mediated ECM synthesis and intracellular stiffness in VSMCs. Whole-transcriptome analysis and cell culture experiments showed that survivin expression is upregulated in injured femoral arteries in mice and in human VSMCs cultured on stiff fibronectin-coated hydrogels. Suppressed expression of survivin in human VSMCs significantly decreased the stiffness-mediated expression of ECM components related to arterial stiffening, such as collagen-I, fibronectin, and lysyl oxidase. By contrast, expression of these ECM proteins was rescued by ectopic expression of survivin in human VSMCs cultured on soft hydrogels. Interestingly, atomic force microscopy analysis showed that suppressed or ectopic expression of survivin decreases or increases intracellular stiffness, respectively. Furthermore, we observed that inhibiting Rac and Rho reduces survivin expression, elucidating a mechanical pathway connecting intracellular tension, mediated by Rac and Rho, to survivin induction. Finally, we found that survivin inhibition decreases FAK phosphorylation, indicating that survivin-dependent intracellular tension feeds back to maintain signaling through FAK. These findings suggest a novel mechanism by which survivin potentially modulates arterial stiffness.