Publications

Submitted

Song, Tzu-Hsi, Leonardo Clemente, Xiang Pan, and . Submitted. “Fine-Grained Forecasting of COVID-19 Trends at the County Level in the United States”. MedRxiv (In Revision) (Submitted). https://doi.org/10.1101/2024.01.13.24301248.

The coronavirus (COVID-19) pandemic has profoundly impacted various aspects of daily life, society, healthcare systems, and global health policies. This pandemic has resulted in more than one hundred million people being infected and, unfortunately, the loss of life for many individuals. Although treatment for the coronavirus is now available, effective forecasting of COVID-19 infection is the most importance to aid public health officials in making critical decisions. However, forecasting COVID-19 trends through time-series analysis poses significant challenges due to the data’s inherently dynamic, transient, and noise-prone nature. In this study, we have developed the Fine-Grained Infection Forecast Network (FIGI-Net) model, which provides accurate forecasts of COVID-19 trends up to two weeks in advance. FIGI-Net addresses the current limitations in COVID-19 forecasting by leveraging fine-grained county-level data and a stacked bidirectional LSTM structure. We employ a pre-trained model to capture essential global infection patterns. Subsequently, these pre-trained parameters were transferred to train localized sub-models for county clusters exhibiting comparable infection dynamics. This model adeptly handles sudden changes and rapid fluctuations in data, frequently observed across various times and locations of county-level data, ultimately improving the accuracy of COVID-19 infection forecasting at the county, state, and national levels. FIGI-Net model demonstrated significant improvement over other deep learning-based models and state-of-the-art COVID-19 forecasting models, evident in various standard evaluation metrics. Notably, FIGI-Net model excels at forecasting the direction of infection trends, especially during the initial phases of different COVID-19 outbreak waves. Our study underscores the effectiveness and superiority of our time-series deep learning-based methods in addressing dynamic and sudden changes in infection numbers over short-term time periods. These capabilities facilitate efficient public health management and the early implementation of COVID-19 transmission prevention measures.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by NIH, United States (Grant Number: R35GM133725).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesCan you update the data availability statement and send me the pdf file? The data used in this study are publicly available and consist of daily COVID-19 cumulative infectious and death cases reported for U.S. counties. The dataset was obtained from the Johns Hopkins Center for Systems Science and Engineering (CSSE) Coronavirus Resource Center, spanning from January 21st, 2020, to April 16th, 2022 [14]. The dataset can be directly accessed from the Johns Hopkins CSSE Coronavirus Resource Center website (https://github.com/CSSEGISandData/COVID-19). Researchers interested in utilizing the data for further analysis can refer to the original source for detailed documentation on data collection methods and definitions. For additional information or inquiries about the dataset, please visit the website or contact the Johns Hopkins CSSE Coronavirus Resource Center.

Basher, Abdur Rahman M. A., Caleb Hallinan, and Kwonmoo Lee. Submitted. “Heterogeneity-Preserving Discriminative Feature Selection for Disease-Specific Subtype Discovery”. BioRxiv (In Final Revision) (Submitted). https://doi.org/10.1101/2023.05.14.540686v3.

The discovery of subtypes is pivotal for disease diagnosis and targeted therapy, considering the diverse responses of different cells or patients to specific treatments. Exploring the heterogeneity within disease or cell states provides insights into disease progression mechanisms and cell differentiation. The advent of high-throughput technologies has enabled the generation and analysis of various molecular data types, such as single-cell RNA-seq, proteomic, and imaging datasets, at large scales. While presenting opportunities for subtype discovery, these datasets pose challenges in finding relevant signatures due to their high dimensionality. Feature selection, a crucial step in the analysis pipeline, involves choosing signatures that reduce the feature size for more efficient downstream computational analysis. Numerous existing methods focus on selecting signatures that differentiate known diseases or cell states, yet they often fall short in identifying features that preserve heterogeneity and reveal subtypes. To identify features that can capture the diversity within each class while also maintaining the discrimination of known disease states, we employed deep metric learning-based feature embedding to conduct a detailed exploration of the statistical properties of features essential in preserving heterogeneity. Our analysis revealed that features with a significant difference in interquartile range (IQR) between classes possess crucial subtype information. Guided by this insight, we developed a robust statistical method, termed PHet (Preserving Heterogeneity) that performs iterative subsampling differential analysis of IQR and Fisher’s method between classes, identifying a minimal set of heterogeneity-preserving discriminative features to optimize subtype clustering quality. Validation using public single-cell RNA-seq and microarray datasets showcased PHet’s effectiveness in preserving sample heterogeneity while maintaining discrimination of known disease/cell states, surpassing the performance of previous outlier-based methods. Furthermore, analysis of a single-cell RNA-seq dataset from mouse tracheal epithelial cells revealed, through PHet-based features, the presence of two distinct basal cell subtypes undergoing differentiation toward a luminal secretory phenotype. Notably, one of these subtypes exhibited high expression of BPIFA1. Interestingly, previous studies have linked BPIFA1 secretion to the emergence of secretory cells during mucociliary differentiation of airway epithelial cells. PHet successfully pinpointed the basal cell subtype associated with this phenomenon, a distinction that pre-annotated markers and dispersion-based features failed to make due to their admixed feature expression profiles. These findings underscore the potential of our method to deepen our understanding of the mechanisms underlying diseases and cell differentiation and contribute significantly to personalized medicine.Competing Interest StatementThe authors have declared no competing interest.

Song, Tzu-Hsi, Mengzhi Cao, Jouha Min, and Hyungsoon Im and. Submitted. “Interpretable Deep Learning for Breast Cancer Cell Phenotyping Using Diffraction Images from Lens-Free Digital In-Line Holography”. BioRxiv, Submitted. https://doi.org/10.1101/2021.05.29.446284.
Lens-free digital in-line holography (LDIH) offers a wide field of view at micrometer-scale resolution, surpassing the capabilities of lens-based microscopes, making it a promising diagnostic tool for high-throughput cellular analysis. However, the complex nature of holograms renders them challenging for human interpretation, necessitating time- consuming computational processing to reconstruct object images. To address this, we present HoloNet, a novel deep learning architecture specifically designed for direct analysis of holographic images from LDIH in cellular phenotyping. HoloNet extracts both global features from diffraction patterns and local features from convolutional layers, achieving superior performance and interpretability compared to other deep learning methods. By leveraging raw holograms of breast cancer cells stained with well-known markers ER/PR and HER2, HoloNet demonstrates its effectiveness in classifying breast cancer cell types and quantifying molecular marker intensities. Furthermore, we introduce the feature-fusion HoloNet model, which extracts diffraction features associated with breast cancer cell types and their marker intensities. This hologram embedding approach allows for the identification of previously unknown subtypes of breast cancer cells, facilitating a comprehensive analysis of cell phenotype heterogeneity, leading to precise breast cancer diagnosis.Competing Interest StatementThe authors have declared no competing interest.
Kim, Y., T. Song, H. J. Choi, and K. Lee. Submitted. “Probing Cellular Heterogeneity through Fuzzy Time Series Forecasting Models of Leading Edge Dynamics”. Submitted.
In this paper, a new biological modeling approach is proposed for predicting complex heterogeneous subcellular behaviors. Cell protrusion which initiates cell migration has a significant amount of subcellular heterogeneity in micrometer length and minute time scales. It is driven by actin polymerization, e.g., pushing the plasma membrane forward, and then regulated by a multitude of actin regulators. While mathematical modeling is central to system-level understandings of cell protrusion, most of the modeling is based on the ensemble average of actin regulator dynamics at the cellular or population levels, preventing from capturing the heterogeneous cellular activities. With these in mind, a systematic modeling framework is proposed in this paper for predicting velocities of heterogeneous protrusion of migrating cells driven by multiple molecular mechanisms. The modeling framework is developed through the integration of the multiple AutoRegressive eXogenous (ARX) models employing probability density input variables. Unlike conventional ARX models, it provides an effective framework for modeling heterogeneous subcellular behaviors with complex nonlinearities and uncertainties of dynamic systems. To train and validate the proposed model, numerous subcellular time series are extracted from time-lapse movies of migrating PtK1 cells using spinning disk confocal microscope: The current edge velocities and fluorescent intensities of mDia1, actin at the leading edge are used as the input while the future cell edge velocities are selected as an output. It is demonstrated that the proposed approach is highly effective in predicting the future trends of heterogeneous cell protrusion. In particular, by capturing the various multiple activities from the dataset, it is expected that it would improve the understanding of the molecular mechanism underlying cellular and subcellular heterogeneity.

2025

Busatto, Sara, Tzu-Hsi Song, Hyung Joon Kim, Caleb Hallinan, Michael N Lombardo, Anat O Stemmer-Rachamimov, Kwonmoo Lee, and Marsha A Moses. 2025. “Breast Cancer-Derived Extracellular Vesicles Modulate the Cytoplasmic and Cytoskeletal Dynamics of Blood-Brain Barrier Endothelial Cells.”. Journal of Extracellular Vesicles 14 (1): e70038. https://doi.org/10.1002/jev2.70038.

Extracellular vesicles (EVs) from brain-seeking breast cancer cells (Br-EVs) breach the blood-brain barrier (BBB) via transcytosis and promote brain metastasis. Here, we defined the mechanisms by which Br-EVs modulate brain endothelial cell (BEC) dynamics to facilitate their BBB transcytosis. BEC treated with Br-EVs show significant downregulation of Rab11fip2, known to promote vesicle recycling to the plasma membrane and significant upregulation of Rab11fip3 and Rab11fip5, which support structural stability of the endosomal compartment and facilitate vesicle recycling and transcytosis, respectively. Using machine learning and quantitative global proteomic, we identified novel Br-EV-induced changes in BECs morphology, motility, and proteome that correlate with decreased BEC cytoplasm and cytoskeletal organization and dynamics. These results define early steps leading to breast-to-brain metastasis and identify molecules that could serve as targets for therapeutic strategies for brain metastasis.

2024

Wang, Chuangqi, Hee June Choi, Lucy Woodbury, and Kwonmoo Lee. 2024. “Interpretable Fine-Grained Phenotypes of Subcellular Dynamics via Unsupervised Deep Learning.”. Advanced Science (Weinheim, Baden-Wurttemberg, Germany) 11 (41): e2403547. https://doi.org/10.1002/advs.202403547.

Uncovering fine-grained phenotypes of live cell dynamics is pivotal for a comprehensive understanding of the heterogeneity in healthy and diseased biological processes. However, this endeavor poses significant technical challenges for unsupervised machine learning, requiring the extraction of features that not only faithfully preserve this heterogeneity but also effectively discriminate between established biological states, all while remaining interpretable. To tackle these challenges, a self-training deep learning framework designed for fine-grained and interpretable phenotyping is presented. This framework incorporates an unsupervised teacher model with interpretable features to facilitate feature learning in a student deep neural network (DNN). Significantly, an autoencoder-based regularizer is designed to encourage the student DNN to maximize the heterogeneity associated with molecular perturbations. This method enables the acquisition of features with enhanced discriminatory power, while simultaneously preserving the heterogeneity associated with molecular perturbations. This study successfully delineated fine-grained phenotypes within the heterogeneous protrusion dynamics of migrating epithelial cells, revealing specific responses to pharmacological perturbations. Remarkably, this framework adeptly captured a concise set of highly interpretable features uniquely linked to these fine-grained phenotypes, each corresponding to specific temporal intervals crucial for their manifestation. This unique capability establishes it as a valuable tool for investigating diverse cellular dynamics and their heterogeneity.

Moon, D., S. Kim, C. Wang, K. Lee, and J. Doh. 2024. “Deep Learning-Based Automated Analysis of NK Cell Cytotoxicity in Single Cancer Cell Arrays”. BioChip Journal 18. https://doi.org/https://doi.org/10.1007/s13206-024-00158-y (co-last authors: JD and KL).

The cytotoxicity assay of immune cells based on live cell imaging offers comprehensive information at the single cell-level information, but the data acquisition and analysis are labor-intensive. To overcome this limitation, we previously developed single cancer cell arrays that immobilize cancer cells in microwells as single cell arrays, thus allow high-throughput data acquisition. In this study, we utilize deep learning to automatically analyze NK cell cytotoxicity in the context of single cancer cell arrays. Defined cancer cell position and the separation of NK cells and cancer cells along distinct optical planes facilitate segmentation and classification by deep learning. Various deep learning models are evaluated to determine the most appropriate model. The results of the deep learning-based automated data analysis are consistent with those of the previous manual analysis. The integration of the microwell platform and deep learning would present new opportunities for the analysis of cell–cell interactions.

2023

Jang, Junbong, Young H Kim, Brian Westgate, Yang Zong, Caleb Hallinan, Ali Akalin, and Kwonmoo Lee. 2023. “Screening Adequacy of Unstained Thyroid Fine Needle Aspiration Samples Using a Deep Learning-Based Classifier.”. Scientific Reports 13 (1): 13525. https://doi.org/10.1038/s41598-023-40652-1 (co-last authors: KL, YHK, AA).

Fine needle aspiration (FNA) biopsy of thyroid nodules is a safe, cost-effective, and accurate diagnostic method for detecting thyroid cancer. However, about 10% of initial FNA biopsy samples from patients are non-diagnostic and require repeated FNA, which delays the diagnosis and appropriate care. On-site evaluation of the FNA sample can be performed to filter out non-diagnostic FNA samples. Unfortunately, it involves a time-consuming staining process, and a cytopathologist has to be present at the time of FNA. To bypass the staining process and expert interpretation of FNA specimens at the clinics, we developed a deep learning-based ensemble model termed FNA-Net that allows in situ screening of adequacy of unstained thyroid FNA samples smeared on a glass slide which can decrease the non-diagnostic rate in thyroid FNA. FNA-Net combines two deep learning models, a patch-based whole slide image classifier and Faster R-CNN, to detect follicular clusters with high precision. Then, FNA-Net classifies sample slides to be non-diagnostic if the total number of detected follicular clusters is less than a predetermined threshold. With bootstrapped sampling, FNA-Net achieved a 0.81 F1 score and 0.84 AUC in the precision-recall curve for detecting the non-diagnostic slides whose follicular clusters are less than six. We expect that FNA-Net can dramatically reduce the diagnostic cost associated with FNA biopsy and improve the quality of patient care.

Biber, John C, Andra Sullivan, Joseph A Brazzo, Yuna Heo, Bat-Ider Tumenbayar, Amanda Krajnik, Kerry E Poppenberg, et al. 2023. “Survivin As a Mediator of Stiffness-Induced Cell Cycle Progression and Proliferation of Vascular Smooth Muscle Cells.”. APL Bioengineering 7 (4): 046108. https://doi.org/10.1063/5.0150532.

Stiffened arteries are a pathology of atherosclerosis, hypertension, and coronary artery disease and a key risk factor for cardiovascular disease events. The increased stiffness of arteries triggers a phenotypic switch, hypermigration, and hyperproliferation of vascular smooth muscle cells (VSMCs), leading to neointimal hyperplasia and accelerated neointima formation. However, the mechanism underlying this trigger remains unknown. Our analyses of whole-transcriptome microarray data from mouse VSMCs cultured on stiff hydrogels simulating arterial pathology identified 623 genes that were significantly and differentially expressed (360 upregulated and 263 downregulated) relative to expression in VSMCs cultured on soft hydrogels. Functional enrichment and gene network analyses revealed that these stiffness-sensitive genes are linked to cell cycle progression and proliferation. Importantly, we found that survivin, an inhibitor of apoptosis protein, mediates stiffness-dependent cell cycle progression and proliferation as determined by gene network and pathway analyses, RT-qPCR, immunoblotting, and cell proliferation assays. Furthermore, we found that inhibition of cell cycle progression did not reduce survivin expression, suggesting that survivin functions as an upstream regulator of cell cycle progression and proliferation in response to ECM stiffness. Mechanistically, we found that the stiffness signal is mechanotransduced via the FAK-E2F1 signaling axis to regulate survivin expression, establishing a regulatory pathway for how the stiffness of the cellular microenvironment affects VSMC behaviors. Overall, our findings indicate that survivin is necessary for VSMC cycling and proliferation and plays a role in regulating stiffness-responsive phenotypes.

Krajnik, Amanda, Erik Nimmer, Joseph A Brazzo, John C Biber, Rhonda Drewes, Bat-Ider Tumenbayar, Andra Sullivan, et al. 2023. “Survivin Regulates Intracellular Stiffness and Extracellular Matrix Production in Vascular Smooth Muscle Cells.”. APL Bioengineering 7 (4): 046104. https://doi.org/10.1063/5.0157549.

Vascular dysfunction is a common cause of cardiovascular diseases characterized by the narrowing and stiffening of arteries, such as atherosclerosis, restenosis, and hypertension. Arterial narrowing results from the aberrant proliferation of vascular smooth muscle cells (VSMCs) and their increased synthesis and deposition of extracellular matrix (ECM) proteins. These, in turn, are modulated by arterial stiffness, but the mechanism for this is not fully understood. We found that survivin is an important regulator of stiffness-mediated ECM synthesis and intracellular stiffness in VSMCs. Whole-transcriptome analysis and cell culture experiments showed that survivin expression is upregulated in injured femoral arteries in mice and in human VSMCs cultured on stiff fibronectin-coated hydrogels. Suppressed expression of survivin in human VSMCs significantly decreased the stiffness-mediated expression of ECM components related to arterial stiffening, such as collagen-I, fibronectin, and lysyl oxidase. By contrast, expression of these ECM proteins was rescued by ectopic expression of survivin in human VSMCs cultured on soft hydrogels. Interestingly, atomic force microscopy analysis showed that suppressed or ectopic expression of survivin decreases or increases intracellular stiffness, respectively. Furthermore, we observed that inhibiting Rac and Rho reduces survivin expression, elucidating a mechanical pathway connecting intracellular tension, mediated by Rac and Rho, to survivin induction. Finally, we found that survivin inhibition decreases FAK phosphorylation, indicating that survivin-dependent intracellular tension feeds back to maintain signaling through FAK. These findings suggest a novel mechanism by which survivin potentially modulates arterial stiffness.