The coronavirus (COVID-19) pandemic has profoundly impacted various aspects of daily life, society, healthcare systems, and global health policies. This pandemic has resulted in more than one hundred million people being infected and, unfortunately, the loss of life for many individuals. Although treatment for the coronavirus is now available, effective forecasting of COVID-19 infection is the most importance to aid public health officials in making critical decisions. However, forecasting COVID-19 trends through time-series analysis poses significant challenges due to the data’s inherently dynamic, transient, and noise-prone nature. In this study, we have developed the Fine-Grained Infection Forecast Network (FIGI-Net) model, which provides accurate forecasts of COVID-19 trends up to two weeks in advance. FIGI-Net addresses the current limitations in COVID-19 forecasting by leveraging fine-grained county-level data and a stacked bidirectional LSTM structure. We employ a pre-trained model to capture essential global infection patterns. Subsequently, these pre-trained parameters were transferred to train localized sub-models for county clusters exhibiting comparable infection dynamics. This model adeptly handles sudden changes and rapid fluctuations in data, frequently observed across various times and locations of county-level data, ultimately improving the accuracy of COVID-19 infection forecasting at the county, state, and national levels. FIGI-Net model demonstrated significant improvement over other deep learning-based models and state-of-the-art COVID-19 forecasting models, evident in various standard evaluation metrics. Notably, FIGI-Net model excels at forecasting the direction of infection trends, especially during the initial phases of different COVID-19 outbreak waves. Our study underscores the effectiveness and superiority of our time-series deep learning-based methods in addressing dynamic and sudden changes in infection numbers over short-term time periods. These capabilities facilitate efficient public health management and the early implementation of COVID-19 transmission prevention measures.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis work was supported by NIH, United States (Grant Number: R35GM133725).Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesCan you update the data availability statement and send me the pdf file? The data used in this study are publicly available and consist of daily COVID-19 cumulative infectious and death cases reported for U.S. counties. The dataset was obtained from the Johns Hopkins Center for Systems Science and Engineering (CSSE) Coronavirus Resource Center, spanning from January 21st, 2020, to April 16th, 2022 [14]. The dataset can be directly accessed from the Johns Hopkins CSSE Coronavirus Resource Center website (https://github.com/CSSEGISandData/COVID-19). Researchers interested in utilizing the data for further analysis can refer to the original source for detailed documentation on data collection methods and definitions. For additional information or inquiries about the dataset, please visit the website or contact the Johns Hopkins CSSE Coronavirus Resource Center.
Publications
Submitted
The discovery of subtypes is pivotal for disease diagnosis and targeted therapy, considering the diverse responses of different cells or patients to specific treatments. Exploring the heterogeneity within disease or cell states provides insights into disease progression mechanisms and cell differentiation. The advent of high-throughput technologies has enabled the generation and analysis of various molecular data types, such as single-cell RNA-seq, proteomic, and imaging datasets, at large scales. While presenting opportunities for subtype discovery, these datasets pose challenges in finding relevant signatures due to their high dimensionality. Feature selection, a crucial step in the analysis pipeline, involves choosing signatures that reduce the feature size for more efficient downstream computational analysis. Numerous existing methods focus on selecting signatures that differentiate known diseases or cell states, yet they often fall short in identifying features that preserve heterogeneity and reveal subtypes. To identify features that can capture the diversity within each class while also maintaining the discrimination of known disease states, we employed deep metric learning-based feature embedding to conduct a detailed exploration of the statistical properties of features essential in preserving heterogeneity. Our analysis revealed that features with a significant difference in interquartile range (IQR) between classes possess crucial subtype information. Guided by this insight, we developed a robust statistical method, termed PHet (Preserving Heterogeneity) that performs iterative subsampling differential analysis of IQR and Fisher’s method between classes, identifying a minimal set of heterogeneity-preserving discriminative features to optimize subtype clustering quality. Validation using public single-cell RNA-seq and microarray datasets showcased PHet’s effectiveness in preserving sample heterogeneity while maintaining discrimination of known disease/cell states, surpassing the performance of previous outlier-based methods. Furthermore, analysis of a single-cell RNA-seq dataset from mouse tracheal epithelial cells revealed, through PHet-based features, the presence of two distinct basal cell subtypes undergoing differentiation toward a luminal secretory phenotype. Notably, one of these subtypes exhibited high expression of BPIFA1. Interestingly, previous studies have linked BPIFA1 secretion to the emergence of secretory cells during mucociliary differentiation of airway epithelial cells. PHet successfully pinpointed the basal cell subtype associated with this phenomenon, a distinction that pre-annotated markers and dispersion-based features failed to make due to their admixed feature expression profiles. These findings underscore the potential of our method to deepen our understanding of the mechanisms underlying diseases and cell differentiation and contribute significantly to personalized medicine.Competing Interest StatementThe authors have declared no competing interest.
2025
Extracellular vesicles (EVs) from brain-seeking breast cancer cells (Br-EVs) breach the blood-brain barrier (BBB) via transcytosis and promote brain metastasis. Here, we defined the mechanisms by which Br-EVs modulate brain endothelial cell (BEC) dynamics to facilitate their BBB transcytosis. BEC treated with Br-EVs show significant downregulation of Rab11fip2, known to promote vesicle recycling to the plasma membrane and significant upregulation of Rab11fip3 and Rab11fip5, which support structural stability of the endosomal compartment and facilitate vesicle recycling and transcytosis, respectively. Using machine learning and quantitative global proteomic, we identified novel Br-EV-induced changes in BECs morphology, motility, and proteome that correlate with decreased BEC cytoplasm and cytoskeletal organization and dynamics. These results define early steps leading to breast-to-brain metastasis and identify molecules that could serve as targets for therapeutic strategies for brain metastasis.
2024
Uncovering fine-grained phenotypes of live cell dynamics is pivotal for a comprehensive understanding of the heterogeneity in healthy and diseased biological processes. However, this endeavor poses significant technical challenges for unsupervised machine learning, requiring the extraction of features that not only faithfully preserve this heterogeneity but also effectively discriminate between established biological states, all while remaining interpretable. To tackle these challenges, a self-training deep learning framework designed for fine-grained and interpretable phenotyping is presented. This framework incorporates an unsupervised teacher model with interpretable features to facilitate feature learning in a student deep neural network (DNN). Significantly, an autoencoder-based regularizer is designed to encourage the student DNN to maximize the heterogeneity associated with molecular perturbations. This method enables the acquisition of features with enhanced discriminatory power, while simultaneously preserving the heterogeneity associated with molecular perturbations. This study successfully delineated fine-grained phenotypes within the heterogeneous protrusion dynamics of migrating epithelial cells, revealing specific responses to pharmacological perturbations. Remarkably, this framework adeptly captured a concise set of highly interpretable features uniquely linked to these fine-grained phenotypes, each corresponding to specific temporal intervals crucial for their manifestation. This unique capability establishes it as a valuable tool for investigating diverse cellular dynamics and their heterogeneity.
The cytotoxicity assay of immune cells based on live cell imaging offers comprehensive information at the single cell-level information, but the data acquisition and analysis are labor-intensive. To overcome this limitation, we previously developed single cancer cell arrays that immobilize cancer cells in microwells as single cell arrays, thus allow high-throughput data acquisition. In this study, we utilize deep learning to automatically analyze NK cell cytotoxicity in the context of single cancer cell arrays. Defined cancer cell position and the separation of NK cells and cancer cells along distinct optical planes facilitate segmentation and classification by deep learning. Various deep learning models are evaluated to determine the most appropriate model. The results of the deep learning-based automated data analysis are consistent with those of the previous manual analysis. The integration of the microwell platform and deep learning would present new opportunities for the analysis of cell–cell interactions.
2023
Fine needle aspiration (FNA) biopsy of thyroid nodules is a safe, cost-effective, and accurate diagnostic method for detecting thyroid cancer. However, about 10% of initial FNA biopsy samples from patients are non-diagnostic and require repeated FNA, which delays the diagnosis and appropriate care. On-site evaluation of the FNA sample can be performed to filter out non-diagnostic FNA samples. Unfortunately, it involves a time-consuming staining process, and a cytopathologist has to be present at the time of FNA. To bypass the staining process and expert interpretation of FNA specimens at the clinics, we developed a deep learning-based ensemble model termed FNA-Net that allows in situ screening of adequacy of unstained thyroid FNA samples smeared on a glass slide which can decrease the non-diagnostic rate in thyroid FNA. FNA-Net combines two deep learning models, a patch-based whole slide image classifier and Faster R-CNN, to detect follicular clusters with high precision. Then, FNA-Net classifies sample slides to be non-diagnostic if the total number of detected follicular clusters is less than a predetermined threshold. With bootstrapped sampling, FNA-Net achieved a 0.81 F1 score and 0.84 AUC in the precision-recall curve for detecting the non-diagnostic slides whose follicular clusters are less than six. We expect that FNA-Net can dramatically reduce the diagnostic cost associated with FNA biopsy and improve the quality of patient care.
Stiffened arteries are a pathology of atherosclerosis, hypertension, and coronary artery disease and a key risk factor for cardiovascular disease events. The increased stiffness of arteries triggers a phenotypic switch, hypermigration, and hyperproliferation of vascular smooth muscle cells (VSMCs), leading to neointimal hyperplasia and accelerated neointima formation. However, the mechanism underlying this trigger remains unknown. Our analyses of whole-transcriptome microarray data from mouse VSMCs cultured on stiff hydrogels simulating arterial pathology identified 623 genes that were significantly and differentially expressed (360 upregulated and 263 downregulated) relative to expression in VSMCs cultured on soft hydrogels. Functional enrichment and gene network analyses revealed that these stiffness-sensitive genes are linked to cell cycle progression and proliferation. Importantly, we found that survivin, an inhibitor of apoptosis protein, mediates stiffness-dependent cell cycle progression and proliferation as determined by gene network and pathway analyses, RT-qPCR, immunoblotting, and cell proliferation assays. Furthermore, we found that inhibition of cell cycle progression did not reduce survivin expression, suggesting that survivin functions as an upstream regulator of cell cycle progression and proliferation in response to ECM stiffness. Mechanistically, we found that the stiffness signal is mechanotransduced via the FAK-E2F1 signaling axis to regulate survivin expression, establishing a regulatory pathway for how the stiffness of the cellular microenvironment affects VSMC behaviors. Overall, our findings indicate that survivin is necessary for VSMC cycling and proliferation and plays a role in regulating stiffness-responsive phenotypes.
Vascular dysfunction is a common cause of cardiovascular diseases characterized by the narrowing and stiffening of arteries, such as atherosclerosis, restenosis, and hypertension. Arterial narrowing results from the aberrant proliferation of vascular smooth muscle cells (VSMCs) and their increased synthesis and deposition of extracellular matrix (ECM) proteins. These, in turn, are modulated by arterial stiffness, but the mechanism for this is not fully understood. We found that survivin is an important regulator of stiffness-mediated ECM synthesis and intracellular stiffness in VSMCs. Whole-transcriptome analysis and cell culture experiments showed that survivin expression is upregulated in injured femoral arteries in mice and in human VSMCs cultured on stiff fibronectin-coated hydrogels. Suppressed expression of survivin in human VSMCs significantly decreased the stiffness-mediated expression of ECM components related to arterial stiffening, such as collagen-I, fibronectin, and lysyl oxidase. By contrast, expression of these ECM proteins was rescued by ectopic expression of survivin in human VSMCs cultured on soft hydrogels. Interestingly, atomic force microscopy analysis showed that suppressed or ectopic expression of survivin decreases or increases intracellular stiffness, respectively. Furthermore, we observed that inhibiting Rac and Rho reduces survivin expression, elucidating a mechanical pathway connecting intracellular tension, mediated by Rac and Rho, to survivin induction. Finally, we found that survivin inhibition decreases FAK phosphorylation, indicating that survivin-dependent intracellular tension feeds back to maintain signaling through FAK. These findings suggest a novel mechanism by which survivin potentially modulates arterial stiffness.