Publications

2022

Kim, Junho, August Yue Huang, Shelby L Johnson, Jenny Lai, Laura Isacco, Ailsa M Jeffries, Michael B Miller, Michael A Lodato, Christopher A Walsh, and Eunjung Alice Lee. (2022) 2022. “Prevalence and Mechanisms of Somatic Deletions in Single Human Neurons During Normal Aging and in DNA Repair Disorders.”. Nature Communications 13 (1): 5918. https://doi.org/10.1038/s41467-022-33642-w.

Replication errors and various genotoxins cause DNA double-strand breaks (DSBs) where error-prone repair creates genomic mutations, most frequently focal deletions, and defective repair may lead to neurodegeneration. Despite its pathophysiological importance, the extent to which faulty DSB repair alters the genome, and the mechanisms by which mutations arise, have not been systematically examined reflecting ineffective methods. Here, we develop PhaseDel, a computational method to detect focal deletions and characterize underlying mechanisms in single-cell whole genome sequences (scWGS). We analyzed high-coverage scWGS of 107 single neurons from 18 neurotypical individuals of various ages, and found that somatic deletions increased with age and in highly expressed genes in human brain. Our analysis of 50 single neurons from DNA repair-deficient diseases with progressive neurodegeneration (Cockayne syndrome, Xeroderma pigmentosum, and Ataxia telangiectasia) reveals elevated somatic deletions compared to age-matched controls. Distinctive mechanistic signatures and transcriptional associations suggest roles for somatic deletions in neurodegeneration.

Choudhury, Sangita, August Yue Huang, Junho Kim, Zinan Zhou, Katherine Morillo, Eduardo A Maury, Jessica W Tsai, et al. (2022) 2022. “Somatic Mutations in Single Human Cardiomyocytes Reveal Age-Associated DNA Damage and Widespread Oxidative Genotoxicity.”. Nature Aging 2 (8): 714-25. https://doi.org/10.1038/s43587-022-00261-5.

The accumulation of somatic DNA mutations over time is a hallmark of aging in many dividing and nondividing cells but has not been studied in postmitotic human cardiomyocytes. Using single-cell whole-genome sequencing, we identified and characterized the landscape of somatic single-nucleotide variants (sSNVs) in 56 single cardiomyocytes from 12 individuals (aged from 0.4 to 82 years). Cardiomyocyte sSNVs accumulate with age at rates that are faster than in many dividing cell types and nondividing neurons. Cardiomyocyte sSNVs show distinctive mutational signatures that implicate failed nucleotide excision repair and base excision repair of oxidative DNA damage, and defective mismatch repair. Since age-accumulated sSNVs create many damaging mutations that disrupt gene functions, polyploidization in cardiomyocytes may provide a mechanism of genetic compensation to minimize the complete knockout of essential genes during aging. Age-related accumulation of cardiac mutations provides a paradigm to understand the influence of aging on cardiac dysfunction.

Bourseguin, Julie, Wen Cheng, Emily Talbot, Liana Hardy, Jenny Lai, Ailsa M Jeffries, Michael A Lodato, Eunjung Alice Lee, and Svetlana Khoronenkova V. (2022) 2022. “Persistent DNA Damage Associated With ATM Kinase Deficiency Promotes Microglial Dysfunction.”. Nucleic Acids Research 50 (5): 2700-2718. https://doi.org/10.1093/nar/gkac104.

The autosomal recessive genome instability disorder Ataxia-telangiectasia, caused by mutations in ATM kinase, is characterized by the progressive loss of cerebellar neurons. We find that DNA damage associated with ATM loss results in dysfunctional behaviour of human microglia, immune cells of the central nervous system. Microglial dysfunction is mediated by the pro-inflammatory RELB/p52 non-canonical NF-κB transcriptional pathway and leads to excessive phagocytic clearance of neuronal material. Activation of the RELB/p52 pathway in ATM-deficient microglia is driven by persistent DNA damage and is dependent on the NIK kinase. Activation of non-canonical NF-κB signalling is also observed in cerebellar microglia of individuals with Ataxia-telangiectasia. These results provide insights into the underlying mechanisms of aberrant microglial behaviour in ATM deficiency, potentially contributing to neurodegeneration in Ataxia-telangiectasia.

Zhao, Boxun, Jill A Madden, Jasmine Lin, Gerard T Berry, Monica H Wojcik, Xuefang Zhao, Harrison Brand, Michael Talkowski, Eunjung Alice Lee, and Pankaj B Agrawal. (2022) 2022. “A Neurodevelopmental Disorder Caused by a Novel de Novo SVA Insertion in Exon 13 of the SRCAP Gene.”. European Journal of Human Genetics : EJHG 30 (9): 1083-87. https://doi.org/10.1038/s41431-022-01137-3.

Pathogenic variants in the SRCAP (SNF2-related CREBBP activator protein) gene, which encodes a chromatin-remodeling ATPase, cause neurodevelopmental disorders including Floating Harbor syndrome (FLHS). Here, we report the discovery of a de novo transposon insertion in SRCAP exon 13 from trio genome sequencing in a 28-year-old female with failure to thrive, developmental delay, mood disorder and seizure disorder. The insertion was a full-length ( 2.8 kb), antisense-oriented SVA insertion relative to the SRCAP transcript, bearing a 5' transduction and hallmarks of target-primed reverse transcription. The 20-bp 5' transduction allowed us to trace the source SVA element to an intron of a long non-coding RNA on chromosome 12, which is highly expressed in testis. RNA sequencing and qRT-PCR confirmed significant depletion of SRCAP expression and low-level exon skipping in the proband. This case highlights a novel disease-causing structural variant and the importance of transposon analysis in a clinical diagnostic setting.

Ganz, Javier, Eduardo A Maury, Basheer Becerra, Sara Bizzotto, Ryan N Doan, Connor J Kenny, Taehwan Shin, et al. (2022) 2022. “Rates and Patterns of Clonal Oncogenic Mutations in the Normal Human Brain.”. Cancer Discovery 12 (1): 172-85. https://doi.org/10.1158/2159-8290.CD-21-0245.

Although oncogenic mutations have been found in nondiseased, proliferative nonneural tissues, their prevalence in the human brain is unknown. Targeted sequencing of genes implicated in brain tumors in 418 samples derived from 110 individuals of varying ages, without tumor diagnoses, detected oncogenic somatic single-nucleotide variants (sSNV) in 5.4% of the brains, including IDH1 R132H. These mutations were largely present in subcortical white matter and enriched in glial cells and, surprisingly, were less common in older individuals. A depletion of high-allele frequency sSNVs representing macroscopic clones with age was replicated by analysis of bulk RNA sequencing data from 1,816 nondiseased brain samples ranging from fetal to old age. We also describe large clonal copy number variants and that sSNVs show mutational signatures resembling those found in gliomas, suggesting that mutational processes of the normal brain drive early glial oncogenesis. This study helps understand the origin and early evolution of brain tumors. SIGNIFICANCE: In the nondiseased brain, clonal oncogenic mutations are enriched in white matter and are less common in older individuals. We revealed early steps in acquiring oncogenic variants, which are essential to understanding brain tumor origins and building new mutational baselines for diagnostics.This article is highlighted in the In This Issue feature, p. 1.

2021

Huang, August Yue, and Eunjung Alice Lee. (2021) 2021. “Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data.”. Frontiers in Aging 2: 800380. https://doi.org/10.3389/fragi.2021.800380.

Somatic mutations are DNA variants that occur after the fertilization of zygotes and accumulate during the developmental and aging processes in the human lifespan. Somatic mutations have long been known to cause cancer, and more recently have been implicated in a variety of non-cancer diseases. The patterns of somatic mutations, or mutational signatures, also shed light on the underlying mechanisms of the mutational process. Advances in next-generation sequencing over the decades have enabled genome-wide profiling of DNA variants in a high-throughput manner; however, unlike germline mutations, somatic mutations are carried only by a subset of the cell population. Thus, sensitive bioinformatic methods are required to distinguish mutant alleles from sequencing and base calling errors in bulk tissue samples. An alternative way to study somatic mutations, especially those present in an extremely small number of cells or even in a single cell, is to sequence single-cell genomes after whole-genome amplification (WGA); however, it is critical and technically challenging to exclude numerous technical artifacts arising during error-prone and uneven genome amplification in current WGA methods. To address these challenges, multiple bioinformatic tools have been developed. In this review, we summarize the latest progress in methods for identification of somatic mutations and the challenges that remain to be addressed in the future.

Borges-Monroy, Rebeca, Chong Chu, Caroline Dias, Jaejoon Choi, Soohyun Lee, Yue Gao, Taehwan Shin, Peter J Park, Christopher A Walsh, and Eunjung Alice Lee. (2021) 2021. “Whole-Genome Analysis Reveals the Contribution of Non-Coding de Novo Transposon Insertions to Autism Spectrum Disorder.”. Mobile DNA 12 (1): 28. https://doi.org/10.1186/s13100-021-00256-w.

BACKGROUND: Retrotransposons have been implicated as causes of Mendelian disease, but their role in autism spectrum disorder (ASD) has not been systematically defined, because they are only called with adequate sensitivity from whole genome sequencing (WGS) data and a large enough cohort for this analysis has only recently become available.

RESULTS: We analyzed WGS data from a cohort of 2288 ASD families from the Simons Simplex Collection by establishing a scalable computational pipeline for retrotransposon insertion detection. We report 86,154 polymorphic retrotransposon insertions-including > 60% not previously reported-and 158 de novo retrotransposition events. The overall burden of de novo events was similar between ASD individuals and unaffected siblings, with 1 de novo insertion per 29, 117, and 206 births for Alu, L1, and SVA respectively, and 1 de novo insertion per 21 births total. However, ASD cases showed more de novo L1 insertions than expected in ASD genes. Additionally, we observed exonic insertions in loss-of-function intolerant genes, including a likely pathogenic exonic insertion in CSDE1, only in ASD individuals.

CONCLUSIONS: These findings suggest a modest, but important, impact of intronic and exonic retrotransposon insertions in ASD, show the importance of WGS for their analysis, and highlight the utility of specific bioinformatic tools for high-throughput detection of retrotransposon insertions.

Bim, Larissa Valdemarin, Thaise Nayane Ribeiro Carneiro, Vanessa Candiotti Buzatto, Gabriel Avelar Colozza-Gama, Fernanda C Koyama, Debora Mota Dias Thomaz, Ana Carolina de Jesus Paniza, Eunjung Alice Lee, Pedro Alexandre Favoretto Galante, and Janete Maria Cerutti. (2021) 2021. “Molecular Signature Expands the Landscape of Driver Negative Thyroid Cancers.”. Cancers 13 (20). https://doi.org/10.3390/cancers13205184.

Thyroid cancer is the most common endocrine malignancy. However, the cytological diagnosis of follicular thyroid carcinoma (FTC), Hürthle cell carcinoma (HCC), and follicular variant of papillary thyroid carcinoma (FVPTC) and their benign counterparts is a challenge for preoperative diagnosis. Nearly 20-30% of biopsied thyroid nodules are classified as having indeterminate risk of malignancy and incur costs to the health care system. Based on that, 120 patients were screened for the main driver mutations previously described in thyroid cancer. Subsequently, 14 mutation-negative cases that are the main source of diagnostic errors (FTC, HCC, or FVPTC) underwent RNA-Sequencing analysis. Somatic variants in candidate driver genes (ECD, NUP98,LRP1B, NCOR1, ATM, SOS1, and SPOP) and fusions were described. NCOR1 and SPOP variants underwent validation. Moreover, expression profiling of driver-negative samples was compared to 16 BRAF V600E, RAS, or PAX8-PPARg positive samples. Negative samples were separated in two clusters, following the expression pattern of the RAS/PAX8-PPARg or BRAF V600E positive samples. Both negative groups showed distinct BRS, ERK, and TDS scores, tumor mutation burden, signaling pathways and immune cell profile. Altogether, here we report novel gene variants and describe cancer-related pathways that might impact preoperative diagnosis and provide insights into thyroid tumor biology.

Wang, Yilan, Boxun Zhao, Jaejoon Choi, and Eunjung Alice Lee. (2021) 2021. “Genomic Approaches to Trace the History of Human Brain Evolution With an Emerging Opportunity for Transposon Profiling of Ancient Humans.”. Mobile DNA 12 (1): 22. https://doi.org/10.1186/s13100-021-00250-2.

Transposable elements (TEs) significantly contribute to shaping the diversity of the human genome, and lines of evidence suggest TEs as one of driving forces of human brain evolution. Existing computational approaches, including cross-species comparative genomics and population genetic modeling, can be adapted for the study of the role of TEs in evolution. In particular, diverse ancient and archaic human genome sequences are increasingly available, allowing reconstruction of past human migration events and holding the promise of identifying and tracking TEs among other evolutionarily important genetic variants at an unprecedented spatiotemporal resolution. However, highly degraded short DNA templates and other unique challenges presented by ancient human DNA call for major changes in current experimental and computational procedures to enable the identification of evolutionarily important TEs. Ancient human genomes are valuable resources for investigating TEs in the evolutionary context, and efforts to explore ancient human genomes will potentially provide a novel perspective on the genetic mechanism of human brain evolution and inspire a variety of technological and methodological advances. In this review, we summarize computational and experimental approaches that can be adapted to identify and validate evolutionarily important TEs, especially for human brain evolution. We also highlight strategies that leverage ancient genomic data and discuss unique challenges in ancient transposon genomics.

Chu, Chong, Rebeca Borges-Monroy, Vinayak Viswanadham V, Soohyun Lee, Heng Li, Eunjung Alice Lee, and Peter J Park. (2021) 2021. “Comprehensive Identification of Transposable Element Insertions Using Multiple Sequencing Technologies.”. Nature Communications 12 (1): 3836. https://doi.org/10.1038/s41467-021-24041-8.

Transposable elements (TEs) help shape the structure and function of the human genome. When inserted into some locations, TEs may disrupt gene regulation and cause diseases. Here, we present xTea (x-Transposable element analyzer), a tool for identifying TE insertions in whole-genome sequencing data. Whereas existing methods are mostly designed for short-read data, xTea can be applied to both short-read and long-read data. Our analysis shows that xTea outperforms other short read-based methods for both germline and somatic TE insertion discovery. With long-read data, we created a catalogue of polymorphic insertions with full assembly and annotation of insertional sequences for various types of retroelements, including pseudogenes and endogenous retroviruses. Notably, we find that individual genomes have an average of nine groups of full-length L1s in centromeres, suggesting that centromeres and other highly repetitive regions such as telomeres are a significant yet unexplored source of active L1s. xTea is available at https://github.com/parklab/xTea .