Abstract
BACKGROUND: Congenital heart disease (CHD) is an important cause of childhood mortality as well as morbidity in children and adults. While genetic risk contributes to the majority of CHD, most individuals with CHD do not have an identified genetic diagnosis. Short tandem repeat (TR) elements are composed of repeated base pair motifs for 2-6 basepairs that are highly polymorphic in length between individuals. These regions had been difficult to study with short read sequencing, and they have not been studied at a large scale in the context of CHD. New software and sequencing platforms have allowed for more accurate TR element genotyping. Therefore, we aimed to identify TR element variants that could impact the expression of known CHD genes.
RESULTS: We identified de novo and inherited TR element variants near known CHD genes in participants with CHD (n = 1,899) in the Pediatric Cardiac Genomics Consortium cohort as well as unaffected participants (n = 1,932) from the Simons Foundation Autism Research Initiative using short-read sequencing followed by variant calling with the gangSTR pipeline. Comparison with long-read sequencing confirmed proband genotypes for 75% (91/120) of the TR element variants identified using short read sequencing. 114 TR element regions had 3 or more de novo TR element variants, compared to an expectation of 74 TR element regions (1.54-fold enrichment, p < 1.5E-5). CHD genes CACNA1C and EVC2 had the strongest enrichment of TR element variants in the CHD cohort, determined by a higher frequency of nearby de novo TR length variants in the CHD cohort compared to the non-CHD cohort. Within CHD trios, there was over-transmission of a TR element variant near Tab 2.
CONCLUSIONS: In a targeted analysis of de novo and transmitted TR element variants in a large cohort of CHD probands, each individual had 1 de novo TR element variant near a CHD gene, and participants with CHD demonstrate clustering of variants within TR element regions. Long-read sequencing confirmed the majority of TR element variants identified using the gangSTR pipeline. De novo variants in known CHD genes were enriched in participants with CHD, with specific enrichment in TR elements near CACNA1C, EVC2, and Tab 2 in the CHD cohort. Many individual TR element variants were in known regulatory regions, but further work is needed to determine their functional impact.