Publications by Year: 2017

2017

Lee, Sejoon, Soohyun Lee, Scott Ouellette, Woong-Yang Park, Eunjung A Lee, and Peter J Park. (2017) 2017. “NGSCheckMate: Software for Validating Sample Identity in Next-Generation Sequencing Studies Within and across Data Types.”. Nucleic Acids Research 45 (11): e103. https://doi.org/10.1093/nar/gkx193.

UNLABELLED: In many next-generation sequencing (NGS) studies, multiple samples or data types are profiled for each individual. An important quality control (QC) step in these studies is to ensure that datasets from the same subject are properly paired. Given the heterogeneity of data types, file types and sequencing depths in a multi-dimensional study, a robust program that provides a standardized metric for genotype comparisons would be useful. Here, we describe NGSCheckMate, a user-friendly software package for verifying sample identities from FASTQ, BAM or VCF files. This tool uses a model-based method to compare allele read fractions at known single-nucleotide polymorphisms, considering depth-dependent behavior of similarity metrics for identical and unrelated samples. Our evaluation shows that NGSCheckMate is effective for a variety of data types, including exome sequencing, whole-genome sequencing, RNA-seq, ChIP-seq, targeted sequencing and single-cell whole-genome sequencing, with a minimal requirement for sequencing depth (>0.5X). An alignment-free module can be run directly on FASTQ files for a quick initial check. We recommend using this software as a QC step in NGS studies.

AVAILABILITY: https://github.com/parklab/NGSCheckMate.

Park, Jisook, Eunjung Lee, Kyoung-Jin Park, Hyung-Doo Park, Jong-Won Kim, Hye In Woo, Kwang Hyuck Lee, et al. (2017) 2017. “Large-Scale Clinical Validation of Biomarkers for Pancreatic Cancer Using a Mass Spectrometry-Based Proteomics Approach.”. Oncotarget 8 (26): 42761-71. https://doi.org/10.18632/oncotarget.17463.

We performed an integrated analysis of proteomic and transcriptomic datasets to develop potential diagnostic markers for early pancreatic cancer. In the discovery phase, a multiple reaction monitoring assay of 90 proteins identified by either gene expression analysis or global serum proteome profiling was established and applied to 182 clinical specimens. Nine proteins (P < 0.05) were selected for the independent validation phase and quantified using stable isotope dilution-multiple reaction monitoring-mass spectrometry in 456 specimens. Of these proteins, four proteins (apolipoprotein A-IV, apolipoprotein CIII, insulin-like growth factor binding protein 2 and tissue inhibitor of metalloproteinase 1) were significantly altered in pancreatic cancer in both the discovery and validation phase (P < 0.01). Moreover, a panel including carbohydrate antigen 19-9, apolipoprotein A-IV and tissue inhibitor of metalloproteinase 1 showed better performance for distinguishing early pancreatic cancer from pancreatitis (Area under the curve = 0.934, 86% sensitivity at fixed 90% specificity) than carbohydrate antigen 19-9 alone (71% sensitivity).Overall, we present the panel of robust biomarkers for early pancreatic cancer diagnosis through bioinformatics analysis that combined transcriptomic and proteomic data as well as rigorous validation on a large number of independent clinical samples.

Zhang, Yiqun, Patrick Kwok-Shing Ng, Melanie Kucherlapati, Fengju Chen, Yuexin Liu, Yiu Huen Tsang, Guillermo de Velasco, et al. (2017) 2017. “A Pan-Cancer Proteogenomic Atlas of PI3K/AKT/MTOR Pathway Alterations.”. Cancer Cell 31 (6): 820-832.e3. https://doi.org/10.1016/j.ccell.2017.04.013.

Molecular alterations involving the PI3K/AKT/mTOR pathway (including mutation, copy number, protein, or RNA) were examined across 11,219 human cancers representing 32 major types. Within specific mutated genes, frequency, mutation hotspot residues, in silico predictions, and functional assays were all informative in distinguishing the subset of genetic variants more likely to have functional relevance. Multiple oncogenic pathways including PI3K/AKT/mTOR converged on similar sets of downstream transcriptional targets. In addition to mutation, structural variations and partial copy losses involving PTEN and STK11 showed evidence for having functional relevance. A substantial fraction of cancers showed high mTOR pathway activity without an associated canonical genetic or genomic alteration, including cancers harboring IDH1 or VHL mutations, suggesting multiple mechanisms for pathway activation.

Yu, Vionnie W C, Rushdia Z Yusuf, Toshihiko Oki, Juwell Wu, Borja Saez, Xin Wang, Colleen Cook, et al. (2017) 2017. “Epigenetic Memory Underlies Cell-Autonomous Heterogeneous Behavior of Hematopoietic Stem Cells.”. Cell 168 (5): 944-45. https://doi.org/10.1016/j.cell.2017.02.010.

Stem cells determine homeostasis and repair of many tissues and are increasingly recognized as functionally heterogeneous. To define the extent of—and molecular basis for—heterogeneity, we overlaid functional, transcriptional, and epigenetic attributes of hematopoietic stem cells (HSCs) at a clonal level using endogenous fluorescent tagging. Endogenous HSC had clone-specific functional attributes in vivo. The intra-clonal behaviors were highly stereotypic, conserved under the stress of transplantation, inflammation, and genotoxic injury, and associated with distinctive transcriptional, DNA methylation, and chromatin accessibility patterns. Further, HSC function corresponded to epigenetic configuration but not always to transcriptional state. Therefore, hematopoiesis under homeostatic and stress conditions represents the integrated action of highly heterogeneous clones of HSC with epigenetically scripted behaviors. This high degree of epigenetically driven cell autonomy among HSCs implies that refinement of the concepts of stem cell plasticity and of the stem cell niche is warranted.