Publications by authors named "Laura Elnitski"

68 Publications

Differential gene expression identifies a transcriptional regulatory network involving ER-alpha and PITX1 in invasive epithelial ovarian cancer.

BMC Cancer 2021 Jul 3;21(1):768. Epub 2021 Jul 3.

Translational Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.

Background: The heterogeneous subtypes and stages of epithelial ovarian cancer (EOC) differ in their biological features, invasiveness, and response to chemotherapy, but the transcriptional regulators causing their differences remain nebulous.

Methods: In this study, we compared high-grade serous ovarian cancers (HGSOCs) to low malignant potential or serous borderline tumors (SBTs). Our aim was to discover new regulatory factors causing distinct biological properties of HGSOCs and SBTs.

Results: In a discovery dataset, we identified 11 differentially expressed genes (DEGs) between SBTs and HGSOCs. Their expression correctly classified 95% of 267 validation samples. Two of the DEGs, TMEM30B and TSPAN1, were significantly associated with worse overall survival in patients with HGSOC. We also identified 17 DEGs that distinguished stage II vs. III HGSOC. In these two DEG promoter sets, we identified significant enrichment of predicted transcription factor binding sites, including those of RARA, FOXF1, BHLHE41, and PITX1. Using published ChIP-seq data acquired from multiple non-ovarian cell types, we showed additional regulatory factors, including AP2-gamma/TFAP2C, FOXA1, and BHLHE40, bound at the majority of DEG promoters. Several of the factors are known to cooperate with and predict the presence of nuclear hormone receptor estrogen receptor alpha (ER-alpha). We experimentally confirmed ER-alpha and PITX1 presence at the DEGs by performing ChIP-seq analysis using the ovarian cancer cell line PEO4. Finally, RNA-seq analysis identified recurrent gene fusion events in our EOC tumor set. Some of these fusions were significantly associated with survival in HGSOC patients; however, the fusion genes are not regulated by the transcription factors identified for the DEGs.

Conclusions: These data implicate an estrogen-responsive regulatory network in the differential gene expression between ovarian cancer subtypes and stages, which includes PITX1. Importantly, the transcription factors associated with our DEG promoters are known to form the MegaTrans complex in breast cancer. This is the first study to implicate the MegaTrans complex in contributing to the distinct biological trajectories of malignant and indolent ovarian cancer subtypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12885-021-08276-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8254236PMC
July 2021

Assessing ZNF154 methylation in patient plasma as a multicancer marker in liquid biopsies from colon, liver, ovarian and pancreatic cancer patients.

Sci Rep 2021 01 8;11(1):221. Epub 2021 Jan 8.

Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.

One epigenetic hallmark of many cancer types is differential DNA methylation occurring at multiple loci compared to normal tissue. Detection and assessment of the methylation state at a specific locus could be an effective cancer diagnostic. We assessed the effectiveness of hypermethylation at the CpG island of ZNF154, a previously reported multi-cancer specific signature for use in a blood-based cancer detection assay. To predict its effectiveness, we compared methylation levels of 3698 primary tumors encompassing 11 solid cancers, 724 controls, 2711 peripheral blood cell samples, and 350 noncancer disease tissues from publicly available methylation array datasets. We performed a single-molecule high-resolution DNA melt analysis on 71 plasma samples from cancer patients and 20 noncancer individuals to assess ZNF154 methylation as a candidate diagnostic metric in liquid biopsy and compared results to KRAS mutation frequency in the case of pancreatic carcinoma. We documented ZNF154 hypermethylation in early stage tumors, which did not increase in most noncancer disease or with respect to age or sex in peripheral blood cells, suggesting it is a promising target in liquid biopsy. ZNF154 cfDNA methylation discriminated cases from healthy donor plasma samples in minimal plasma volumes and outperformed KRAS mutation frequency in pancreatic cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-80345-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7794477PMC
January 2021

Leveraging locus-specific epigenetic heterogeneity to improve the performance of blood-based DNA methylation biomarkers.

Clin Epigenetics 2020 10 20;12(1):154. Epub 2020 Oct 20.

Translational Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.

Background: Variation in intercellular methylation patterns can complicate the use of methylation biomarkers for clinical diagnostic applications such as blood-based cancer testing. Here, we describe development and validation of a methylation density binary classification method called EpiClass (available for download at https://github.com/Elnitskilab/EpiClass ) that can be used to predict and optimize the performance of methylation biomarkers, particularly in challenging, heterogeneous samples such as liquid biopsies. This approach is based upon leveraging statistical differences in single-molecule sample methylation density distributions to identify ideal thresholds for sample classification.

Results: We developed and tested the classifier using reduced representation bisulfite sequencing (RRBS) data derived from ovarian carcinoma tissue DNA and controls. We used these data to perform in silico simulations using methylation density profiles from individual epiallelic copies of ZNF154, a genomic locus known to be recurrently methylated in numerous cancer types. From these profiles, we predicted the performance of the classifier in liquid biopsies for the detection of epithelial ovarian carcinomas (EOC). In silico analysis indicated that EpiClass could be leveraged to better identify cancer-positive liquid biopsy samples by implementing precise thresholds with respect to methylation density profiles derived from circulating cell-free DNA (cfDNA) analysis. These predictions were confirmed experimentally using DREAMing to perform digital methylation density analysis on a cohort of low volume (1-ml) plasma samples obtained from 26 EOC-positive and 41 cancer-free women. EpiClass performance was then validated in an independent cohort of 24 plasma specimens, derived from a longitudinal study of 8 EOC-positive women, and 12 plasma specimens derived from 12 healthy women, respectively, attaining a sensitivity/specificity of 91.7%/100.0%. Direct comparison of CA-125 measurements with EpiClass demonstrated that EpiClass was able to better identify EOC-positive women than standard CA-125 assessment. Finally, we used independent whole genome bisulfite sequencing (WGBS) datasets to demonstrate that EpiClass can also identify other cancer types as well or better than alternative methylation-based classifiers.

Conclusions: Our results indicate that assessment of intramolecular methylation density distributions calculated from cfDNA facilitates the use of methylation biomarkers for diagnostic applications. Furthermore, we demonstrated that EpiClass analysis of ZNF154 methylation was able to outperform CA-125 in the detection of etiologically diverse ovarian carcinomas, indicating broad utility of ZNF154 for use as a biomarker of ovarian cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13148-020-00939-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7574234PMC
October 2020

DNA methylation profiles unique to Kalahari KhoeSan individuals.

Epigenetics 2021 May 6;16(5):537-553. Epub 2020 Sep 6.

Genomic Functional Analysis Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

Genomes of KhoeSan individuals of the Kalahari Desert provide the greatest understanding of single nucleotide diversity in the human genome. Compared with individuals in industrialized environments, the KhoeSan have a unique foraging and hunting lifestyle. Given these dramatic environmental differences, and the responsiveness of the methylome to environmental exposures of many types, we hypothesized that DNA methylation patterns would differ between KhoeSan and neighbouring agropastoral and/or industrial Bantu. We analysed Illumina HumanMethylation 450 k array data generated from blood samples from 38 KhoeSan and 42 Bantu, and 6 Europeans. After removing CpG positions associated with annotated and novel polymorphisms and controlling for white blood cell composition, sex, age and technical variation we identified 816 differentially methylated CpG loci, out of which 133 had an absolute beta-value difference of at least 0.05. Notably /, which plays a role in zinc transport, was one of the most differentially methylated loci. Although the chronological ages of the KhoeSan are not formally recorded, we compared historically estimated ages to methylation-based calculations. This study demonstrates that the epigenetic profile of KhoeSan individuals reveals differences from other populations, and along with extensive genetic diversity, this community brings increased accessibility and understanding to the diversity of the human genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/15592294.2020.1809852DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8078743PMC
May 2021

MethylToSNP: identifying SNPs in Illumina DNA methylation array data.

Epigenetics Chromatin 2019 12 20;12(1):79. Epub 2019 Dec 20.

Genomic Functional Analysis Section, Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 49 Convent Dr., Bethesda, MD, 20892, USA.

Background: Current array-based methods for the measurement of DNA methylation rely on the process of sodium bisulfite conversion to differentiate between methylated and unmethylated cytosine bases in DNA. In the absence of genotype data this process can lead to ambiguity in data interpretation when a sample has polymorphisms at a methylation probe site. A common way to minimize this problem is to exclude such potentially problematic sites, with some methods removing as much as 60% of array probes from consideration before data analysis.

Results: Here, we present an algorithm implemented in an R Bioconductor package, MethylToSNP, which detects a characteristic data pattern to infer sites likely to be confounded by polymorphisms. Additionally, the tool provides a stringent reliability score to allow thresholding on SNP predictions. We calibrated parameters and thresholds used by the algorithm on simulated and real methylation data sets. We illustrate findings using methylation data from YRI (Yoruba in Ibadan, Nigeria), CEPH (European descent) and KhoeSan (southern African) populations. Our polymorphism predictions made using MethylToSNP have been validated through SNP databases and bisulfite and genomic sequencing.

Conclusions: The benefits of this method are threefold. First, it prevents extensive data loss by considering only SNPs specific to the individuals in the study. Second, it offers the possibility to identify new polymorphisms in samples for which there is little known about the genetic landscape. Third, it identifies variants as they exist in functional regions of a genome, such as in CTCF (transcriptional repressor) sites and enhancers, that may be common alleles or personal mutations with potential to deleteriously affect genomic regulatory activities. We demonstrate that MethylToSNP is applicable to the Illumina 450K and Illumina 850K EPIC array data and is also backwards compatible to the 27K methylation arrays. Going forward, this kind of nuanced approach can increase the amount of information derived from precious data sets by considering samples of the project individually to enable more informed decisions about data cleaning.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13072-019-0321-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6923858PMC
December 2019

Aberrant DNA methylation defines isoform usage in cancer, with functional implications.

PLoS Comput Biol 2019 07 22;15(7):e1007095. Epub 2019 Jul 22.

Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

Alternative transcript isoforms are common in tumors and act as potential drivers of cancer. Mechanisms determining altered isoform expression include somatic mutations in splice regulatory sites or altered splicing factors. However, since DNA methylation is known to regulate transcriptional isoform activity in normal cells, we predicted the highly dysregulated patterns of DNA methylation present in cancer also affect isoform activity. We analyzed DNA methylation and RNA-seq isoform data from 18 human cancer types and found frequent correlations specifically within 11 cancer types. Examining the top 25% of variable methylation sites revealed that the location of the methylated CpG site in a gene determined which isoform was used. In addition, the correlated methylation-isoform patterns classified tumors into known subtypes and predicted distinct protein functions between tumor subtypes. Finally, methylation-correlated isoforms were enriched for oncogenes, tumor suppressors, and cancer-related pathways. These findings provide new insights into the functional impact of dysregulated DNA methylation in cancer and highlight the relationship between the epigenome and transcriptome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1007095DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6675117PMC
July 2019

CAGI experiments: Modeling sequence variant impact on gene splicing using predictions from computational tools.

Hum Mutat 2019 09 27;40(9):1252-1260. Epub 2019 Jun 27.

Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland.

Improving predictions of phenotypic consequences for genomic variants is part of ongoing efforts in the scientific community to gain meaningful insights into genomic function. Within the framework of the critical assessment of genome interpretation experiments, we participated in the Vex-seq challenge, which required predicting the change in the percent spliced in measure (ΔΨ) for 58 exons caused by more than 1,000 genomic variants. Experimentally determined through the Vex-seq assay, the Ψ quantifies the fraction of reads that include an exon of interest. Predicting the change in Ψ associated with specific genomic variants implies determining the sequence changes relevant for splicing regulators, such as splicing enhancers and silencers. Here we took advantage of two computational tools, SplicePort and SPANR, that incorporate relevant sequence features in their models of splice sites and exon-inclusion level, respectively. Specifically, we used the SplicePort and SPANR outputs to build mathematical models of the experimental data obtained for the variants in the training set, which we then used to predict the ΔΨ associated with the mutations in the test set. We show that the sequence changes captured by these computational tools provide a reasonable foundation for modeling the impact on splicing associated with genomic variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23782DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6744343PMC
September 2019

Identification of human silencers by correlating cross-tissue epigenetic profiles and gene expression.

Genome Res 2019 04 18;29(4):657-667. Epub 2019 Mar 18.

Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20892, USA.

Compared to enhancers, silencers are notably difficult to identify and validate experimentally. In search for human silencers, we utilized H3K27me3-DNase I hypersensitive site (DHS) peaks with tissue specificity negatively correlated with the expression of nearby genes across 25 diverse cell lines. These regions are predicted to be silencers since they are physically linked, using Hi-C loops, or associated, using expression quantitative trait loci (eQTL) results, with a decrease in gene expression much more frequently than general H3K27me3-DHSs. Also, these regions are enriched for the binding sites of transcriptional repressors (such as CTCF, MECOM, SMAD4, and SNAI3) and depleted of the binding sites of transcriptional activators. Using sequence signatures of these regions, we constructed a computational model and predicted approximately 10,000 additional silencers per cell line and demonstrated that the majority of genes linked to these silencers are expressed at a decreased level. Furthermore, single nucleotide polymorphisms (SNPs) in predicted silencers are significantly associated with disease phenotypes. Finally, our results show that silencers commonly interact with enhancers to affect the transcriptional dynamics of tissue-specific genes and to facilitate fine-tuning of transcription in the human genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.247007.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6442386PMC
April 2019

The hypothesis of ultraconserved enhancer dispensability overturned.

Genome Biol 2018 05 8;19(1):57. Epub 2018 May 8.

Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20892, USA.

Two recent studies explore how redundant enhancers in mice really are.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-018-1433-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5938802PMC
May 2018

Significant associations between driver gene mutations and DNA methylation alterations across many cancer types.

PLoS Comput Biol 2017 Nov 10;13(11):e1005840. Epub 2017 Nov 10.

Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States.

Recent evidence shows that mutations in several driver genes can cause aberrant methylation patterns, a hallmark of cancer. In light of these findings, we hypothesized that the landscapes of tumor genomes and epigenomes are tightly interconnected. We measured this relationship using principal component analyses and methylation-mutation associations applied at the nucleotide level and with respect to genome-wide trends. We found that a few mutated driver genes were associated with genome-wide patterns of aberrant hypomethylation or CpG island hypermethylation in specific cancer types. In addition, we identified associations between 737 mutated driver genes and site-specific methylation changes. Moreover, using these mutation-methylation associations, we were able to distinguish between two uterine and two thyroid cancer subtypes. The driver gene mutation-associated methylation differences between the thyroid cancer subtypes were linked to differential gene expression in JAK-STAT signaling, NADPH oxidation, and other cancer-related pathways. These results establish that driver gene mutations are associated with methylation alterations capable of shaping regulatory network functions. In addition, the methodology presented here can be used to subdivide tumors into more homogeneous subsets corresponding to underlying molecular characteristics, which could improve treatment efficacy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1005840DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5709060PMC
November 2017

SigSeeker: a peak-calling ensemble approach for constructing epigenetic signatures.

Bioinformatics 2017 Sep;33(17):2615-2621

Genetics and Molecular Biology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.

Motivation: Epigenetic data are invaluable when determining the regulatory programs governing a cell. Based on use of next-generation sequencing data for characterizing epigenetic marks and transcription factor binding, numerous peak-calling approaches have been developed to determine sites of genomic significance in these data. Such analyses can produce a large number of false positive predictions, suggesting that sites supported by multiple algorithms provide a stronger foundation for inferring and characterizing regulatory programs associated with the epigenetic data. Few methodologies integrate epigenetic based predictions of multiple approaches when combining profiles generated by different tools.

Results: The SigSeeker peak-calling ensemble uses multiple tools to identify peaks, and with user-defined thresholds for peak overlap and signal strength it retains only those peaks that are concordant across multiple tools. Peaks predicted to be co-localized by only a very small number of tools, discovered to be only marginally overlapping, or found to represent significant outliers to the approximation model are removed from the results, providing concise and high quality epigenetic datasets. SigSeeker has been validated using established benchmarks for transcription factor binding and histone modification ChIP-Seq data. These comparisons indicate that the quality of our ensemble technique exceeds that of single tool approaches, enhances existing peak-calling ensembles, and results in epigenetic profiles of higher confidence.

Availability And Implementation: http://sigseeker.org.

Contact: [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx276DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860059PMC
September 2017

CpG island methylator phenotype in adenocarcinomas from the digestive tract: Methods, conclusions, and controversies.

World J Gastrointest Oncol 2017 Mar;9(3):105-120

Francisco Sánchez-Vega, Valer Gotea, Yun-Ching Chen, Laura Elnitski, Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, MD 20852, United States.

Over the last two decades, cancer-related alterations in DNA methylation that regulate transcription have been reported for a variety of tumors of the gastrointestinal tract. Due to its relevance for translational research, great emphasis has been placed on the analysis and molecular characterization of the CpG island methylator phenotype (CIMP), defined as widespread hypermethylation of CpG islands in clinically distinct subsets of cancer patients. Here, we present an overview of previous work in this field and also explore some open questions using cross-platform data for esophageal, gastric, and colorectal adenocarcinomas from The Cancer Genome Atlas. We provide a data-driven, pan-gastrointestinal stratification of individual samples based on CIMP status and we investigate correlations with oncogenic alterations, including somatic mutations and epigenetic silencing of tumor suppressor genes. Besides known events in CIMP such as mutation, CDKN2A silencing or MLH1 inactivation, we discuss the potential role of emerging actors such as Wnt pathway deregulation through truncating mutations in RNF43 and epigenetic silencing of WIF1. Our results highlight the existence of molecular similarities that are superimposed over a larger backbone of tissue-specific features and can be exploited to reduce heterogeneity of response in clinical trials.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4251/wjgo.v9.i3.105DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5348626PMC
March 2017

The Emergence of Pan-Cancer CIMP and Its Elusive Interpretation.

Biomolecules 2016 11 22;6(4). Epub 2016 Nov 22.

National Human Genome Research Institute, Rockville, MD 20852, USA.

Epigenetic dysregulation is recognized as a hallmark of cancer. In the last 16 years, a CpG island methylator phenotype (CIMP) has been documented in tumors originating from different tissues. However, a looming question in the field is whether or not CIMP is a pan-cancer phenomenon or a tissue-specific event. Here, we give a synopsis of the history of CIMP and describe the pattern of DNA methylation that defines the CIMP phenotype in different cancer types. We highlight new conceptual approaches of classifying tumors based on CIMP in a cancer type-agnostic way that reveal the presence of distinct CIMP tumors in a multitude of The Cancer Genome Atlas (TCGA) datasets, suggesting that this phenotype may transcend tissue-type specificity. Lastly, we show evidence supporting the clinical relevance of CIMP-positive tumors and suggest that a common CIMP etiology may define new mechanistic targets in cancer treatment.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/biom6040045DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5197955PMC
November 2016

A Case of IL-7R Deficiency Caused by a Novel Synonymous Mutation and Implications for Mutation Screening in SCID Diagnosis.

Front Immunol 2016 27;7:443. Epub 2016 Oct 27.

Servicio de Inmunología, Hospital Universitario 12 de Octubre, Madrid, Spain; Instituto de Investigación I+12, Madrid, Spain.

Reported synonymous substitutions are generally non-pathogenic, and rare pathogenic synonymous variants may be disregarded unless there is a high index of suspicion. In a case of IL7 receptor deficiency severe combined immunodeficiency (SCID), the relevance of a non-reported synonymous variant was only suspected through the use of additional computational tools, which focused on the impact of mutations on gene splicing. The pathogenic nature of the variant was confirmed using experimental validation of the effect on mRNA splicing and IL7 pathway function. This case reinforces the need to use additional experimental methods to establish the functional impact of specific mutations, in particular for cases such as SCID where prompt diagnosis can greatly impact on diagnosis, treatment, and survival.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fimmu.2016.00443DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5081475PMC
October 2016

A Systems Biology Comparison of Ovarian Cancers Implicates Putative Somatic Driver Mutations through Protein-Protein Interaction Models.

PLoS One 2016 27;11(10):e0163353. Epub 2016 Oct 27.

National Human Genome Research Institute, National Institutes of Health, Rockville, MD, 20852, United States of America.

Ovarian carcinomas can be aggressive with a high mortality rate (e.g., high-grade serous ovarian carcinomas, or HGSOCs), or indolent with much better long-term outcomes (e.g., low-malignant-potential, or LMP, serous ovarian carcinomas). By comparing LMP and HGSOC tumors, we can gain insight into the mechanisms underlying malignant progression in ovarian cancer. However, previous studies of the two subtypes have been focused on gene expression analysis. Here, we applied a systems biology approach, integrating gene expression profiles derived from two independent data sets containing both LMP and HGSOC tumors with protein-protein interaction data. Genes and related networks implicated by both data sets involved both known and novel disease mechanisms and highlighted the different roles of BRCA1 and CREBBP in the two tumor types. In addition, the incorporation of somatic mutation data revealed that amplification of PAK4 is associated with poor survival in patients with HGSOC. Thus, perturbations in protein interaction networks demonstrate differential trafficking of network information between malignant and benign ovarian cancers. The novel network-based molecular signatures identified here may be used to identify new targets for intervention and to improve the treatment of invasive ovarian cancer as well as early diagnosis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0163353PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5082879PMC
June 2017

Robust Detection of DNA Hypermethylation of ZNF154 as a Pan-Cancer Locus with in Silico Modeling for Blood-Based Diagnostic Development.

J Mol Diagn 2016 Mar 5;18(2):283-98. Epub 2016 Feb 5.

Translational and Functional Genomics Branch, National Human Genome Research Institute, Rockville, Maryland. Electronic address:

Sites that display recurrent, aberrant DNA methylation in cancer represent potential biomarkers for screening and diagnostics. Previously, we identified hypermethylation at the ZNF154 CpG island in 15 solid epithelial tumor types from 13 different organs. In this study, we measure the magnitude and pattern of differential methylation of this region across colon, lung, breast, stomach, and endometrial tumor samples using next-generation bisulfite amplicon sequencing. We found that all tumor types and subtypes are hypermethylated at this locus compared with normal tissue. To evaluate this site as a possible pan-cancer marker, we compare the ability of several sequence analysis methods to distinguish the five tumor types (184 tumor samples) from normal tissue samples (n = 34). The classification performance for the strongest method, measured by the area under (the receiver operating characteristic) curve (AUC), is 0.96, close to a perfect value of 1. Furthermore, in a computational simulation of circulating tumor DNA, we were able to detect limited amounts of tumor DNA diluted with normal DNA: 1% tumor DNA in 99% normal DNA yields AUCs of up to 0.79. Our findings suggest that hypermethylation of the ZNF154 CpG island is a relevant biomarker for identifying solid tumor DNA and may have utility as a generalizable biomarker for circulating tumor DNA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jmoldx.2015.11.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4816708PMC
March 2016

Discovering Gene Regulatory Elements Using Coverage-Based Heuristics.

IEEE/ACM Trans Comput Biol Bioinform 2018 Jul-Aug;15(4):1290-1300. Epub 2015 Oct 30.

Data mining algorithms and sequencing methods (such as RNA-seq and ChIP-seq) are being combined to discover genomic regulatory motifs that relate to a variety of phenotypes. However, motif discovery algorithms often produce very long lists of putative transcription factor binding sites, hindering the discovery of phenotype-related regulatory elements by making it difficult to select a manageable set of candidate motifs for experimental validation. To address this issue, the authors introduce the motif selection problem and provide coverage-based search heuristics for its solution. Analysis of 203 ChIP-seq experiments from the ENCyclopedia of DNA Elements project shows that our algorithms produce motifs that have high sensitivity and specificity and reveals new insights about the regulatory code of the human genome. The greedy algorithm performs the best, selecting a median of two motifs per ChIP-seq transcription factor group while achieving a median sensitivity of 77 percent.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2015.2496261DOI Listing
May 2019

The functional relevance of somatic synonymous mutations in melanoma and other cancers.

Pigment Cell Melanoma Res 2015 Nov;28(6):673-84

Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel.

Recent technological advances in sequencing have flooded the field of cancer research with knowledge about somatic mutations for many different cancer types. Most cancer genomics studies focus on mutations that alter the amino acid sequence, ignoring the potential impact of synonymous mutations. However, accumulating experimental evidence has demonstrated clear consequences for gene function, leading to a widespread recognition of the functional role of synonymous mutations and their causal connection to various diseases. Here, we review the evidence supporting the direct impact of synonymous mutations on gene function via gene splicing; mRNA stability, folding, and translation; protein folding; and miRNA-based regulation of expression. These results highlight the functional contribution of synonymous mutations to oncogenesis and the need to further investigate their detection and prioritization for experimental assessment.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/pcmr.12413DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4834044PMC
November 2015

Pan-cancer stratification of solid human epithelial tumors and cancer cell lines reveals commonalities and tissue-specific features of the CpG island methylator phenotype.

Epigenetics Chromatin 2015 17;8:14. Epub 2015 Apr 17.

Translational and Functional Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA.

Background: The term CpG island methylator phenotype (CIMP) has been used to describe widespread DNA hypermethylation at CpG-rich genomic regions affecting clinically distinct subsets of cancer patients. Even though there have been numerous studies of CIMP in individual cancer types, a uniform analysis across tissues is still lacking.

Results: We analyze genome-wide patterns of CpG island hypermethylation in 5,253 solid epithelial tumors from 15 cancer types from TCGA and 23 cancer cell lines from ENCODE. We identify differentially methylated loci that define CIMP+ and CIMP- samples, and we use unsupervised clustering to provide a robust molecular stratification of tumor methylomes for 12 cancer types and all cancer cell lines. With a minimal set of 89 discriminative loci, we demonstrate accurate pan-cancer separation of the 12 CIMP+/- subpopulations, based on their average levels of methylation. Tumor samples in different CIMP subclasses show distinctive correlations with gene expression profiles and recurrence of somatic mutations, copy number variations, and epigenetic silencing. Enrichment analyses indicate shared canonical pathways and upstream regulators for CIMP-targeted regions across cancer types. Furthermore, genomic alterations showing consistent associations with CIMP+/- status include genes involved in DNA repair, chromatin remodeling genes, and several histone methyltransferases. Associations of CIMP status with specific clinical features, including overall survival in several cancer types, highlight the importance of the CIMP+/- designation for individual tumor evaluation and personalized medicine.

Conclusions: We present a comprehensive computational study of CIMP that reveals pan-cancer commonalities and tissue-specific differences underlying concurrent hypermethylation of CpG islands across tumors. Our stratification of solid tumors and cancer cell lines based on CIMP status is data-driven and agnostic to tumor type by design, which protects against known biases that have hindered classic methods previously used to define CIMP. The results that we provide can be used to refine existing molecular subtypes of cancer into more homogeneously behaving subgroups, potentially leading to more uniform responses in clinical trials.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13072-015-0007-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4424513PMC
May 2015

Orthology-driven mapping of bidirectional promoters in human and mouse genomes.

BMC Bioinformatics 2014 16;15 Suppl 17:S1. Epub 2014 Dec 16.

Background: The presence of bidirectional promoters in all vertebrate species suggests that the promoters may be maintained in orthologous positions. Therefore the identification of the comprehensive orthologous mapping of this type promoter across species can facilitate elucidation of regulatory mechanisms controlling bidirectional gene expression. However, the lack of annotation for many transcribed regions in the genome can impact the orthology designation of these promoters. Human and mouse are among genomes that have been relatively well annotated. Thus we used them as models to study the orthologous patterns of bidirectional promoters.

Results: We developed a method to annotate these regulatory regions by confirming the orthology of the genes found on each side of the promoters. In this manuscript we report the cross-species comparisons between human and mouse genomes, where the bidirectional promoter sets regulating UCSC Known Genes and spliced EST annotations were mapped from human to mouse and vice versa. We validate hundreds of orthologous bidirectional promoters through the presence of orthologous flanking gene annotations in the second species. We also show that regulatory activity of these orthologous promoters confers similar gene expression profiles in 21 tissues of human and mouse. In particular, more than one third of human bidirectional promoters annotated from spliced EST annotations regulate ncRNA, of which over 90% are lncRNAs.

Conclusions: Although evolutionary conservation shows a weaker signature in promoters than coding regions, our technique of mapping of orthologous genes shows that most bidirectional promoter arrangements are conserved across human and mouse genomes, suggesting a critical function. In addition, the similar expression patterns of the orthologous gene sets indicate that the regulatory mechanisms remain largely conserved as well.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-15-S17-S1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4304189PMC
May 2015

Defining functional DNA elements in the human genome.

Proc Natl Acad Sci U S A 2014 Apr 21;111(17):6131-8. Epub 2014 Apr 21.

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139.

With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1318948111DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4035993PMC
April 2014

Ascertaining regions affected by GC-biased gene conversion through weak-to-strong mutational hotspots.

Genomics 2014 May-Jun;103(5-6):349-56. Epub 2014 Apr 13.

National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

A major objective for evolutionary biology is to identify regions affected by positive selection. High dN/dS values for proteins and accelerated lineage-specific substitution rates for non-coding regions are considered classic signatures of positive selection. However, these could also be the result of non-adaptive phenomena, such as GC-biased gene conversion (gBGC), which favors the fixation of strong (C/G) over weak (A/T) nucleotides. Recent estimates indicate that gBGC affected up to 20% of regions with signatures of positive selection. Here we evaluate the impact of gBGC through its molecular signature of weak-to-strong mutational hotspots. We implemented specific modifications to the test proposed by Tang and Lewontin (1999) for identifying regions of differential variability and applied it to regions previously investigated for the influence of gBGC. While we found significant agreement with previous reports, our results suggest a smaller influence of gBGC than previously estimated, warranting further development of methods for its detection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ygeno.2014.04.001DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4527313PMC
January 2015

Computational analysis reveals a correlation of exon-skipping events with splicing, transcription and epigenetic factors.

Nucleic Acids Res 2014 Mar 24;42(5):2856-69. Epub 2013 Dec 24.

Departments of Molecular Medicine, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, Department of Molecular and Cellular Biochemistry and the Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA, Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Rockville, MD 20852, USA and Deparment of Epidemiology and Biostatistics, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA.

Alternative splicing (AS), in higher eukaryotes, is one of the mechanisms of post-transcriptional regulation that generate multiple transcripts from the same gene. One particular mode of AS is the skipping event where an exon may be alternatively excluded or constitutively included in the resulting mature mRNA. Both transcript isoforms from this skipping event site, i.e. in which the exon is either included (inclusion isoform) or excluded (skipping isoform), are typically present in one cell, and maintain a subtle balance that is vital to cellular function and dynamics. However, how the prevailing conditions dictate which isoform is expressed and what biological factors might influence the regulation of this process remain areas requiring further exploration. In this study, we have developed a novel computational method, graph-based exon-skipping scanner (GESS), for de novo detection of skipping event sites from raw RNA-seq reads without prior knowledge of gene annotations, as well as for determining the dominant isoform generated from such sites. We have applied our method to publicly available RNA-seq data in GM12878 and K562 cells from the ENCODE consortium and experimentally validated several skipping site predictions by RT-PCR. Furthermore, we integrated other sequencing-based genomic data to investigate the impact of splicing activities, transcription factors (TFs) and epigenetic histone modifications on splicing outcomes. Our computational analysis found that splice sites within the skipping-isoform-dominated group (SIDG) tended to exhibit weaker MaxEntScan-calculated splice site strength around middle, 'skipping', exons compared to those in the inclusion-isoform-dominated group (IIDG). We further showed the positional preference pattern of splicing factors, characterized by enrichment in the intronic splice sites immediately bordering middle exons. Finally, our analysis suggested that different epigenetic factors may introduce a variable obstacle in the process of exon-intron boundary establishment leading to skipping events.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkt1338DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3950716PMC
March 2014

Recurrent patterns of DNA methylation in the ZNF154, CASP8, and VHL promoters across a wide spectrum of human solid epithelial tumors and cancer cell lines.

Epigenetics 2013 Dec 22;8(12):1355-72. Epub 2013 Oct 22.

Genome Technology Branch; National Human Genome Research Institute; National Institutes of Health; Bethesda, MD USA.

The study of aberrant DNA methylation in cancer holds the key to the discovery of novel biological markers for diagnostics and can help to delineate important mechanisms of disease. We have identified 12 loci that are differentially methylated in serous ovarian cancers and endometrioid ovarian and endometrial cancers with respect to normal control samples. The strongest signal showed hypermethylation in tumors at a CpG island within the ZNF154 promoter. We show that hypermethylation of this locus is recurrent across solid human epithelial tumor samples for 15 of 16 distinct cancer types from TCGA. Furthermore, ZNF154 hypermethylation is strikingly present across a diverse panel of ENCODE cell lines, but only in those derived from tumor cells. By extending our analysis from the Illumina 27K Infinium platform to the 450K platform, to sequencing of PCR amplicons from bisulfite treated DNA, we demonstrate that hypermethylation extends across the breadth of the ZNF154 CpG island. We have also identified recurrent hypomethylation in two genomic regions associated with CASP8 and VHL. These three genes exhibit significant negative correlation between methylation and gene expression across many cancer types, as well as patterns of DNaseI hypersensitivity and histone marks that reflect different chromatin accessibility in cancer vs. normal cell lines. Our findings emphasize hypermethylation of ZNF154 as a biological marker of relevance for tumor identification. Epigenetic modifications affecting the promoters of ZNF154, CASP8, and VHL are shared across a vast array of tumor types and may therefore be important for understanding the genomic landscape of cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4161/epi.26701DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3933495PMC
December 2013

Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma.

Proc Natl Acad Sci U S A 2013 Aug 30;110(33):13481-6. Epub 2013 Jul 30.

National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.

Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683-691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This mutation led to increased BCL2L12 mRNA and protein levels because of differential targeting of WT and mutant BCL2L12 by hsa-miR-671-5p. Protein made from mutant BCL2L12 transcript bound p53, inhibited UV-induced apoptosis more efficiently than WT BCL2L12, and reduced endogenous p53 target gene transcription. This report shows selection of a recurrent somatic synonymous mutation in cancer. Our data indicate that silent alterations have a role to play in human cancer, emphasizing the importance of their investigation in future cancer genome studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1304227110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3746936PMC
August 2013

Bidirectional promoters as important drivers for the emergence of species-specific transcripts.

PLoS One 2013 27;8(2):e57323. Epub 2013 Feb 27.

DIR/GTB Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

The diversification of gene functions has been largely attributed to the process of gene duplication. Novel examples of genes originating from previously untranscribed regions have been recently described without regard to a unifying functional mechanism for their emergence. Here we propose a model mechanism that could generate a large number of lineage-specific novel transcripts in vertebrates through the activation of bidirectional transcription from unidirectional promoters. We examined this model in silico using human transcriptomic and genomic data and identified evidence consistent with the emergence of more than 1,000 primate-specific transcripts. These are transcripts with low coding potential and virtually no functional annotation. They initiate at less than 1 kb upstream of an oppositely transcribed conserved protein coding gene, in agreement with the generally accepted definition of bidirectional promoters. We found that the genomic regions upstream of ancestral promoters, where the novel transcripts in our dataset reside, are characterized by preferential accumulation of transposable elements. This enhances the sequence diversity of regions located upstream of ancestral promoters, further highlighting their evolutionary importance for the emergence of transcriptional novelties. By applying a newly developed test for positive selection to transposable element-derived fragments in our set of novel transcripts, we found evidence of adaptive evolution in the human lineage in nearly 3% of the novel transcripts in our dataset. These findings indicate that at least some novel transcripts could become functionally relevant, and thus highlight the evolutionary importance of promoters, through their capacity for bidirectional transcription, for the emergence of novel genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0057323PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3583895PMC
September 2013

Unique alterations of an ultraconserved non-coding element in the 3'UTR of ZIC2 in holoprosencephaly.

PLoS One 2012 31;7(7):e39026. Epub 2012 Jul 31.

Medical Genetics Branch, National Human Genome Research Institute (NHGRI), National Institutes of Health, Bethesda, Maryland, United States of America.

Coding region alterations of ZIC2 are the second most common type of mutation in holoprosencephaly (HPE). Here we use several complementary bioinformatic approaches to identify ultraconserved cis-regulatory sequences potentially driving the expression of human ZIC2. We demonstrate that an 804 bp element in the 3' untranslated region (3'UTR) is highly conserved across the evolutionary history of vertebrates from fish to humans. Furthermore, we show that while genetic variation of this element is unexpectedly common among holoprosencephaly subjects (6/528 or >1%), it is not present in control individuals. Two of six proband-unique variants are de novo, supporting their pathogenic involvement in HPE outcomes. These findings support a general recommendation that the identification and analysis of key ultraconserved elements should be incorporated into the genetic risk assessment of holoprosencephaly cases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0039026PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409191PMC
April 2013

Functional analysis of synonymous substitutions predicted to affect splicing of the CFTR gene.

J Cyst Fibros 2012 Dec 14;11(6):511-7. Epub 2012 May 14.

DIR/GTB Genomic Functional Analysis Section, National Human Genome Research Institute, NIH Rockville, MD 20852, USA.

Background: Cystic fibrosis is caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene. Over 1800 CFTR mutations have been reported, and about 12% of mutations are believed to impair pre-mRNA splicing. Given that several synthetic, non-splice-junction synonymous substitutions have been reported to alter splicing in CFTR, we predicted that naturally occurring synonymous substitutions may be erroneously classified as functionally neutral.

Methods: Computational tools were used to predict the effect of synonymous substitutions on CFTR pre-mRNA splicing. The functional consequences of selected substitutions were evaluated using a minigene splicing assay.

Results: Two synonymous mutations were shown to have a dramatic effect on CFTR pre-mRNA splicing, and consequently could alter protein integrity and phenotypic outcome.

Conclusions: Traditional methods of mutation analysis overlook splicing defects that occur at internal positions in coding exons, especially synonymous substitutions. We show that bioinformatics tools and minigene splicing assays are a potent combination to prioritize and identify mutations that cause aberrant CFTR pre-mRNA splicing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jcf.2012.04.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3440543PMC
December 2012

Differential analysis of ovarian and endometrial cancers identifies a methylator phenotype.

PLoS One 2012 5;7(3):e32941. Epub 2012 Mar 5.

DIR/GTB Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America.

Despite improved outcomes in the past 30 years, less than half of all women diagnosed with epithelial ovarian cancer live five years beyond their diagnosis. Although typically treated as a single disease, epithelial ovarian cancer includes several distinct histological subtypes, such as papillary serous and endometrioid carcinomas. To address whether the morphological differences seen in these carcinomas represent distinct characteristics at the molecular level we analyzed DNA methylation patterns in 11 papillary serous tumors, 9 endometrioid ovarian tumors, 4 normal fallopian tube samples and 6 normal endometrial tissues, plus 8 normal fallopian tube and 4 serous samples from TCGA. For comparison within the endometrioid subtype we added 6 primary uterine endometrioid tumors and 5 endometrioid metastases from uterus to ovary. Data was obtained from 27,578 CpG dinucleotides occurring in or near promoter regions of 14,495 genes. We identified 36 locations with significant increases or decreases in methylation in comparisons of serous tumors and normal fallopian tube samples. Moreover, unsupervised clustering techniques applied to all samples showed three major profiles comprising mostly normal samples, serous tumors, and endometrioid tumors including ovarian, uterine and metastatic origins. The clustering analysis identified 60 differentially methylated sites between the serous group and the normal group. An unrelated set of 25 serous tumors validated the reproducibility of the methylation patterns. In contrast, >1,000 genes were differentially methylated between endometrioid tumors and normal samples. This finding is consistent with a generalized regulatory disruption caused by a methylator phenotype. Through DNA methylation analyses we have identified genes with known roles in ovarian carcinoma etiology, whereas pathway analyses provided biological insight to the role of novel genes. Our finding of differences between serous and endometrioid ovarian tumors indicates that intervention strategies could be developed to specifically address subtypes of epithelial ovarian cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0032941PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3293923PMC
July 2012
-->