Publications by authors named "Michiel van Galen"

12 Publications

  • Page 1 of 1

Disease variants alter transcription factor levels and methylation of their binding sites.

Nat Genet 2017 01 5;49(1):131-138. Epub 2016 Dec 5.

Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands.

Most disease-associated genetic variants are noncoding, making it challenging to design experiments to understand their functional consequences. Identification of expression quantitative trait loci (eQTLs) has been a powerful approach to infer the downstream effects of disease-associated variants, but most of these variants remain unexplained. The analysis of DNA methylation, a key component of the epigenome, offers highly complementary data on the regulatory potential of genomic regions. Here we show that disease-associated variants have widespread effects on DNA methylation in trans that likely reflect differential occupancy of trans binding sites by cis-regulated transcription factors. Using multiple omics data sets from 3,841 Dutch individuals, we identified 1,907 established trait-associated SNPs that affect the methylation levels of 10,141 different CpG sites in trans (false discovery rate (FDR) < 0.05). These included SNPs that affect both the expression of a nearby transcription factor (such as NFKB1, CTCF and NKX2-3) and methylation of its respective binding site across the genome. Trans methylation QTLs effectively expose the downstream effects of disease-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3721DOI Listing
January 2017

Identification of context-dependent expression quantitative trait loci in whole blood.

Nat Genet 2017 01 5;49(1):139-145. Epub 2016 Dec 5.

University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, the Netherlands.

Genetic risk factors often localize to noncoding regions of the genome with unknown effects on disease etiology. Expression quantitative trait loci (eQTLs) help to explain the regulatory mechanisms underlying these genetic associations. Knowledge of the context that determines the nature and strength of eQTLs may help identify cell types relevant to pathophysiology and the regulatory networks underlying disease. Here we generated peripheral blood RNA-seq data from 2,116 unrelated individuals and systematically identified context-dependent eQTLs using a hypothesis-free strategy that does not require previous knowledge of the identity of the modifiers. Of the 23,060 significant cis-regulated genes (false discovery rate (FDR) ≤ 0.05), 2,743 (12%) showed context-dependent eQTL effects. The majority of these effects were influenced by cell type composition. A set of 145 cis-eQTLs depended on type I interferon signaling. Others were modulated by specific transcription factors binding to the eQTL SNPs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3737DOI Listing
January 2017

Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms.

Genome Biol 2016 Sep 22;17(1):191. Epub 2016 Sep 22.

Molecular Epidemiology section, Leiden University Medical Center, Einthovenweg 20, 2333 ZC, Leiden, The Netherlands.

Background: Epigenetic change is a hallmark of ageing but its link to ageing mechanisms in humans remains poorly understood. While DNA methylation at many CpG sites closely tracks chronological age, DNA methylation changes relevant to biological age are expected to gradually dissociate from chronological age, mirroring the increased heterogeneity in health status at older ages.

Results: Here, we report on the large-scale identification of 6366 age-related variably methylated positions (aVMPs) identified in 3295 whole blood DNA methylation profiles, 2044 of which have a matching RNA-seq gene expression profile. aVMPs are enriched at polycomb repressed regions and, accordingly, methylation at those positions is associated with the expression of genes encoding components of polycomb repressive complex 2 (PRC2) in trans. Further analysis revealed trans-associations for 1816 aVMPs with an additional 854 genes. These trans-associated aVMPs are characterized by either an age-related gain of methylation at CpG islands marked by PRC2 or a loss of methylation at enhancers. This distinct pattern extends to other tissues and multiple cancer types. Finally, genes associated with aVMPs in trans whose expression is variably upregulated with age (733 genes) play a key role in DNA repair and apoptosis, whereas downregulated aVMP-associated genes (121 genes) are mapped to defined pathways in cellular metabolism.

Conclusions: Our results link age-related changes in DNA methylation to fundamental mechanisms that are thought to drive human ageing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-1053-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5032245PMC
September 2016

Blood lipids influence DNA methylation in circulating cells.

Genome Biol 2016 06 27;17(1):138. Epub 2016 Jun 27.

Molecular Epidemiology section, Leiden University Medical Center, Einthovenweg 20, Leiden, The Netherlands.

Background: Cells can be primed by external stimuli to obtain a long-term epigenetic memory. We hypothesize that long-term exposure to elevated blood lipids can prime circulating immune cells through changes in DNA methylation, a process that may contribute to the development of atherosclerosis. To interrogate the causal relationship between triglyceride, low-density lipoprotein (LDL) cholesterol, and high-density lipoprotein (HDL) cholesterol levels and genome-wide DNA methylation while excluding confounding and pleiotropy, we perform a stepwise Mendelian randomization analysis in whole blood of 3296 individuals.

Results: This analysis shows that differential methylation is the consequence of inter-individual variation in blood lipid levels and not vice versa. Specifically, we observe an effect of triglycerides on DNA methylation at three CpGs, of LDL cholesterol at one CpG, and of HDL cholesterol at two CpGs using multivariable Mendelian randomization. Using RNA-seq data available for a large subset of individuals (N = 2044), DNA methylation of these six CpGs is associated with the expression of CPT1A and SREBF1 (for triglycerides), DHCR24 (for LDL cholesterol) and ABCG1 (for HDL cholesterol), which are all key regulators of lipid metabolism.

Conclusions: Our analysis suggests a role for epigenetic priming in end-product feedback control of lipid metabolism and highlights Mendelian randomization as an effective tool to infer causal relationships in integrative genomics data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-1000-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4922056PMC
June 2016

Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients.

PLoS One 2016 14;11(6):e0157381. Epub 2016 Jun 14.

Department of Human Genetics, Leiden University Medical Centre, Leiden, The Netherlands.

Background And Aims: Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history.

Methods: Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants.

Results: Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1).

Conclusions: This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0157381PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4907507PMC
July 2017

Determining the quality and complexity of next-generation sequencing data without a reference genome.

Genome Biol 2014 ;15(12):555

We describe an open-source kPAL package that facilitates an alignment-free assessment of the quality and comparability of sequencing datasets by analyzing k-mer frequencies. We show that kPAL can detect technical artefacts such as high duplication rates, library chimeras, contamination and differences in library preparation protocols. kPAL also successfully captures the complexity and diversity of microbiomes and provides a powerful means to study changes in microbial communities. Together, these features make kPAL an attractive and broadly applicable tool to determine the quality and comparability of sequence libraries even in the absence of a reference sequence. kPAL is freely available at https://github.com/LUMC/kPAL webcite.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-014-0555-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4298064PMC
August 2015

Exome sequencing of germline DNA from non-BRCA1/2 familial breast cancer cases selected on the basis of aCGH tumor profiling.

PLoS One 2013 31;8(1):e55734. Epub 2013 Jan 31.

Department of Human Genetics, Leiden University Medical Centre, Leiden, The Netherlands.

The bulk of familial breast cancer risk (∼70%) cannot be explained by mutations in the known predisposition genes, primarily BRCA1 and BRCA2. Underlying genetic heterogeneity in these cases is the probable explanation for the failure of all attempts to identify further high-risk alleles. While exome sequencing of non-BRCA1/2 breast cancer cases is a promising strategy to detect new high-risk genes, rational approaches to the rigorous pre-selection of cases are needed to reduce heterogeneity. We selected six families in which the tumours of multiple cases showed a specific genomic profile on array comparative genomic hybridization (aCGH). Linkage analysis in these families revealed a region on chromosome 4 with a LOD score of 2.49 under homogeneity. We then analysed the germline DNA of two patients from each family using exome sequencing. Initially focusing on the linkage region, no potentially pathogenic variants could be identified in more than one family. Variants outside the linkage region were then analysed, and we detected multiple possibly pathogenic variants in genes that encode DNA integrity maintenance proteins. However, further analysis led to the rejection of all variants due to poor co-segregation or a relatively high allele frequency in a control population. We concluded that using CGH results to focus on a sub-set of families for sequencing analysis did not enable us to identify a common genetic change responsible for the aggregation of breast cancer in these families. Our data also support the emerging view that non-BRCA1/2 hereditary breast cancer families have a very heterogeneous genetic basis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0055734PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3561352PMC
July 2013

Directed adenovirus evolution using engineered mutator viral polymerases.

Nucleic Acids Res 2011 Mar 7;39(5):e30. Epub 2010 Dec 7.

Department of Molecular Cell Biology, Leiden University Medical Center, Leiden, 2300 RC, The Netherlands.

Adenoviruses (Ads) are the most frequently used viruses for oncolytic and gene therapy purposes. Most Ad-based vectors have been generated through rational design. Although this led to significant vector improvements, it is often hampered by an insufficient understanding of Ad's intricate functions and interactions. Here, to evade this issue, we adopted a novel, mutator Ad polymerase-based, 'accelerated-evolution' approach that can serve as general method to generate or optimize adenoviral vectors. First, we site specifically substituted Ad polymerase residues located in either the nucleotide binding pocket or the exonuclease domain. This yielded several polymerase mutants that, while fully supportive of viral replication, increased Ad's intrinsic mutation rate. Mutator activities of these mutants were revealed by performing deep sequencing on pools of replicated viruses. The strongest identified mutators carried replacements of residues implicated in ssDNA binding at the exonuclease active site. Next, we exploited these mutators to generate the genetic diversity required for directed Ad evolution. Using this new forward genetics approach, we isolated viral mutants with improved cytolytic activity. These mutants revealed a common mutation in a splice acceptor site preceding the gene for the adenovirus death protein (ADP). Accordingly, the isolated viruses showed high and untimely expression of ADP, correlating with a severe deregulation of E3 transcript splicing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkq1258DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3061072PMC
March 2011

Experiences with array-based sequence capture; toward clinical applications.

Eur J Hum Genet 2011 Jan 24;19(1):50-5. Epub 2010 Nov 24.

Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.

Although sequencing of a human genome gradually becomes an option, zooming in on the region of interest remains attractive and cost saving. We performed array-based sequence capture using 385K Roche NimbleGen, Inc. arrays to zoom in on the protein-coding and immediate intron-flanking sequences of 112 genes, potentially involved in mental retardation and congenital malformation. Captured material was sequenced using Illumina technology. A data analysis pipeline was built that detects sequence variants, positions them in relation to the gene, checks for presence in databases (eg, db single-nucleotide polymorphism (SNP)) and predicts the potential consequences at the level of RNA splicing and protein translation. In the samples analyzed, all known variants were reliably detected, including pathogenic variants from control cases and SNPs derived from array experiments. Although overall coverage varied considerably, it was reproducible per region and facilitated the detection of large deletions and duplications (copy number variations), including a partial deletion in the B3GALTL gene from a patient sample. For ultimate diagnostic application, overall results need to be improved. Future arrays should contain probes from both DNA strands, and to obtain a more even coverage, one could add fewer probes from densely and more probes from sparsely covered regions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2010.145DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3039511PMC
January 2011

Genome-wide assessment of differential roles for p300 and CBP in transcription regulation.

Nucleic Acids Res 2010 Sep 30;38(16):5396-408. Epub 2010 Apr 30.

Department of Molecular Cell Biology, Leiden University Medical Center, Postzone S4-0P, PO Box 9600, 2300 RC Leiden, The Netherlands.

Despite high levels of homology, transcription coactivators p300 and CREB binding protein (CBP) are both indispensable during embryogenesis. They are largely known to regulate the same genes. To identify genes preferentially regulated by p300 or CBP, we performed an extensive genome-wide survey using the ChIP-seq on cell-cycle synchronized cells. We found that 57% of the tags were within genes or proximal promoters, with an overall preference for binding to transcription start and end sites. The heterogeneous binding patterns possibly reflect the divergent roles of CBP and p300 in transcriptional regulation. Most of the 16 103 genes were bound by both CBP and p300. However, after stimulation 89 and 1944 genes were preferentially bound by CBP or p300, respectively. Target genes were found to be primarily involved in the regulation of metabolic and developmental processes, and transcription, with CBP showing a stronger preference than p300 for genes active in negative regulation of transcription. Analysis of transcription factor binding sites suggest that CBP and p300 have many partners in common, but AP-1 and Serum Response Factor (SRF) appear to be more prominent in CBP-specific sequences, whereas AP-2 and SP1 are enriched in p300-specific targets. Taken together, our findings further elucidate the distinct roles of coactivators p300 and CBP in transcriptional regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkq184DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2938195PMC
September 2010

Deep sequencing to reveal new variants in pooled DNA samples.

Hum Mutat 2009 Dec;30(12):1703-12

Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.

We evaluated massive parallel sequencing and long-range PCR (LRP) for rare variant detection and allele frequency estimation in pooled DNA samples. Exons 2 to 16 of the MUTYH gene were analyzed in breast cancer patients with Illumina's (Solexa) technology. From a pool of 287 genomic DNA samples we generated a single LRP product, while the same LRP was performed on 88 individual samples and the resulting products then pooled. Concentrations of constituent samples were measured with fluorimetry for genomic DNA and high-resolution melting curve analysis (HR-MCA) for LRP products. Illumina sequencing results were compared to Sanger sequencing data of individual samples. Correlation between allele frequencies detected by both methods was poor in the first pool, presumably because the genomic samples amplified unequally in the LRP, due to DNA quality variability. In contrast, allele frequencies correlated well in the second pool, in which all expected alleles at a frequency of 1% and higher were reliably detected, plus the majority of singletons (0.6% allele frequency). We describe custom bioinformatics and statistics to optimize detection of rare variants and to estimate required sequencing depth. Our results provide directions for designing high-throughput analyses of candidate genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.21122DOI Listing
December 2009

CORE_TF: a user-friendly interface to identify evolutionary conserved transcription factor binding sites in sets of co-regulated genes.

BMC Bioinformatics 2008 Nov 26;9:495. Epub 2008 Nov 26.

The Center for Human and Clinical Genetics, Leiden University Medical Center, Postzone S4-0P, PO Box 9600, 2300 RC Leiden, The Netherlands.

Background: The identification of transcription factor binding sites is difficult since they are only a small number of nucleotides in size, resulting in large numbers of false positives and false negatives in current approaches. Computational methods to reduce false positives are to look for over-representation of transcription factor binding sites in a set of similarly regulated promoters or to look for conservation in orthologous promoter alignments.

Results: We have developed a novel tool, "CORE_TF" (Conserved and Over-REpresented Transcription Factor binding sites) that identifies common transcription factor binding sites in promoters of co-regulated genes. To improve upon existing binding site predictions, the tool searches for position weight matrices from the TRANSFAC R database that are over-represented in an experimental set compared to a random set of promoters and identifies cross-species conservation of the predicted transcription factor binding sites. The algorithm has been evaluated with expression and chromatin-immunoprecipitation on microarray data. We also implement and demonstrate the importance of matching the random set of promoters to the experimental promoters by GC content, which is a unique feature of our tool.

Conclusion: The program CORE_TF is accessible in a user friendly web interface at http://www.LGTC.nl/CORE_TF. It provides a table of over-represented transcription factor binding sites in the users input genes' promoters and a graphical view of evolutionary conserved transcription factor binding sites. In our test data sets it successfully predicts target transcription factors and their binding sites.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-9-495DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2613159PMC
November 2008