Publications by authors named "Daria V Zhernakova"

28 Publications

  • Page 1 of 1

The Effect of Phenotype and Genotype on the Plasma Proteome in Patients with Inflammatory Bowel Disease.

J Crohns Colitis 2021 Sep 7. Epub 2021 Sep 7.

Department of Gastroenterology and Hepatology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.

Background And Aims: Protein profiling in patients with inflammatory bowel diseases (IBD) for diagnostic and therapeutic purposes is underexplored in IBD. This study analysed the association between phenotype, genotype and the plasma proteome in IBD.

Methods: Ninety-two (92) inflammation-related proteins were quantified in plasma of 1,028 patients with IBD (567 Crohn's disease [CD]; 461 ulcerative colitis [UC]) and 148 healthy individuals to assess protein-phenotype associations. Corresponding whole-exome sequencing and global screening array data of 919 patients with IBD were included to analyse the effect of genetics on protein levels (protein quantitative trait loci (pQTL) analysis). Intestinal mucosal RNA sequencing and fecal metagenomic data were used for complementary analyses.

Results: Thirty-two (32) proteins were differentially abundant between IBD and healthy individuals, of which 22 proteins independent of active inflammation. Sixty-nine (69) proteins were associated with 15 demographic and clinical factors. Fibroblast growth factor-19 levels were decreased in CD patients with ileal disease or a history of ileocecal resection. Thirteen novel cis-pQTLs were identified and 10 replicated from previous studies. One trans-pQTL of the fucosyltransferase 2 (FUT2) gene (rs602662) and two independent cis-pQTLs of C-C motif chemokine 25 (CCL25) affected plasma CCL25 levels. Intestinal gene expression data revealed an overlapping cis-expression (e)QTL-variant (rs3745387) of the CCL25 gene. The FUT2 rs602662 trans-pQTL was associated with reduced abundances of fecal butyrate-producing bacteria.

Conclusions: This study shows that genotype and multiple disease phenotypes strongly associate with the plasma inflammatory proteome in IBD and identifies disease-associated pathways that may help to improve disease management in the future.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/ecco-jcc/jjab157DOI Listing
September 2021

Draft de novo genome assembly of the elusive jaguarundi, Puma yagouaroundi.

J Hered 2021 Jun 19. Epub 2021 Jun 19.

ITMO University, Computer Technologies Laboratory, St. Petersburg, Russia.

The Puma lineage within the family Felidae consists of three species that last shared a common ancestor around 4.9 million years ago. Whole-genome sequences of two species from the lineage were previously reported: the cheetah (Acinonyx jubatus) and the mountain lion (Puma concolor). The present report describes a whole-genome assembly of the remaining species, the jaguarundi (Puma yagouaroundi). We sequenced the genome of a male jaguarundi with 10X Genomics linked reads and assembled the whole-genome sequence. The assembled genome contains a series of scaffolds that reach the length of chromosome arms and is similar in scaffold contiguity to the genome assemblies of cheetah and puma, with a contig N50 = 100.2 kbp and a scaffold N50 = 49.27 Mbp. We assessed the assembled sequence of the jaguarundi genome using BUSCO, aligned reads of the sequenced individual and another published female jaguarundi to the assembled genome, annotated protein-coding genes, repeats, genomic variants and their effects with respect to the protein-coding genes, and analyzed differences of the two jaguarundis from the reference mitochondrial genome. The jaguarundi genome assembly and its annotation were compared in quality, variants and features to the previously reported genome assemblies of puma and cheetah. Computational analyzes used in the study were implemented in transparent and reproducible way to allow their further reuse and modification.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/jhered/esab036DOI Listing
June 2021

Mendelian randomisation identifies alternative splicing of the FAS death receptor as a mediator of severe COVID-19.

medRxiv 2021 Apr 7. Epub 2021 Apr 7.

Severe COVID-19 is characterised by immunopathology and epithelial injury. Proteomic studies have identified circulating proteins that are biomarkers of severe COVID-19, but cannot distinguish correlation from causation. To address this, we performed Mendelian randomisation (MR) to identify proteins that mediate severe COVID-19. Using protein quantitative trait loci (pQTL) data from the SCALLOP consortium, involving meta-analysis of up to 26,494 individuals, and COVID-19 genome-wide association data from the Host Genetics Initiative, we performed MR for 157 COVID-19 severity protein biomarkers. We identified significant MR results for five proteins: FAS, TNFRSF10A, CCL2, EPHB4 and LGALS9. Further evaluation of these candidates using sensitivity analyses and colocalization testing provided strong evidence to implicate the apoptosis-associated cytokine receptor FAS as a causal mediator of severe COVID-19. This effect was specific to severe disease. Using RNA-seq data from 4,778 individuals, we demonstrate that the pQTL at the locus results from genetically influenced alternate splicing causing skipping of exon 6. We show that the risk allele for very severe COVID-19 increases the proportion of transcripts lacking exon 6, and thereby increases soluble FAS. Soluble FAS acts as a decoy receptor for FAS-ligand, inhibiting apoptosis induced through membrane-bound FAS. In summary, we demonstrate a novel genetic mechanism that contributes to risk of severe of COVID-19, highlighting a pathway that may be a promising therapeutic target.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2021.04.01.21254789DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8043484PMC
April 2021

Large-scale association analyses identify host factors influencing human gut microbiome composition.

Nat Genet 2021 02 18;53(2):156-165. Epub 2021 Jan 18.

Department of Twin Research & Genetic Epidemiology, King's College London, London, UK.

To study the effect of host genetics on gut microbiome composition, the MiBioGen consortium curated and analyzed genome-wide genotypes and 16S fecal microbiome data from 18,340 individuals (24 cohorts). Microbial composition showed high variability across cohorts: only 9 of 410 genera were detected in more than 95% of samples. A genome-wide association study of host genetic variation regarding microbial taxa identified 31 loci affecting the microbiome at a genome-wide significant (P < 5 × 10) threshold. One locus, the lactase (LCT) gene locus, reached study-wide significance (genome-wide association study signal: P = 1.28 × 10), and it showed an age-dependent association with Bifidobacterium abundance. Other associations were suggestive (1.95 × 10 < P < 5 × 10) but enriched for taxa showing high heritability and for genes expressed in the intestine and brain. A phenome-wide association study and Mendelian randomization identified enrichment of microbiome trait loci in the metabolic, nutrition and environment domains and suggested the microbiome might have causal effects in ulcerative colitis and rheumatoid arthritis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-020-00763-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8515199PMC
February 2021

Genomic and drug target evaluation of 90 cardiovascular proteins in 30,931 individuals.

Nat Metab 2020 10 16;2(10):1135-1148. Epub 2020 Oct 16.

SCALLOP consortium.

Circulating proteins are vital in human health and disease and are frequently used as biomarkers for clinical decision-making or as targets for pharmacological intervention. Here, we map and replicate protein quantitative trait loci (pQTL) for 90 cardiovascular proteins in over 30,000 individuals, resulting in 451 pQTLs for 85 proteins. For each protein, we further perform pathway mapping to obtain trans-pQTL gene and regulatory designations. We substantiate these regulatory findings with orthogonal evidence for trans-pQTLs using mouse knockdown experiments (ABCA1 and TRIB1) and clinical trial results (chemokine receptors CCR2 and CCR5), with consistent regulation. Finally, we evaluate known drug targets, and suggest new target candidates or repositioning opportunities using Mendelian randomization. This identifies 11 proteins with causal evidence of involvement in human disease that have not previously been targeted, including EGF, IL-16, PAPPA, SPON1, F3, ADM, CASP-8, CHI3L1, CXCL16, GDF15 and MMP-12. Taken together, these findings demonstrate the utility of large-scale mapping of the genetics of the proteome and provide a resource for future precision studies of circulating proteins in human health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s42255-020-00287-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7611474PMC
October 2020

Genetic and Microbial Associations to Plasma and Fecal Bile Acids in Obesity Relate to Plasma Lipids and Liver Fat Content.

Cell Rep 2020 10;33(1):108212

Department of Pediatrics, University of Groningen, University Medical Center Groningen, Groningen 9713AV, the Netherlands; Department of Laboratory Medicine, University of Groningen, University Medical Center Groningen, Groningen 9713AV, the Netherlands. Electronic address:

Bile acids (BAs) are implicated in the etiology of obesity-related conditions such as non-alcoholic fatty liver disease. Differently structured BA species display variable signaling activities via farnesoid X receptor (FXR) and Takeda G protein-coupled BA receptor 1 (TGR5). This study profiles plasma and fecal BAs and plasma 7α-hydroxy-4-cholesten-3-one (C4) in 297 persons with obesity, identifies underlying genetic and microbial determinants, and establishes BA correlations with liver fat and plasma lipid parameters. We identify 27 genetic associations (p < 5 × 10) and 439 microbial correlations (FDR < 0.05) for 50 BA entities. Additionally, we report 111 correlations between BA and 88 lipid parameters (FDR < 0.05), mainly for C4 reflecting hepatic BA synthesis. Inter-individual variability in the plasma BA profile does not reflect hepatic BA synthetic pathways, but rather transport and metabolism within the enterohepatic circulation. Our study reveals genetic and microbial determinants of BAs in obesity and their relationship to disease-relevant lipid parameters that are important for the design of personalized therapies targeting BA-signaling pathways.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2020.108212DOI Listing
October 2020

New Gene Variants Associated with the Risk of Chronic HBV Infection.

Virol Sin 2020 Aug 15;35(4):378-387. Epub 2020 Apr 15.

Department of Infectious Diseases, Peking University First Hospital, Beijing, 100034, China.

Some patients with chronic hepatitis B virus (HBV) infection failed to clear HBV, even persistently continue to produce antibodies to HBV. Here we performed a two stage genome wide association study in a cohort of Chinese patients designed to discover single nucleotide variants that associate with HBV infection and clearance of HBV. The first stage involved genome wide exome sequencing of 101 cases (HBsAg plus anti-HBs positive) compared with 102 control patients (anti-HBs positive, HBsAg negative). Over 80% of individual sequences displayed 20 × sequence coverage. Adapters, uncertain bases > 10% or low-quality base calls (> 50%) were filtered and compared to the human reference genome hg19. In the second stage, 579 chronic HBV infected cases and 439 HBV clearance controls were sequenced with selected genes from the first stage. Although there were no significant associated gene variants in the first stage, two significant gene associations were discovered when the two stages were assessed in a combined analysis. One association showed rs506121-"T" allele [within the dedicator of cytokinesis 8 (DOCK8) gene] was higher in chronic HBV infection group than that in clearance group (P = 0.002, OR = 0.77, 95% CI [0.65, 0.91]). The second association involved rs2071676-A allele within the Carbonic anhydrase (CA9) gene that was significantly elevated in chronic HBV infection group compared to the clearance group (P = 0.0003, OR = 1.35, 95% CI [1.15, 1.58]). Upon replication these gene associations would suggest the influence of DOCK8 and CA9 as potential risk genetic factors in the persistence of HBV infection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s12250-020-00200-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7462954PMC
August 2020

Genome-wide sequence analyses of ethnic populations across Russia.

Genomics 2020 01 19;112(1):442-458. Epub 2019 Mar 19.

Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russian Federation.

The Russian Federation is the largest and one of the most ethnically diverse countries in the world, however no centralized reference database of genetic variation exists to date. Such data are crucial for medical genetics and essential for studying population history. The Genome Russia Project aims at filling this gap by performing whole genome sequencing and analysis of peoples of the Russian Federation. Here we report the characterization of genome-wide variation of 264 healthy adults, including 60 newly sequenced samples. People of Russia carry known and novel genetic variants of adaptive, clinical and functional consequence that in many cases show allele frequency divergence from neighboring populations. Population genetics analyses revealed six phylogeographic partitions among indigenous ethnicities corresponding to their geographic locales. This study presents a characterization of population-specific genomic variation in Russia with results important for medical genetics and for understanding the dynamic population history of the world's largest country.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ygeno.2019.03.007DOI Listing
January 2020

Author Correction: Individual variations in cardiovascular-disease-related protein levels are driven by genetics and gut microbiome.

Nat Genet 2018 12;50(12):1752

Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.

In the version of this paper originally published, there was a typographical error. In the Discussion, the sentence "In line with this, Ep-CAM-deficient mice exhibited increased intestinal permeability and decreased ion transport, which may contribute to CVD susceptibility risk" originally read iron instead of ion transport. This error has been corrected in the HTML, PDF and print versions of the article.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0275-9DOI Listing
December 2018

Individual variations in cardiovascular-disease-related protein levels are driven by genetics and gut microbiome.

Nat Genet 2018 11 24;50(11):1524-1532. Epub 2018 Sep 24.

Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.

Despite a growing body of evidence, the role of the gut microbiome in cardiovascular diseases is still unclear. Here, we present a systems-genome-wide and metagenome-wide association study on plasma concentrations of 92 cardiovascular-disease-related proteins in the population cohort LifeLines-DEEP. We identified genetic components for 73 proteins and microbial associations for 41 proteins, of which 31 were associated to both. The genetic and microbial factors identified mostly exert additive effects and collectively explain up to 76.6% of inter-individual variation (17.5% on average). Genetics contribute most to concentrations of immune-related proteins, while the gut microbiome contributes most to proteins involved in metabolism and intestinal health. We found several host-microbe interactions that impact proteins involved in epithelial function, lipid metabolism, and central nervous system function. This study provides important evidence for a joint genetic and microbial effect in cardiovascular disease and provides directions for future applications in personalized medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0224-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6241851PMC
November 2018

Analytical "bake-off" of whole genome sequencing quality for the Genome Russia project using a small cohort for autoimmune hepatitis.

PLoS One 2018 11;13(7):e0200423. Epub 2018 Jul 11.

Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russian Federation.

A comparative analysis of whole genome sequencing (WGS) and genotype calling was initiated for ten human genome samples sequenced by St. Petersburg State University Peterhof Sequencing Center and by three commercial sequencing centers outside of Russia. The sequence quality, efficiency of DNA variant and genotype calling were compared with each other and with DNA microarrays for each of ten study subjects. We assessed calling of SNPs, indels, copy number variation, and the speed of WGS throughput promised. Twenty separate QC analyses showed high similarities among the sequence quality and called genotypes. The ten genomes tested by the centers included eight American patients afflicted with autoimmune hepatitis (AIH), plus one case's unaffected parents, in a prelude to discovering genetic influences in this rare disease of unknown etiology. The detailed internal replication and parallel analyses allowed the observation of two of eight AIH cases carrying a rare allele genotype for a previously described AIH-associated gene (FTCD), plus multiple occurrences of known HLA-DRB1 alleles associated with AIH (HLA-DRB1-03:01:01, 13:01:01 and 7:01:01). We also list putative SNVs in other genes as suggestive in AIH influence.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0200423PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6040705PMC
January 2019

A SNP panel for identification of DNA and RNA specimens.

BMC Genomics 2018 01 25;19(1):90. Epub 2018 Jan 25.

Department of Human Genetics, Leiden University Medical Center, Postzone S4-P, PO Box 9600, 2300 RC, Leiden, The Netherlands.

Background: SNP panels that uniquely identify an individual are useful for genetic and forensic research. Previously recommended SNP panels are based on DNA profiles and mostly contain intragenic SNPs. With the increasing interest in RNA expression profiles, we aimed for establishing a SNP panel for both DNA and RNA-based genotyping.

Results: To determine a small set of SNPs with maximally discriminative power, genotype calls were obtained from DNA and blood-derived RNA sequencing data belonging to healthy, geographically dispersed, Dutch individuals. SNPs were selected based on different criteria like genotype call rate, minor allele frequency, Hardy-Weinberg equilibrium and linkage disequilibrium. A panel of 50 SNPs was sufficient to identify an individual uniquely: the probability of identity was 6.9 × 10 when assuming no family relations and 1.2 × 10 when accounting for the presence of full sibs. The ability of the SNP panel to uniquely identify individuals on DNA and RNA level was validated in an independent population dataset. The panel is applicable to individuals from European descent, with slightly lower power in non-Europeans. Whereas most of the genes containing the 50 SNPs are expressed in various tissues, our SNP panel needs optimization for other tissues than blood.

Conclusions: This first DNA/RNA SNP panel will be useful to identify sample mix-ups in biomedical research and for assigning DNA and RNA stains in crime scenes to unique individuals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-018-4482-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5785835PMC
January 2018

Disease variants alter transcription factor levels and methylation of their binding sites.

Nat Genet 2017 01 5;49(1):131-138. Epub 2016 Dec 5.

Department of Epidemiology, ErasmusMC, Rotterdam, the Netherlands.

Most disease-associated genetic variants are noncoding, making it challenging to design experiments to understand their functional consequences. Identification of expression quantitative trait loci (eQTLs) has been a powerful approach to infer the downstream effects of disease-associated variants, but most of these variants remain unexplained. The analysis of DNA methylation, a key component of the epigenome, offers highly complementary data on the regulatory potential of genomic regions. Here we show that disease-associated variants have widespread effects on DNA methylation in trans that likely reflect differential occupancy of trans binding sites by cis-regulated transcription factors. Using multiple omics data sets from 3,841 Dutch individuals, we identified 1,907 established trait-associated SNPs that affect the methylation levels of 10,141 different CpG sites in trans (false discovery rate (FDR) < 0.05). These included SNPs that affect both the expression of a nearby transcription factor (such as NFKB1, CTCF and NKX2-3) and methylation of its respective binding site across the genome. Trans methylation QTLs effectively expose the downstream effects of disease-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3721DOI Listing
January 2017

Identification of context-dependent expression quantitative trait loci in whole blood.

Nat Genet 2017 01 5;49(1):139-145. Epub 2016 Dec 5.

Molecular Epidemiology Section, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, the Netherlands.

Genetic risk factors often localize to noncoding regions of the genome with unknown effects on disease etiology. Expression quantitative trait loci (eQTLs) help to explain the regulatory mechanisms underlying these genetic associations. Knowledge of the context that determines the nature and strength of eQTLs may help identify cell types relevant to pathophysiology and the regulatory networks underlying disease. Here we generated peripheral blood RNA-seq data from 2,116 unrelated individuals and systematically identified context-dependent eQTLs using a hypothesis-free strategy that does not require previous knowledge of the identity of the modifiers. Of the 23,060 significant cis-regulated genes (false discovery rate (FDR) ≤ 0.05), 2,743 (12%) showed context-dependent eQTL effects. The majority of these effects were influenced by cell type composition. A set of 145 cis-eQTLs depended on type I interferon signaling. Others were modulated by specific transcription factors binding to the eQTL SNPs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3737DOI Listing
January 2017

Genome-wide analysis identifies 12 loci influencing human reproductive behavior.

Nat Genet 2016 12 31;48(12):1462-1472. Epub 2016 Oct 31.

Department of Internal Medicine, Erasmus Medical Center, Rotterdam, the Netherlands.

The genetic architecture of human reproductive behavior-age at first birth (AFB) and number of children ever born (NEB)-has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified, and the underlying mechanisms of AFB and NEB are poorly understood. We report a large genome-wide association study of both sexes including 251,151 individuals for AFB and 343,072 individuals for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study and 4 additional loci associated in a gene-based effort. These loci harbor genes that are likely to have a role, either directly or by affecting non-local gene expression, in human reproduction and infertility, thereby increasing understanding of these complex traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3698DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5695684PMC
December 2016

The effect of host genetics on the gut microbiome.

Nat Genet 2016 11 3;48(11):1407-1412. Epub 2016 Oct 3.

University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, the Netherlands.

The gut microbiome is affected by multiple factors, including genetics. In this study, we assessed the influence of host genetics on microbial species, pathways and gene ontology categories, on the basis of metagenomic sequencing in 1,514 subjects. In a genome-wide analysis, we identified associations of 9 loci with microbial taxonomies and 33 loci with microbial pathways and gene ontology terms at P < 5 × 10. Additionally, in a targeted analysis of regions involved in complex diseases, innate and adaptive immunity, or food preferences, 32 loci were identified at the suggestive level of P < 5 × 10. Most of our reported associations are new, including genome-wide significance for the C-type lectin molecules CLEC4F-CD207 at 2p13.3 and CLEC4A-FAM90A1 at 12p13. We also identified association of a functional LCT SNP with the Bifidobacterium genus (P = 3.45 × 10) and provide evidence of a gene-diet interaction in the regulation of Bifidobacterium abundance. Our results demonstrate the importance of understanding host-microbe interactions to gain better insight into human health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3663DOI Listing
November 2016

Age-related accrual of methylomic variability is linked to fundamental ageing mechanisms.

Genome Biol 2016 Sep 22;17(1):191. Epub 2016 Sep 22.

Department of Internal Medicine and School for Cardiovascular Diseases (CARIM), Maastricht University Medical Center, Universiteitssingel 50, 6229 ER, Maastricht, The Netherlands.

Background: Epigenetic change is a hallmark of ageing but its link to ageing mechanisms in humans remains poorly understood. While DNA methylation at many CpG sites closely tracks chronological age, DNA methylation changes relevant to biological age are expected to gradually dissociate from chronological age, mirroring the increased heterogeneity in health status at older ages.

Results: Here, we report on the large-scale identification of 6366 age-related variably methylated positions (aVMPs) identified in 3295 whole blood DNA methylation profiles, 2044 of which have a matching RNA-seq gene expression profile. aVMPs are enriched at polycomb repressed regions and, accordingly, methylation at those positions is associated with the expression of genes encoding components of polycomb repressive complex 2 (PRC2) in trans. Further analysis revealed trans-associations for 1816 aVMPs with an additional 854 genes. These trans-associated aVMPs are characterized by either an age-related gain of methylation at CpG islands marked by PRC2 or a loss of methylation at enhancers. This distinct pattern extends to other tissues and multiple cancer types. Finally, genes associated with aVMPs in trans whose expression is variably upregulated with age (733 genes) play a key role in DNA repair and apoptosis, whereas downregulated aVMP-associated genes (121 genes) are mapped to defined pathways in cellular metabolism.

Conclusions: Our results link age-related changes in DNA methylation to fundamental mechanisms that are thought to drive human ageing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-1053-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5032245PMC
September 2016

Blood lipids influence DNA methylation in circulating cells.

Genome Biol 2016 06 27;17(1):138. Epub 2016 Jun 27.

Department of Genetics, University of Groningen, University Medical Centre Groningen, Broerstraat 5, Groningen, The Netherlands.

Background: Cells can be primed by external stimuli to obtain a long-term epigenetic memory. We hypothesize that long-term exposure to elevated blood lipids can prime circulating immune cells through changes in DNA methylation, a process that may contribute to the development of atherosclerosis. To interrogate the causal relationship between triglyceride, low-density lipoprotein (LDL) cholesterol, and high-density lipoprotein (HDL) cholesterol levels and genome-wide DNA methylation while excluding confounding and pleiotropy, we perform a stepwise Mendelian randomization analysis in whole blood of 3296 individuals.

Results: This analysis shows that differential methylation is the consequence of inter-individual variation in blood lipid levels and not vice versa. Specifically, we observe an effect of triglycerides on DNA methylation at three CpGs, of LDL cholesterol at one CpG, and of HDL cholesterol at two CpGs using multivariable Mendelian randomization. Using RNA-seq data available for a large subset of individuals (N = 2044), DNA methylation of these six CpGs is associated with the expression of CPT1A and SREBF1 (for triglycerides), DHCR24 (for LDL cholesterol) and ABCG1 (for HDL cholesterol), which are all key regulators of lipid metabolism.

Conclusions: Our analysis suggests a role for epigenetic priming in end-product feedback control of lipid metabolism and highlights Mendelian randomization as an effective tool to infer causal relationships in integrative genomics data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-1000-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4922056PMC
June 2016

Refined mapping of autoimmune disease associated genetic variants with gene expression suggests an important role for non-coding RNAs.

J Autoimmun 2016 Apr 18;68:62-74. Epub 2016 Feb 18.

University of Groningen, University Medical Centre Groningen, Department of Genetics, Groningen, 9700 RB, The Netherlands. Electronic address:

Genome-wide association and fine-mapping studies in 14 autoimmune diseases (AID) have implicated more than 250 loci in one or more of these diseases. As more than 90% of AID-associated SNPs are intergenic or intronic, pinpointing the causal genes is challenging. We performed a systematic analysis to link 460 SNPs that are associated with 14 AID to causal genes using transcriptomic data from 629 blood samples. We were able to link 71 (39%) of the AID-SNPs to two or more nearby genes, providing evidence that for part of the AID loci multiple causal genes exist. While 54 of the AID loci are shared by one or more AID, 17% of them do not share candidate causal genes. In addition to finding novel genes such as ULK3, we also implicate novel disease mechanisms and pathways like autophagy in celiac disease pathogenesis. Furthermore, 42 of the AID SNPs specifically affected the expression of 53 non-coding RNA genes. To further understand how the non-coding genome contributes to AID, the SNPs were linked to functional regulatory elements, which suggest a model where AID genes are regulated by network of chromatin looping/non-coding RNAs interactions. The looping model also explains how a causal candidate gene is not necessarily the gene closest to the AID SNP, which was the case in nearly 50% of cases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jaut.2016.01.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5391837PMC
April 2016

Functional implications of disease-specific variants in loci jointly associated with coeliac disease and rheumatoid arthritis.

Hum Mol Genet 2016 Jan 5;25(1):180-90. Epub 2015 Nov 5.

Department of Genetics, University Medical Centre Groningen, University of Groningen, Groningen, The Netherlands,

Hundreds of genomic loci have been associated with a significant number of immune-mediated diseases, and a large proportion of these associated loci are shared among traits. Both the molecular mechanisms by which these loci confer disease susceptibility and the extent to which shared loci are implicated in a common pathogenesis are unknown. We therefore sought to dissect the functional components at loci shared between two autoimmune diseases: coeliac disease (CeD) and rheumatoid arthritis (RA). We used a cohort of 12 381 CeD cases and 7827 controls, and another cohort of 13 819 RA cases and 12 897 controls, all genotyped with the Immunochip platform. In the joint analysis, we replicated 19 previously identified loci shared by CeD and RA and discovered five new non-HLA loci shared by CeD and RA. Our fine-mapping results indicate that in nine of 24 shared loci the associated variants are distinct in the two diseases. Using cell-type-specific histone markers, we observed that loci which pointed to the same variants in both diseases were enriched for marks of promoters active in CD14+ and CD34+ immune cells (P < 0.001), while loci pointing to distinct variants in one of the two diseases showed enrichment for marks of more specialized cell types, like CD4+ regulatory T cells in CeD (P < 0.0001) compared with Th17 and CD15+ in RA (P = 0.0029).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddv455DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4690494PMC
January 2016

An integrative systems genetics approach reveals potential causal genes and pathways related to obesity.

Genome Med 2015 Oct 20;7:105. Epub 2015 Oct 20.

Department of Veterinary Clinical and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 7, 1870, Frederiksberg C, Denmark.

Background: Obesity is a multi-factorial health problem in which genetic factors play an important role. Limited results have been obtained in single-gene studies using either genomic or transcriptomic data. RNA sequencing technology has shown its potential in gaining accurate knowledge about the transcriptome, and may reveal novel genes affecting complex diseases. Integration of genomic and transcriptomic variation (expression quantitative trait loci [eQTL] mapping) has identified causal variants that affect complex diseases. We integrated transcriptomic data from adipose tissue and genomic data from a porcine model to investigate the mechanisms involved in obesity using a systems genetics approach.

Methods: Using a selective gene expression profiling approach, we selected 36 animals based on a previously created genomic Obesity Index for RNA sequencing of subcutaneous adipose tissue. Differential expression analysis was performed using the Obesity Index as a continuous variable in a linear model. eQTL mapping was then performed to integrate 60 K porcine SNP chip data with the RNA sequencing data. Results were restricted based on genome-wide significant single nucleotide polymorphisms, detected differentially expressed genes, and previously detected co-expressed gene modules. Further data integration was performed by detecting co-expression patterns among eQTLs and integration with protein data.

Results: Differential expression analysis of RNA sequencing data revealed 458 differentially expressed genes. The eQTL mapping resulted in 987 cis-eQTLs and 73 trans-eQTLs (false discovery rate < 0.05), of which the cis-eQTLs were associated with metabolic pathways. We reduced the eQTL search space by focusing on differentially expressed and co-expressed genes and disease-associated single nucleotide polymorphisms to detect obesity-related genes and pathways. Building a co-expression network using eQTLs resulted in the detection of a module strongly associated with lipid pathways. Furthermore, we detected several obesity candidate genes, for example, ENPP1, CTSL, and ABHD12B.

Conclusions: To our knowledge, this is the first study to perform an integrated genomics and transcriptomics (eQTL) study using, and modeling, genomic and subcutaneous adipose tissue RNA sequencing data on obesity in a porcine model. We detected several pathways and potential causal genes for obesity. Further validation and investigation may reveal their exact function and association with obesity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-015-0229-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4617184PMC
October 2015

Calling genotypes from public RNA-sequencing data enables identification of genetic variants that affect gene-expression levels.

Genome Med 2015 27;7(1):30. Epub 2015 Mar 27.

University of Groningen, University Medical Center Groningen, Department of Genetics, 9700 RB Groningen, The Netherlands.

Background: RNA-sequencing (RNA-seq) is a powerful technique for the identification of genetic variants that affect gene-expression levels, either through expression quantitative trait locus (eQTL) mapping or through allele-specific expression (ASE) analysis. Given increasing numbers of RNA-seq samples in the public domain, we here studied to what extent eQTLs and ASE effects can be identified when using public RNA-seq data while deriving the genotypes from the RNA-sequencing reads themselves.

Methods: We downloaded the raw reads for all available human RNA-seq datasets. Using these reads we performed gene expression quantification. All samples were jointly normalized and subjected to a strict quality control. We also derived genotypes using the RNA-seq reads and used imputation to infer non-coding variants. This allowed us to perform eQTL mapping and ASE analyses jointly on all samples that passed quality control. Our results were validated using samples for which DNA-seq genotypes were available.

Results: 4,978 public human RNA-seq runs, representing many different tissues and cell-types, passed quality control. Even though these data originated from many different laboratories, samples reflecting the same cell type clustered together, suggesting that technical biases due to different sequencing protocols are limited. In a joint analysis on the 1,262 samples with high quality genotypes, we identified cis-eQTLs effects for 8,034 unique genes (at a false discovery rate ≤0.05). eQTL mapping on individual tissues revealed that a limited number of samples already suffice to identify tissue-specific eQTLs for known disease-associated genetic variants. Additionally, we observed strong ASE effects for 34 rare pathogenic variants, corroborating previously observed effects on the corresponding protein levels.

Conclusions: By deriving and imputing genotypes from RNA-seq data, it is possible to identify both eQTLs and ASE effects. Given the exponential growth of the number of publicly available RNA-seq samples, we expect this approach will become especially relevant for studying the effects of tissue-specific and rare pathogenic genetic variants to aid clinical interpretation of exome and genome sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-015-0152-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4423486PMC
May 2015

Expression profiles of long non-coding RNAs located in autoimmune disease-associated regions reveal immune cell-type specificity.

Genome Med 2014 28;6(10):88. Epub 2014 Oct 28.

Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.

Background: Although genome-wide association studies (GWAS) have identified hundreds of variants associated with a risk for autoimmune and immune-related disorders (AID), our understanding of the disease mechanisms is still limited. In particular, more than 90% of the risk variants lie in non-coding regions, and almost 10% of these map to long non-coding RNA transcripts (lncRNAs). lncRNAs are known to show more cell-type specificity than protein-coding genes.

Methods: We aimed to characterize lncRNAs and protein-coding genes located in loci associated with nine AIDs which have been well-defined by Immunochip analysis and by transcriptome analysis across seven populations of peripheral blood leukocytes (granulocytes, monocytes, natural killer (NK) cells, B cells, memory T cells, naive CD4(+) and naive CD8(+) T cells) and four populations of cord blood-derived T-helper cells (precursor, primary, and polarized (Th1, Th2) T-helper cells).

Results: We show that lncRNAs mapping to loci shared between AID are significantly enriched in immune cell types compared to lncRNAs from the whole genome (α <0.005). We were not able to prioritize single cell types relevant for specific diseases, but we observed five different cell types enriched (α <0.005) in five AID (NK cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, and psoriasis; memory T and CD8(+) T cells in juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis; Th0 and Th2 cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis). Furthermore, we show that co-expression analyses of lncRNAs and protein-coding genes can predict the signaling pathways in which these AID-associated lncRNAs are involved.

Conclusions: The observed enrichment of lncRNA transcripts in AID loci implies lncRNAs play an important role in AID etiology and suggests that lncRNA genes should be studied in more detail to interpret GWAS findings correctly. The co-expression results strongly support a model in which the lncRNA and protein-coding genes function together in the same pathways.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-014-0088-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4240855PMC
November 2014

Identification of co-expression gene networks, regulatory genes and pathways for obesity based on adipose tissue RNA Sequencing in a porcine model.

BMC Med Genomics 2014 Sep 30;7:57. Epub 2014 Sep 30.

Department of Veterinary Clinical and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 7, 1870, Frederiksberg, Denmark.

Background: Obesity is a complex metabolic condition in strong association with various diseases, like type 2 diabetes, resulting in major public health and economic implications. Obesity is the result of environmental and genetic factors and their interactions, including genome-wide genetic interactions. Identification of co-expressed and regulatory genes in RNA extracted from relevant tissues representing lean and obese individuals provides an entry point for the identification of genes and pathways of importance to the development of obesity. The pig, an omnivorous animal, is an excellent model for human obesity, offering the possibility to study in-depth organ-level transcriptomic regulations of obesity, unfeasible in humans. Our aim was to reveal adipose tissue co-expression networks, pathways and transcriptional regulations of obesity using RNA Sequencing based systems biology approaches in a porcine model.

Methods: We selected 36 animals for RNA Sequencing from a previously created F2 pig population representing three extreme groups based on their predicted genetic risks for obesity. We applied Weighted Gene Co-expression Network Analysis (WGCNA) to detect clusters of highly co-expressed genes (modules). Additionally, regulator genes were detected using Lemon-Tree algorithms.

Results: WGCNA revealed five modules which were strongly correlated with at least one obesity-related phenotype (correlations ranging from -0.54 to 0.72, P < 0.001). Functional annotation identified pathways enlightening the association between obesity and other diseases, like osteoporosis (osteoclast differentiation, P = 1.4E-7), and immune-related complications (e.g. Natural killer cell mediated cytotoxity, P = 3.8E-5; B cell receptor signaling pathway, P = 7.2E-5). Lemon-Tree identified three potential regulator genes, using confident scores, for the WGCNA module which was associated with osteoclast differentiation: CCR1, MSR1 and SI1 (probability scores respectively 95.30, 62.28, and 34.58). Moreover, detection of differentially connected genes identified various genes previously identified to be associated with obesity in humans and rodents, e.g. CSF1R and MARC2.

Conclusions: To our knowledge, this is the first study to apply systems biology approaches using porcine adipose tissue RNA-Sequencing data in a genetically characterized porcine model for obesity. We revealed complex networks, pathways, candidate and regulatory genes related to obesity, confirming the complexity of obesity and its association with immune-related disorders and osteoporosis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1755-8794-7-57DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4183073PMC
September 2014

Systematic annotation of celiac disease loci refines pathological pathways and suggests a genetic explanation for increased interferon-gamma levels.

Hum Mol Genet 2015 Jan 4;24(2):397-409. Epub 2014 Sep 4.

Department of Genetics

Although genome-wide association studies and fine mapping have identified 39 non-HLA loci associated with celiac disease (CD), it is difficult to pinpoint the functional variants and susceptibility genes in these loci. We applied integrative approaches to annotate and prioritize functional single nucleotide polymorphisms (SNPs), genes and pathways affected in CD. CD-associated SNPs were intersected with regulatory elements categorized by the ENCODE project to prioritize functional variants, while results from cis-expression quantitative trait loci (eQTL) mapping in 1469 blood samples were combined with co-expression analyses to prioritize causative genes. To identify the key cell types involved in CD, we performed pathway analysis on RNA-sequencing data from different immune cell populations and on publicly available expression data on non-immune tissues. We discovered that CD SNPs are significantly enriched in B-cell-specific enhancer regions, suggesting that, besides T-cell processes, B-cell responses play a major role in CD. By combining eQTL and co-expression analyses, we prioritized 43 susceptibility genes in 36 loci. Pathway and tissue-specific expression analyses on these genes suggested enrichment of CD genes in the Th1, Th2 and Th17 pathways, but also predicted a role for four genes in the intestinal barrier function. We also discovered an intricate transcriptional connectivity between CD susceptibility genes and interferon-γ, a key effector in CD, despite the absence of CD-associated SNPs in the IFNG locus. Using systems biology, we prioritized the CD-associated functional SNPs and genes. By highlighting a role for B cells in CD, which classically has been described as a T-cell-driven disease, we offer new insights into the mechanisms and pathways underlying CD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddu453DOI Listing
January 2015

Systematic identification of trans eQTLs as putative drivers of known disease associations.

Nat Genet 2013 Oct 8;45(10):1238-1243. Epub 2013 Sep 8.

Department of Functional Genomics, Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, D-17487 Greifswald, Germany.

Identifying the downstream effects of disease-associated SNPs is challenging. To help overcome this problem, we performed expression quantitative trait locus (eQTL) meta-analysis in non-transformed peripheral blood samples from 5,311 individuals with replication in 2,775 individuals. We identified and replicated trans eQTLs for 233 SNPs (reflecting 103 independent loci) that were previously associated with complex traits at genome-wide significance. Some of these SNPs affect multiple genes in trans that are known to be altered in individuals with disease: rs4917014, previously associated with systemic lupus erythematosus (SLE), altered gene expression of C1QB and five type I interferon response genes, both hallmarks of SLE. DeepSAGE RNA sequencing showed that rs4917014 strongly alters the 3' UTR levels of IKZF1 in cis, and chromatin immunoprecipitation and sequencing analysis of the trans-regulated genes implicated IKZF1 as the causal gene. Variants associated with cholesterol metabolism and type 1 diabetes showed similar phenomena, indicating that large-scale eQTL mapping provides insight into the downstream effects of many trait-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.2756DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3991562PMC
October 2013

DeepSAGE reveals genetic variants associated with alternative polyadenylation and expression of coding and non-coding transcripts.

PLoS Genet 2013 Jun 20;9(6):e1003594. Epub 2013 Jun 20.

University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands.

Many disease-associated variants affect gene expression levels (expression quantitative trait loci, eQTLs) and expression profiling using next generation sequencing (NGS) technology is a powerful way to detect these eQTLs. We analyzed 94 total blood samples from healthy volunteers with DeepSAGE to gain specific insight into how genetic variants affect the expression of genes and lengths of 3'-untranslated regions (3'-UTRs). We detected previously unknown cis-eQTL effects for GWAS hits in disease- and physiology-associated traits. Apart from cis-eQTLs that are typically easily identifiable using microarrays or RNA-sequencing, DeepSAGE also revealed many cis-eQTLs for antisense and other non-coding transcripts, often in genomic regions containing retrotransposon-derived elements. We also identified and confirmed SNPs that affect the usage of alternative polyadenylation sites, thereby potentially influencing the stability of messenger RNAs (mRNA). We then combined the power of RNA-sequencing with DeepSAGE by performing a meta-analysis of three datasets, leading to the identification of many more cis-eQTLs. Our results indicate that DeepSAGE data is useful for eQTL mapping of known and unknown transcripts, and for identifying SNPs that affect alternative polyadenylation. Because of the inherent differences between DeepSAGE and RNA-sequencing, our complementary, integrative approach leads to greater insight into the molecular consequences of many disease-associated variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1003594DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3688553PMC
June 2013

Human disease-associated genetic variation impacts large intergenic non-coding RNA expression.

PLoS Genet 2013 17;9(1):e1003201. Epub 2013 Jan 17.

Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.

Recently it has become clear that only a small percentage (7%) of disease-associated single nucleotide polymorphisms (SNPs) are located in protein-coding regions, while the remaining 93% are located in gene regulatory regions or in intergenic regions. Thus, the understanding of how genetic variations control the expression of non-coding RNAs (in a tissue-dependent manner) has far-reaching implications. We tested the association of SNPs with expression levels (eQTLs) of large intergenic non-coding RNAs (lincRNAs), using genome-wide gene expression and genotype data from five different tissues. We identified 112 cis-regulated lincRNAs, of which 45% could be replicated in an independent dataset. We observed that 75% of the SNPs affecting lincRNA expression (lincRNA cis-eQTLs) were specific to lincRNA alone and did not affect the expression of neighboring protein-coding genes. We show that this specific genotype-lincRNA expression correlation is tissue-dependent and that many of these lincRNA cis-eQTL SNPs are also associated with complex traits and diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1003201DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3547830PMC
May 2013
-->