Publications by authors named "Eric R Gamazon"

127 Publications

Contextualizing genetic risk score for disease screening and rare variant discovery.

Nat Commun 2021 07 20;12(1):4418. Epub 2021 Jul 20.

Vanderbilt Genetics Institute, Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.

Studies of the genetic basis of complex traits have demonstrated a substantial role for common, small-effect variant polygenic burden (PB) as well as large-effect variants (LEV, primarily rare). We identify sufficient conditions in which GWAS-derived PB may be used for well-powered rare pathogenic variant discovery or as a sample prioritization tool for whole-genome or exome sequencing. Through extensive simulations of genetic architectures and generative models of disease liability with parameters informed by empirical data, we quantify the power to detect, among cases, a lower PB in LEV carriers than in non-carriers. Furthermore, we uncover clinically useful conditions wherein the risk derived from the PB is comparable to the LEV-derived risk. The resulting summary-statistics-based methodology (with publicly available software, PB-LEV-SCAN) makes predictions on PB-based LEV screening for 36 complex traits, which we confirm in several disease datasets with available LEV information in the UK Biobank, with important implications on clinical decision-making.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-24387-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8292385PMC
July 2021

Multilayer modelling of the human transcriptome and biological mechanisms of complex diseases and traits.

NPJ Syst Biol Appl 2021 05 27;7(1):24. Epub 2021 May 27.

Clare Hall, University of Cambridge, Cambridge, UK.

Here, we performed a comprehensive intra-tissue and inter-tissue multilayer network analysis of the human transcriptome. We generated an atlas of communities in gene co-expression networks in 49 tissues (GTEx v8), evaluated their tissue specificity, and investigated their methodological implications. UMAP embeddings of gene expression from the communities (representing nearly 18% of all genes) robustly identified biologically-meaningful clusters. Notably, new gene expression data can be embedded into our algorithmically derived models to accelerate discoveries in high-dimensional molecular datasets and downstream diagnostic or prognostic applications. We demonstrate the generalisability of our approach through systematic testing in external genomic and transcriptomic datasets. Methodologically, prioritisation of the communities in a transcriptome-wide association study of the biomarker C-reactive protein (CRP) in 361,194 individuals in the UK Biobank identified genetically-determined expression changes associated with CRP and led to considerably improved performance. Furthermore, a deep learning framework applied to the communities in nearly 11,000 tumors profiled by The Cancer Genome Atlas across 33 different cancer types learned biologically-meaningful latent spaces, representing metastasis (p < 2.2 × 10) and stemness (p < 2.2 × 10). Our study provides a rich genomic resource to catalyse research into inter-tissue regulatory mechanisms, and their downstream consequences on human disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41540-021-00186-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8160250PMC
May 2021

Revisiting Some Useful Statistical Guidelines in in Response to a Changing Landscape.

Circ Res 2021 May 25;128(11):1724-1727. Epub 2021 May 25.

Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN (J.B.).

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCRESAHA.120.317360DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8202354PMC
May 2021

Multi-omic analysis elucidates the genetic basis of hydrocephalus.

Cell Rep 2021 May;35(5):109085

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Data Science Institute, Vanderbilt University, Nashville, TN 37232, USA; Clare Hall, University of Cambridge, Cambridge CB3 9AL, UK; MRC Epidemiology Unit, University of Cambridge, Cambridge CB3 9AL, UK. Electronic address:

We conducted PrediXcan analysis of hydrocephalus risk in ten neurological tissues and whole blood. Decreased expression of MAEL in the brain was significantly associated (Bonferroni-adjusted p < 0.05) with hydrocephalus. PrediXcan analysis of brain imaging and genomics data in the independent UK Biobank (N = 8,428) revealed that MAEL expression in the frontal cortex is associated with white matter and total brain volumes. Among the top differentially expressed genes in brain, we observed a significant enrichment for gene-level associations with these structural phenotypes, suggesting an effect on disease risk through regulation of brain structure and integrity. We found additional support for these genes through analysis of the choroid plexus transcriptome of a murine model of hydrocephalus. Finally, differential protein expression analysis in patient cerebrospinal fluid recapitulated disease-associated expression changes in neurological tissues, but not in whole blood. Our findings provide convergent evidence highlighting the importance of tissue-specific pathways and mechanisms in the pathophysiology of hydrocephalus.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2021.109085DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8124085PMC
May 2021

Deep Learning Enables Fast and Accurate Imputation of Gene Expression.

Front Genet 2021 13;12:624128. Epub 2021 Apr 13.

Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom.

A question of fundamental biological significance is to what extent the expression of a subset of genes can be used to recover the full transcriptome, with important implications for biological discovery and clinical application. To address this challenge, we propose two novel deep learning methods, PMI and GAIN-GTEx, for gene expression imputation. In order to increase the applicability of our approach, we leverage data from GTEx v8, a reference resource that has generated a comprehensive collection of transcriptomes from a diverse set of human tissues. We show that our approaches compare favorably to several standard and state-of-the-art imputation methods in terms of predictive performance and runtime in two case studies and two imputation scenarios. In comparison conducted on the protein-coding genes, PMI attains the highest performance in inductive imputation whereas GAIN-GTEx outperforms the other methods in in-place imputation. Furthermore, our results indicate strong generalization on RNA-Seq data from 3 cancer types across varying levels of missingness. Our work can facilitate a cost-effective integration of large-scale RNA biorepositories into genomic studies of disease, with high applicability across diverse tissue types.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2021.624128DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8076954PMC
April 2021

E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics.

Bioinformatics 2021 Feb 24. Epub 2021 Feb 24.

Translational Neurogenomics Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.

Motivation: Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with human traits and diseases, but the exact causal genes are largely unknown. Common genetic risk variants are enriched in non-protein-coding regions of the genome and often affect gene expression (expression quantitative trait loci, eQTL) in a tissue-specific manner. To address this challenge, we developed a methodological framework, E-MAGMA, which converts genome-wide association summary statistics into gene-level statistics by assigning risk variants to their putative genes based on tissue-specific eQTL information.

Results: We compared E-MAGMA to three eQTL informed gene-based approaches using simulated phenotype data. Phenotypes were simulated based on eQTL reference data using GCTA for all genes with at least one eQTL at chromosome 1. We performed 10 simulations per gene. The eQTL-h2 (i.e., the proportion of variation explained by the eQTLs) was set at 1%, 2%, and 5%. We found E-MAGMA outperforms other gene-based approaches across a range of simulated parameters (e.g. the number of identified causal genes). When applied to genome-wide association summary statistics for five neuropsychiatric disorders, E-MAGMA identified more putative candidate causal genes compared to other eQTL-based approaches. By integrating tissue-specific eQTL information, these results show E-MAGMA will help to identify novel candidate causal genes from genome-wide association summary statistics and thereby improve the understanding of the biological basis of complex disorders.

Availability: A tutorial and input files are made available in a github repository: https://github.com/eskederks/eMAGMA-tutorial.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab115DOI Listing
February 2021

Modeling mutational effects on biochemical phenotypes using convolutional neural networks: application to SARS-CoV-2.

bioRxiv 2021 Jan 28. Epub 2021 Jan 28.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.

Biochemical phenotypes are major indexes for protein structure and function characterization. They are determined, at least in part, by the intrinsic physicochemical properties of amino acids and may be reflected in the protein three-dimensional structure. Modeling mutational effects on biochemical phenotypes is a critical step for understanding protein function and disease mechanism as well as enabling drug discovery. Deep Mutational Scanning (DMS) experiments have been performed on SARS-CoV-2's spike receptor binding domain and the human ACE2 zinc-binding peptidase domain - both central players in viral infection and evolution and antibody evasion - quantifying how mutations impact binding affinity and protein expression. Here, we modeled biochemical phenotypes from massively parallel assays, using convolutional neural networks trained on protein sequence mutations in the virus and human host. We found that neural networks are significantly predictive of binding affinity, protein expression, and antibody escape, learning complex interactions and higher-order features that are difficult to capture with conventional methods from structural biology. Integrating the intrinsic physicochemical properties of amino acids, including hydrophobicity, solvent-accessible surface area, and long-range non-bonded energy per atom, significantly improved prediction (empirical p<0.01) though there was such a strong dependence on the sequence data alone to yield reasonably good prediction. We observed concordance of the DMS data and our neural network predictions with an independent study on intermolecular interactions from molecular dynamics (multiple 500 ns or 1 μs all-atom) simulations of the spike protein-ACE2 interface, with critical implications for the use of deep learning to dissect molecular mechanisms. The mutation- or genetically-determined component of a biochemical phenotype estimated from the neural networks has improved causal inference properties relative to the original phenotype and can facilitate crucial insights into disease pathophysiology and therapeutic design.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2021.01.28.428521DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7852230PMC
January 2021

Exploiting the GTEx resources to decipher the mechanisms at GWAS loci.

Genome Biol 2021 01 26;22(1):49. Epub 2021 Jan 26.

Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL, USA.

The resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02252-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7836161PMC
January 2021

An integrative systems-based analysis of substance use: eQTL-informed gene-based tests, gene networks, and biological mechanisms.

Am J Med Genet B Neuropsychiatr Genet 2021 04 23;186(3):162-172. Epub 2020 Dec 23.

Translational Neurogenomics Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.

Genome-wide association studies have identified multiple genetic risk factors underlying susceptibility to substance use, however, the functional genes and biological mechanisms remain poorly understood. The discovery and characterization of risk genes can be facilitated by the integration of genome-wide association data and gene expression data across biologically relevant tissues and/or cell types to identify genes whose expression is altered by DNA sequence variation (expression quantitative trait loci; eQTLs). The integration of gene expression data can be extended to the study of genetic co-expression, under the biologically valid assumption that genes form co-expression networks to influence the manifestation of a disease or trait. Here, we integrate genome-wide association data with gene expression data from 13 brain tissues to identify candidate risk genes for 8 substance use phenotypes. We then test for the enrichment of candidate risk genes within tissue-specific gene co-expression networks to identify modules (or groups) of functionally related genes whose dysregulation is associated with variation in substance use. We identified eight gene modules in brain that were enriched with gene-based association signals for substance use phenotypes. For example, a single module of 40 co-expressed genes was enriched with gene-based associations for drinks per week and biological pathways involved in GABA synthesis, release, reuptake and degradation. Our study demonstrates the utility of eQTL and gene co-expression analysis to uncover novel biological mechanisms for substance use traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ajmg.b.32829DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137546PMC
April 2021

Genetic architecture of host proteins involved in SARS-CoV-2 infection.

Nat Commun 2020 12 16;11(1):6397. Epub 2020 Dec 16.

MRC Epidemiology Unit, University of Cambridge, Cambridge, UK.

Understanding the genetic architecture of host proteins interacting with SARS-CoV-2 or mediating the maladaptive host response to COVID-19 can help to identify new or repurpose existing drugs targeting those proteins. We present a genetic discovery study of 179 such host proteins among 10,708 individuals using an aptamer-based technique. We identify 220 host DNA sequence variants acting in cis (MAF 0.01-49.9%) and explaining 0.3-70.9% of the variance of 97 of these proteins, including 45 with no previously known protein quantitative trait loci (pQTL) and 38 encoding current drug targets. Systematic characterization of pQTLs across the phenome identified protein-drug-disease links and evidence that putative viral interaction partners such as MARK3 affect immune response. Our results accelerate the evaluation and prioritization of new drug development programmes and repurposing of trials to prevent, treat or reduce adverse outcomes. Rapid sharing and detailed interrogation of results is facilitated through an interactive webserver ( https://omicscience.org/apps/covidpgwas/ ).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-19996-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7744536PMC
December 2020

A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis.

Nat Genet 2020 11 5;52(11):1239-1246. Epub 2020 Oct 5.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.

Here, we present a joint-tissue imputation (JTI) approach and a Mendelian randomization framework for causal inference, MR-JTI. JTI borrows information across transcriptomes of different tissues, leveraging shared genetic regulation, to improve prediction performance in a tissue-dependent manner. Notably, JTI includes the single-tissue imputation method PrediXcan as a special case and outperforms other single-tissue approaches (the Bayesian sparse linear mixed model and Dirichlet process regression). MR-JTI models variant-level heterogeneity (primarily due to horizontal pleiotropy, addressing a major challenge of transcriptome-wide association study interpretation) and performs causal inference with type I error control. We make explicit the connection between the genetic architecture of gene expression and of complex traits and the suitability of Mendelian randomization as a causal inference strategy for transcriptome-wide association studies. We provide a resource of imputation models generated from GTEx and PsychENCODE panels. Analysis of biobanks and meta-analysis data, and extensive simulations show substantially improved statistical power, replication and causal mapping rate for JTI relative to existing approaches.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-020-0706-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7606598PMC
November 2020

The impact of sex on gene expression across human tissues.

Science 2020 09;369(6509)

Department of Statistics, University of Chicago, Chicago, IL, USA.

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aba3066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8136152PMC
September 2020

A Transcriptome-Wide Association Study Identifies Candidate Susceptibility Genes for Pancreatic Cancer Risk.

Cancer Res 2020 10 9;80(20):4346-4354. Epub 2020 Sep 9.

Division of Cancer Epidemiology, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii.

Pancreatic cancer is among the most well-characterized cancer types, yet a large proportion of the heritability of pancreatic cancer risk remains unclear. Here, we performed a large transcriptome-wide association study to systematically investigate associations between genetically predicted gene expression in normal pancreas tissue and pancreatic cancer risk. Using data from 305 subjects of mostly European descent in the Genotype-Tissue Expression Project, we built comprehensive genetic models to predict normal pancreas tissue gene expression, modifying the UTMOST (unified test for molecular signatures). These prediction models were applied to the genetic data of 8,275 pancreatic cancer cases and 6,723 controls of European ancestry. Thirteen genes showed an association of genetically predicted expression with pancreatic cancer risk at an FDR ≤ 0.05, including seven previously reported genes (, and ) and six novel genes not yet reported for pancreatic cancer risk [6q27: OR (95% confidence interval (CI), 1.54 (1.25-1.89); 13q12.13: OR (95% CI), 0.78 (0.70-0.88); 14q24.3: OR (95% CI), 1.35 (1.17-1.56); 17q12: OR (95% CI), 6.49 (2.96-14.27); 17q21.1: OR (95% CI), 1.94 (1.45-2.58); and 20p13: OR (95% CI): 1.41 (1.20-1.66)]. The associations for 10 of these genes (, and ) remained statistically significant even after adjusting for risk SNPs identified in previous genome-wide association study. Collectively, this analysis identified novel candidate susceptibility genes for pancreatic cancer that warrant further investigation. SIGNIFICANCE: A transcriptome-wide association analysis identified seven previously reported and six novel candidate susceptibility genes for pancreatic cancer risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/0008-5472.CAN-20-1353DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7572664PMC
October 2020

Optimizing Genetic Analyses of Serum Lipids in Longitudinal Data.

Circ Res 2020 10 2;127(10):1337-1339. Epub 2020 Sep 2.

Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, (H.-H.C., L.E.P., E.R.G., Q.S.W., J.E.B.), Vanderbilt University Medical Center, Nashville, TN.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCRESAHA.120.317569DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7581558PMC
October 2020

The regulatory genome constrains protein sequence evolution: implications for the search for disease-associated genes.

PeerJ 2020 21;8:e9554. Epub 2020 Jul 21.

Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, United States of America.

The development of explanatory models of protein sequence evolution has broad implications for our understanding of cellular biology, population history, and disease etiology. Here we analyze the GTEx transcriptome resource to quantify the effect of the transcriptome on protein sequence evolution in a multi-tissue framework. We find substantial variation among the central nervous system tissues in the effect of expression variance on evolutionary rate, with highly variable genes in the cortex showing significantly greater purifying selection than highly variable genes in subcortical regions (Mann-Whitney U  = 1.4 × 10). The remaining tissues cluster in observed expression correlation with evolutionary rate, enabling evolutionary analysis of genes in diverse physiological systems, including digestive, reproductive, and immune systems. Importantly, the tissue in which a gene attains its maximum expression variance significantly varies ( = 5.55 × 10) with evolutionary rate, suggesting a tissue-anchored model of protein sequence evolution. Using a large-scale reference resource, we show that the tissue-anchored model provides a transcriptome-based approach to predicting the primary affected tissue of developmental disorders. Using gradient boosted regression trees to model evolutionary rate under a range of model parameters, selected features explain up to 62% of the variation in evolutionary rate and provide additional support for the tissue model. Finally, we investigate several methodological implications, including the importance of evolutionary-rate-aware gene expression imputation models using genetic data for improved search for disease-associated genes in transcriptome-wide association studies. Collectively, this study presents a comprehensive transcriptome-based analysis of a range of factors that may constrain molecular evolution and proposes a novel framework for the study of gene function and disease mechanism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.9554DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380284PMC
July 2020

Metabolic coessentiality mapping identifies C12orf49 as a regulator of SREBP processing and cholesterol metabolism.

Nat Metab 2020 06 1;2(6):487-498. Epub 2020 Jun 1.

Laboratory of Metabolic Regulation and Genetics, The Rockefeller University, New York, NY, USA.

Coessentiality mapping has been useful to systematically cluster genes into biological pathways and identify gene functions. Here, using the debiased sparse partial correlation (DSPC) method, we construct a functional coessentiality map for cellular metabolic processes across human cancer cell lines. This analysis reveals 35 modules associated with known metabolic pathways and further assigns metabolic functions to unknown genes. In particular, we identify C12orf49 as an essential regulator of cholesterol and fatty acid metabolism in mammalian cells. Mechanistically, C12orf49 localizes to the Golgi, binds membrane-bound transcription factor peptidase, site 1 (MBTPS1, site 1 protease) and is necessary for the cleavage of its substrates, including sterol regulatory element binding protein (SREBP) transcription factors. This function depends on the evolutionarily conserved uncharacterized domain (DUF2054) and promotes cell proliferation under cholesterol depletion. Notably, c12orf49 depletion in zebrafish blocks dietary lipid clearance in vivo, mimicking the phenotype of mbtps1 mutants. Finally, in an electronic health record (EHR)-linked DNA biobank, C12orf49 is associated with hyperlipidaemia through phenome analysis. Altogether, our findings reveal a conserved role for C12orf49 in cholesterol and lipid homeostasis and provide a platform to identify unknown components of other metabolic pathways.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s42255-020-0206-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384252PMC
June 2020

Genetic architecture of host proteins interacting with SARS-CoV-2.

bioRxiv 2020 Jul 1. Epub 2020 Jul 1.

MRC Epidemiology Unit, University of Cambridge, Cambridge, UK.

Strategies to develop therapeutics for SARS-CoV-2 infection may be informed by experimental identification of viral-host protein interactions in cellular assays and measurement of host response proteins in COVID-19 patients. Identification of genetic variants that influence the level or activity of these proteins in the host could enable rapid 'in silico' assessment in human genetic studies of their causal relevance as molecular targets for new or repurposed drugs to treat COVID-19. We integrated large-scale genomic and aptamer-based plasma proteomic data from 10,708 individuals to characterize the genetic architecture of 179 host proteins reported to interact with SARS-CoV-2 proteins or to participate in the host response to COVID-19. We identified 220 host DNA sequence variants acting in (MAF 0.01-49.9%) and explaining 0.3-70.9% of the variance of 97 of these proteins, including 45 with no previously known protein quantitative trait loci (pQTL) and 38 encoding current drug targets. Systematic characterization of pQTLs across the phenome identified protein-drug-disease links, evidence that putative viral interaction partners such as MARK3 affect immune response, and establish the first link between a recently reported variant for respiratory failure of COVID-19 patients at the locus and hypercoagulation, i.e. maladaptive host response. Our results accelerate the evaluation and prioritization of new drug development programmes and repurposing of trials to prevent, treat or reduce adverse outcomes. Rapid sharing and dynamic and detailed interrogation of results is facilitated through an interactive webserver ( https://omicscience.org/apps/covidpgwas/ ).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.07.01.182709DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7337378PMC
July 2020

An analysis of genetically regulated gene expression across multiple tissues implicates novel gene candidates in Alzheimer's disease.

Alzheimers Res Ther 2020 04 16;12(1):43. Epub 2020 Apr 16.

Translational Neurogenomics Laboratory, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston, Brisbane, QLD, 4006, Australia.

Introduction: Genome-wide association studies (GWAS) have successfully identified multiple independent genetic loci that harbour variants associated with Alzheimer's disease, but the exact causal genes and biological pathways are largely unknown.

Methods: To prioritise likely causal genes associated with Alzheimer's disease, we used S-PrediXcan to integrate expression quantitative trait loci (eQTL) from the Genotype-Tissue Expression (GTEx) study and CommonMind Consortium (CMC) with Alzheimer's disease GWAS summary statistics. We meta-analysed the GTEx results using S-MultiXcan, prioritised disease-implicated loci using a computational fine-mapping approach, and performed a biological pathway analysis on the gene-based results.

Results: We identified 126 tissue-specific gene-based associations across 48 GTEx tissues, targeting 50 unique genes. Meta-analysis of the tissue-specific associations identified 73 genes whose expression was associated with Alzheimer's disease. Additional analyses in the dorsolateral prefrontal cortex from the CMC identified 12 significant associations, 8 of which also had a significant association in GTEx tissues. Fine-mapping of causal gene sets prioritised gene candidates in 10 Alzheimer's disease loci with strong evidence for causality. Biological pathway analyses of the meta-analysed GTEx data and CMC data identified a significant enrichment of Alzheimer's disease association signals in plasma lipoprotein clearance, in addition to multiple immune-related pathways.

Conclusions: Gene expression data from brain and peripheral tissues can improve power to detect regulatory variation underlying Alzheimer's disease. However, the associations in peripheral tissues may reflect tissue-shared regulatory variation for a gene. Therefore, future functional studies should be performed to validate the biological meaning of these associations and whether they represent new pathogenic tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13195-020-00611-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7164172PMC
April 2020

Electronic health record phenotypes associated with genetically regulated expression of CFTR and application to cystic fibrosis.

Genet Med 2020 07 16;22(7):1191-1200. Epub 2020 Apr 16.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.

Purpose: The increasing use of electronic health records (EHRs) and biobanks offers unique opportunities to study Mendelian diseases. We described a novel approach to summarize clinical manifestations from patient EHRs into phenotypic evidence for cystic fibrosis (CF) with potential to alert unrecognized patients of the disease.

Methods: We estimated genetically predicted expression (GReX) of cystic fibrosis transmembrane conductance regulator (CFTR) and tested for association with clinical diagnoses in the Vanderbilt University biobank (N = 9142 persons of European descent with 71 cases of CF). The top associated EHR phenotypes were assessed in combination as a phenotype risk score (PheRS) for discriminating CF case status in an additional 2.8 million patients from Vanderbilt University Medical Center (VUMC) and 125,305 adult patients including 25,314 CF cases from MarketScan, an independent external cohort.

Results: GReX of CFTR was associated with EHR phenotypes consistent with CF. PheRS constructed using the EHR phenotypes and weights discovered by the genetic associations improved discriminative power for CF over the initially proposed PheRS in both VUMC and MarketScan.

Conclusion: Our study demonstrates the power of EHRs for clinical description of CF and the benefits of using a genetics-informed weighing scheme in construction of a phenotype risk score. This research may find broad applications for phenomic studies of Mendelian disease genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41436-020-0786-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7781195PMC
July 2020

Genomic Variants of Cytarabine Sensitivity Associated with Treatment-Related Mortality in Pediatric AML: A Report from the Children's Oncology Group.

Clin Cancer Res 2020 06 2;26(12):2891-2897. Epub 2020 Mar 2.

Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio.

Purpose: Cytarabine is an effective treatment for AML with associated toxicities including treatment related mortality (TRM). The purpose is to determine the clinical relevance of SNPs identified through the use of HapMap lymphoblastoid cell-based models, in predicting cytarabine response and toxicity in AML.

Experimental Design: We tested clinical significance of SNPs associated with cytarabine sensitivity in children with AML treated on Children's Oncology Group regimens (CCG 2941/2961). Endpoints included overall survival (OS), event-free survival (EFS), and TRM. Patients who received bone marrow transplant were excluded. We tested 124 SNPs associated with cytarabine sensitivity in HapMap cell lines in 348 children to determine whether any associated with treatment outcomes. In addition, we tested five SNPs previously associated with TRM in children with AML in our independent dataset of 385 children.

Results: Homozygous variant genotypes of rs2025501 and rs6661575 had increased cellular sensitivity to cytarabine and were associated with increased TRM. TRM was particularly increased in children with variant genotype randomized to high-dose cytarabine (rs2025501: = 0.0024 and rs6661575 = 0.0188). In analysis of previously reported SNPs, only the variant genotype rs17202778 C/C was significantly associated with TRM ( < 0.0001).

Conclusions: We report clinical importance of two SNPs not previously associated with cytarabine toxicity. Moreover, we confirm that SNP rs17202778 significantly impacts TRM in pediatric AML. Cytarabine sensitivity genotypes may predict TRM and could be used to stratify to standard versus high-dose cytarabine regimens, warranting further study in prospective AML trials.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/1078-0432.CCR-19-3117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7722896PMC
June 2020

Phenome-based approach identifies RIC1-linked Mendelian syndrome through zebrafish models, biobank associations and clinical studies.

Nat Med 2020 01 13;26(1):98-109. Epub 2020 Jan 13.

Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.

Discovery of genotype-phenotype relationships remains a major challenge in clinical medicine. Here, we combined three sources of phenotypic data to uncover a new mechanism for rare and common diseases resulting from collagen secretion deficits. Using a zebrafish genetic screen, we identified the ric1 gene as being essential for skeletal biology. Using a gene-based phenome-wide association study (PheWAS) in the EHR-linked BioVU biobank, we show that reduced genetically determined expression of RIC1 is associated with musculoskeletal and dental conditions. Whole-exome sequencing identified individuals homozygous-by-descent for a rare variant in RIC1 and, through a guided clinical re-evaluation, it was discovered that they share signs with the BioVU-associated phenome. We named this new Mendelian syndrome CATIFA (cleft lip, cataract, tooth abnormality, intellectual disability, facial dysmorphism, attention-deficit hyperactivity disorder) and revealed further disease mechanisms. This gene-based, PheWAS-guided approach can accelerate the discovery of clinically relevant disease phenome and associated biological mechanisms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41591-019-0705-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7147997PMC
January 2020

Transcriptome-wide association analysis offers novel opportunities for clinical translation of genetic discoveries on mental disorders.

World Psychiatry 2020 Feb;19(1):113-114

Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/wps.20702DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6953546PMC
February 2020

Post-GWAS analysis of six substance use traits improves the identification and functional interpretation of genetic risk loci.

Drug Alcohol Depend 2020 01 4;206:107703. Epub 2019 Nov 4.

Department of Psychiatry, Amsterdam UMC, Amsterdam Neuroscience, University of Amsterdam, Meibergdreef 9, Amsterdam, the Netherlands; QIMR Berghofer, Translational Neurogenomics group, Brisbane, Australia.

Background: Little is known about the functional mechanisms through which genetic loci associated with substance use traits ascertain their effect. This study aims to identify and functionally annotate loci associated with substance use traits based on their role in genetic regulation of gene expression.

Methods: We evaluated expression Quantitative Trait Loci (eQTLs) from 13 brain regions and whole blood of the Genotype-Tissue Expression (GTEx) database, and from whole blood of the Depression Genes and Networks (DGN) database. The role of single eQTLs was examined for six substance use traits: alcohol consumption (N = 537,349), cigarettes per day (CPD; N = 263,954), former vs. current smoker (N = 312,821), age of smoking initiation (N = 262,990), ever smoker (N = 632,802), and cocaine dependence (N = 4,769). Subsequently, we conducted a gene level analysis of gene expression on these substance use traits using S-PrediXcan.

Results: Using an FDR-adjusted p-value <0.05 we found 2,976 novel candidate genetic loci for substance use traits, and identified genes and tissues through which these loci potentially exert their effects. Using S-PrediXcan, we identified significantly associated genes for all substance traits.

Discussion: Annotating genes based on transcriptomic regulation improves the identification and functional characterization of candidate loci and genes for substance use traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.drugalcdep.2019.107703DOI Listing
January 2020

Inferred divergent gene regulation in archaic hominins reveals potential phenotypic differences.

Nat Ecol Evol 2019 11 7;3(11):1598-1606. Epub 2019 Oct 7.

Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA.

Sequencing DNA derived from archaic bones has enabled genetic comparison of Neanderthals and anatomically modern humans (AMHs), and revealed that they interbred. However, interpreting what genetic differences imply about their phenotypic differences remains challenging. Here, we introduce an approach for identifying divergent gene regulation between archaic hominins, such as Neanderthals, and AMH sequences, and find 766 genes that are likely to have been divergently regulated (DR) by Neanderthal haplotypes that do not remain in AMHs. DR genes include many involved in phenotypes known to differ between Neanderthals and AMHs, such as the structure of the rib cage and supraorbital ridge development. They are also enriched for genes associated with spontaneous abortion, polycystic ovary syndrome, myocardial infarction and melanoma. Phenotypes associated with modern human variation in these genes' regulation in ~23,000 biobank patients further support their involvement in immune and cardiovascular phenotypes. Comparing DR genes between two Neanderthals and a Denisovan revealed divergence in the immune system and in genes associated with skeletal and dental morphology that are consistent with the archaeological record. These results establish differences in gene regulatory architecture between AMHs and archaic hominins, and provide an avenue for exploring phenotypic differences between archaic groups from genomic information alone.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41559-019-0996-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7046098PMC
November 2019

A gene co-expression network-based analysis of multiple brain tissues reveals novel genes and molecular pathways underlying major depression.

PLoS Genet 2019 07 15;15(7):e1008245. Epub 2019 Jul 15.

Translational Neurogenomics Laboratory, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia.

Major depression is a common and severe psychiatric disorder with a highly polygenic genetic architecture. Genome-wide association studies have successfully identified multiple independent genetic loci that harbour variants associated with major depression, but the exact causal genes and biological mechanisms are largely unknown. Tissue-specific network approaches may identify molecular mechanisms underlying major depression and provide a biological substrate for integrative analyses. We provide a framework for the identification of individual risk genes and gene co-expression networks using genome-wide association summary statistics and gene expression information across multiple human brain tissues and whole blood. We developed a novel gene-based method called eMAGMA that leverages tissue-specific eQTL information to identify 99 biologically plausible risk genes associated with major depression, of which 58 are novel. Among these novel associations is Complement Factor 4A (C4A), recently implicated in schizophrenia through its role in synaptic pruning during postnatal development. Major depression risk genes were enriched in gene co-expression modules in multiple brain tissues and the implicated gene modules contained genes involved in synaptic signalling, neuronal development, and cell transport pathways. Modules enriched with major depression signals were strongly preserved across brain tissues, but were weakly preserved in whole blood, highlighting the importance of using disease-relevant tissues in genetic studies of psychiatric traits. We identified tissue-specific genes and gene co-expression networks associated with major depression. Our novel analytical framework can be used to gain fundamental insights into the functioning of the nervous system in major depression and other brain-related traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008245DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6658115PMC
July 2019

DNA methylation profiles are associated with complex regional pain syndrome after traumatic injury.

Pain 2019 10;160(10):2328-2337

Department of Anesthesiology, Vanderbilt University Medical Center, Nashville, TN, United States. Mr. Shaw is now with Department of Anesthesiology and Pain Medicine, University of Alberta, Edmonton, AB, Canada.

Factors contributing to development of complex regional pain syndrome (CRPS) are not fully understood. This study examined possible epigenetic mechanisms that may contribute to CRPS after traumatic injury. DNA methylation profiles were compared between individuals developing CRPS (n = 9) and those developing non-CRPS neuropathic pain (n = 38) after undergoing amputation following military trauma. Linear Models for Microarray (LIMMA) analyses revealed 48 differentially methylated cytosine-phosphate-guanine dinucleotide (CpG) sites between groups (unadjusted P's < 0.005), with the top gene COL11A1 meeting Bonferroni-adjusted P < 0.05. The second largest differential methylation was observed for the HLA-DRB6 gene, an immune-related gene linked previously to CRPS in a small gene expression study. For all but 7 of the significant CpG sites, the CRPS group was hypomethylated. Numerous functional Gene Ontology-Biological Process categories were significantly enriched (false discovery rate-adjusted q value <0.15), including multiple immune-related categories (eg, activation of immune response, immune system development, regulation of immune system processes, and antigen processing and presentation). Differentially methylated genes were more highly connected in human protein-protein networks than expected by chance (P < 0.05), supporting the biological relevance of the findings. Results were validated in an independent sample linking a DNA biobank with electronic health records (n = 126 CRPS phenotype, n = 19,768 non-CRPS chronic pain phenotype). Analyses using PrediXcan methodology indicated differences in the genetically determined component of gene expression in 7 of 48 genes identified in methylation analyses (P's < 0.02). Results suggest that immune- and inflammatory-related factors might confer risk of developing CRPS after traumatic injury. Validation findings demonstrate the potential of using electronic health records linked to DNA for genomic studies of CRPS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1097/j.pain.0000000000001624DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7473388PMC
October 2019

On Using Local Ancestry to Characterize the Genetic Architecture of Human Traits: Genetic Regulation of Gene Expression in Multiethnic or Admixed Populations.

Am J Hum Genet 2019 06 16;104(6):1097-1115. Epub 2019 May 16.

Division of Genetic Medicine and Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Clare Hall, University of Cambridge, Cambridge CB3 9AL, UK.

Understanding the nature of the genetic regulation of gene expression promises to advance our understanding of the genetic basis of disease. However, the methodological impact of the use of local ancestry on high-dimensional omics analyses, including, most prominently, expression quantitative trait loci (eQTL) mapping and trait heritability estimation, in admixed populations remains critically underexplored. Here, we develop a statistical framework that characterizes the relationships among the determinants of the genetic architecture of an important class of molecular traits. We provide a computationally efficient approach to local ancestry analysis in eQTL mapping while increasing control of type I and type II error over traditional approaches. Applying our method to National Institute of General Medical Sciences (NIGMS) and Genotype-Tissue Expression (GTEx) datasets, we show that the use of local ancestry can improve eQTL mapping in admixed and multiethnic populations, respectively. We estimate the trait variance explained by ancestry by using local admixture relatedness between individuals. By using simulations of diverse genetic architectures and degrees of confounding, we show improved accuracy in estimating heritability when accounting for local ancestry similarity. Furthermore, we characterize the sparse versus polygenic components of gene expression in admixed individuals. Our study has important methodological implications for genetic analysis of omics traits across a range of genomic contexts, from a single variant to a prioritized region to the entire genome. Our findings highlight the importance of using local ancestry to better characterize the heritability of complex traits and to more accurately map genetic associations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2019.04.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6562007PMC
June 2019

Publisher Correction: Gene expression imputation across multiple brain regions provides insights into schizophrenia risk.

Nat Genet 2019 Jun;51(6):1068

Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.

In the HTML version of the article originally published, the author group 'The Schizophrenia Working Group of the Psychiatric Genomics Consortium' was displayed incorrectly. The error has been corrected in the HTML version of the article.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0435-6DOI Listing
June 2019

Multi-tissue transcriptome analyses identify genetic mechanisms underlying neuropsychiatric traits.

Nat Genet 2019 06 13;51(6):933-940. Epub 2019 May 13.

Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands.

The genetic architecture of psychiatric disorders is characterized by a large number of small-effect variants located primarily in non-coding regions, suggesting that the underlying causal effects may influence disease risk by modulating gene expression. We provide comprehensive analyses using transcriptome data from an unprecedented collection of tissues to gain pathophysiological insights into the role of the brain, neuroendocrine factors (adrenal gland) and gastrointestinal systems (colon) in psychiatric disorders. In each tissue, we perform PrediXcan analysis and identify trait-associated genes for schizophrenia (n associations = 499; n unique genes = 275), bipolar disorder (n associations = 17; n unique genes = 13), attention deficit hyperactivity disorder (n associations = 19; n unique genes = 12) and broad depression (n associations = 41; n unique genes = 31). Importantly, both PrediXcan and summary-data-based Mendelian randomization/heterogeneity in dependent instruments analyses suggest potentially causal genes in non-brain tissues, showing the utility of these tissues for mapping psychiatric disease genetic predisposition. Our analyses further highlight the importance of joint tissue approaches as 76% of the genes were detected only in difficult-to-acquire tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0409-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6590703PMC
June 2019
-->