Publications by authors named "Alexis Battle"

44 Publications

DNA methylation signatures reveal that distinct combinations of transcription factors specify human immune cell epigenetic identity.

Immunity 2021 11 26;54(11):2465-2480.e5. Epub 2021 Oct 26.

Laboratory of Molecular Biology and Immunology, National Institute on Aging, Baltimore, MD, USA. Electronic address:

Epigenetic reprogramming underlies specification of immune cell lineages, but patterns that uniquely define immune cell types and the mechanisms by which they are established remain unclear. Here, we identified lineage-specific DNA methylation signatures of six immune cell types from human peripheral blood and determined their relationship to other epigenetic and transcriptomic patterns. Sites of lineage-specific hypomethylation were associated with distinct combinations of transcription factors in each cell type. By contrast, sites of lineage-specific hypermethylation were restricted mostly to adaptive immune cells. PU.1 binding sites were associated with lineage-specific hypo- and hypermethylation in different cell types, suggesting that it regulates DNA methylation in a context-dependent manner. These observations indicate that innate and adaptive immune lineages are specified by distinct epigenetic mechanisms via combinatorial and context-dependent use of key transcription factors. The cell-specific epigenomics and transcriptional patterns identified serve as a foundation for future studies on immune dysregulation in diseases and aging.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.immuni.2021.10.001DOI Listing
November 2021

Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression.

Nat Genet 2021 09 2;53(9):1300-1310. Epub 2021 Sep 2.

Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland.

Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-021-00913-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8432599PMC
September 2021

Coexpression network architecture reveals the brain-wide and multiregional basis of disease susceptibility.

Nat Neurosci 2021 09 22;24(9):1313-1323. Epub 2021 Jul 22.

Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.

Gene networks have yielded numerous neurobiological insights, yet an integrated view across brain regions is lacking. We leverage RNA sequencing in 864 samples representing 12 brain regions to robustly identify 12 brain-wide, 50 cross-regional and 114 region-specific coexpression modules. Nearly 40% of genes fall into brain-wide modules, while 25% comprise region-specific modules reflecting regional biology, such as oxytocin signaling in the hypothalamus, or addiction pathways in the nucleus accumbens. Schizophrenia and autism genetic risk are enriched in brain-wide and multiregional modules, indicative of broad impact; these modules implicate neuronal proliferation and activity-dependent processes, including endocytosis and splicing, in disease pathophysiology. We find that cell-type-specific long noncoding RNA and gene isoforms contribute substantially to regional synaptic diversity and that constrained, mutation-intolerant genes are primarily enriched in neurons. We leverage these data using an omnigenic-inspired network framework to characterize how coexpression and gene regulatory networks reflect neuropsychiatric disease risk, supporting polygenic models.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41593-021-00887-5DOI Listing
September 2021

Population-scale tissue transcriptomics maps long non-coding RNAs to complex disease.

Cell 2021 05 16;184(10):2633-2648.e19. Epub 2021 Apr 16.

Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Pathology, Stanford University, Stanford, CA 94305, USA. Electronic address:

Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2021.03.050DOI Listing
May 2021

In vivo CD8 T cell CRISPR screening reveals control by Fli1 in infection and cancer.

Cell 2021 03 25;184(5):1262-1280.e22. Epub 2021 Feb 25.

Institute for Immunology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Department of Cancer Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA; Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA. Electronic address:

Improving effector activity of antigen-specific T cells is a major goal in cancer immunotherapy. Despite the identification of several effector T cell (T)-driving transcription factors (TFs), the transcriptional coordination of T biology remains poorly understood. We developed an in vivo T cell CRISPR screening platform and identified a key mechanism restraining T biology through the ETS family TF, Fli1. Genetic deletion of Fli1 enhanced T responses without compromising memory or exhaustion precursors. Fli1 restrained T lineage differentiation by binding to cis-regulatory elements of effector-associated genes. Loss of Fli1 increased chromatin accessibility at ETS:RUNX motifs, allowing more efficient Runx3-driven T biology. CD8 T cells lacking Fli1 provided substantially better protection against multiple infections and tumors. These data indicate that Fli1 safeguards the developing CD8 T cell transcriptional landscape from excessive ETS:RUNX-driven T cell differentiation. Moreover, genetic deletion of Fli1 improves T differentiation and protective immunity in infections and cancer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2021.02.019DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8054351PMC
March 2021

ACE inhibition and cardiometabolic risk factors, lung and gene expression, and plasma ACE2 levels: a Mendelian randomization study.

R Soc Open Sci 2020 Nov 18;7(11):200958. Epub 2020 Nov 18.

Computer Science Department and Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA.

Angiotensin-converting enzyme 2 (ACE2) and serine protease TMPRSS2 have been implicated in cell entry for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus responsible for coronavirus disease 2019 (COVID-19). The expression of and in the lung epithelium might have implications for the risk of SARS-CoV-2 infection and severity of COVID-19. We use human genetic variants that proxy angiotensin-converting enzyme (ACE) inhibitor drug effects and cardiovascular risk factors to investigate whether these exposures affect lung and gene expression and circulating ACE2 levels. We observed no consistent evidence of an association of genetically predicted serum ACE levels with any of our outcomes. There was weak evidence for an association of genetically predicted serum ACE levels with gene expression in the Lung eQTL Consortium ( = 0.014), but this finding did not replicate. There was evidence of a positive association of genetic liability to type 2 diabetes mellitus with lung gene expression in the Gene-Tissue Expression (GTEx) study ( = 4 × 10) and with circulating plasma ACE2 levels in the INTERVAL study ( = 0.03), but not with lung expression in the Lung eQTL Consortium study ( = 0.68). There were no associations of genetically proxied liability to the other cardiometabolic traits with any outcome. This study does not provide consistent evidence to support an effect of serum ACE levels (as a proxy for ACE inhibitors) or cardiometabolic risk factors on lung and expression or plasma ACE2 levels.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1098/rsos.200958DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7735342PMC
November 2020

Linear and Nonlinear Mendelian Randomization Analyses of the Association Between Diastolic Blood Pressure and Cardiovascular Events: The J-Curve Revisited.

Circulation 2021 Mar 30;143(9):895-906. Epub 2020 Nov 30.

Department of Medicine, Division of Cardiology (M.A., W.S.P., J.W.M.), Johns Hopkins University, Baltimore, MD.

Background: Recent clinical guidelines support intensive blood pressure treatment targets. However, observational data suggest that excessive diastolic blood pressure (DBP) lowering might increase the risk of myocardial infarction (MI), reflecting a J- or U-shaped relationship.

Methods: We analyzed 47 407 participants from 5 cohorts (median age, 60 years). First, to corroborate previous observational analyses, we used traditional statistical methods to test the shape of association between DBP and cardiovascular disease (CVD). Second, we created polygenic risk scores of DBP and systolic blood pressure and generated linear Mendelian randomization (MR) estimates for the effect of DBP on CVD. Third, using novel nonlinear MR approaches, we evaluated for nonlinearity in the genetic relationship between DBP and CVD events. Comprehensive MR interrogation of DBP required us to also model systolic blood pressure, given that the 2 are strongly correlated.

Results: Traditional observational analysis of our cohorts suggested a J-shaped association between DBP and MI. By contrast, linear MR analyses demonstrated an adverse effect of increasing DBP increments on CVD outcomes, including MI (MI hazard ratio, 1.07 per unit mm Hg increase in DBP; <0.001). Furthermore, nonlinear MR analyses found no evidence for a J-shaped relationship; instead confirming that MI risk decreases consistently per unit decrease in DBP, even among individuals with low values of baseline DBP.

Conclusions: In this analysis of the genetic effect of DBP, we found no evidence for a nonlinear J- or U-shaped relationship between DBP and adverse CVD outcomes; including MI.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCULATIONAHA.120.049819DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7920937PMC
March 2021

Transcriptional profile of platelets and iPSC-derived megakaryocytes from whole-genome and RNA sequencing.

Blood 2021 02;137(7):959-968

The GeneSTAR Research Program.

Genome-wide association studies have identified common variants associated with platelet-related phenotypes, but because these variants are largely intronic or intergenic, their link to platelet biology is unclear. In 290 normal subjects from the GeneSTAR Research Study (110 African Americans [AAs] and 180 European Americans [EAs]), we generated whole-genome sequence data from whole blood and RNA sequence data from extracted nonribosomal RNA from 185 induced pluripotent stem cell-derived megakaryocyte (MK) cell lines (platelet precursor cells) and 290 blood platelet samples from these subjects. Using eigenMT software to select the peak single-nucleotide polymorphism (SNP) for each expressed gene, and meta-analyzing the results of AAs and EAs, we identify (q-value < 0.05) 946 cis-expression quantitative trait loci (eQTLs) in derived MKs and 1830 cis-eQTLs in blood platelets. Among the 57 eQTLs shared between the 2 tissues, the estimated directions of effect are very consistent (98.2% concordance). A high proportion of detected cis-eQTLs (74.9% in MKs and 84.3% in platelets) are unique to MKs and platelets compared with peak-associated SNP-expressed gene pairs of 48 other tissue types that are reported in version V7 of the Genotype-Tissue Expression Project. The locations of our identified eQTLs are significantly enriched for overlap with several annotation tracks highlighting genomic regions with specific functionality in MKs, including MK-specific DNAse hotspots, H3K27-acetylation marks, H3K4-methylation marks, enhancers, and superenhancers. These results offer insights into the regulatory signature of MKs and platelets, with significant overlap in genes expressed, eQTLs detected, and enrichment within known superenhancers relevant to platelet biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1182/blood.2020006115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7918180PMC
February 2021

Transcriptomic signatures across human tissues identify functional rare genetic variation.

Science 2020 09 10;369(6509). Epub 2020 Sep 10.

University of Mississippi Medical Center, Jackson, MS, USA.

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaz5900DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7646251PMC
September 2020

The impact of sex on gene expression across human tissues.

Science 2020 09;369(6509)

Department of Statistics, University of Chicago, Chicago, IL, USA.

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aba3066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8136152PMC
September 2020

Where Are the Disease-Associated eQTLs?

Trends Genet 2021 02 7;37(2):109-124. Epub 2020 Sep 7.

Department of Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA. Electronic address:

Most disease-associated variants, although located in putatively regulatory regions, do not have detectable effects on gene expression. One explanation could be that we have not examined gene expression in the cell types or conditions that are most relevant for disease. Even large-scale efforts to study gene expression across tissues are limited to human samples obtained opportunistically or postmortem, mostly from adults. In this review we evaluate recent findings and suggest an alternative strategy, drawing on the dynamic and highly context-specific nature of gene regulation. We discuss new technologies that can extend the standard regulatory mapping framework to more diverse, disease-relevant cell types and states.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.tig.2020.08.009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8162831PMC
February 2021

sn-spMF: matrix factorization informs tissue-specific genetic regulation of gene expression.

Genome Biol 2020 09 11;21(1):235. Epub 2020 Sep 11.

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, 21218, MD, USA.

Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues from the Genotype-Tissue Expression (GTEx) project. The learned factors reflect tissues with known biological similarity and identify transcription factors that may mediate tissue-specific effects. sn-spMF, available at https://github.com/heyuan7676/ts_eQTLs , can be applied to learn biologically interpretable patterns of eQTL tissue-specificity and generate testable mechanistic hypotheses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02129-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7488540PMC
September 2020

GBAT: a gene-based association test for robust detection of trans-gene regulation.

Genome Biol 2020 08 24;21(1):211. Epub 2020 Aug 24.

Departments of Neurology and Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA.

The observation that disease-associated genetic variants typically reside outside of exons has inspired widespread investigation into the genetic basis of transcriptional regulation. While associations between the mRNA abundance of a gene and its proximal SNPs (cis-eQTLs) are now readily identified, identification of high-quality distal associations (trans-eQTLs) has been limited by a heavy multiple testing burden and the proneness to false-positive signals. To address these issues, we develop GBAT, a powerful gene-based pipeline that allows robust detection of high-quality trans-gene regulation signal.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02120-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7444084PMC
August 2020

Genome-wide association and multi-omic analyses reveal ACTN2 as a gene linked to heart failure.

Nat Commun 2020 02 28;11(1):1122. Epub 2020 Feb 28.

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.

Heart failure is a major public health problem affecting over 23 million people worldwide. In this study, we present the results of a large scale meta-analysis of heart failure GWAS and replication in a comparable sized cohort to identify one known and two novel loci associated with heart failure. Heart failure sub-phenotyping shows that a new locus in chromosome 1 is associated with left ventricular adverse remodeling and clinical heart failure, in response to different initial cardiac muscle insults. Functional characterization and fine-mapping of that locus reveal a putative causal variant in a cardiac muscle specific regulatory region activated during cardiomyocyte differentiation that binds to the ACTN2 gene, a crucial structural protein inside the cardiac sarcolemma (Hi-C interaction p-value = 0.00002). Genome-editing in human embryonic stem cell-derived cardiomyocytes confirms the influence of the identified regulatory region in the expression of ACTN2. Our findings extend our understanding of biological mechanisms underlying heart failure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-14843-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7048760PMC
February 2020

Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts.

Nat Med 2019 06 3;25(6):911-919. Epub 2019 Jun 3.

Department of Computer Science, Stanford University, Stanford, CA, USA.

It is estimated that 350 million individuals worldwide suffer from rare diseases, which are predominantly caused by mutation in a single gene. The current molecular diagnostic rate is estimated at 50%, with whole-exome sequencing (WES) among the most successful approaches. For patients in whom WES is uninformative, RNA sequencing (RNA-seq) has shown diagnostic utility in specific tissues and diseases. This includes muscle biopsies from patients with undiagnosed rare muscle disorders, and cultured fibroblasts from patients with mitochondrial disorders. However, for many individuals, biopsies are not performed for clinical care, and tissues are difficult to access. We sought to assess the utility of RNA-seq from blood as a diagnostic tool for rare diseases of different pathophysiologies. We generated whole-blood RNA-seq from 94 individuals with undiagnosed rare diseases spanning 16 diverse disease categories. We developed a robust approach to compare data from these individuals with large sets of RNA-seq data for controls (n = 1,594 unrelated controls and n = 49 family members) and demonstrated the impacts of expression, splicing, gene and variant filtering strategies on disease gene identification. Across our cohort, we observed that RNA-seq yields a 7.5% diagnostic rate, and an additional 16.7% with improved candidate gene resolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41591-019-0457-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6634302PMC
June 2019

Addressing confounding artifacts in reconstruction of gene co-expression networks.

Genome Biol 2019 05 16;20(1):94. Epub 2019 May 16.

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Gene co-expression networks capture biological relationships between genes and are important tools in predicting gene function and understanding disease mechanisms. We show that technical and biological artifacts in gene expression data confound commonly used network reconstruction algorithms. We demonstrate theoretically, in simulation, and empirically, that principal component correction of gene expression measurements prior to network inference can reduce false discoveries. Using data from the GTEx project in multiple tissues, we show that this approach reduces false discoveries beyond correcting only for known confounders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1700-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6521369PMC
May 2019

Author Correction: Retinal transcriptome and eQTL analyses identify genes associated with age-related macular degeneration.

Nat Genet 2019 Jun;51(6):1067

Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, Bethesda, MD, USA.

In the version of this article initially published, in Supplementary Data 5, the logFC, FC, P value and adjusted P value for advanced AMD versus control (DE 4/1) without age correction did not correspond to the correct gene IDs. The errors have been corrected in the HTML version of the article.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0430-yDOI Listing
June 2019

False positives in trans-eQTL and co-expression analyses arising from RNA-sequencing alignment errors.

F1000Res 2018 28;7:1860. Epub 2018 Nov 28.

Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, 21218, USA.

Sequence similarity among distinct genomic regions can lead to errors in alignment of short reads from next-generation sequencing. While this is well known, the downstream consequences of misalignment have not been fully characterized.  We assessed the potential for incorrect alignment of RNA-sequencing reads to cause false positives in both gene expression quantitative trait locus (eQTL) and co-expression analyses. Trans-eQTLs identified from human RNA-sequencing studies appeared to be particularly affected by this phenomenon, even when only uniquely aligned reads are considered. Over 75% of trans-eQTLs using a standard pipeline occurred between regions of sequence similarity and therefore could be due to alignment errors. Further, associations due to mapping errors are likely to misleadingly replicate between studies. To help address this problem, we quantified the potential for "cross-mapping'' to occur between every pair of annotated genes in the human genome. Such cross-mapping data can be used to filter or flag potential false positives in both trans-eQTL and co-expression analyses. Such filtering substantially alters the detection of significant associations and can have an impact on the assessment of false discovery rate, functional enrichment, and replication for RNA-sequencing association studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.17145.2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6305209PMC
November 2018

Imputed gene associations identify replicable trans-acting genes enriched in transcription pathways and complex traits.

Genet Epidemiol 2019 09 4;43(6):596-608. Epub 2019 Apr 4.

Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois.

Regulation of gene expression is an important mechanism through which genetic variation can affect complex traits. A substantial portion of gene expression variation can be explained by both local (cis) and distal (trans) genetic variation. Much progress has been made in uncovering cis-acting expression quantitative trait loci (cis-eQTL), but trans-eQTL have been more difficult to identify and replicate. Here we take advantage of our ability to predict the cis component of gene expression coupled with gene mapping methods such as PrediXcan to identify high confidence candidate trans-acting genes and their targets. That is, we correlate the cis component of gene expression with observed expression of genes in different chromosomes. Leveraging the shared cis-acting regulation across tissues, we combine the evidence of association across all available Genotype-Tissue Expression Project tissues and find 2,356 trans-acting/target gene pairs with high mappability scores. Reassuringly, trans-acting genes are enriched in transcription and nucleic acid binding pathways and target genes are enriched in known transcription factor binding sites. Interestingly, trans-acting genes are more significantly associated with selected complex traits and diseases than target or background genes, consistent with percolating trans effects. Our scripts and summary statistics are publicly available for future studies of trans-acting gene regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22205DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6687523PMC
September 2019

Retinal transcriptome and eQTL analyses identify genes associated with age-related macular degeneration.

Nat Genet 2019 04 11;51(4):606-610. Epub 2019 Feb 11.

Neurobiology-Neurodegeneration & Repair Laboratory, National Eye Institute, National Institutes of Health, Bethesda, MD, USA.

Genome-wide association studies (GWAS) have identified genetic variants at 34 loci contributing to age-related macular degeneration (AMD). We generated transcriptional profiles of postmortem retinas from 453 controls and cases at distinct stages of AMD and integrated retinal transcriptomes, covering 13,662 protein-coding and 1,462 noncoding genes, with genotypes at more than 9 million common SNPs for expression quantitative trait loci (eQTL) analysis of a tissue not included in Genotype-Tissue Expression (GTEx) and other large datasets. Cis-eQTL analysis identified 10,474 genes under genetic regulation, including 4,541 eQTLs detected only in the retina. Integrated analysis of AMD-GWAS with eQTLs ascertained likely target genes at six reported loci. Using transcriptome-wide association analysis (TWAS), we identified three additional genes, RLBP1, HIC1 and PARP12, after Bonferroni correction. Our studies expand the genetic landscape of AMD and establish the Eye Genotype Expression (EyeGEx) database as a resource for post-GWAS interpretation of multifactorial ocular traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-019-0351-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6441365PMC
April 2019

Genetic effects on gene expression across human tissues.

Nature 2017 10;550(7675):204-213

Department of Genetics, Stanford University, Stanford, California 94305, USA.

Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression levels across individuals and diverse tissues of the human body, many of which are not easily accessible. Here we describe genetic effects on gene expression levels across 44 human tissues. We find that local genetic variation affects gene expression levels for the majority of genes, and we further identify inter-chromosomal genetic effects for 93 genes and 112 loci. On the basis of the identified genetic effects, we characterize patterns of tissue specificity, compare local and distal effects, and evaluate the functional properties of the genetic effects. We also demonstrate that multi-tissue, multi-individual data can be used to identify genes and pathways affected by human disease-associated variation, enabling a mechanistic interpretation of gene regulation and the genetic basis of disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature24277DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5776756PMC
October 2017

The impact of rare variation on gene expression across tissues.

Nature 2017 10;550(7675):239-243

Department of Pathology, Stanford University, Stanford, California 94305, USA.

Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature24267DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5877409PMC
October 2017

Co-expression networks reveal the tissue-specific regulation of transcription and splicing.

Genome Res 2017 11 11;27(11):1843-1858. Epub 2017 Oct 11.

Gene co-expression networks capture biologically important patterns in gene expression data, enabling functional analyses of genes, discovery of biomarkers, and interpretation of genetic variants. Most network analyses to date have been limited to assessing correlation between total gene expression levels in a single tissue or small sets of tissues. Here, we built networks that additionally capture the regulation of relative isoform abundance and splicing, along with tissue-specific connections unique to each of a diverse set of tissues. We used the Genotype-Tissue Expression (GTEx) project v6 RNA sequencing data across 50 tissues and 449 individuals. First, we developed a framework called Transcriptome-Wide Networks (TWNs) for combining total expression and relative isoform levels into a single sparse network, capturing the interplay between the regulation of splicing and transcription. We built TWNs for 16 tissues and found that hubs in these networks were strongly enriched for splicing and RNA binding genes, demonstrating their utility in unraveling regulation of splicing in the human transcriptome. Next, we used a Bayesian biclustering model that identifies network edges unique to a single tissue to reconstruct Tissue-Specific Networks (TSNs) for 26 distinct tissues and 10 groups of related tissues. Finally, we found genetic variants associated with pairs of adjacent nodes in our networks, supporting the estimated network structures and identifying 20 genetic variants with distant regulatory impact on transcription and splicing. Our networks provide an improved understanding of the complex relationships of the human transcriptome across tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.216721.116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5668942PMC
November 2017

Incorporation of Biological Knowledge Into the Study of Gene-Environment Interactions.

Am J Epidemiol 2017 Oct;186(7):771-777

A growing knowledge base of genetic and environmental information has greatly enabled the study of disease risk factors. However, the computational complexity and statistical burden of testing all variants by all environments has required novel study designs and hypothesis-driven approaches. We discuss how incorporating biological knowledge from model organisms, functional genomics, and integrative approaches can empower the discovery of novel gene-environment interactions and discuss specific methodological considerations with each approach. We consider specific examples where the application of these approaches has uncovered effects of gene-environment interactions relevant to drug response and immunity, and we highlight how such improvements enable a greater understanding of the pathogenesis of disease and the realization of precision medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/aje/kwx229DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860556PMC
October 2017

FIRE: functional inference of genetic variants that regulate gene expression.

Bioinformatics 2017 Dec;33(24):3895-3901

Department of Health Research & Policy.

Motivation: Interpreting genetic variation in noncoding regions of the genome is an important challenge for personal genome analysis. One mechanism by which noncoding single nucleotide variants (SNVs) influence downstream phenotypes is through the regulation of gene expression. Methods to predict whether or not individual SNVs are likely to regulate gene expression would aid interpretation of variants of unknown significance identified in whole-genome sequencing studies.

Results: We developed FIRE (Functional Inference of Regulators of Expression), a tool to score both noncoding and coding SNVs based on their potential to regulate the expression levels of nearby genes. FIRE consists of 23 random forests trained to recognize SNVs in cis-expression quantitative trait loci (cis-eQTLs) using a set of 92 genomic annotations as predictive features. FIRE scores discriminate cis-eQTL SNVs from non-eQTL SNVs in the training set with a cross-validated area under the receiver operating characteristic curve (AUC) of 0.807, and discriminate cis-eQTL SNVs shared across six populations of different ancestry from non-eQTL SNVs with an AUC of 0.939. FIRE scores are also predictive of cis-eQTL SNVs across a variety of tissue types.

Availability And Implementation: FIRE scores for genome-wide SNVs in hg19/GRCh37 are available for download at https://sites.google.com/site/fireregulatoryvariation/.

Contact: [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx534DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860093PMC
December 2017

Identifying global expression patterns and key regulators in epithelial to mesenchymal transition through multi-study integration.

BMC Cancer 2017 Jun 26;17(1):447. Epub 2017 Jun 26.

Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA.

Background: Epithelial to mesenchymal transition (EMT) is the process by which stationary epithelial cells transdifferentiate to mesenchymal cells with increased motility. EMT is integral in early stages of development and wound healing. Studies have shown that EMT could be a critical early event in tumor metastasis that is involved in acquisition of migratory and invasive properties in multiple carcinomas.

Methods: In this study, we used 15 published gene expression microarray datasets from Gene Expression Omnibus (GEO) that represent 12 cell lines from 6 cancer types across 95 observations (45 unique samples and 50 replicates) with different modes of induction of EMT or the reverse transition, mesenchymal to epithelial transition (MET). We integrated multiple gene expression datasets while considering study differences, batch effects, and noise in gene expression measurements. A universal differential EMT gene list was obtained by normalizing and correcting the data using four approaches, computing differential expression from each, and identifying a consensus ranking. We confirmed our discovery of novel EMT genes at mRNA and protein levels in an in vitro EMT model of prostate cancer - PC3 epi, EMT and Taxol resistant cell lines. We validate our discovery of C1orf116 as a novel EMT regulator by siRNA knockdown of C1orf116 in PC3 epithelial cells.

Results: Among differentially expressed genes, we found known epithelial and mesenchymal marker genes such as CDH1 and ZEB1. Additionally, we discovered genes known in a subset of carcinomas that were unknown in prostate cancer. This included epithelial specific LSR and S100A14 and mesenchymal specific DPYSL3. Furthermore, we also discovered novel EMT genes including a poorly-characterized gene C1orf116. We show that decreased expression of C1orf116 is associated with poor prognosis in lung and prostate cancer patients. We demonstrate that knockdown of C1orf116 expression induced expression of mesenchymal genes in epithelial prostate cancer cell line PC3-epi cells, suggesting it as a candidate driver of the epithelial phenotype.

Conclusions: This comprehensive approach of statistical analysis and functional validation identified global expression patterns in EMT and candidate regulatory genes, thereby both extending current knowledge and identifying novel drivers of EMT.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12885-017-3413-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5485747PMC
June 2017

Allele-specific expression reveals interactions between genetic variation and environment.

Nat Methods 2017 Jul 22;14(7):699-702. Epub 2017 May 22.

Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA.

Identifying interactions between genetics and the environment (GxE) remains challenging. We have developed EAGLE, a hierarchical Bayesian model for identifying GxE interactions based on associations between environmental variables and allele-specific expression. Combining whole-blood RNA-seq with extensive environmental annotations collected from 922 human individuals, we identified 35 GxE interactions, compared with only four using standard GxE interaction testing. EAGLE provides new opportunities for researchers to identify GxE interactions using functional genomic data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.4298DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5501199PMC
July 2017

Population- and individual-specific regulatory variation in Sardinia.

Nat Genet 2017 May 10;49(5):700-707. Epub 2017 Apr 10.

Istituto di Ricerca Genetica e Biomedica (IRGB), CNR, Monserrato, Italy.

Genetic studies of complex traits have mainly identified associations with noncoding variants. To further determine the contribution of regulatory variation, we combined whole-genome and transcriptome data for 624 individuals from Sardinia to identify common and rare variants that influence gene expression and splicing. We identified 21,183 expression quantitative trait loci (eQTLs) and 6,768 splicing quantitative trait loci (sQTLs), including 619 new QTLs. We identified high-frequency QTLs and found evidence of selection near genes involved in malarial resistance and increased multiple sclerosis risk, reflecting the epidemiological history of Sardinia. Using family relationships, we identified 809 segregating expression outliers (median z score of 2.97), averaging 13.3 genes per individual. Outlier genes were enriched for proximal rare variants, providing a new approach to study large-effect regulatory variants and their relevance to traits. Our results provide insight into the effects of regulatory variants and their relationship to population history and individual genetic risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3840DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411016PMC
May 2017

The impact of structural variation on human gene expression.

Nat Genet 2017 May 3;49(5):692-699. Epub 2017 Apr 3.

McDonnell Genome Institute, Washington University School of Medicine, St. Louis, Missouri, USA.

Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3834DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5406250PMC
May 2017

Genetic variation in MHC proteins is associated with T cell receptor expression biases.

Nat Genet 2016 09 1;48(9):995-1002. Epub 2016 Aug 1.

Department of Genetics, Stanford University, Stanford, California, USA.

In each individual, a highly diverse T cell receptor (TCR) repertoire interacts with peptides presented by major histocompatibility complex (MHC) molecules. Despite extensive research, it remains controversial whether germline-encoded TCR-MHC contacts promote TCR-MHC specificity and, if so, whether differences exist in TCR V gene compatibilities with different MHC alleles. We applied expression quantitative trait locus (eQTL) mapping to test for associations between genetic variation and TCR V gene usage in a large human cohort. We report strong trans associations between variation in the MHC locus and TCR V gene usage. Fine-mapping of the association signals identifies specific amino acids from MHC genes that bias V gene usage, many of which contact or are spatially proximal to the TCR or peptide in the TCR-peptide-MHC complex. Hence, these MHC variants, several of which are linked to autoimmune diseases, can directly affect TCR-MHC interaction. These results provide the first examples of trans-QTL effects mediated by protein-protein interactions and are consistent with intrinsic TCR-MHC specificity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3625DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5010864PMC
September 2016
-->