Publications by authors named "Kristin G Ardlie"

59 Publications

Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs.

Nat Commun 2021 06 7;12(1):3394. Epub 2021 Jun 7.

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

The large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants' effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-23134-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8184741PMC
June 2021

Population-scale tissue transcriptomics maps long non-coding RNAs to complex disease.

Cell 2021 May 16;184(10):2633-2648.e19. Epub 2021 Apr 16.

Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Pathology, Stanford University, Stanford, CA 94305, USA. Electronic address:

Long non-coding RNA (lncRNA) genes have well-established and important impacts on molecular and cellular functions. However, among the thousands of lncRNA genes, it is still a major challenge to identify the subset with disease or trait relevance. To systematically characterize these lncRNA genes, we used Genotype Tissue Expression (GTEx) project v8 genetic and multi-tissue transcriptomic data to profile the expression, genetic regulation, cellular contexts, and trait associations of 14,100 lncRNA genes across 49 tissues for 101 distinct complex genetic traits. Using these approaches, we identified 1,432 lncRNA gene-trait associations, 800 of which were not explained by stronger effects of neighboring protein-coding genes. This included associations between lncRNA quantitative trait loci and inflammatory bowel disease, type 1 and type 2 diabetes, and coronary artery disease, as well as rare variant associations to body mass index.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2021.03.050DOI Listing
May 2021

RNA-SeQC 2: Efficient RNA-seq quality control and quantification for large cohorts.

Bioinformatics 2021 Mar 2. Epub 2021 Mar 2.

Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

Summary: Post-sequencing quality control is a crucial component of RNA sequencing (RNA-seq) data generation and analysis, as sample quality can be affected by sample storage, extraction, and sequencing protocols. RNA-seq is increasingly applied to cohorts ranging from hundreds to tens of thousands of samples in size, but existing tools do not readily scale to these sizes, and were not designed for a wide range of sample types and qualities. Here, we describe RNA-SeQC 2, an efficient reimplementation of RNA-SeQC (DeLuca et al., 2012) that adds multiple metrics designed to characterize sample quality across a wide range of RNA-seq protocols.

Availability And Implementation: The command-line tool, documentation, and C ++ source code are available at the GitHub repository https://github.com/getzlab/rnaseqc.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab135DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479667PMC
March 2021

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Nature 2021 02 10;590(7845):290-299. Epub 2021 Feb 10.

The Broad Institute of MIT and Harvard, Cambridge, MA, USA.

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes). In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03205-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875770PMC
February 2021

Whole genome sequence analysis of pulmonary function and COPD in 19,996 multi-ethnic participants.

Nat Commun 2020 10 14;11(1):5182. Epub 2020 Oct 14.

The Institute for Translational Genomics and Population Sciences, The Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, 90502, USA.

Chronic obstructive pulmonary disease (COPD), diagnosed by reduced lung function, is a leading cause of morbidity and mortality. We performed whole genome sequence (WGS) analysis of lung function and COPD in a multi-ethnic sample of 11,497 participants from population- and family-based studies, and 8499 individuals from COPD-enriched studies in the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program. We identify at genome-wide significance 10 known GWAS loci and 22 distinct, previously unreported loci, including two common variant signals from stratified analysis of African Americans. Four novel common variants within the regions of PIAS1, RGN (two variants) and FTO show evidence of replication in the UK Biobank (European ancestry n ~ 320,000), while colocalization analyses leveraging multi-omic data from GTEx and TOPMed identify potential molecular mechanisms underlying four of the 22 novel loci. Our study demonstrates the value of performing WGS analyses and multi-omic follow-up in cohorts of diverse ancestry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-18334-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7598941PMC
October 2020

Fine-mapping and QTL tissue-sharing information improves the reliability of causal gene identification.

Genet Epidemiol 2020 Sep 10. Epub 2020 Sep 10.

Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, Illinois.

The integration of transcriptomic studies and genome-wide association studies (GWAS) via imputed expression has seen extensive application in recent years, enabling the functional characterization and causal gene prioritization of GWAS loci. However, the techniques for imputing transcriptomic traits from DNA variation remain underdeveloped. Furthermore, associations found when linking eQTL studies to complex traits through methods like PrediXcan can lead to false positives due to linkage disequilibrium between distinct causal variants. Therefore, the best prediction performance models may not necessarily lead to more reliable causal gene discovery. With the goal of improving discoveries without increasing false positives, we develop and compare multiple transcriptomic imputation approaches using the most recent GTEx release of expression and splicing data on 17,382 RNA-sequencing samples from 948 post-mortem donors in 54 tissues. We find that informing prediction models with posterior causal probability from fine-mapping (dap-g) and borrowing information across tissues (mashr) can lead to better performance in terms of number and proportion of significant associations that are colocalized and the proportion of silver standard genes identified as indicated by precision-recall and receiver operating characteristic curves. All prediction models are made publicly available at predictdb.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22346DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693040PMC
September 2020

Cell type-specific genetic regulation of gene expression across human tissues.

Science 2020 09;369(6509)

Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain.

The Genotype-Tissue Expression (GTEx) project has identified expression and splicing quantitative trait loci in cis (QTLs) for the majority of genes across a wide range of human tissues. However, the functional characterization of these QTLs has been limited by the heterogeneous cellular composition of GTEx tissue samples. We mapped interactions between computational estimates of cell type abundance and genotype to identify cell type-interaction QTLs for seven cell types and show that cell type-interaction expression QTLs (eQTLs) provide finer resolution to tissue specificity than bulk tissue cis-eQTLs. Analyses of genetic associations with 87 complex traits show a contribution from cell type-interaction QTLs and enables the discovery of hundreds of previously unidentified colocalized loci that are masked in bulk tissue.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaz8528DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8051643PMC
September 2020

Determinants of telomere length across human tissues.

Science 2020 09;369(6509)

Department of Public Health Sciences, University of Chicago, Chicago, IL, USA.

Telomere shortening is a hallmark of aging. Telomere length (TL) in blood cells has been studied extensively as a biomarker of human aging and disease; however, little is known regarding variability in TL in nonblood, disease-relevant tissue types. Here, we characterize variability in TLs from 6391 tissue samples, representing >20 tissue types and 952 individuals from the Genotype-Tissue Expression (GTEx) project. We describe differences across tissue types, positive correlation among tissue types, and associations with age and ancestry. We show that genetic variation affects TL in multiple tissue types and that TL may mediate the effect of age on gene expression. Our results provide the foundational knowledge regarding TL in healthy tissues that is needed to interpret epidemiological studies of TL and human health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaz6876DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8108546PMC
September 2020

Transcriptomic signatures across human tissues identify functional rare genetic variation.

Science 2020 09 10;369(6509). Epub 2020 Sep 10.

University of Mississippi Medical Center, Jackson, MS, USA.

Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaz5900DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7646251PMC
September 2020

The impact of sex on gene expression across human tissues.

Science 2020 09;369(6509)

Department of Statistics, University of Chicago, Chicago, IL, USA.

Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aba3066DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8136152PMC
September 2020

Impact of admixture and ancestry on eQTL analysis and GWAS colocalization in GTEx.

Genome Biol 2020 09 11;21(1):233. Epub 2020 Sep 11.

Department of Genetics, Stanford University, Stanford, CA, USA.

Background: Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization.

Results: Here, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in seven tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe 31 loci (0.02%) where a significant colocalization is called only with one eQTL ancestry adjustment method. Notably, both adjustments produce similar numbers of significant colocalizations within each of two different colocalization methods, COLOC and FINEMAP. Finally, we identify a small subset of eQTL-associated variants highly correlated with local ancestry, providing a resource to enhance functional follow-up.

Conclusions: We provide a local ancestry map for admixed individuals in the GTEx v8 release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of the results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02113-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7488497PMC
September 2020

A vast resource of allelic expression data spanning human tissues.

Genome Biol 2020 09 11;21(1):234. Epub 2020 Sep 11.

New York Genome Center, New York, NY, USA.

Allele expression (AE) analysis robustly measures cis-regulatory effects. Here, we present and demonstrate the utility of a vast AE resource generated from the GTEx v8 release, containing 15,253 samples spanning 54 human tissues for a total of 431 million measurements of AE at the SNP level and 153 million measurements at the haplotype level. In addition, we develop an extension of our tool phASER that allows effect sizes of cis-regulatory variants to be estimated using haplotype-level AE data. This AE resource is the largest to date, and we are able to make haplotype-level data publicly available. We anticipate that the availability of this resource will enable future studies of regulatory variation across human tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02122-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7488534PMC
September 2020

sn-spMF: matrix factorization informs tissue-specific genetic regulation of gene expression.

Genome Biol 2020 09 11;21(1):235. Epub 2020 Sep 11.

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, 21218, MD, USA.

Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues from the Genotype-Tissue Expression (GTEx) project. The learned factors reflect tissues with known biological similarity and identify transcription factors that may mediate tissue-specific effects. sn-spMF, available at https://github.com/heyuan7676/ts_eQTLs , can be applied to learn biologically interpretable patterns of eQTL tissue-specificity and generate testable mechanistic hypotheses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02129-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7488540PMC
September 2020

Transcriptional and Cellular Diversity of the Human Heart.

Circulation 2020 08 14;142(5):466-482. Epub 2020 May 14.

Precision Cardiology Laboratory (N.R.T., M.C., S.J.F., A.W.H., A.-D.A., C.N.H., A.A., I.P., C.R., S.H.C., M.B., C.M.S., P.T.E.), Cambridge, MA.

Background: The human heart requires a complex ensemble of specialized cell types to perform its essential function. A greater knowledge of the intricate cellular milieu of the heart is critical to increase our understanding of cardiac homeostasis and pathology. As recent advances in low-input RNA sequencing have allowed definitions of cellular transcriptomes at single-cell resolution at scale, we have applied these approaches to assess the cellular and transcriptional diversity of the nonfailing human heart.

Methods: Microfluidic encapsulation and barcoding was used to perform single nuclear RNA sequencing with samples from 7 human donors, selected for their absence of overt cardiac disease. Individual nuclear transcriptomes were then clustered based on transcriptional profiles of highly variable genes. These clusters were used as the basis for between-chamber and between-sex differential gene expression analyses and intersection with genetic and pharmacologic data.

Results: We sequenced the transcriptomes of 287 269 single cardiac nuclei, revealing 9 major cell types and 20 subclusters of cell types within the human heart. Cellular subclasses include 2 distinct groups of resident macrophages, 4 endothelial subtypes, and 2 fibroblast subsets. Comparisons of cellular transcriptomes by cardiac chamber or sex reveal diversity not only in cardiomyocyte transcriptional programs but also in subtypes involved in extracellular matrix remodeling and vascularization. Using genetic association data, we identified strong enrichment for the role of cell subtypes in cardiac traits and diseases. Intersection of our data set with genes on cardiac clinical testing panels and the druggable genome reveals striking patterns of cellular specificity.

Conclusions: Using large-scale single nuclei RNA sequencing, we defined the transcriptional and cellular diversity in the normal human heart. Our identification of discrete cell subtypes and differentially expressed genes within the heart will ultimately facilitate the development of new therapeutics for cardiovascular diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCULATIONAHA.119.045401DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7666104PMC
August 2020

RNA sequence analysis reveals macroscopic somatic clonal expansion across normal tissues.

Science 2019 Jun;364(6444)

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

How somatic mutations accumulate in normal cells is poorly understood. A comprehensive analysis of RNA sequencing data from ~6700 samples across 29 normal tissues revealed multiple somatic variants, demonstrating that macroscopic clones can be found in many normal tissues. We found that sun-exposed skin, esophagus, and lung have a higher mutation burden than other tested tissues, which suggests that environmental factors can promote somatic mosaicism. Mutation burden was associated with both age and tissue-specific cell proliferation rate, highlighting that mutations accumulate over both time and number of cell divisions. Finally, normal tissues were found to harbor mutations in known cancer genes and hotspots. This study provides a broad view of macroscopic clonal expansion in human tissues, thus serving as a foundation for associating clonal expansion with environmental factors, aging, and risk of disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaw0726DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7350423PMC
June 2019

Using an atlas of gene regulation across 44 human tissues to inform complex disease- and trait-associated variation.

Nat Genet 2018 07 28;50(7):956-967. Epub 2018 Jun 28.

The Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA.

We apply integrative approaches to expression quantitative loci (eQTLs) from 44 tissues from the Genotype-Tissue Expression project and genome-wide association study data. About 60% of known trait-associated loci are in linkage disequilibrium with a cis-eQTL, over half of which were not found in previous large-scale whole blood studies. Applying polygenic analyses to metabolic, cardiovascular, anthropometric, autoimmune, and neurodegenerative traits, we find that eQTLs are significantly enriched for trait associations in relevant pathogenic tissues and explain a substantial proportion of the heritability (40-80%). For most traits, tissue-shared eQTLs underlie a greater proportion of trait associations, although tissue-specific eQTLs have a greater contribution to some traits, such as blood pressure. By integrating information from biological pathways with eQTL target genes and applying a gene-based approach, we validate previously implicated causal genes and pathways, and propose new variant and gene associations for several complex traits, which we replicate in the UK BioBank and BioVU.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0154-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6248311PMC
July 2018

The effects of death and post-mortem cold ischemia on human tissue transcriptomes.

Nat Commun 2018 02 13;9(1):490. Epub 2018 Feb 13.

Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, Barcelona, E-08003, Catalonia, Spain.

Post-mortem tissues samples are a key resource for investigating patterns of gene expression. However, the processes triggered by death and the post-mortem interval (PMI) can significantly alter physiologically normal RNA levels. We investigate the impact of PMI on gene expression using data from multiple tissues of post-mortem donors obtained from the GTEx project. We find that many genes change expression over relatively short PMIs in a tissue-specific manner, but this potentially confounding effect in a biological analysis can be minimized by taking into account appropriate covariates. By comparing ante- and post-mortem blood samples, we identify the cascade of transcriptional events triggered by death of the organism. These events do not appear to simply reflect stochastic variation resulting from mRNA degradation, but active and ongoing regulation of transcription. Finally, we develop a model to predict the time since death from the analysis of the transcriptome of a few readily accessible tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-017-02772-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5811508PMC
February 2018

Landscape of X chromosome inactivation across human tissues.

Nature 2017 10;550(7675):244-248

Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA.

X chromosome inactivation (XCI) silences transcription from one of the two X chromosomes in female mammalian cells to balance expression dosage between XX females and XY males. XCI is, however, incomplete in humans: up to one-third of X-chromosomal genes are expressed from both the active and inactive X chromosomes (Xa and Xi, respectively) in female cells, with the degree of 'escape' from inactivation varying between genes and individuals. The extent to which XCI is shared between cells and tissues remains poorly characterized, as does the degree to which incomplete XCI manifests as detectable sex differences in gene expression and phenotypic traits. Here we describe a systematic survey of XCI, integrating over 5,500 transcriptomes from 449 individuals spanning 29 tissues from GTEx (v6p release) and 940 single-cell transcriptomes, combined with genomic sequence data. We show that XCI at 683 X-chromosomal genes is generally uniform across human tissues, but identify examples of heterogeneity between tissues, individuals and cells. We show that incomplete XCI affects at least 23% of X-chromosomal genes, identify seven genes that escape XCI with support from multiple lines of evidence and demonstrate that escape from XCI results in sex biases in gene expression, establishing incomplete XCI as a mechanism that is likely to introduce phenotypic diversity. Overall, this updated catalogue of XCI across human tissues helps to increase our understanding of the extent and impact of the incompleteness in the maintenance of XCI.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature24265DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5685192PMC
October 2017

Data Resources for Human Functional Genomics.

Curr Opin Syst Biol 2017 Feb 22;1:75-79. Epub 2017 Feb 22.

Center for Genomic Regulation (CRG), Barcelona, Catalonia, Spain.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.coisb.2016.12.019DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5625631PMC
February 2017

Genetic analysis in UK Biobank links insulin resistance and transendothelial migration pathways to coronary artery disease.

Nat Genet 2017 Sep 17;49(9):1392-1397. Epub 2017 Jul 17.

Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, USA.

UK Biobank is among the world's largest repositories for phenotypic and genotypic information in individuals of European ancestry. We performed a genome-wide association study in UK Biobank testing ∼9 million DNA sequence variants for association with coronary artery disease (4,831 cases and 115,455 controls) and carried out meta-analysis with previously published results. We identified 15 new loci, bringing the total number of loci associated with coronary artery disease to 95 at the time of analysis. Phenome-wide association scanning showed that CCDC92 likely affects coronary artery disease through insulin resistance pathways, whereas experimental analysis suggests that ARHGEF26 influences the transendothelial migration of leukocytes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3914DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5577383PMC
September 2017

Sharing and Specificity of Co-expression Networks across 35 Human Tissues.

PLoS Comput Biol 2015 May 13;11(5):e1004220. Epub 2015 May 13.

Department of Computer Science, Stanford University, Stanford, California, United States of America.

To understand the regulation of tissue-specific gene expression, the GTEx Consortium generated RNA-seq expression data for more than thirty distinct human tissues. This data provides an opportunity for deriving shared and tissue specific gene regulatory networks on the basis of co-expression between genes. However, a small number of samples are available for a majority of the tissues, and therefore statistical inference of networks in this setting is highly underpowered. To address this problem, we infer tissue-specific gene co-expression networks for 35 tissues in the GTEx dataset using a novel algorithm, GNAT, that uses a hierarchy of tissues to share data between related tissues. We show that this transfer learning approach increases the accuracy with which networks are learned. Analysis of these networks reveals that tissue-specific transcription factors are hubs that preferentially connect to genes with tissue specific functions. Additionally, we observe that genes with tissue-specific functions lie at the peripheries of our networks. We identify numerous modules enriched for Gene Ontology functions, and show that modules conserved across tissues are especially likely to have functions common to all tissues, while modules that are upregulated in a particular tissue are often instrumental to tissue-specific function. Finally, we provide a web tool, available at mostafavilab.stat.ubc.ca/GNAT, which allows exploration of gene function and regulation in a tissue-specific manner.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pcbi.1004220DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4430528PMC
May 2015

Human genomics. The human transcriptome across tissues and individuals.

Science 2015 May;348(6235):660-5

Center for Genomic Regulation (CRG), Barcelona, Catalonia, Spain. Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain. Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), Barcelona, Catalonia, Spain. Joint CRG-Barcelona Super Computing Center (BSC)-Institut de Recerca Biomedica (IRB) Program in Computational Biology, Barcelona, Catalonia, Spain.

Transcriptional regulation and posttranscriptional processing underlie many cellular and organismal phenotypes. We used RNA sequence data generated by Genotype-Tissue Expression (GTEx) project to investigate the patterns of transcriptome variation across individuals and tissues. Tissues exhibit characteristic transcriptional signatures that show stability in postmortem samples. These signatures are dominated by a relatively small number of genes—which is most clearly seen in blood—though few are exclusive to a particular tissue and vary more across tissues than individuals. Genes exhibiting high interindividual expression variation include disease candidates associated with sex, ethnicity, and age. Primary transcription is the major driver of cellular specificity, with splicing playing mostly a complementary role; except for the brain, which exhibits a more divergent splicing program. Variation in splicing, despite its stochasticity, may play in contrast a comparatively greater role in defining individual phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aaa0355DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4547472PMC
May 2015

Sequence analysis of mutations and translocations across breast cancer subtypes.

Nature 2012 Jun 20;486(7403):405-9. Epub 2012 Jun 20.

The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA.

Breast carcinoma is the leading cause of cancer-related mortality in women worldwide, with an estimated 1.38 million new cases and 458,000 deaths in 2008 alone. This malignancy represents a heterogeneous group of tumours with characteristic molecular features, prognosis and responses to available therapy. Recurrent somatic alterations in breast cancer have been described, including mutations and copy number alterations, notably ERBB2 amplifications, the first successful therapy target defined by a genomic aberration. Previous DNA sequencing studies of breast cancer genomes have revealed additional candidate mutations and gene rearrangements. Here we report the whole-exome sequences of DNA from 103 human breast cancers of diverse subtypes from patients in Mexico and Vietnam compared to matched-normal DNA, together with whole-genome sequences of 22 breast cancer/normal pairs. Beyond confirming recurrent somatic mutations in PIK3CA, TP53, AKT1, GATA3 and MAP3K1, we discovered recurrent mutations in the CBFB transcription factor gene and deletions of its partner RUNX1. Furthermore, we have identified a recurrent MAGI3-AKT3 fusion enriched in triple-negative breast cancer lacking oestrogen and progesterone receptors and ERBB2 expression. The MAGI3-AKT3 fusion leads to constitutive activation of AKT kinase, which is abolished by treatment with an ATP-competitive AKT small-molecule inhibitor.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature11154DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4148686PMC
June 2012

Initial genome sequencing and analysis of multiple myeloma.

Nature 2011 Mar;471(7339):467-72

The Eli and Edythe L. Broad Institute, 7 Cambridge Center, Cambridge, Massachusetts 02412, USA.

Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumour genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the data set. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-κB signalling was indicated by mutations in 11 members of the NF-κB pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature09837DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3560292PMC
March 2011

Hundreds of variants clustered in genomic loci and biological pathways affect human height.

Nature 2010 Oct 29;467(7317):832-8. Epub 2010 Sep 29.

Genetics of Complex Traits, Peninsula College of Medicine and Dentistry, University of Exeter, Exeter EX1 2LU, UK.

Most common human traits and diseases have a polygenic pattern of inheritance: DNA sequence variants at many genetic loci influence the phenotype. Genome-wide association (GWA) studies have identified more than 600 variants associated with human traits, but these typically explain small fractions of phenotypic variation, raising questions about the use of further studies. Here, using 183,727 individuals, we show that hundreds of genetic variants, in at least 180 loci, influence adult height, a highly heritable and classic polygenic trait. The large number of loci reveals patterns with important implications for genetic studies of common human diseases and traits. First, the 180 loci are not random, but instead are enriched for genes that are connected in biological pathways (P = 0.016) and that underlie skeletal growth defects (P < 0.001). Second, the likely causal gene is often located near the most strongly associated variant: in 13 of 21 loci containing a known skeletal growth gene, that gene was closest to the associated variant. Third, at least 19 loci have multiple independently associated variants, suggesting that allelic heterogeneity is a frequent feature of polygenic traits, that comprehensive explorations of already-discovered loci should discover additional variants and that an appreciable fraction of associated loci may have been identified. Fourth, associated variants are enriched for likely functional effects on genes, being over-represented among variants that alter amino-acid structure of proteins and expression levels of nearby genes. Our data explain approximately 10% of the phenotypic variation in height, and we estimate that unidentified common variants of similar effect sizes would increase this figure to approximately 16% of phenotypic variation (approximately 20% of heritable variation). Although additional approaches are needed to dissect the genetic architecture of polygenic human traits fully, our findings indicate that GWA studies can identify large numbers of loci that implicate biologically relevant genes and pathways.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature09410DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2955183PMC
October 2010

Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci.

Nat Genet 2010 Jun 9;42(6):508-14. Epub 2010 May 9.

Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Boston, Massachusetts, USA.

To identify new genetic risk factors for rheumatoid arthritis, we conducted a genome-wide association study meta-analysis of 5,539 autoantibody-positive individuals with rheumatoid arthritis (cases) and 20,169 controls of European descent, followed by replication in an independent set of 6,768 rheumatoid arthritis cases and 8,806 controls. Of 34 SNPs selected for replication, 7 new rheumatoid arthritis risk alleles were identified at genome-wide significance (P < 5 x 10(-8)) in an analysis of all 41,282 samples. The associated SNPs are near genes of known immune function, including IL6ST, SPRED2, RBPJ, CCR6, IRF5 and PXK. We also refined associations at two established rheumatoid arthritis risk loci (IL2RA and CCL21) and confirmed the association at AFF3. These new associations bring the total number of confirmed rheumatoid arthritis risk loci to 31 among individuals of European ancestry. An additional 11 SNPs replicated at P < 0.05, many of which are validated autoimmune risk alleles, suggesting that most represent genuine rheumatoid arthritis risk alleles.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.582DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4243840PMC
June 2010

Genetic variants at CD28, PRDM1 and CD2/CD58 are associated with rheumatoid arthritis risk.

Nat Genet 2009 Dec 8;41(12):1313-8. Epub 2009 Nov 8.

Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital, Boston, Massachusetts, USA.

To discover new rheumatoid arthritis (RA) risk loci, we systematically examined 370 SNPs from 179 independent loci with P < 0.001 in a published meta-analysis of RA genome-wide association studies (GWAS) of 3,393 cases and 12,462 controls. We used Gene Relationships Across Implicated Loci (GRAIL), a computational method that applies statistical text mining to PubMed abstracts, to score these 179 loci for functional relationships to genes in 16 established RA disease loci. We identified 22 loci with a significant degree of functional connectivity. We genotyped 22 representative SNPs in an independent set of 7,957 cases and 11,958 matched controls. Three were convincingly validated: CD2-CD58 (rs11586238, P = 1 x 10(-6) replication, P = 1 x 10(-9) overall), CD28 (rs1980422, P = 5 x 10(-6) replication, P = 1 x 10(-9) overall) and PRDM1 (rs548234, P = 1 x 10(-5) replication, P = 2 x 10(-8) overall). An additional four were replicated (P < 0.0023): TAGAP (rs394581, P = 0.0002 replication, P = 4 x 10(-7) overall), PTPRC (rs10919563, P = 0.0003 replication, P = 7 x 10(-7) overall), TRAF6-RAG1 (rs540386, P = 0.0008 replication, P = 4 x 10(-6) overall) and FCGR2A (rs12746613, P = 0.0022 replication, P = 2 x 10(-5) overall). Many of these loci are also associated to other immunologic diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.479DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3142887PMC
December 2009

Evaluation of the 8q24 prostate cancer risk locus and MYC expression.

Cancer Res 2009 Jul 23;69(13):5568-74. Epub 2009 Jun 23.

Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA.

Polymorphisms at 8q24 are robustly associated with prostate cancer risk. The risk variants are located in nonprotein coding regions and their mechanism has not been fully elucidated. To further dissect the function of this locus, we tested two hypotheses: (a) unannotated microRNAs (miRNA) are transcribed in the region, and (b) this region is a cis-acting enhancer. Using next generation sequencing, 8q24 risk regions were interrogated for known and novel miRNAs in histologically normal radical prostatectomy tissue. We also evaluated the association between the risk variants and transcript levels of multiple genes, focusing on the proto-oncogene, MYC. RNA expression was measured in histologically normal and tumor tissue from 280 prostatectomy specimens (from 234 European American and 46 African American patients), and paired germline DNA from each individual was genotyped for six 8q24 risk single nucleotide polymorphisms. No evidence was found for significant miRNA transcription within 8q24 prostate cancer risk loci. Likewise, no convincing association between RNA expression and risk allele status was detected in either histologically normal or tumor tissue. To our knowledge, this is one of the first and largest studies to directly assess miRNA in this region and to systematically measure MYC expression levels in prostate tissue in relation to inherited risk variants. These data will help to direct the future study of this risk locus.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1158/0008-5472.CAN-09-0387DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2884104PMC
July 2009

Common variants at CD40 and other loci confer risk of rheumatoid arthritis.

Nat Genet 2008 Oct 14;40(10):1216-23. Epub 2008 Sep 14.

Program in Medical and Population Genetics, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142, USA.

To identify rheumatoid arthritis risk loci in European populations, we conducted a meta-analysis of two published genome-wide association (GWA) studies totaling 3,393 cases and 12,462 controls. We genotyped 31 top-ranked SNPs not previously associated with rheumatoid arthritis in an independent replication of 3,929 autoantibody-positive rheumatoid arthritis cases and 5,807 matched controls from eight separate collections. We identified a common variant at the CD40 gene locus (rs4810485, P = 0.0032 replication, P = 8.2 x 10(-9) overall, OR = 0.87). Along with other associations near TRAF1 (refs. 2,3) and TNFAIP3 (refs. 4,5), this implies a central role for the CD40 signaling pathway in rheumatoid arthritis pathogenesis. We also identified association at the CCL21 gene locus (rs2812378, P = 0.00097 replication, P = 2.8 x 10(-7) overall), a gene involved in lymphocyte trafficking. Finally, we identified evidence of association at four additional gene loci: MMEL1-TNFRSF14 (rs3890745, P = 0.0035 replication, P = 1.1 x 10(-7) overall), CDK6 (rs42041, P = 0.010 replication, P = 4.0 x 10(-6) overall), PRKCQ (rs4750316, P = 0.0078 replication, P = 4.4 x 10(-6) overall), and KIF5A-PIP4K2C (rs1678542, P = 0.0026 replication, P = 8.8 x 10(-8) overall).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.233DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2757650PMC
October 2008
-->