Publications by authors named "Klaudia Walter"

49 Publications

Whole-genome sequencing in diverse subjects identifies genetic correlates of leukocyte traits: The NHLBI TOPMed program.

Am J Hum Genet 2021 Oct 27;108(10):1836-1851. Epub 2021 Sep 27.

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.

Many common and rare variants associated with hematologic traits have been discovered through imputation on large-scale reference panels. However, the majority of genome-wide association studies (GWASs) have been conducted in Europeans, and determining causal variants has proved challenging. We performed a GWAS of total leukocyte, neutrophil, lymphocyte, monocyte, eosinophil, and basophil counts generated from 109,563,748 variants in the autosomes and the X chromosome in the Trans-Omics for Precision Medicine (TOPMed) program, which included data from 61,802 individuals of diverse ancestry. We discovered and replicated 7 leukocyte trait associations, including (1) the association between a chromosome X, pseudo-autosomal region (PAR), noncoding variant located between cytokine receptor genes (CSF2RA and CLRF2) and lower eosinophil count; and (2) associations between single variants found predominantly among African Americans at the S1PR3 (9q22.1) and HBB (11p15.4) loci and monocyte and lymphocyte counts, respectively. We further provide evidence indicating that the newly discovered eosinophil-lowering chromosome X PAR variant might be associated with reduced susceptibility to common allergic diseases such as atopic dermatitis and asthma. Additionally, we found a burden of very rare FLT3 (13q12.2) variants associated with monocyte counts. Together, these results emphasize the utility of whole-genome sequencing in diverse samples in identifying associations missed by European-ancestry-driven GWASs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2021.08.007DOI Listing
October 2021

Mitochondrial DNA variants modulate N-formylmethionine, proteostasis and risk of late-onset human diseases.

Nat Med 2021 09 23;27(9):1564-1575. Epub 2021 Aug 23.

Human Genetics Department, Wellcome Sanger Institute (WT), Hinxton, UK.

Mitochondrial DNA (mtDNA) variants influence the risk of late-onset human diseases, but the reasons for this are poorly understood. Undertaking a hypothesis-free analysis of 5,689 blood-derived biomarkers with mtDNA variants in 16,220 healthy donors, here we show that variants defining mtDNA haplogroups Uk and H4 modulate the level of circulating N-formylmethionine (fMet), which initiates mitochondrial protein translation. In human cytoplasmic hybrid (cybrid) lines, fMet modulated both mitochondrial and cytosolic proteins on multiple levels, through transcription, post-translational modification and proteolysis by an N-degron pathway, abolishing known differences between mtDNA haplogroups. In a further 11,966 individuals, fMet levels contributed to all-cause mortality and the disease risk of several common cardiovascular disorders. Together, these findings indicate that fMet plays a key role in common age-related disease through pleiotropic effects on cell proteostasis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41591-021-01441-3DOI Listing
September 2021

Effects of adiposity on the human plasma proteome: observational and Mendelian randomisation estimates.

Int J Obes (Lond) 2021 Oct 5;45(10):2221-2229. Epub 2021 Jul 5.

Medical Research Council (MRC) Integrative Epidemiology Unit at the University of Bristol, Bristol, UK.

Background: Variation in adiposity is associated with cardiometabolic disease outcomes, but mechanisms leading from this exposure to disease are unclear. This study aimed to estimate effects of body mass index (BMI) on an extensive set of circulating proteins.

Methods: We used SomaLogic proteomic data from up to 2737 healthy participants from the INTERVAL study. Associations between self-reported BMI and 3622 unique plasma proteins were explored using linear regression. These were complemented by Mendelian randomisation (MR) analyses using a genetic risk score (GRS) comprised of 654 BMI-associated polymorphisms from a recent genome-wide association study (GWAS) of adult BMI. A disease enrichment analysis was performed using DAVID Bioinformatics 6.8 for proteins which were altered by BMI.

Results: Observationally, BMI was associated with 1576 proteins (P < 1.4 × 10), with particularly strong evidence for a positive association with leptin and fatty acid-binding protein-4 (FABP4), and a negative association with sex hormone-binding globulin (SHBG). Observational estimates were likely confounded, but the GRS for BMI did not associate with measured confounders. MR analyses provided evidence for a causal relationship between BMI and eight proteins including leptin (0.63 standard deviation (SD) per SD BMI, 95% CI 0.48-0.79, P = 1.6 × 10), FABP4 (0.64 SD per SD BMI, 95% CI 0.46-0.83, P = 6.7 × 10) and SHBG (-0.45 SD per SD BMI, 95% CI -0.65 to -0.25, P = 1.4 × 10). There was agreement in the magnitude of observational and MR estimates (R = 0.33) and evidence that proteins most strongly altered by BMI were enriched for genes involved in cardiovascular disease.

Conclusions: This study provides evidence for a broad impact of adiposity on the human proteome. Proteins strongly altered by BMI include those involved in regulating appetite, sex hormones and inflammation; such proteins are also enriched for cardiovascular disease-related genes. Altogether, results help focus attention onto new proteomic signatures of obesity-related disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41366-021-00896-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8455324PMC
October 2021

Genetic perturbation of PU.1 binding and chromatin looping at neutrophil enhancers associates with autoimmune disease.

Nat Commun 2021 04 16;12(1):2298. Epub 2021 Apr 16.

Human Genetics, Wellcome Sanger Institute, Genome Campus, Hinxton, UK.

Neutrophils play fundamental roles in innate immune response, shape adaptive immunity, and are a potentially causal cell type underpinning genetic associations with immune system traits and diseases. Here, we profile the binding of myeloid master regulator PU.1 in primary neutrophils across nearly a hundred volunteers. We show that variants associated with differential PU.1 binding underlie genetically-driven differences in cell count and susceptibility to autoimmune and inflammatory diseases. We integrate these results with other multi-individual genomic readouts, revealing coordinated effects of PU.1 binding variants on the local chromatin state, enhancer-promoter contacts and downstream gene expression, and providing a functional interpretation for 27 genes underlying immune traits. Collectively, these results demonstrate the functional role of PU.1 and its target enhancers in neutrophil transcriptional control and immune disease susceptibility.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-22548-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8052402PMC
April 2021

The Polygenic and Monogenic Basis of Blood Traits and Diseases.

Cell 2020 09;182(5):1214-1231.e11

Laboratory of Epidemiology and Population Science, National Institute on Aging/NIH, Baltimore, MD, 21224, USA.

Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant global health burden. Here we integrate data from UK Biobank and a large-scale international collaborative effort, including data for 563,085 European ancestry participants, and discover 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering a range of variation impacting hematopoiesis. We holistically characterize the genetic architecture of hematopoiesis, assess the relevance of the omnigenic model to blood cell phenotypes, delineate relevant hematopoietic cell states influenced by regulatory genetic variants and gene networks, identify novel splice-altering variants mediating the associations, and assess the polygenic prediction potential for blood traits and clinical disorders at the interface of complex and Mendelian genetics. These results show the power of large-scale blood cell trait GWAS to interrogate clinically meaningful variants across a wide allelic spectrum of human variation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2020.08.008DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7482360PMC
September 2020

The influence of rare variants in circulating metabolic biomarkers.

PLoS Genet 2020 03 9;16(3):e1008605. Epub 2020 Mar 9.

Wellcome Sanger Institute, Cambridge, United Kingdom.

Circulating metabolite levels are biomarkers for cardiovascular disease (CVD). Here we studied, association of rare variants and 226 serum lipoproteins, lipids and amino acids in 7,142 (discovery plus follow-up) healthy participants. We leveraged the information from multiple metabolite measurements on the same participants to improve discovery in rare variant association analyses for gene-based and gene-set tests by incorporating correlated metabolites as covariates in the validation stage. Gene-based analysis corrected for the effective number of tests performed, confirmed established associations at APOB, APOC3, PAH, HAL and PCSK (p<1.32x10-7) and identified novel gene-trait associations at a lower stringency threshold with ACSL1, MYCN, FBXO36 and B4GALNT3 (p<2.5x10-6). Regulation of the pyruvate dehydrogenase (PDH) complex was associated for the first time, in gene-set analyses also corrected for effective number of tests, with IDL and LDL parameters, as well as circulating cholesterol (pMETASKAT<2.41x10-6). In conclusion, using an approach that leverages metabolite measurements obtained in the same participants, we identified novel loci and pathways involved in the regulation of these important metabolic biomarkers. As large-scale biobanks continue to amass sequencing and phenotypic information, analytical approaches such as ours will be useful to fully exploit the copious amounts of biological data generated in these efforts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008605DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7108731PMC
March 2020

Population-wide copy number variation calling using variant call format files from 6,898 individuals.

Genet Epidemiol 2020 01 14;44(1):79-89. Epub 2019 Sep 14.

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom.

Copy number variants (CNVs) play an important role in a number of human diseases, but the accurate calling of CNVs remains challenging. Most current approaches to CNV detection use raw read alignments, which are computationally intensive to process. We use a regression tree-based approach to call germline CNVs from whole-genome sequencing (WGS, >18x) variant call sets in 6,898 samples across four European cohorts, and describe a rich large variation landscape comprising 1,320 CNVs. Eighty-one percent of detected events have been previously reported in the Database of Genomic Variants. Twenty-three percent of high-quality deletions affect entire genes, and we recapitulate known events such as the GSTM1 and RHD gene deletions. We test for association between the detected deletions and 275 protein levels in 1,457 individuals to assess the potential clinical impact of the detected CNVs. We describe complex CNV patterns underlying an association with levels of the CCL3 protein (MAF = 0.15, p = 3.6x10 ) at the CCL3L3 locus, and a novel cis-association between a low-frequency NOMO1 deletion and NOMO1 protein levels (MAF = 0.02, p = 2.2x10 ). This study demonstrates that existing population-wide WGS call sets can be mined for germline CNVs with minimal computational overhead, delivering insight into a less well-studied, yet potentially impactful class of genetic variant.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22260DOI Listing
January 2020

GARFIELD classifies disease-relevant genomic features through integration of functional annotations with association signals.

Nat Genet 2019 02 28;51(2):343-353. Epub 2019 Jan 28.

Human Genetics, Wellcome Sanger Institute, Hinxton, UK.

Loci discovered by genome-wide association studies predominantly map outside protein-coding genes. The interpretation of the functional consequences of non-coding variants can be greatly enhanced by catalogs of regulatory genomic regions in cell lines and primary tissues. However, robust and readily applicable methods are still lacking by which to systematically evaluate the contribution of these regions to genetic variation implicated in diseases or quantitative traits. Here we propose a novel approach that leverages genome-wide association studies' findings with regulatory or functional annotations to classify features relevant to a phenotype of interest. Within our framework, we account for major sources of confounding not offered by current methods. We further assess enrichment of genome-wide association studies for 19 traits within Encyclopedia of DNA Elements- and Roadmap-derived regulatory regions. We characterize unique enrichment patterns for traits and annotations driving novel biological insights. The method is implemented in standalone software and an R package, to facilitate its application by the research community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0322-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6908448PMC
February 2019

Low-frequency variation in TP53 has large effects on head circumference and intracranial volume.

Nat Commun 2019 01 21;10(1):357. Epub 2019 Jan 21.

School of Medicine and Public Health, Faculty of Medicine and Health, The University of Newcastle, Newcastle, NSW, 2308, Australia.

Cranial growth and development is a complex process which affects the closely related traits of head circumference (HC) and intracranial volume (ICV). The underlying genetic influences shaping these traits during the transition from childhood to adulthood are little understood, but might include both age-specific genetic factors and low-frequency genetic variation. Here, we model the developmental genetic architecture of HC, showing this is genetically stable and correlated with genetic determinants of ICV. Investigating up to 46,000 children and adults of European descent, we identify association with final HC and/or final ICV + HC at 9 novel common and low-frequency loci, illustrating that genetic variation from a wide allele frequency spectrum contributes to cranial growth. The largest effects are reported for low-frequency variants within TP53, with 0.5 cm wider heads in increaser-allele carriers versus non-carriers during mid-childhood, suggesting a previously unrecognized role of TP53 transcripts in human cranial development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-07863-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6341110PMC
January 2019

Author Correction: Cohort-wide deep whole genome sequencing and the allelic architecture of complex traits.

Nat Commun 2018 12 19;9(1):5460. Epub 2018 Dec 19.

Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom.

The original version of this Article contained an error in Fig. 2. In panel a, the two legend items "rare" and "common" were inadvertently swapped. This has been corrected in both the PDF and HTML versions of the Article.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-07730-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6300593PMC
December 2018

Cohort-wide deep whole genome sequencing and the allelic architecture of complex traits.

Nat Commun 2018 11 7;9(1):4674. Epub 2018 Nov 7.

Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom.

The role of rare variants in complex traits remains uncharted. Here, we conduct deep whole genome sequencing of 1457 individuals from an isolated population, and test for rare variant burdens across six cardiometabolic traits. We identify a role for rare regulatory variation, which has hitherto been missed. We find evidence of rare variant burdens that are independent of established common variant signals (ADIPOQ and adiponectin, P = 4.2 × 10; APOC3 and triglyceride levels, P = 1.5 × 10), and identify replicating evidence for a burden associated with triglyceride levels in FAM189B (P = 2.2 × 10), indicating a role for this gene in lipid metabolism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-018-07070-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6220258PMC
November 2018

Quality Control of Common and Rare Variants.

Methods Mol Biol 2018 ;1793:25-36

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, Cambridgeshire, United Kingdom.

Thorough data quality control (QC) is a key step to the success of high-throughput genotyping approaches. Following extensive research several criteria and thresholds have been established for data QC at the sample and variant level. Sample QC is aimed at the identification and removal (when appropriate) of individuals with (1) low call rate, (2) discrepant sex or other identity-related information, (3) excess genome-wide heterozygosity and homozygosity, (4) relations to other samples, (5) ethnicity differences, (6) batch effects, and (7) contamination. Variant QC is aimed at identification and removal or refinement of variants with (1) low call rate, (2) call rate differences by phenotypic status, (3) gross deviation from Hardy-Weinberg Equilibrium (HWE), (4) bad genotype intensity plots, (5) batch effects, (6) differences in allele frequencies with published data sets, (7) very low minor allele counts (MAC), (8) low imputation quality score, (9) low variant quality score log-odds, and (10) few or low quality reads.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-7868-7_3DOI Listing
February 2019

Automated typing of red blood cell and platelet antigens: a whole-genome sequencing study.

Lancet Haematol 2018 Jun 17;5(6):e241-e251. Epub 2018 May 17.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA; Partners Personalized Medicine, Boston, MA, USA; Broad Institute of Massachusetts Institute of Technology and Harvard, Boston, MA, USA.

Background: There are more than 300 known red blood cell (RBC) antigens and 33 platelet antigens that differ between individuals. Sensitisation to antigens is a serious complication that can occur in prenatal medicine and after blood transfusion, particularly for patients who require multiple transfusions. Although pre-transfusion compatibility testing largely relies on serological methods, reagents are not available for many antigens. Methods based on single-nucleotide polymorphism (SNP) arrays have been used, but typing for ABO and Rh-the most important blood groups-cannot be done with SNP typing alone. We aimed to develop a novel method based on whole-genome sequencing to identify RBC and platelet antigens.

Methods: This whole-genome sequencing study is a subanalysis of data from patients in the whole-genome sequencing arm of the MedSeq Project randomised controlled trial (NCT01736566) with no measured patient outcomes. We created a database of molecular changes in RBC and platelet antigens and developed an automated antigen-typing algorithm based on whole-genome sequencing (bloodTyper). This algorithm was iteratively improved to address cis-trans haplotype ambiguities and homologous gene alignments. Whole-genome sequencing data from 110 MedSeq participants (30 × depth) were used to initially validate bloodTyper through comparison with conventional serology and SNP methods for typing of 38 RBC antigens in 12 blood-group systems and 22 human platelet antigens. bloodTyper was further validated with whole-genome sequencing data from 200 INTERVAL trial participants (15 × depth) with serological comparisons.

Findings: We iteratively improved bloodTyper by comparing its typing results with conventional serological and SNP typing in three rounds of testing. The initial whole-genome sequencing typing algorithm was 99·5% concordant across the first 20 MedSeq genomes. Addressing discordances led to development of an improved algorithm that was 99·8% concordant for the remaining 90 MedSeq genomes. Additional modifications led to the final algorithm, which was 99·2% concordant across 200 INTERVAL genomes (or 99·9% after adjustment for the lower depth of coverage).

Interpretation: By enabling more precise antigen-matching of patients with blood donors, antigen typing based on whole-genome sequencing provides a novel approach to improve transfusion outcomes with the potential to transform the practice of transfusion medicine.

Funding: National Human Genome Research Institute, Doris Duke Charitable Foundation, National Health Service Blood and Transplant, National Institute for Health Research, and Wellcome Trust.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/S2352-3026(18)30053-XDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6438177PMC
June 2018

Whole-Genome Sequencing Coupled to Imputation Discovers Genetic Signals for Anthropometric Traits.

Am J Hum Genet 2017 Jun 25;100(6):865-884. Epub 2017 May 25.

Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London W2 1PG, UK; Department of Cardiology, Ealing Hospital NHS Trust, Middlesex UB1 3EU, UK.

Deep sequence-based imputation can enhance the discovery power of genome-wide association studies by assessing previously unexplored variation across the common- and low-frequency spectra. We applied a hybrid whole-genome sequencing (WGS) and deep imputation approach to examine the broader allelic architecture of 12 anthropometric traits associated with height, body mass, and fat distribution in up to 267,616 individuals. We report 106 genome-wide significant signals that have not been previously identified, including 9 low-frequency variants pointing to functional candidates. Of the 106 signals, 6 are in genomic regions that have not been implicated with related traits before, 28 are independent signals at previously reported regions, and 72 represent previously reported signals for a different anthropometric trait. 71% of signals reside within genes and fine mapping resolves 23 signals to one or two likely causal variants. We confirm genetic overlap between human monogenic and polygenic anthropometric traits and find signal enrichment in cis expression QTLs in relevant tissues. Our results highlight the potential of WGS strategies to enhance biologically relevant discoveries across the frequency spectrum.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2017.04.014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5473732PMC
June 2017

The impact of rare and low-frequency genetic variants in common disease.

Genome Biol 2017 04 27;18(1):77. Epub 2017 Apr 27.

Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, CB10 1HH, UK.

Despite thousands of genetic loci identified to date, a large proportion of genetic variation predisposing to complex disease and traits remains unaccounted for. Advances in sequencing technology enable focused explorations on the contribution of low-frequency and rare variants to human traits. Here we review experimental approaches and current knowledge on the contribution of these genetic variants in complex disease and discuss challenges and opportunities for personalised medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-017-1212-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408830PMC
April 2017

An Organismal CNV Mutator Phenotype Restricted to Early Human Development.

Cell 2017 02;168(5):830-842.e7

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Baylor Genetics, Houston, TX 77021, USA.

De novo copy number variants (dnCNVs) arising at multiple loci in a personal genome have usually been considered to reflect cancer somatic genomic instabilities. We describe a multiple dnCNV (MdnCNV) phenomenon in which individuals with genomic disorders carry five to ten constitutional dnCNVs. These CNVs originate from independent formation incidences, are predominantly tandem duplications or complex gains, exhibit breakpoint junction features reminiscent of replicative repair, and show increased de novo point mutations flanking the rearrangement junctions. The active CNV mutation shower appears to be restricted to a transient perizygotic period. We propose that a defect in the CNV formation process is responsible for the "CNV-mutator state," and this state is dampened after early embryogenesis. The constitutional MdnCNV phenomenon resembles chromosomal instability in various cancers. Investigations of this phenomenon may provide unique access to understanding genomic disorders, structural variant mutagenesis, human evolution, and cancer biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2017.01.037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5407901PMC
February 2017

Whole-genome view of the consequences of a population bottleneck using 2926 genome sequences from Finland and United Kingdom.

Eur J Hum Genet 2017 04 1;25(4):477-484. Epub 2017 Feb 1.

Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.

Isolated populations with enrichment of variants due to recent population bottlenecks provide a powerful resource for identifying disease-associated genetic variants and genes. As a model of an isolate population, we sequenced the genomes of 1463 Finnish individuals as part of the Sequencing Initiative Suomi (SISu) Project. We compared the genomic profiles of the 1463 Finns to a sample of 1463 British individuals that were sequenced in parallel as part of the UK10K Project. Whereas there were no major differences in the allele frequency of common variants, a significant depletion of variants in the rare frequency spectrum was observed in Finns when comparing the two populations. On the other hand, we observed >2.1 million variants that were twice as frequent among Finns compared with Britons and 800 000 variants that were more than 10 times more frequent in Finns. Furthermore, in Finns we observed a relative proportional enrichment of variants in the minor allele frequency range between 2 and 5% (P<2.2 × 10). When stratified by their functional annotations, loss-of-function variants showed the highest proportional enrichment in Finns (P=0.0291). In the non-coding part of the genome, variants in conserved regions (P=0.002) and promoters (P=0.01) were also significantly enriched in the Finnish samples. These functional categories represent the highest a priori power for downstream association studies of rare variants using population isolates.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2016.205DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5346294PMC
April 2017

The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease.

Cell 2016 11;167(5):1415-1429.e19

Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK; Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK.

Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2016.10.042DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5300907PMC
November 2016

Genetic Drivers of Epigenetic and Transcriptional Variation in Human Immune Cells.

Cell 2016 11;167(5):1398-1414.e24

Human Genetics, McGill University, 740 Dr. Penfield, Montreal, QC H3A 0G1, Canada.

Characterizing the multifaceted contribution of genetic and epigenetic factors to disease phenotypes is a major challenge in human genetics and medicine. We carried out high-resolution genetic, epigenetic, and transcriptomic profiling in three major human immune cell types (CD14 monocytes, CD16 neutrophils, and naive CD4 T cells) from up to 197 individuals. We assess, quantitatively, the relative contribution of cis-genetic and epigenetic factors to transcription and evaluate their impact as potential sources of confounding in epigenome-wide association studies. Further, we characterize highly coordinated genetic effects on gene expression, methylation, and histone variation through quantitative trait locus (QTL) mapping and allele-specific (AS) analyses. Finally, we demonstrate colocalization of molecular trait QTLs at 345 unique immune disease loci. This expansive, high-resolution atlas of multi-omics changes yields insights into cell-type-specific correlation between diverse genomic inputs, more generalizable correlations between these inputs, and defines molecular events that may underpin complex disease risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2016.10.026DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5119954PMC
November 2016

Discovery and refinement of genetic loci associated with cardiometabolic risk using dense imputation maps.

Nat Genet 2016 11 26;48(11):1303-1312. Epub 2016 Sep 26.

Department of Epidemiology, Erasmus University Medical Center, Rotterdam, Netherlands.

Large-scale whole-genome sequence data sets offer novel opportunities to identify genetic variation underlying human traits. Here we apply genotype imputation based on whole-genome sequence data from the UK10K and 1000 Genomes Project into 35,981 study participants of European ancestry, followed by association analysis with 20 quantitative cardiometabolic and hematological traits. We describe 17 new associations, including 6 rare (minor allele frequency (MAF) < 1%) or low-frequency (1% < MAF < 5%) variants with platelet count (PLT), red blood cell indices (MCH and MCV) and HDL cholesterol. Applying fine-mapping analysis to 233 known and new loci associated with the 20 traits, we resolve the associations of 59 loci to credible sets of 20 or fewer variants and describe trait enrichments within regions of predicted regulatory function. These findings improve understanding of the allelic architecture of risk factors for cardiometabolic and hematological diseases and provide additional functional insights with the identification of potentially novel biological targets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3668DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5279872PMC
November 2016

A reference panel of 64,976 haplotypes for genotype imputation.

Nat Genet 2016 10 22;48(10):1279-83. Epub 2016 Aug 22.

IRGB, CNR, Sardinia, Italy.

We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3643DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5388176PMC
October 2016

Whole-Exome Sequencing Identifies Loci Associated with Blood Cell Traits and Reveals a Role for Alternative GFI1B Splice Variants in Human Hematopoiesis.

Am J Hum Genet 2016 08;99(2):481-8

Cardiovascular Health Research Unit and Department of Medicine, University of Washington, Seattle, WA 98195, USA.

Circulating blood cell counts and indices are important indicators of hematopoietic function and a number of clinical parameters, such as blood oxygen-carrying capacity, inflammation, and hemostasis. By performing whole-exome sequence association analyses of hematologic quantitative traits in 15,459 community-dwelling individuals, followed by in silico replication in up to 52,024 independent samples, we identified two previously undescribed coding variants associated with lower platelet count: a common missense variant in CPS1 (rs1047891, MAF = 0.33, discovery + replication p = 6.38 × 10(-10)) and a rare synonymous variant in GFI1B (rs150813342, MAF = 0.009, discovery + replication p = 1.79 × 10(-27)). By performing CRISPR/Cas9 genome editing in hematopoietic cell lines and follow-up targeted knockdown experiments in primary human hematopoietic stem and progenitor cells, we demonstrate an alternative splicing mechanism by which the GFI1B rs150813342 variant suppresses formation of a GFI1B isoform that preferentially promotes megakaryocyte differentiation and platelet production. These results demonstrate how unbiased studies of natural variation in blood cell traits can provide insight into the regulation of human hematopoiesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2016.06.016DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4974169PMC
August 2016

A genomic approach to therapeutic target validation identifies a glucose-lowering GLP1R variant protective for coronary heart disease.

Sci Transl Med 2016 06;8(341):341ra76

Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Strangeways Laboratory, Worts Causeway, Cambridge CB1 8RN, UK.

Regulatory authorities have indicated that new drugs to treat type 2 diabetes (T2D) should not be associated with an unacceptable increase in cardiovascular risk. Human genetics may be able to guide development of antidiabetic therapies by predicting cardiovascular and other health endpoints. We therefore investigated the association of variants in six genes that encode drug targets for obesity or T2D with a range of metabolic traits in up to 11,806 individuals by targeted exome sequencing and follow-up in 39,979 individuals by targeted genotyping, with additional in silico follow-up in consortia. We used these data to first compare associations of variants in genes encoding drug targets with the effects of pharmacological manipulation of those targets in clinical trials. We then tested the association of those variants with disease outcomes, including coronary heart disease, to predict cardiovascular safety of these agents. A low-frequency missense variant (Ala316Thr; rs10305492) in the gene encoding glucagon-like peptide-1 receptor (GLP1R), the target of GLP1R agonists, was associated with lower fasting glucose and T2D risk, consistent with GLP1R agonist therapies. The minor allele was also associated with protection against heart disease, thus providing evidence that GLP1R agonists are not likely to be associated with an unacceptable increase in cardiovascular risk. Our results provide an encouraging signal that these agents may be associated with benefit, a question currently being addressed in randomized controlled trials. Genetic variants associated with metabolic traits and multiple disease outcomes can be used to validate therapeutic targets at an early stage in the drug development process.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/scitranslmed.aad3744DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5219001PMC
June 2016

An integrated map of structural variation in 2,504 human genomes.

Nature 2015 Oct;526(7571):75-81

University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, CA 92093, USA.

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature15394DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4617611PMC
October 2015

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel.

Nat Commun 2015 Sep 14;6:8111. Epub 2015 Sep 14.

The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1HH, UK.

Imputing genotypes from reference panels created by whole-genome sequencing (WGS) provides a cost-effective strategy for augmenting the single-nucleotide polymorphism (SNP) content of genome-wide arrays. The UK10K Cohorts project has generated a data set of 3,781 whole genomes sequenced at low depth (average 7x), aiming to exhaustively characterize genetic variation down to 0.1% minor allele frequency in the British population. Here we demonstrate the value of this resource for improving imputation accuracy at rare and low-frequency variants in both a UK and an Italian population. We show that large increases in imputation accuracy can be achieved by re-phasing WGS reference panels after initial genotype calling. We also present a method for combining WGS panels to improve variant coverage and downstream imputation accuracy, which we illustrate by integrating 7,562 WGS haplotypes from the UK10K project with 2,184 haplotypes from the 1000 Genomes Project. Finally, we introduce a novel approximation that maintains speed without sacrificing imputation accuracy for rare variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms9111DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4579394PMC
September 2015

The UK10K project identifies rare variants in health and disease.

Nature 2015 Oct 14;526(7571):82-90. Epub 2015 Sep 14.

The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature14962DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4773891PMC
October 2015

Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture.

Nature 2015 Oct 14;526(7571):112-7. Epub 2015 Sep 14.

Institute for Aging Research, Hebrew SeniorLife, Boston, Massachusetts 02131, USA.

The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature14878DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4755714PMC
October 2015

An interactive genome browser of association results from the UK10K cohorts project.

Bioinformatics 2015 Dec 26;31(24):4029-31. Epub 2015 Aug 26.

Wellcome Trust Sanger Institute, Genome Campus, Hinxton CB10 1HH, UK, Department of Haematology, University of Cambridge, Cambridge CB2 1TN, UK.

Unlabelled: High-throughput sequencing technologies survey genetic variation at genome scale and are increasingly used to study the contribution of rare and low-frequency genetic variants to human traits. As part of the Cohorts arm of the UK10K project, genetic variants called from low-read depth (average 7×) whole genome sequencing of 3621 cohort individuals were analysed for statistical associations with 64 different phenotypic traits of biomedical importance. Here, we describe a novel genome browser based on the Biodalliance platform developed to provide interactive access to the association results of the project.

Availability And Implementation: The browser is available at http://www.uk10k.org/dalliance.html. Source code for the Biodalliance platform is available under a BSD license from http://github.com/dasmoth/dalliance, and for the LD-display plugin and backend from http://github.com/dasmoth/ldserv.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btv491DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4673976PMC
December 2015
-->