Publications by authors named "Laurent Francioli"

29 Publications

  • Page 1 of 1

The effect of LRRK2 loss-of-function variants in humans.

Nat Med 2020 06 27;26(6):869-877. Epub 2020 May 27.

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Human genetic variants predicted to cause loss-of-function of protein-coding genes (pLoF variants) provide natural in vivo models of human gene inactivation and can be valuable indicators of gene function and the potential toxicity of therapeutic inhibitors targeting these genes. Gain-of-kinase-function variants in LRRK2 are known to significantly increase the risk of Parkinson's disease, suggesting that inhibition of LRRK2 kinase activity is a promising therapeutic strategy. While preclinical studies in model organisms have raised some on-target toxicity concerns, the biological consequences of LRRK2 inhibition have not been well characterized in humans. Here, we systematically analyze pLoF variants in LRRK2 observed across 141,456 individuals sequenced in the Genome Aggregation Database (gnomAD), 49,960 exome-sequenced individuals from the UK Biobank and over 4 million participants in the 23andMe genotyped dataset. After stringent variant curation, we identify 1,455 individuals with high-confidence pLoF variants in LRRK2. Experimental validation of three variants, combined with previous work, confirmed reduced protein levels in 82.5% of our cohort. We show that heterozygous pLoF variants in LRRK2 reduce LRRK2 protein levels but that these are not strongly associated with any specific phenotype or disease state. Our results demonstrate the value of large-scale genomic databases and phenotyping of human loss-of-function carriers for target validation in drug discovery.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41591-020-0893-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7303015PMC
June 2020

The mutational constraint spectrum quantified from variation in 141,456 humans.

Nature 2020 05 27;581(7809):434-443. Epub 2020 May 27.

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2308-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334197PMC
May 2020

A structural variation reference for medical and population genetics.

Nature 2020 05 27;581(7809):444-451. Epub 2020 May 27.

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Structural variants (SVs) rearrange large segments of DNA and can have profound consequences in evolution and human disease. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD) have become integral in the interpretation of single-nucleotide variants (SNVs). However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings. This SV resource is freely distributed via the gnomAD browser and will have broad utility in population genetics, disease-association studies, and diagnostic screening.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2287-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7334194PMC
May 2020

Characterising the loss-of-function impact of 5' untranslated region variants in 15,708 individuals.

Nat Commun 2020 05 27;11(1):2523. Epub 2020 May 27.

National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK.

Upstream open reading frames (uORFs) are tissue-specific cis-regulators of protein translation. Isolated reports have shown that variants that create or disrupt uORFs can cause disease. Here, in a systematic genome-wide study using 15,708 whole genome sequences, we show that variants that create new upstream start codons, and variants disrupting stop sites of existing uORFs, are under strong negative selection. This selection signal is significantly stronger for variants arising upstream of genes intolerant to loss-of-function variants. Furthermore, variants creating uORFs that overlap the coding sequence show signals of selection equivalent to coding missense variants. Finally, we identify specific genes where modification of uORFs likely represents an important disease mechanism, and report a novel uORF frameshift variant upstream of NF2 in neurofibromatosis. Our results highlight uORF-perturbing variants as an under-recognised functional class that contribute to penetrant human disease, and demonstrate the power of large-scale population sequencing data in studying non-coding variant classes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-10717-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253449PMC
May 2020

Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes.

Nat Commun 2020 05 27;11(1):2539. Epub 2020 May 27.

Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.

Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-019-12438-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7253413PMC
May 2020

Recurrent TTN metatranscript-only c.39974-11T>G splice variant associated with autosomal recessive arthrogryposis multiplex congenita and myopathy.

Hum Mutat 2020 02 3;41(2):403-411. Epub 2019 Dec 3.

Paediatric Neurology, Bristol Royal Hospital For Children, University Hospitals Bristol NHS Foundation Trust, Bristol, United Kingdom.

We present eight families with arthrogryposis multiplex congenita and myopathy bearing a TTN intron 213 extended splice-site variant (NM_001267550.1:c.39974-11T>G), inherited in trans with a second pathogenic TTN variant. Muscle-derived RNA studies of three individuals confirmed mis-splicing induced by the c.39974-11T>G variant; in-frame exon 214 skipping or use of a cryptic 3' splice-site effecting a frameshift. Confounding interpretation of pathogenicity is the absence of exons 213-217 within the described skeletal muscle TTN N2A isoform. However, RNA-sequencing from 365 adult human gastrocnemius samples revealed that 56% specimens predominantly include exons 213-217 in TTN transcripts (inclusion rate ≥66%). Further, RNA-sequencing of five fetal muscle samples confirmed that 4/5 specimens predominantly include exons 213-217 (fifth sample inclusion rate 57%). Contractures improved significantly with age for four individuals, which may be linked to decreased expression of pathogenic fetal transcripts. Our study extends emerging evidence supporting a vital developmental role for TTN isoforms containing metatranscript-only exons.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23938DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7306402PMC
February 2020

Variant Score Ranker-a web application for intuitive missense variant prioritization.

Bioinformatics 2019 11;35(21):4478-4479

Cologne Center for Genomics, University of Cologne, University Hospital Cologne, Cologne, Germany.

Motivation: The correct classification of missense variants as benign or pathogenic remains challenging. Pathogenic variants are expected to have higher deleterious prediction scores than benign variants in the same gene. However, most of the existing variant annotation tools do not reference the score range of benign population variants on gene level.

Results: We present a web-application, Variant Score Ranker, which enables users to rapidly annotate variants and perform gene-specific variant score ranking on the population level. We also provide an intuitive example of how gene- and population-calibrated variant ranking scores can improve epilepsy variant prioritization.

Availability And Implementation: http://vsranker.broadinstitute.org.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz252DOI Listing
November 2019

Disparities in discovery of pathogenic variants for autosomal recessive non-syndromic hearing impairment by ancestry.

Eur J Hum Genet 2019 09 3;27(9):1456-1465. Epub 2019 May 3.

Center for Statistical Genetics, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.

Hearing impairment (HI) is characterized by extensive genetic heterogeneity. To determine the population-specific contribution of known autosomal recessive nonsyndromic (ARNS)HI genes and variants to HI etiology; pathogenic and likely pathogenic (PLP) ARNSHI variants were selected from ClinVar and the Deafness Variation Database and their frequencies were obtained from gnomAD for seven populations. ARNSHI prevalence due to PLP variants varies greatly by population ranging from 96.9 affected per 100,000 individuals for Ashkenazi Jews to 5.2 affected per 100,000 individuals for Africans/African Americans. For Europeans, Finns have the lowest prevalence due to ARNSHI PLP variants with 9.5 affected per 100,000 individuals. For East Asians, Latinos, non-Finish Europeans, and South Asians, ARNSHI prevalence due to PLP variants ranges from 17.1 to 33.7 affected per 100,000 individuals. ARNSHI variants that were previously reported in a single ancestry or family were observed in additional populations, e.g., USH1C p.(Q723*) reported in a Chinese family was the most prevalent pathogenic variant observed in gnomAD for African/African Americans. Variability between populations is due to how extensively ARNSHI has been studied, ARNSHI prevalence and ancestry specific ARNSHI variant architecture which is impacted by population history. Our study demonstrates that additional gene and variant discovery studies are necessary for all populations and particularly for individuals of African ancestry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41431-019-0417-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6777454PMC
September 2019

novoCaller: a Bayesian network approach for de novo variant calling from pedigree and population sequence data.

Bioinformatics 2019 04;35(7):1174-1180

Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.

Motivation: De novo mutations (i.e. newly occurring mutations) are a pre-dominant cause of sporadic dominant monogenic diseases and play a significant role in the genetics of complex disorders. De novo mutation studies also inform population genetics models and shed light on the biology of DNA replication and repair. Despite the broad interest, there is room for improvement with regard to the accuracy of de novo mutation calling.

Results: We designed novoCaller, a Bayesian variant calling algorithm that uses information from read-level data both in the pedigree and in unrelated samples. The method was extensively tested using large trio-sequencing studies, and it consistently achieved over 97% sensitivity. We applied the algorithm to 48 trio cases of suspected rare Mendelian disorders as part of the Brigham Genomic Medicine gene discovery initiative. Its application resulted in a significant reduction in the resources required for manual inspection and experimental validation of the calls. Three de novo variants were found in known genes associated with rare disorders, leading to rapid genetic diagnosis of the probands. Another 14 variants were found in genes that are likely to explain the phenotype, and could lead to novel disease-gene discovery.

Availability And Implementation: Source code implemented in C++ and Python can be downloaded from https://github.com/bgm-cwg/novoCaller.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty749DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6449753PMC
April 2019

Human genetic variation alters CRISPR-Cas9 on- and off-targeting specificity at therapeutically implicated loci.

Proc Natl Acad Sci U S A 2017 12 11;114(52):E11257-E11266. Epub 2017 Dec 11.

Division of Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115;

The CRISPR-Cas9 nuclease system holds enormous potential for therapeutic genome editing of a wide spectrum of diseases. Large efforts have been made to further understanding of on- and off-target activity to assist the design of CRISPR-based therapies with optimized efficacy and safety. However, current efforts have largely focused on the reference genome or the genome of cell lines to evaluate guide RNA (gRNA) efficiency, safety, and toxicity. Here, we examine the effect of human genetic variation on both on- and off-target specificity. Specifically, we utilize 7,444 whole-genome sequences to examine the effect of variants on the targeting specificity of ∼3,000 gRNAs across 30 therapeutically implicated loci. We demonstrate that human genetic variation can alter the off-target landscape genome-wide including creating and destroying protospacer adjacent motifs (PAMs). Furthermore, single-nucleotide polymorphisms (SNPs) and insertions/deletions (indels) can result in altered on-target sites and novel potent off-target sites, which can predispose patients to treatment failure and adverse effects, respectively; however, these events are rare. Taken together, these data highlight the importance of considering individual genomes for therapeutic genome-editing applications for the design and evaluation of CRISPR-based therapies to minimize risk of treatment failure and/or adverse outcomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1714640114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5748207PMC
December 2017

The role of de novo mutations in the development of amyotrophic lateral sclerosis.

Hum Mutat 2017 11 3;38(11):1534-1541. Epub 2017 Aug 3.

Department of Neurology and Laboratory of Neuroscience, IRCCS Istituto Auxologico Italiano, Milan, Italy.

The genetic basis combined with the sporadic occurrence of amyotrophic lateral sclerosis (ALS) suggests a role of de novo mutations in disease pathogenesis. Previous studies provided some evidence for this hypothesis; however, results were conflicting: no genes with recurrent occurring de novo mutations were identified and different pathways were postulated. In this study, we analyzed whole-exome data from 82 new patient-parents trios and combined it with the datasets of all previously published ALS trios (173 trios in total). The per patient de novo rate was not higher than expected based on the general population (P = 0.40). We showed that these mutations are not part of the previously postulated pathways, and gene-gene interaction analysis found no enrichment of interacting genes in this group (P = 0.57). Also, we were able to show that the de novo mutations in ALS patients are located in genes already prone for de novo mutations (P < 1 × 10 ). Although the individual effect of rare de novo mutations in specific genes could not be assessed, our results indicate that, in contrast to previous hypothesis, de novo mutations in general do not impose a major burden on ALS risk.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23295DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6599399PMC
November 2017

Negative selection in humans and fruit flies involves synergistic epistasis.

Science 2017 05;356(6337):539-542

Negative selection against deleterious alleles produced by mutation influences within-population variation as the most pervasive form of natural selection. However, it is not known whether deleterious alleles affect fitness independently, so that cumulative fitness loss depends exponentially on the number of deleterious alleles, or synergistically, so that each additional deleterious allele results in a larger decrease in relative fitness. Negative selection with synergistic epistasis should produce negative linkage disequilibrium between deleterious alleles and, therefore, an underdispersed distribution of the number of deleterious alleles in the genome. Indeed, we detected underdispersion of the number of rare loss-of-function alleles in eight independent data sets from human and fly populations. Thus, selection against rare protein-disrupting alleles is characterized by synergistic epistasis, which may explain how human and fly populations persist despite high genomic mutation rates.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aah5238DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6200135PMC
May 2017

A framework for the detection of de novo mutations in family-based sequencing data.

Eur J Hum Genet 2017 02 23;25(2):227-233. Epub 2016 Nov 23.

Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands.

Germline mutation detection from human DNA sequence data is challenging due to the rarity of such events relative to the intrinsic error rates of sequencing technologies and the uneven coverage across the genome. We developed PhaseByTransmission (PBT) to identify de novo single nucleotide variants and short insertions and deletions (indels) from sequence data collected in parent-offspring trios. We compute the joint probability of the data given the genotype likelihoods in the individual family members, the known familial relationships and a prior probability for the mutation rate. Candidate de novo mutations (DNMs) are reported along with their posterior probability, providing a systematic way to prioritize them for validation. Our tool is integrated in the Genome Analysis Toolkit and can be used together with the ReadBackedPhasing module to infer the parental origin of DNMs based on phase-informative reads. Using simulated data, we show that PBT outperforms existing tools, especially in low coverage data and on the X chromosome. We further show that PBT displays high validation rates on empirical parent-offspring sequencing data for whole-exome data from 104 trios and X-chromosome data from 249 parent-offspring families. Finally, we demonstrate an association between father's age at conception and the number of DNMs in female offspring's X chromosome, consistent with previous literature reports.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2016.147DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5255947PMC
February 2017

A high-quality human reference panel reveals the complexity and distribution of genomic structural variants.

Nat Commun 2016 10 6;7:12989. Epub 2016 Oct 6.

European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, Groningen 9713AD, The Netherlands.

Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation. Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100 bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms12989DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5059695PMC
October 2016

Leveraging Distant Relatedness to Quantify Human Mutation and Gene-Conversion Rates.

Am J Hum Genet 2015 Dec 12;97(6):775-89. Epub 2015 Nov 12.

Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.

The rate at which human genomes mutate is a central biological parameter that has many implications for our ability to understand demographic and evolutionary phenomena. We present a method for inferring mutation and gene-conversion rates by using the number of sequence differences observed in identical-by-descent (IBD) segments together with a reconstructed model of recent population-size history. This approach is robust to, and can quantify, the presence of substantial genotyping error, as validated in coalescent simulations. We applied the method to 498 trio-phased sequenced Dutch individuals and inferred a point mutation rate of 1.66 × 10(-8) per base per generation and a rate of 1.26 × 10(-9) for <20 bp indels. By quantifying how estimates varied as a function of allele frequency, we inferred the probability that a site is involved in non-crossover gene conversion as 5.99 × 10(-6). We found that recombination does not have observable mutagenic effects after gene conversion is accounted for and that local gene-conversion rates reflect recombination rates. We detected a strong enrichment of recent deleterious variation among mismatching variants found within IBD regions and observed summary statistics of local sharing of IBD segments to closely match previously proposed metrics of background selection; however, we found no significant effects of selection on our mutation-rate estimates. We detected no evidence of strong variation of mutation rates in a number of genomic annotations obtained from several recent studies. Our analysis suggests that a mutation-rate estimate higher than that reported by recent pedigree-based studies should be adopted in the context of DNA-based demographic reconstruction.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2015.10.006DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4678427PMC
December 2015

Genome-wide patterns and properties of de novo mutations in humans.

Nat Genet 2015 Jul 18;47(7):822-826. Epub 2015 May 18.

Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.

Mutations create variation in the population, fuel evolution and cause genetic diseases. Current knowledge about de novo mutations is incomplete and mostly indirect. Here we analyze 11,020 de novo mutations from the whole genomes of 250 families. We show that de novo mutations in the offspring of older fathers are not only more numerous but also occur more frequently in early-replicating, genic regions. Functional regions exhibit higher mutation rates due to CpG dinucleotides and show signatures of transcription-coupled repair, whereas mutation clusters with a unique signature point to a new mutational mechanism. Mutation and recombination rates independently associate with nucleotide diversity, and regional variation in human-chimpanzee divergence is only partly explained by heterogeneity in mutation rate. Finally, we provide a genome-wide mutation rate map for medical and population genetics applications. Our results provide new insights and refine long-standing hypotheses about human mutagenesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3292DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4485564PMC
July 2015

Characteristics of de novo structural changes in the human genome.

Genome Res 2015 Jun 16;25(6):792-801. Epub 2015 Apr 16.

Department of Genome Sciences, University of Washington, Seattle, Washington 98105, USA;

Small insertions and deletions (indels) and large structural variations (SVs) are major contributors to human genetic diversity and disease. However, mutation rates and characteristics of de novo indels and SVs in the general population have remained largely unexplored. We report 332 validated de novo structural changes identified in whole genomes of 250 families, including complex indels, retrotransposon insertions, and interchromosomal events. These data indicate a mutation rate of 2.94 indels (1-20 bp) and 0.16 SVs (>20 bp) per generation. De novo structural changes affect on average 4.1 kbp of genomic sequence and 29 coding bases per generation, which is 91 and 52 times more nucleotides than de novo substitutions, respectively. This contrasts with the equal genomic footprint of inherited SVs and substitutions. An excess of structural changes originated on paternal haplotypes. Additionally, we observed a nonuniform distribution of de novo SVs across offspring. These results reveal the importance of different mutational mechanisms to changes in human genome structure across generations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.185041.114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4448676PMC
June 2015

Genome of The Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels.

Nat Commun 2015 Mar 9;6:6065. Epub 2015 Mar 9.

Department of Clinical Genetics, Erasmus Medical Center, Rotterdam 3000 CA, The Netherlands.

Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (~35,000 samples) with the population-specific reference panel created by the Genome of The Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value <6.61 × 10(-4)), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted to be deleterious. The frequency of this ABCA6 variant is 3.65-fold increased in the Dutch and its effect (βLDL-C=0.135, βTC=0.140) is estimated to be very similar to those observed for single variants in well-known lipid genes, such as LDLR.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms7065DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4366498PMC
March 2015

A genome-wide association study identifies a functional ERAP2 haplotype associated with birdshot chorioretinopathy.

Hum Mol Genet 2014 Nov 22;23(22):6081-7. Epub 2014 Jun 22.

Department of Medical Genetics,

Birdshot chorioretinopathy (BSCR) is a rare form of autoimmune uveitis that can lead to severe visual impairment. Intriguingly, >95% of cases carry the HLA-A29 allele, which defines the strongest documented HLA association for a human disease. We have conducted a genome-wide association study in 96 Dutch and 27 Spanish cases, and 398 unrelated Dutch and 380 Spanish controls. Fine-mapping the primary MHC association through high-resolution imputation at classical HLA loci, identified HLA-A*29:02 as the principal MHC association (odds ratio (OR) = 157.5, 95% CI 91.6-272.6, P = 6.6 × 10(-74)). We also identified two novel susceptibility loci at 5q15 near ERAP2 (rs7705093; OR = 2.3, 95% CI 1.7-3.1, for the T allele, P = 8.6 × 10(-8)) and at 14q32.31 in the TECPR2 gene (rs150571175; OR = 6.1, 95% CI 3.2-11.7, for the A allele, P = 3.2 × 10(-8)). The association near ERAP2 was confirmed in an independent British case-control samples (combined meta-analysis P = 1.7 × 10(-9)). Functional analyses revealed that the risk allele of the polymorphism near ERAP2 is strongly associated with high mRNA and protein expression of ERAP2 in B cells. This study further defined an extremely strong MHC risk component in BSCR, and detected evidence for a novel disease mechanism that affects peptide processing in the endoplasmic reticulum.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/hmg/ddu307DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4204766PMC
November 2014

Improved imputation quality of low-frequency and rare variants in European samples using the 'Genome of The Netherlands'.

Eur J Hum Genet 2014 Nov 4;22(11):1321-6. Epub 2014 Jun 4.

1] University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands [2] University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands.

Although genome-wide association studies (GWAS) have identified many common variants associated with complex traits, low-frequency and rare variants have not been interrogated in a comprehensive manner. Imputation from dense reference panels, such as the 1000 Genomes Project (1000G), enables testing of ungenotyped variants for association. Here we present the results of imputation using a large, new population-specific panel: the Genome of The Netherlands (GoNL). We benchmarked the performance of the 1000G and GoNL reference sets by comparing imputation genotypes with 'true' genotypes typed on ImmunoChip in three European populations (Dutch, British, and Italian). GoNL showed significant improvement in the imputation quality for rare variants (MAF 0.05-0.5%) compared with 1000G. In Dutch samples, the mean observed Pearson correlation, r(2), increased from 0.61 to 0.71. We also saw improved imputation accuracy for other European populations (in the British samples, r(2) improved from 0.58 to 0.65, and in the Italians from 0.43 to 0.47). A combined reference set comprising 1000G and GoNL improved the imputation of rare variants even further. The Italian samples benefitted the most from this combined reference (the mean r(2) increased from 0.47 to 0.50). We conclude that the creation of a large population-specific reference is advantageous for imputing rare variants and that a combined reference panel across multiple populations yields the best imputation results.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2014.19DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4200431PMC
November 2014

An international effort towards developing standards for best practices in analysis, interpretation and reporting of clinical genome sequencing results in the CLARITY Challenge.

Genome Biol 2014 Mar 25;15(3):R53. Epub 2014 Mar 25.

Background: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance.

Results: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization.

Conclusions: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/gb-2014-15-3-r53DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4073084PMC
March 2014

The Genome of the Netherlands: design, and project goals.

Eur J Hum Genet 2014 Feb 29;22(2):221-7. Epub 2013 May 29.

University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands.

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ejhg.2013.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3895638PMC
February 2014

Deleterious alleles in the human genome are on average younger than neutral alleles of the same frequency.

PLoS Genet 2013 28;9(2):e1003301. Epub 2013 Feb 28.

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

Large-scale population sequencing studies provide a complete picture of human genetic variation within the studied populations. A key challenge is to identify, among the myriad alleles, those variants that have an effect on molecular function, phenotypes, and reproductive fitness. Most non-neutral variation consists of deleterious alleles segregating at low population frequency due to incessant mutation. To date, studies characterizing selection against deleterious alleles have been based on allele frequency (testing for a relative excess of rare alleles) or ratio of polymorphism to divergence (testing for a relative increase in the number of polymorphic alleles). Here, starting from Maruyama's theoretical prediction (Maruyama T (1974), Am J Hum Genet USA 6:669-673) that a (slightly) deleterious allele is, on average, younger than a neutral allele segregating at the same frequency, we devised an approach to characterize selection based on allelic age. Unlike existing methods, it compares sets of neutral and deleterious sequence variants at the same allele frequency. When applied to human sequence data from the Genome of the Netherlands Project, our approach distinguishes low-frequency coding non-synonymous variants from synonymous and non-coding variants at the same allele frequency and discriminates between sets of variants independently predicted to be benign or damaging for protein structure and function. The results confirm the abundance of slightly deleterious coding variation in humans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1003301DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3585140PMC
June 2013

A numeric model to simulate solar individual ultraviolet exposure.

Photochem Photobiol 2011 May-Jun;87(3):721-8. Epub 2011 Feb 10.

Institute of Work and Health (IST), Lausanne, Switzerland.

Exposure to solar ultraviolet (UV) light is the main causative factor for skin cancer. UV exposure depends on environmental and individual factors. Individual exposure data remain scarce and development of alternative assessment methods is greatly needed. We developed a model simulating human exposure to solar UV. The model predicts the dose and distribution of UV exposure received on the basis of ground irradiation and morphological data. Standard 3D computer graphics techniques were adapted to develop a rendering engine that estimates the solar exposure of a virtual manikin depicted as a triangle mesh surface. The amount of solar energy received by each triangle was calculated, taking into account reflected, direct and diffuse radiation, and shading from other body parts. Dosimetric measurements (n = 54) were conducted in field conditions using a foam manikin as surrogate for an exposed individual. Dosimetric results were compared to the model predictions. The model predicted exposure to solar UV adequately. The symmetric mean absolute percentage error was 13%. Half of the predictions were within 17% range of the measurements. This model provides a tool to assess outdoor occupational and recreational UV exposures, without necessitating time-consuming individual dosimetry, with numerous potential uses in skin cancer prevention and research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.1751-1097.2011.00895.xDOI Listing
September 2011