Publications by authors named "Shane McCarthy"

82 Publications

The genome sequence of the brown trout, Linnaeus 1758.

Wellcome Open Res 2021 13;6:108. Epub 2021 May 13.

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

We present a genome assembly from an individual female (the brown trout; Chordata; Actinopteri; Salmoniformes; Salmonidae). The genome sequence is 2.37 gigabases in span. The majority of the assembly is scaffolded into 40 chromosomal pseudomolecules. Gene annotation of this assembly on Ensembl has identified 43,935 protein coding genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.16838.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8488904PMC
May 2021

Genome-wide association analysis of serum alanine and aspartate aminotransferase, and the modifying effects of BMI in 388k European individuals.

Genet Epidemiol 2021 Sep 29;45(6):664-681. Epub 2021 Jun 29.

Regeneron Genetics Center, Regeneron Pharmaceuticals, Tarrytown, New York, USA.

Serum alanine aminotransferase (ALT) and aspartate aminotransferase (AST) are biomarkers for liver health. Here we report the largest genome-wide association analysis to date of serum ALT and AST levels in over 388k people of European ancestry from UK biobank and DiscovEHR. Eleven million imputed markers with a minor allele frequency (MAF) ≥ 0.5% were analyzed. Overall, 300 ALT and 336 AST independent genome-wide significant associations were identified. Among them, 81 ALT and 61 AST associations are reported for the first time. Genome-wide interaction study identified 9 ALT and 12 AST independent associations significantly modified by body mass index (BMI), including several previously reported potential liver disease therapeutic targets, for example, PNPLA3, HSD17B13, and MARC1. While further work is necessary to understand the effect of ALT and AST-associated variants on liver disease, the weighted burden of significant BMI-modified signals is significantly associated with liver disease outcomes. In summary, this study identifies genetic associations which offer an important step forward in understanding the genetic architecture of serum ALT and AST levels. Significant interactions between BMI and genetic loci not only highlight the important role of adiposity in liver damage but also shed light on the genetic etiology of liver disease in obese individuals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/gepi.22392DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8457092PMC
September 2021

The monoclonal antibody combination REGEN-COV protects against SARS-CoV-2 mutational escape in preclinical and human studies.

Cell 2021 07 5;184(15):3949-3961.e11. Epub 2021 Jun 5.

Regeneron Pharmaceuticals, Inc., Tarrytown, NY 10591, USA.

Monoclonal antibodies against SARS-CoV-2 are a clinically validated therapeutic option against COVID-19. Because rapidly emerging virus mutants are becoming the next major concern in the fight against the global pandemic, it is imperative that these therapeutic treatments provide coverage against circulating variants and do not contribute to development of treatment-induced emergent resistance. To this end, we investigated the sequence diversity of the spike protein and monitored emergence of virus variants in SARS-COV-2 isolates found in COVID-19 patients treated with the two-antibody combination REGEN-COV, as well as in preclinical in vitro studies using single, dual, or triple antibody combinations, and in hamster in vivo studies using REGEN-COV or single monoclonal antibody treatments. Our study demonstrates that the combination of non-competing antibodies in REGEN-COV provides protection against all current SARS-CoV-2 variants of concern/interest and also protects against emergence of new variants and their potential seeding into the population in a clinical setting.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2021.06.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8179113PMC
July 2021

Pan-ancestry exome-wide association analyses of COVID-19 outcomes in 586,157 individuals.

Am J Hum Genet 2021 07 3;108(7):1350-1355. Epub 2021 Jun 3.

Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge CB2 0AA, UK.

Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) causes coronavirus disease 2019 (COVID-19), a respiratory illness that can result in hospitalization or death. We used exome sequence data to investigate associations between rare genetic variants and seven COVID-19 outcomes in 586,157 individuals, including 20,952 with COVID-19. After accounting for multiple testing, we did not identify any clear associations with rare variants either exome wide or when specifically focusing on (1) 13 interferon pathway genes in which rare deleterious variants have been reported in individuals with severe COVID-19, (2) 281 genes located in susceptibility loci identified by the COVID-19 Host Genetics Initiative, or (3) 32 additional genes of immunologic relevance and/or therapeutic potential. Our analyses indicate there are no significant associations with rare protein-coding variants with detectable effect sizes at our current sample sizes. Analyses will be updated as additional data become available, and results are publicly available through the Regeneron Genetics Center COVID-19 Results Browser.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2021.05.017DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8173480PMC
July 2021

Towards complete and error-free genome assemblies of all vertebrate species.

Nature 2021 Apr 28;592(7856):737-746. Epub 2021 Apr 28.

UQ Genomics, University of Queensland, Brisbane, Queensland, Australia.

High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species. To address this issue, the international Genome 10K (G10K) consortium has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03451-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081667PMC
April 2021

Complete vertebrate mitogenomes reveal widespread repeats and gene duplications.

Genome Biol 2021 04 29;22(1):120. Epub 2021 Apr 29.

Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK.

Background: Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly.

Results: As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100-300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization.

Conclusions: Our results indicate that even in the "simple" case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-021-02336-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8082918PMC
April 2021

A high-quality, chromosome-level genome assembly of the Black Soldier Fly (Hermetia illucens L.).

G3 (Bethesda) 2021 05;11(5)

Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK.

Hermetia illucens L. (Diptera: Stratiomyidae), the Black Soldier Fly (BSF) is an increasingly important species for bioconversion of organic material into animal feed. We generated a high-quality chromosome-scale genome assembly of the BSF using Pacific Bioscience, 10X Genomics linked read and high-throughput chromosome conformation capture sequencing technology. Scaffolding the final assembly with Hi-C data produced a highly contiguous 1.01 Gb genome with 99.75% of scaffolds assembled into pseudochromosomes representing seven chromosomes with 16.01 Mb contig and 180.46 Mb scaffold N50 values. The highly complete genome obtained a Benchmarking Universal Single-Copy Orthologs (BUSCO) completeness of 98.6%. We masked 67.32% of the genome as repetitive sequences and annotated a total of 16,478 protein-coding genes using the BRAKER2 pipeline. We analyzed an established lab population to investigate the genomic variation and architecture of the BSF revealing six autosomes and an X chromosome. Additionally, we estimated the inbreeding coefficient (1.9%) of the lab population by assessing runs of homozygosity. This provided evidence for inbreeding events including long runs of homozygosity on chromosome 5. The release of this novel chromosome-scale BSF genome assembly will provide an improved resource for further genomic studies, functional characterization of genes of interest and genetic modification of this economically important species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/g3journal/jkab085DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8104945PMC
May 2021

Twelve years of SAMtools and BCFtools.

Gigascience 2021 Feb;10(2)

Department of Data Sciences, Dana-Farber Cancer Institute, 450 Brookline Avenue, Boston, MA 02215, USA.

Background: SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods.

Findings: The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines.

Conclusion: Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed >1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giab008DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931819PMC
February 2021

The genome sequence of the eastern grey squirrel, Gmelin, 1788.

Wellcome Open Res 2020 13;5:27. Epub 2020 Feb 13.

Tree of Life, Wellcome Sanger Institute,Wellcome Genome Campus, Hinxton, CB10 1SA, UK.

We present a genome assembly from an individual male (the eastern grey squirrel; Vertebrata; Mammalia; Eutheria; Rodentia; Sciuridae). The genome sequence is 2.82 gigabases in span. The majority of the assembly (92.3%) is scaffolded into 21 chromosomal-level scaffolds, with both X and Y sex chromosomes assembled.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.15721.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7653645PMC
February 2020

The genome sequence of the channel bull blenny, (Günther, 1861).

Wellcome Open Res 2020 24;5:148. Epub 2020 Jun 24.

Wellcome Sanger Institute, Cambridge, CB10 1SA, UK.

We present a genome assembly for (channel bull blenny, (Günther, 1861)); Chordata; Actinopterygii (ray-finned fishes), a temperate water outgroup for Antarctic Notothenioids. The size of the genome assembly is 609 megabases, with the majority of the assembly scaffolded into 24 chromosomal pseudomolecules. Gene annotation on Ensembl of this assembly has identified 21,662 coding genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.16012.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7649722PMC
June 2020

A haplotype-resolved, de novo genome assembly for the wood tiger moth (Arctia plantaginis) through trio binning.

Gigascience 2020 08;9(8)

Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK.

Background: Diploid genome assembly is typically impeded by heterozygosity because it introduces errors when haplotypes are collapsed into a consensus sequence. Trio binning offers an innovative solution that exploits heterozygosity for assembly. Short, parental reads are used to assign parental origin to long reads from their F1 offspring before assembly, enabling complete haplotype resolution. Trio binning could therefore provide an effective strategy for assembling highly heterozygous genomes, which are traditionally problematic, such as insect genomes. This includes the wood tiger moth (Arctia plantaginis), which is an evolutionary study system for warning colour polymorphism.

Findings: We produced a high-quality, haplotype-resolved assembly for Arctia plantaginis through trio binning. We sequenced a same-species family (F1 heterozygosity ∼1.9%) and used parental Illumina reads to bin 99.98% of offspring Pacific Biosciences reads by parental origin, before assembling each haplotype separately and scaffolding with 10X linked reads. Both assemblies are contiguous (mean scaffold N50: 8.2 Mb) and complete (mean BUSCO completeness: 97.3%), with annotations and 31 chromosomes identified through karyotyping. We used the assembly to analyse genome-wide population structure and relationships between 40 wild resequenced individuals from 5 populations across Europe, revealing the Georgian population as the most genetically differentiated with the lowest genetic diversity.

Conclusions: We present the first invertebrate genome to be assembled via trio binning. This assembly is one of the highest quality genomes available for Lepidoptera, supporting trio binning as a potent strategy for assembling heterozygous genomes. Using our assembly, we provide genomic insights into the geographic population structure of A. plantaginis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa088DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7433188PMC
August 2020

The genome sequence of the Eurasian red squirrel, Linnaeus 1758.

Wellcome Open Res 2020 3;5:18. Epub 2020 Feb 3.

Tree of Life, Wellcome Sanger Institute, Cambridge, CB10 1SA, UK.

We present a genome assembly from an individual male (the Eurasian red squirrel; Vertebrata; Mammalia; Eutheria; Rodentia; Sciuridae). The genome sequence is 2.88 gigabases in span. The majority of the assembly is scaffolded into 21 chromosomal-level scaffolds, with both X and Y sex chromosomes assembled.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.15679.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7309416PMC
February 2020

The gene-rich genome of the scallop Pecten maximus.

Gigascience 2020 05;9(5)

Natural History Museum, Department of Life Sciences,Cromwell Road, London SW7 5BD, UK.

Background: The king scallop, Pecten maximus, is distributed in shallow waters along the Atlantic coast of Europe. It forms the basis of a valuable commercial fishery and plays a key role in coastal ecosystems and food webs. Like other filter feeding bivalves it can accumulate potent phytotoxins, to which it has evolved some immunity. The molecular origins of this immunity are of interest to evolutionary biologists, pharmaceutical companies, and fisheries management.

Findings: Here we report the genome assembly of this species, conducted as part of the Wellcome Sanger 25 Genomes Project. This genome was assembled from PacBio reads and scaffolded with 10X Chromium and Hi-C data. Its 3,983 scaffolds have an N50 of 44.8 Mb (longest scaffold 60.1 Mb), with 92% of the assembly sequence contained in 19 scaffolds, corresponding to the 19 chromosomes found in this species. The total assembly spans 918.3 Mb and is the best-scaffolded marine bivalve genome published to date, exhibiting 95.5% recovery of the metazoan BUSCO set. Gene annotation resulted in 67,741 gene models. Analysis of gene content revealed large numbers of gene duplicates, as previously seen in bivalves, with little gene loss, in comparison with the sequenced genomes of other marine bivalve species.

Conclusions: The genome assembly of P. maximus and its annotated gene set provide a high-quality platform for studies on such disparate topics as shell biomineralization, pigmentation, vision, and resistance to algal toxins. As a result of our findings we highlight the sodium channel gene Nav1, known to confer resistance to saxitoxin and tetrodotoxin, as a candidate for further studies investigating immunity to domoic acid.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa037DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7191990PMC
May 2020

The genome sequence of the Eurasian river otter, Lutra lutra Linnaeus 1758.

Wellcome Open Res 2020 19;5:33. Epub 2020 Feb 19.

Wellcome Genome Campus, Wellcome Sanger Institute,, Hinxton, CB10 1SA, UK.

We present a genome assembly from an individual male (the Eurasian river otter; Vertebrata; Mammalia; Eutheria; Carnivora; Mustelidae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.15722.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7097881PMC
February 2020

Insights into human genetic variation and population history from 929 diverse genomes.

Science 2020 03;367(6484)

Wellcome Sanger Institute, Hinxton CB10 1SA, UK.

Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to southern Africa, central Africa, Oceania, and the Americas, but an absence of such variants fixed between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the past 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aay5012DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7115999PMC
March 2020

A Model for Integrating the Functions of Neuropsychiatric Risk Genes Identifies Components Required for Normal Dendritic Morphology.

G3 (Bethesda) 2020 05 4;10(5):1617-1628. Epub 2020 May 4.

Cold Spring Harbor Laboratory, One Bungtown Road, Cold Spring Harbor, NY 11724

Analysis of patient-derived DNA samples has identified hundreds of variants that are likely involved in neuropsychiatric diseases such as autism spectrum disorder (ASD) and schizophrenia (SCZ). While these studies couple behavioral phenotypes to individual genotypes, the number and diversity of candidate genes implicated in these disorders highlights the fact that the mechanistic underpinnings of these disorders are largely unknown. Here, we describe a RNAi-based screening platform that uses to screen candidate neuropsychiatric risk genes (NRGs) for roles in controlling dendritic arborization. To benchmark this approach, we queried published lists of NRGs whose variants in ASD and SCZ are predicted to result in complete or partial loss of gene function. We found that a significant fraction (>16%) of these candidate NRGs are essential for dendritic development. Furthermore, these gene sets are enriched for dendritic arbor phenotypes (>14 fold) when compared to control RNAi datasets of over 500 human orthologs. The diversity of PVD structural abnormalities observed in these assays suggests that the functions of diverse NRGs (encoding transcription factors, chromatin remodelers, molecular chaperones and cytoskeleton-related proteins) converge to regulate neuronal morphology and that individual NRGs may play distinct roles in dendritic branching. We also demonstrate that the experimental value of this platform by providing additional insights into the molecular frameworks of candidate NRGs. Specifically, we show that ANK2/UNC-44 function is directly integrated with known regulators of dendritic arborization and suggest that altering the dosage of ARID1B/LET-526 expression during development affects neuronal morphology without diminishing aspects of cell fate specification.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/g3.119.400925DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7202017PMC
May 2020

Identifying and removing haplotypic duplication in primary genome assemblies.

Bioinformatics 2020 05;36(9):2896-2898

Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK.

Motivation: Rapid development in long-read sequencing and scaffolding technologies is accelerating the production of reference-quality assemblies for large eukaryotic genomes. However, haplotype divergence in regions of high heterozygosity often results in assemblers creating two copies rather than one copy of a region, leading to breaks in contiguity and compromising downstream steps such as gene annotation. Several tools have been developed to resolve this problem. However, they either focus only on removing contained duplicate regions, also known as haplotigs, or fail to use all the relevant information and hence make errors.

Results: Here we present a novel tool, purge_dups, that uses sequence similarity and read depth to automatically identify and remove both haplotigs and heterozygous overlaps. In comparison with current tools, we demonstrate that purge_dups can reduce heterozygous duplication and increase assembly continuity while maintaining completeness of the primary assembly. Moreover, purge_dups is fully automatic and can easily be integrated into assembly pipelines.

Availability And Implementation: The source code is written in C and is available at https://github.com/dfguan/purge_dups.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa025DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7203741PMC
May 2020

Birth, expansion, and death of VCY-containing palindromes on the human Y chromosome.

Genome Biol 2019 10 14;20(1):207. Epub 2019 Oct 14.

The Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.

Background: Large palindromes (inverted repeats) make up substantial proportions of mammalian sex chromosomes, often contain genes, and have high rates of structural variation arising via ectopic recombination. As a result, they underlie many genomic disorders. Maintenance of the palindromic structure by gene conversion between the arms has been documented, but over longer time periods, palindromes are remarkably labile. Mechanisms of origin and loss of palindromes have, however, received little attention.

Results: Here, we use fiber-FISH, 10x Genomics Linked-Read sequencing, and breakpoint PCR sequencing to characterize the structural variation of the P8 palindrome on the human Y chromosome, which contains two copies of the VCY (Variable Charge Y) gene. We find a deletion of almost an entire arm of the palindrome, leading to death of the palindrome, a size increase by recruitment of adjacent sequence, and other complex changes including the formation of an entire new palindrome nearby. Together, these changes are found in ~ 1% of men, and we can assign likely molecular mechanisms to these mutational events. As a result, healthy men can have 1-4 copies of VCY.

Conclusions: Gross changes, especially duplications, in palindrome structure can be relatively frequent and facilitate the evolution of sex chromosomes in humans, and potentially also in other mammalian species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1816-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6790999PMC
October 2019

Targeted Treatment of Individuals With Psychosis Carrying a Copy Number Variant Containing a Genomic Triplication of the Glycine Decarboxylase Gene.

Biol Psychiatry 2019 10 9;86(7):523-535. Epub 2019 May 9.

McLean Hospital, Belmont, Massachusetts; Department of Psychiatry, Harvard Medical School, Boston, Massachusetts.

Background: The increased mutational burden for rare structural genomic variants in schizophrenia and other neurodevelopmental disorders has so far not yielded therapies targeting the biological effects of specific mutations. We identified two carriers (mother and son) of a triplication of the gene encoding glycine decarboxylase, GLDC, presumably resulting in reduced availability of the N-methyl-D-aspartate receptor coagonists glycine and D-serine and N-methyl-D-aspartate receptor hypofunction. Both carriers had a diagnosis of a psychotic disorder.

Methods: We carried out two double-blind, placebo-controlled clinical trials of N-methyl-D-aspartate receptor augmentation of psychotropic drug treatment in these two individuals. Glycine was used in the first clinical trial, and D-cycloserine was used in the second one.

Results: Glycine or D-cycloserine augmentation of psychotropic drug treatment each improved psychotic and mood symptoms in placebo-controlled trials.

Conclusions: These results provide two independent proof-of-principle demonstrations of symptom relief by targeting a specific genotype and explicitly link an individual mutation to the pathophysiology of psychosis and treatment response.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.biopsych.2019.04.031DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6745274PMC
October 2019

Rare Protein-Truncating Variants in APOB, Lower Low-Density Lipoprotein Cholesterol, and Protection Against Coronary Heart Disease.

Circ Genom Precis Med 2019 05;12(5):e002376

Program in Medical and Population Genetics, Broad Institute, Cambridge, MA (A.V.K., M.C., S.K.).

Background Familial hypobetalipoproteinemia is a genetic disorder caused by rare protein-truncating variants (PTV) in the gene encoding APOB (apolipoprotein B), the major protein component of LDL (low-density lipoprotein) and triglyceride-rich lipoprotein particles. Whether heterozygous APOB deficiency is associated with decreased risk for coronary heart disease (CHD) is uncertain. We combined family-based and large scale gene-sequencing to characterize the association of rare PTVs in APOB with circulating LDL-C (LDL cholesterol), triglycerides, and risk for CHD. Methods We sequenced the APOB gene in 29 Japanese hypobetalipoproteinemia families, as well as 57 973 individuals derived from 12 CHD case-control studies-18 442 with early-onset CHD and 39 531 controls. We defined PTVs as variants that lead to a premature stop, disrupt canonical splice-sites, or lead to insertions/deletions that shift reading frame. We tested the association of rare APOB PTV carrier status with blood lipid levels and CHD. Results Among 29 familial hypobetalipoproteinemia families, 8 families harbored APOB PTVs. Carrying 1 APOB PTV was associated with 55 mg/dL lower LDL-C ( P=3×10) and 53% lower triglyceride level ( P=2×10). Among 12 case-control studies, an APOB PTV was present in 0.038% of CHD cases as compared to 0.092% of controls. APOB PTV carrier status was associated with a 43 mg/dL lower LDL-C ( P=2×10), a 30% decrease in triglycerides ( P=5×10), and a 72% lower risk for CHD (odds ratio, 0.28; 95% CI, 0.12-0.64; P=0.002). Conclusions Rare PTV mutations in APOB which are associated with lower LDL-C and reduced triglycerides also confer protection against CHD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1161/CIRCGEN.118.002376DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7044908PMC
May 2019

Population-based identity-by-descent mapping combined with exome sequencing to detect rare risk variants for schizophrenia.

Am J Med Genet B Neuropsychiatr Genet 2019 04 23;180(3):223-231. Epub 2019 Feb 23.

Cognitive Genetics and Cognitive Therapy Group, Neuroimaging, Cognition & Genomics (NICOG) Centre & NCBES Galway Neuroscience Centre, School of Psychology and Discipline of Biochemistry, National University of Ireland Galway, Galway, Ireland.

Genome-wide association studies (GWASs) are highly effective at identifying common risk variants for schizophrenia. Rare risk variants are also important contributors to schizophrenia etiology but, with the exception of large copy number variants, are difficult to detect with GWAS. Exome and genome sequencing, which have accelerated the study of rare variants, are expensive so alternative methods are needed to aid detection of rare variants. Here we re-analyze an Irish schizophrenia GWAS dataset (n = 3,473) by performing identity-by-descent (IBD) mapping followed by exome sequencing of individuals identified as sharing risk haplotypes to search for rare risk variants in coding regions. We identified 45 rare haplotypes (>1 cM) that were significantly more common in cases than controls. By exome sequencing 105 haplotype carriers, we investigated these haplotypes for functional coding variants that could be tested for association in independent GWAS samples. We identified one rare missense variant in PCNT but did not find statistical support for an association with schizophrenia in a replication analysis. However, IBD mapping can prioritize both individual samples and genomic regions for follow-up analysis but genome rather than exome sequencing may be more effective at detecting risk variants on rare haplotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ajmg.b.32716DOI Listing
April 2019

Biobank-driven genomic discovery yields new insight into atrial fibrillation biology.

Nat Genet 2018 09 30;50(9):1234-1239. Epub 2018 Jul 30.

Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.

To identify genetic variation underlying atrial fibrillation, the most common cardiac arrhythmia, we performed a genome-wide association study of >1,000,000 people, including 60,620 atrial fibrillation cases and 970,216 controls. We identified 142 independent risk variants at 111 loci and prioritized 151 functional candidate genes likely to be involved in atrial fibrillation. Many of the identified risk variants fall near genes where more deleterious mutations have been reported to cause serious heart defects in humans (GATA4, MYH6, NKX2-5, PITX2, TBX5), or near genes important for striated muscle function and integrity (for example, CFL2, MYH7, PKP2, RBM20, SGCG, SSPN). Pathway and functional enrichment analyses also suggested that many of the putative atrial fibrillation genes act via cardiac structural remodeling, potentially in the form of an 'atrial cardiomyopathy', either during fetal heart development or as a response to stress in the adult heart.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0171-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6530775PMC
September 2018

Crumble: reference free lossy compression of sequence quality values.

Bioinformatics 2019 01;35(2):337-339

DNA Pipelines, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

Motivation: The bulk of space taken up by NGS sequencing CRAM files consists of per-base quality values. Most of these are unnecessary for variant calling, offering an opportunity for space saving.

Results: On the Syndip test set, a 17 fold reduction in the quality storage portion of a CRAM file can be achieved while maintaining variant calling accuracy. The size reduction of an entire CRAM file varied from 2.2 to 7.4 fold, depending on the non-quality content of the original file (see Supplementary Material S6 for details).

Availability And Implementation: Crumble is OpenSource and can be obtained from https://github.com/jkbonfield/crumble.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bty608DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6330002PMC
January 2019

DNA sequence-level analyses reveal potential phenotypic modifiers in a large family with psychiatric disorders.

Mol Psychiatry 2018 12 7;23(12):2254-2265. Epub 2018 Jun 7.

Centre for Genomic and Experimental Medicine, MRC Institute of Genetic and Molecular Medicine, University of Edinburgh, Edinburgh, UK.

Psychiatric disorders are a group of genetically related diseases with highly polygenic architectures. Genome-wide association analyses have made substantial progress towards understanding the genetic architecture of these disorders. More recently, exome- and whole-genome sequencing of cases and families have identified rare, high penetrant variants that provide direct functional insight. There remains, however, a gap in the heritability explained by these complementary approaches. To understand how multiple genetic variants combine to modify both severity and penetrance of a highly penetrant variant, we sequenced 48 whole genomes from a family with a high loading of psychiatric disorder linked to a balanced chromosomal translocation. The (1;11)(q42;q14.3) translocation directly disrupts three genes: DISC1, DISC2, DISC1FP and has been linked to multiple brain imaging and neurocognitive outcomes in the family. Using DNA sequence-level linkage analysis, functional annotation and population-based association, we identified common and rare variants in GRM5 (minor allele frequency (MAF) > 0.05), PDE4D (MAF > 0.2) and CNTN5 (MAF < 0.01) that may help explain the individual differences in phenotypic expression in the family. We suggest that whole-genome sequencing in large families will improve the understanding of the combined effects of the rare and common sequence variation underlying psychiatric phenotypes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41380-018-0087-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6294736PMC
December 2018

Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.

Am J Hum Genet 2018 05;102(5):874-889

Regeneron Genetics Center, Regeneron Pharmaceuticals, Tarrytown, NY 10591, USA. Electronic address:

Large-scale human genetics studies are ascertaining increasing proportions of populations as they continue growing in both number and scale. As a result, the amount of cryptic relatedness within these study cohorts is growing rapidly and has significant implications on downstream analyses. We demonstrate this growth empirically among the first 92,455 exomes from the DiscovEHR cohort and, via a custom simulation framework we developed called SimProgeny, show that these measures are in line with expectations given the underlying population and ascertainment approach. For example, within DiscovEHR we identified ∼66,000 close (first- and second-degree) relationships, involving 55.6% of study participants. Our simulation results project that >70% of the cohort will be involved in these close relationships, given that DiscovEHR scales to 250,000 recruited individuals. We reconstructed 12,574 pedigrees by using these relationships (including 2,192 nuclear families) and leveraged them for multiple applications. The pedigrees substantially improved the phasing accuracy of 20,947 rare, deleterious compound heterozygous mutations. Reconstructed nuclear families were critical for identifying 3,415 de novo mutations in ∼1,783 genes. Finally, we demonstrate the segregation of known and suspected disease-causing mutations, including a tandem duplication that occurs in LDLR and causes familial hypercholesterolemia, through reconstructed pedigrees. In summary, this work highlights the prevalence of cryptic relatedness expected among large healthcare population-genomic studies and demonstrates several analyses that are uniquely enabled by large amounts of cryptic relatedness.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2018.03.012DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5986700PMC
May 2018

Marker chromosome genomic structure and temporal origin implicate a chromoanasynthesis event in a family with pleiotropic psychiatric phenotypes.

Hum Mutat 2018 07 11;39(7):939-946. Epub 2018 May 11.

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.

Small supernumerary marker chromosomes (sSMC) are chromosomal fragments difficult to characterize genomically. Here, we detail a proband with schizoaffective disorder and a mother with bipolar disorder with psychotic features who present with a marker chromosome that segregates with disease. We explored the architecture of this marker and investigated its temporal origin. Array comparative genomic hybridization (aCGH) analysis revealed three duplications and three triplications that spanned the short arm of chromosome 9, suggestive of a chromoanasynthesis-like event. Segregation of marker genotypes, phased using sSMC mosaicism in the mother, provided evidence that it was generated during a germline-level event in the proband's maternal grandmother. Whole-genome sequencing (WGS) was performed to resolve the structure and junctions of the chromosomal fragments, revealing further complexities. While structural variations have been previously associated with neuropsychiatric disorders and marker chromosomes, here we detail the precise architecture, human life-cycle genesis, and propose a DNA replicative/repair mechanism underlying formation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/humu.23537DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5995661PMC
July 2018

A Protein-Truncating HSD17B13 Variant and Protection from Chronic Liver Disease.

N Engl J Med 2018 03;378(12):1096-1106

From the Regeneron Genetics Center (N.S.A.-H., A.H.L., C.S., S. McCarthy, C.O., J.S.P., S.B., N.G., S. Mukherjee, A.E.L., E.D.F., J.P., I.B.B., A.R.S., J.G.R., J.D.O., O.G., T.M.T., A.B., F.E.D.) and Regeneron Pharmaceuticals (X. Cheng, Y.X., P.S., Y.L., D.E., S.Y.K., B.Z., W.O., A.J.M., G.D.Y., J.G.), Tarrytown, NY; the University of Texas Southwestern Medical Center at Dallas, Dallas (J.K., S.S., H.H.H., J.C.C.); and Geisinger Health System, Danville (G.C.W., A.N.S., M.D.S., X. Chu, J.Z.L., U.L.M., D.J.C., C.D.S., T.M.), and Perelman School of Medicine, University of Pennsylvania, Philadelphia (M.D.F., A.S., S.M.D., D.J.R.) - both in Pennsylvania.

Background: Elucidation of the genetic factors underlying chronic liver disease may reveal new therapeutic targets.

Methods: We used exome sequence data and electronic health records from 46,544 participants in the DiscovEHR human genetics study to identify genetic variants associated with serum levels of alanine aminotransferase (ALT) and aspartate aminotransferase (AST). Variants that were replicated in three additional cohorts (12,527 persons) were evaluated for association with clinical diagnoses of chronic liver disease in DiscovEHR study participants and two independent cohorts (total of 37,173 persons) and with histopathological severity of liver disease in 2391 human liver samples.

Results: A splice variant (rs72613567:TA) in HSD17B13, encoding the hepatic lipid droplet protein hydroxysteroid 17-beta dehydrogenase 13, was associated with reduced levels of ALT (P=4.2×10) and AST (P=6.2×10). Among DiscovEHR study participants, this variant was associated with a reduced risk of alcoholic liver disease (by 42% [95% confidence interval {CI}, 20 to 58] among heterozygotes and by 53% [95% CI, 3 to 77] among homozygotes), nonalcoholic liver disease (by 17% [95% CI, 8 to 25] among heterozygotes and by 30% [95% CI, 13 to 43] among homozygotes), alcoholic cirrhosis (by 42% [95% CI, 14 to 61] among heterozygotes and by 73% [95% CI, 15 to 91] among homozygotes), and nonalcoholic cirrhosis (by 26% [95% CI, 7 to 40] among heterozygotes and by 49% [95% CI, 15 to 69] among homozygotes). Associations were confirmed in two independent cohorts. The rs72613567:TA variant was associated with a reduced risk of nonalcoholic steatohepatitis, but not steatosis, in human liver samples. The rs72613567:TA variant mitigated liver injury associated with the risk-increasing PNPLA3 p.I148M allele and resulted in an unstable and truncated protein with reduced enzymatic activity.

Conclusions: A loss-of-function variant in HSD17B13 was associated with a reduced risk of chronic liver disease and of progression from steatosis to steatohepatitis. (Funded by Regeneron Pharmaceuticals and others.).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1056/NEJMoa1712191DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668033PMC
March 2018

Altered DNA methylation associated with a translocation linked to major mental illness.

NPJ Schizophr 2018 Mar 19;4(1). Epub 2018 Mar 19.

Medical Genetics Section, Centre for Genomic and Experimental Medicine, Institute of Genetics and Molecular Medicine, Western General Hospital, University of Edinburgh, Crewe Road, Edinburgh, EH4 2XU, UK.

Recent work has highlighted a possible role for altered epigenetic modifications, including differential DNA methylation, in susceptibility to psychiatric illness. Here, we investigate blood-based DNA methylation in a large family where a balanced translocation between chromosomes 1 and 11 shows genome-wide significant linkage to psychiatric illness. Genome-wide DNA methylation was profiled in whole-blood-derived DNA from 41 individuals using the Infinium HumanMethylation450 BeadChip (Illumina Inc., San Diego, CA). We found significant differences in DNA methylation when translocation carriers (n = 17) were compared to related non-carriers (n = 24) at 13 loci. All but one of the 13 significant differentially methylated positions (DMPs) mapped to the regions surrounding the translocation breakpoints. Methylation levels of five DMPs were associated with genotype at SNPs in linkage disequilibrium with the translocation. Two of the five genes harbouring significant DMPs, DISC1 and DUSP10, have been previously shown to be differentially methylated in schizophrenia. Gene Ontology analysis revealed enrichment for terms relating to neuronal function and neurodevelopment among the genes harbouring the most significant DMPs. Differentially methylated region (DMR) analysis highlighted a number of genes from the MHC region, which has been implicated in psychiatric illness previously through genetic studies. We show that inheritance of a translocation linked to major mental illness is associated with differential DNA methylation at loci implicated in neuronal development/function and in psychiatric illness. As genomic rearrangements are over-represented in individuals with psychiatric illness, such analyses may be valuable more widely in the study of these conditions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41537-018-0047-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859082PMC
March 2018

Rare Variant Analysis of Human and Rodent Obesity Genes in Individuals with Severe Childhood Obesity.

Sci Rep 2017 06 29;7(1):4394. Epub 2017 Jun 29.

University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Wellcome Trust-MRC Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge, UK.

Obesity is a genetically heterogeneous disorder. Using targeted and whole-exome sequencing, we studied 32 human and 87 rodent obesity genes in 2,548 severely obese children and 1,117 controls. We identified 52 variants contributing to obesity in 2% of cases including multiple novel variants in GNAS, which were sometimes found with accelerated growth rather than short stature as described previously. Nominally significant associations were found for rare functional variants in BBS1, BBS9, GNAS, MKKS, CLOCK and ANGPTL6. The p.S284X variant in ANGPTL6 drives the association signal (rs201622589, MAF~0.1%, odds ratio = 10.13, p-value = 0.042) and results in complete loss of secretion in cells. Further analysis including additional case-control studies and population controls (N = 260,642) did not support association of this variant with obesity (odds ratio = 2.34, p-value = 2.59 × 10), highlighting the challenges of testing rare variant associations and the need for very large sample sizes. Further validation in cohorts with severe obesity and engineering the variants in model organisms will be needed to explore whether human variants in ANGPTL6 and other genes that lead to obesity when deleted in mice, do contribute to obesity. Such studies may yield druggable targets for weight loss therapies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-017-03054-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5491520PMC
June 2017

Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated European populations.

Nat Commun 2017 06 23;8:15927. Epub 2017 Jun 23.

The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK.

The genetic features of isolated populations can boost power in complex-trait association studies, and an in-depth understanding of how their genetic variation has been shaped by their demographic history can help leverage these advantageous characteristics. Here, we perform a comprehensive investigation using 3,059 newly generated low-depth whole-genome sequences from eight European isolates and two matched general populations, together with published data from the 1000 Genomes Project and UK10K. Sequencing data give deeper and richer insights into population demography and genetic characteristics than genotype-chip data, distinguishing related populations more effectively and allowing their functional variants to be studied more fully. We demonstrate relaxation of purifying selection in the isolates, leading to enrichment of rare and low-frequency functional variants, using novel statistics, DVxy and SVxy. We also develop an isolation-index (Isx) that predicts the overall level of such key genetic characteristics and can thus help guide population choice in future complex-trait association studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms15927DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5490002PMC
June 2017
-->