Publications by authors named "Evan E Eichler"

400 Publications

Reflections on the genetics-first approach to advancements in molecular genetic and neurobiological research on neurodevelopmental disorders.

J Neurodev Disord 2021 Jun 21;13(1):24. Epub 2021 Jun 21.

Department of Psychiatry and Behavioral Sciences, University of Washington, CHDD, Box 357920, Seattle, WA, 98195, USA.

Background: Neurodevelopmental disorders (NDDs), including autism spectrum disorder (ASD) and intellectual disability (ID), are common diagnoses with highly heterogeneous phenotypes and etiology. The genetics-first approach to research on NDDs has led to the identification of hundreds of genes conferring risk for ASD, ID, and related symptoms.

Main Body: Although relatively few individuals with NDDs share likely gene-disruptive (LGD) mutations in the same gene, characterization of overlapping functions, protein networks, and temporospatial expression patterns among these genes has led to increased understanding of the neurobiological etiology of NDDs. This shift in focus away from single genes and toward broader gene-brain-behavior pathways has been accelerated by the development of publicly available transcriptomic databases, cell type-specific research methods, and sequencing of non-coding genomic regions.

Conclusions: The genetics-first approach to research on NDDs has advanced the identification of critical protein function pathways and temporospatial expression patterns, expanding the impact of this research beyond individuals with single-gene mutations to the broader population of patients with NDDs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s11689-021-09371-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8215789PMC
June 2021

The CHD8/CHD7/Kismet family links blood-brain barrier glia and serotonin to ASD-associated sleep defects.

Sci Adv 2021 Jun 4;7(23). Epub 2021 Jun 4.

Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud university medical center, 6525 GA, Nijmegen, Netherlands.

Sleep disturbances in autism and neurodevelopmental disorders are common and adversely affect patient's quality of life, yet the underlying mechanisms are understudied. We found that individuals with mutations in , among the highest-confidence autism risk genes, or suffer from disturbed sleep maintenance. These defects are recapitulated in mutants affecting , the sole ortholog. We show that Kismet is required in glia for early developmental and adult sleep architecture. This role localizes to subperineurial glia constituting the blood-brain barrier. We demonstrate that Kismet-related sleep disturbances are caused by high serotonin during development, paralleling a well-established but genetically unsolved autism endophenotype. Despite their developmental origin, Kismet's sleep architecture defects can be reversed in adulthood by a behavioral regime resembling human sleep restriction therapy. Our findings provide fundamental insights into glial regulation of sleep and propose a causal mechanistic link between the CHD8/CHD7/Kismet family, developmental hyperserotonemia, and autism-associated sleep disturbances.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/sciadv.abe2626DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8177706PMC
June 2021

Mining the gaps of chromosome 8.

Nature 2021 May 14. Epub 2021 May 14.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/d41586-021-01095-8DOI Listing
May 2021

Sleep Problems in Children with ASD and Gene Disrupting Mutations.

J Genet Psychol 2021 May 17:1-18. Epub 2021 May 17.

Center for Youth Development and Intervention and Department of Psychology, University of Alabama, Tuscaloosa, Alabama, USA.

Sleep difficulties are pervasive in autism spectrum disorder (ASD), yet how sleep problems relate to underlying biological mechanisms such as genetic etiology is unclear, despite recent reports of profound sleep problems in children with ASD-associated likely gene disrupting (dnLGD) mutations, and . We aimed to inform etiological contributions to ASD and sleep by characterizing sleep problems in individuals with dnLGD mutations. Participants (N = 2886) were families who completed dichotomous questions about sleep problems within a medical history interview for their child with ASD (age 3-28 years). Confirmatory factor analyses compared between those with ASD and a dnLGD mutation and those with idiopathic ASD (i.e., no known genetic event, NON) highlighted four domains (sleep onset, breathing issues, nighttime awakenings, and daytime tiredness) with sleep onset as a strong factor for both groups. Overall, participant predictors indicated that internalizing behavioral problems and lower cognitive scores were related to increased sleep problems. Internalizing problems were also related to increase nighttime awakenings in the dnLGD group. As an exploratory aim, patterns of sleep issues are described for genetic subgroups with unique patterns including more overall sleep issues in ( = 19), problems falling asleep in ( = 22), and increased daytime naps in ( = 23). Implications for considering genetically defined subgroups when approaching sleep problems in children with ASD are discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1080/00221325.2021.1922869DOI Listing
May 2021

A high-quality bonobo genome refines the analysis of hominid evolution.

Nature 2021 Jun 5;594(7861):77-81. Epub 2021 May 5.

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03519-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8172381PMC
June 2021

Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C.

Nat Commun 2021 04 28;12(1):1935. Epub 2021 Apr 28.

Pacific Biosciences, Menlo Park, CA, USA.

Haplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80-91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-20536-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081726PMC
April 2021

Rare deleterious mutations of HNRNP genes result in shared neurodevelopmental disorders.

Genome Med 2021 Apr 19;13(1):63. Epub 2021 Apr 19.

The Atwal Clinic: Genomic & Personalized Medicine, Jacksonville, FL, USA.

Background: With the increasing number of genomic sequencing studies, hundreds of genes have been implicated in neurodevelopmental disorders (NDDs). The rate of gene discovery far outpaces our understanding of genotype-phenotype correlations, with clinical characterization remaining a bottleneck for understanding NDDs. Most disease-associated Mendelian genes are members of gene families, and we hypothesize that those with related molecular function share clinical presentations.

Methods: We tested our hypothesis by considering gene families that have multiple members with an enrichment of de novo variants among NDDs, as determined by previous meta-analyses. One of these gene families is the heterogeneous nuclear ribonucleoproteins (hnRNPs), which has 33 members, five of which have been recently identified as NDD genes (HNRNPK, HNRNPU, HNRNPH1, HNRNPH2, and HNRNPR) and two of which have significant enrichment in our previous meta-analysis of probands with NDDs (HNRNPU and SYNCRIP). Utilizing protein homology, mutation analyses, gene expression analyses, and phenotypic characterization, we provide evidence for variation in 12 HNRNP genes as candidates for NDDs. Seven are potentially novel while the remaining genes in the family likely do not significantly contribute to NDD risk.

Results: We report 119 new NDD cases (64 de novo variants) through sequencing and international collaborations and combined with published clinical case reports. We consider 235 cases with gene-disruptive single-nucleotide variants or indels and 15 cases with small copy number variants. Three hnRNP-encoding genes reach nominal or exome-wide significance for de novo variant enrichment, while nine are candidates for pathogenic mutations. Comparison of HNRNP gene expression shows a pattern consistent with a role in cerebral cortical development with enriched expression among radial glial progenitors. Clinical assessment of probands (n = 188-221) expands the phenotypes associated with HNRNP rare variants, and phenotypes associated with variation in the HNRNP genes distinguishes them as a subgroup of NDDs.

Conclusions: Overall, our novel approach of exploiting gene families in NDDs identifies new HNRNP-related disorders, expands the phenotypes of known HNRNP-related disorders, strongly implicates disruption of the hnRNPs as a whole in NDDs, and supports that NDD subtypes likely have shared molecular pathogenesis. To date, this is the first study to identify novel genetic disorders based on the presence of disorders in related genes. We also perform the first phenotypic analyses focusing on related genes. Finally, we show that radial glial expression of these genes is likely critical during neurodevelopment. This is important for diagnostics, as well as developing strategies to best study these genes for the development of therapeutics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-021-00870-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8056596PMC
April 2021

The structure, function and evolution of a complete human chromosome 8.

Nature 2021 05 7;593(7857):101-107. Epub 2021 Apr 7.

Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA.

The complete assembly of each human chromosome is essential for understanding human biology and evolution. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03420-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8099727PMC
May 2021

Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.

Am J Hum Genet 2021 05 30;108(5):919-928. Epub 2021 Mar 30.

Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Disorders, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, MA 02142, USA; Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA; Division of Medical Sciences, Harvard Medical School, Boston, MA 02115, USA. Electronic address:

Virtually all genome sequencing efforts in national biobanks, complex and Mendelian disease programs, and medical genetic initiatives are reliant upon short-read whole-genome sequencing (srWGS), which presents challenges for the detection of structural variants (SVs) relative to emerging long-read WGS (lrWGS) technologies. Given this ubiquity of srWGS in large-scale genomics initiatives, we sought to establish expectations for routine SV detection from this data type by comparison with lrWGS assembly, as well as to quantify the genomic properties and added value of SVs uniquely accessible to each technology. Analyses from the Human Genome Structural Variation Consortium (HGSVC) of three families captured ~11,000 SVs per genome from srWGS and ~25,000 SVs per genome from lrWGS assembly. Detection power and precision for SV discovery varied dramatically by genomic context and variant class: 9.7% of the current GRCh38 reference is defined by segmental duplication (SD) and simple repeat (SR), yet 91.4% of deletions that were specifically discovered by lrWGS localized to these regions. Across the remaining 90.3% of reference sequence, we observed extremely high (93.8%) concordance between technologies for deletions in these datasets. In contrast, lrWGS was superior for detection of insertions across all genomic contexts. Given that non-SD/SR sequences encompass 95.9% of currently annotated disease-associated exons, improved sensitivity from lrWGS to discover novel pathogenic deletions in these currently interpretable genomic regions is likely to be incremental. However, these analyses highlight the considerable added value of assembly-based lrWGS to create new catalogs of insertions and transposable elements, as well as disease-associated repeat expansions in genomic sequences that were previously recalcitrant to routine assessment.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2021.03.014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8206509PMC
May 2021

2020 William Allan Award introduction: Mary-Claire King.

Authors:
Evan E Eichler

Am J Hum Genet 2021 03;108(3):383-385

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA. Electronic address:

This article is based on the address given by the author at the 2020 virtual meeting of the American Society of Human Genetics (ASHG) on October 26, 2020. The video of the original address can be found at the ASHG website. Photo credit: Clare McLean.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.12.011DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8175867PMC
March 2021

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Science 2021 04 25;372(6537). Epub 2021 Feb 25.

New York Genome Center, New York, NY 10013, USA.

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.abf7117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8026704PMC
April 2021

SPEN haploinsufficiency causes a neurodevelopmental disorder overlapping proximal 1p36 deletion syndrome with an episignature of X chromosomes in females.

Am J Hum Genet 2021 03 16;108(3):502-516. Epub 2021 Feb 16.

Division of Medical Genetics, Department of Pediatrics, UPMC Children's Hospital of Pittsburgh, Pittsburgh, PA 15224, USA.

Deletion 1p36 (del1p36) syndrome is the most common human disorder resulting from a terminal autosomal deletion. This condition is molecularly and clinically heterogeneous. Deletions involving two non-overlapping regions, known as the distal (telomeric) and proximal (centromeric) critical regions, are sufficient to cause the majority of the recurrent clinical features, although with different facial features and dysmorphisms. SPEN encodes a transcriptional repressor commonly deleted in proximal del1p36 syndrome and is located centromeric to the proximal 1p36 critical region. Here, we used clinical data from 34 individuals with truncating variants in SPEN to define a neurodevelopmental disorder presenting with features that overlap considerably with those of proximal del1p36 syndrome. The clinical profile of this disease includes developmental delay/intellectual disability, autism spectrum disorder, anxiety, aggressive behavior, attention deficit disorder, hypotonia, brain and spine anomalies, congenital heart defects, high/narrow palate, facial dysmorphisms, and obesity/increased BMI, especially in females. SPEN also emerges as a relevant gene for del1p36 syndrome by co-expression analyses. Finally, we show that haploinsufficiency of SPEN is associated with a distinctive DNA methylation episignature of the X chromosome in affected females, providing further evidence of a specific contribution of the protein to the epigenetic control of this chromosome, and a paradigm of an X chromosome-specific episignature that classifies syndromic traits. We conclude that SPEN is required for multiple developmental processes and SPEN haploinsufficiency is a major contributor to a disorder associated with deletions centromeric to the previously established 1p36 critical regions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2021.01.015DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8008487PMC
March 2021

Human disease genes website series: An international, open and dynamic library for up-to-date clinical information.

Am J Med Genet A 2021 04 13;185(4):1039-1046. Epub 2021 Jan 13.

Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud university medical center, Nijmegen, The Netherlands.

Since the introduction of next-generation sequencing, an increasing number of disorders have been discovered to have genetic etiology. To address diverse clinical questions and coordinate research activities that arise with the identification of these rare disorders, we developed the Human Disease Genes website series (HDG website series): an international digital library that records detailed information on the clinical phenotype of novel genetic variants in the human genome (https://humandiseasegenes.info/). Each gene website is moderated by a dedicated team of clinicians and researchers, focused on specific genes, and provides up-to-date-including unpublished-clinical information. The HDG website series is expanding rapidly with 424 genes currently adopted by 325 moderators from across the globe. On average, a gene website has detailed phenotypic information of 14.4 patients. There are multiple examples of added value, one being the ARID1B gene website, which was recently utilized in research to collect clinical information of 81 new patients. Additionally, several gene websites have more data available than currently published in the literature. In conclusion, the HDG website series provides an easily accessible, open and up-to-date clinical data resource for patients with pathogenic variants of individual genes. This is a valuable resource not only for clinicians dealing with rare genetic disorders such as developmental delay and autism, but other professionals working in diagnostics and basic research. Since the HDG website series is a dynamic platform, its data also include the phenotype of yet unpublished patients curated by professionals providing higher quality clinical detail to improve management of these rare disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/ajmg.a.62057DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7986414PMC
April 2021

Sequence diversity analyses of an improved rhesus macaque genome enhance its biomedical utility.

Science 2020 12;370(6523)

Department of Biology, University of Bari 'Aldo Moro', 70125 Bari, Italy.

The rhesus macaque () is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.abc6617DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7818670PMC
December 2020

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads.

Nat Biotechnol 2021 03 7;39(3):302-308. Epub 2020 Dec 7.

Heinrich Heine University Düsseldorf, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Düsseldorf, Germany.

Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly that combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing with continuous long-read or high-fidelity sequencing data. Employing this strategy, we produced a completely phased de novo genome assembly for each haplotype of an individual of Puerto Rican descent (HG00733) in the absence of parental data. The assemblies are accurate (quality value > 40) and highly contiguous (contig N50 > 23 Mbp) with low switch error rates (0.17%), providing fully phased single-nucleotide variants, indels and structural variants. A comparison of Oxford Nanopore Technologies and Pacific Biosciences phased assemblies identified 154 regions that are preferential sites of contig breaks, irrespective of sequencing technology or phasing algorithms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0719-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7954704PMC
March 2021

Brief Report: Associations Between Self-injurious Behaviors and Abdominal Pain Among Individuals with ASD-Associated Disruptive Mutations.

J Autism Dev Disord 2020 Nov 11. Epub 2020 Nov 11.

Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, Washington, USA.

Self-injurious behaviors (SIB) are elevated in autism spectrum disorder (ASD) and related genetic disorders, but the genetic and biological mechanisms that contribute to SIB in ASD are poorly understood. This study examined rates and predictors of SIB in 112 individuals with disruptive mutations to ASD-risk genes. Current SIB were reported in 30% of participants and associated with poorer cognitive and adaptive skills. History of severe abdominal pain predicted higher rates of SIB and SIB severity after controlling for age and adaptive behavior; individuals with a history of severe abdominal pain were eight times more likely to exhibit SIB than those with no history. Future research is needed to examine associations between genetic risk, pain, and SIB in this population.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s10803-020-04774-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8110605PMC
November 2020

NCKAP1 Disruptive Variants Lead to a Neurodevelopmental Disorder with Core Features of Autism.

Am J Hum Genet 2020 11;107(5):963-976

Service of Endocrinology, Diabetology, and Metabolism, Lausanne University Hospital, Lausanne 1011, Switzerland.

NCKAP1/NAP1 regulates neuronal cytoskeletal dynamics and is essential for neuronal differentiation in the developing brain. Deleterious variants in NCKAP1 have been identified in individuals with autism spectrum disorder (ASD) and intellectual disability; however, its clinical significance remains unclear. To determine its significance, we assemble genotype and phenotype data for 21 affected individuals from 20 unrelated families with predicted deleterious variants in NCKAP1. This includes 16 individuals with de novo (n = 8), transmitted (n = 6), or inheritance unknown (n = 2) truncating variants, two individuals with structural variants, and three with potentially disruptive de novo missense variants. We report a de novo and ultra-rare deleterious variant burden of NCKAP1 in individuals with neurodevelopmental disorders which needs further replication. ASD or autistic features, language and motor delay, and variable expression of intellectual or learning disability are common clinical features. Among inherited cases, there is evidence of deleterious variants segregating with neuropsychiatric disorders. Based on available human brain transcriptomic data, we show that NCKAP1 is broadly and highly expressed in both prenatal and postnatal periods and demostrate enriched expression in excitatory neurons and radial glias but depleted expression in inhibitory neurons. Mouse in utero electroporation experiments reveal that Nckap1 loss of function promotes neuronal migration during early cortical development. Combined, these data support a role for disruptive NCKAP1 variants in neurodevelopmental delay/autism, possibly by interfering with neuronal migration early in cortical development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.10.002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7674997PMC
November 2020

Single-cell strand sequencing of a macaque genome reveals multiple nested inversions and breakpoint reuse during primate evolution.

Genome Res 2020 11 22;30(11):1680-1693. Epub 2020 Oct 22.

Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy.

Rhesus macaque is an Old World monkey that shared a common ancestor with human ∼25 Myr ago and is an important animal model for human disease studies. A deep understanding of its genetics is therefore required for both biomedical and evolutionary studies. Among structural variants, inversions represent a driving force in speciation and play an important role in disease predisposition. Here we generated a genome-wide map of inversions between human and macaque, combining single-cell strand sequencing with cytogenetics. We identified 375 total inversions between 859 bp and 92 Mbp, increasing by eightfold the number of previously reported inversions. Among these, 19 inversions flanked by segmental duplications overlap with recurrent copy number variants associated with neurocognitive disorders. Evolutionary analyses show that in 17 out of 19 cases, the Hominidae orientation of these disease-associated regions is always derived. This suggests that duplicated sequences likely played a fundamental role in generating inversions in humans and great apes, creating architectures that nowadays predispose these regions to disease-associated genetic instability. Finally, we identified 861 genes mapping at 156 inversions breakpoints, with some showing evidence of differential expression in human and macaque cell lines, thus highlighting candidates that might have contributed to the evolution of species-specific features. This study depicts the most accurate fine-scale map of inversions between human and macaque using a two-pronged integrative approach, such as single-cell strand sequencing and cytogenetics, and represents a valuable resource toward understanding of the biology and evolution of primate species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.265322.120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7605249PMC
November 2020

A Novel Framework for Characterizing Genomic Haplotype Diversity in the Human Immunoglobulin Heavy Chain Locus.

Front Immunol 2020 23;11:2136. Epub 2020 Sep 23.

Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY, United States.

An incomplete ascertainment of genetic variation within the highly polymorphic immunoglobulin heavy chain locus (IGH) has hindered our ability to define genetic factors that influence antibody-mediated processes. Due to locus complexity, standard high-throughput approaches have failed to accurately and comprehensively capture IGH polymorphism. As a result, the locus has only been fully characterized two times, severely limiting our knowledge of human IGH diversity. Here, we combine targeted long-read sequencing with a novel bioinformatics tool, IGenotyper, to fully characterize IGH variation in a haplotype-specific manner. We apply this approach to eight human samples, including a haploid cell line and two mother-father-child trios, and demonstrate the ability to generate high-quality assemblies (>98% complete and >99% accurate), genotypes, and gene annotations, identifying 2 novel structural variants and 15 novel IGH alleles. We show multiplexing allows for scaling of the approach without impacting data quality, and that our genotype call sets are more accurate than short-read (>35% increase in true positives and >97% decrease in false-positives) and array/imputation-based datasets. This framework establishes a desperately needed foundation for leveraging IG genomic data to study population-level variation in antibody-mediated immunity, critical for bettering our understanding of disease risk, and responses to vaccines and therapeutics.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fimmu.2020.02136DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7539625PMC
May 2021

Large-scale targeted sequencing identifies risk genes for neurodevelopmental disorders.

Nat Commun 2020 10 1;11(1):4932. Epub 2020 Oct 1.

Oasi Research Institute-IRCCS, Troina, Italy.

Most genes associated with neurodevelopmental disorders (NDDs) were identified with an excess of de novo mutations (DNMs) but the significance in case-control mutation burden analysis is unestablished. Here, we sequence 63 genes in 16,294 NDD cases and an additional 62 genes in 6,211 NDD cases. By combining these with published data, we assess a total of 125 genes in over 16,000 NDD cases and compare the mutation burden to nonpsychiatric controls from ExAC. We identify 48 genes (25 newly reported) showing significant burden of ultra-rare (MAF < 0.01%) gene-disruptive mutations (FDR 5%), six of which reach family-wise error rate (FWER) significance (p < 1.25E-06). Among these 125 targeted genes, we also reevaluate DNM excess in 17,426 NDD trios with 6,499 new autism trios. We identify 90 genes enriched for DNMs (FDR 5%; e.g., GABRG2 and UIMC1); of which, 61 reach FWER significance (p < 3.64E-07; e.g., CASZ1). In addition to doubling the number of patients for many NDD risk genes, we present phenotype-genotype correlations for seven risk genes (CTCF, HNRNPU, KCNQ3, ZBTB18, TCF12, SPEN, and LEO1) based on this large-scale targeted sequencing effort.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-020-18723-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7530681PMC
October 2020

Developmental Predictors of Cognitive and Adaptive Outcomes in Genetic Subtypes of Autism Spectrum Disorder.

Autism Res 2020 10 12;13(10):1659-1669. Epub 2020 Sep 12.

Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.

Approximately one-fourth of autism spectrum disorder (ASD) cases are associated with a disruptive genetic variant. Many of these ASD genotypes have been described previously, and are characterized by unique constellations of medical, psychiatric, developmental, and behavioral features. Development of precision medicine care for affected individuals has been challenging due to the phenotypic heterogeneity that exists even within each genetic subtype. In the present study, we identify developmental milestones that predict cognitive and adaptive outcomes for five of the most common ASD genotypes. Sixty-five youth with a known pathogenic variant involving ADNP, CHD8, DYRK1A, GRIN2B, or SCN2A genes participated in cognitive and adaptive testing. Exploratory linear regressions were used to identify developmental milestones that predicted cognitive and adaptive outcomes within each gene group. We hypothesized that the earliest and most predictive milestones would vary across gene groups, but would be consistent across outcomes within each genetic subtype. Within the ADNP group, age of walking predicted cognitive outcomes, while age of first words predicted adaptive behaviors. Age of phrases predicted adaptive functioning in the CHD8 group, but cognitive outcomes were not clearly associated with early developmental milestones. Verbal milestones were the strongest predictors of cognitive and adaptive outcomes for individuals with mutations to DYRK1A, GRIN2B, or SCN2A. These trends inform decisions about treatment planning and long-term expectations for affected individuals, and they add to the growing body of research linking molecular genetic function to brain development and phenotypic outcomes. LAY SUMMARY: Researchers have found many genetic causes of autism including mutations to ADNP, CHD8, DYRK1A, GRIN2B, and SCN2A genes. We found that each genetic cause had different early developmental milestones that explained the overall functioning of the children when they were older. Depending on the genetic cause, the age that a child first starts walking and/or talking may help to better understand and support a child's development who has a mutation to one of the above genes. Autism Res 2020, 13: 1659-1669. © 2020 International Society for Autism Research and Wiley Periodicals LLC.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/aur.2385DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7861657PMC
October 2020

HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

Genome Res 2020 09 14;30(9):1291-1305. Epub 2020 Aug 14.

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA.

Complete and accurate genome assemblies form the basis of most downstream genomic analyses and are of critical importance. Recent genome assembly projects have relied on a combination of noisy long-read sequencing and accurate short-read sequencing, with the former offering greater assembly continuity and the latter providing higher consensus accuracy. The recently introduced Pacific Biosciences (PacBio) HiFi sequencing technology bridges this divide by delivering long reads (>10 kbp) with high per-base accuracy (>99.9%). Here we present HiCanu, a modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering. We benchmark HiCanu with a focus on the recovery of haplotype diversity, major histocompatibility complex (MHC) variants, satellite DNAs, and segmental duplications. For diploid human genomes sequenced to 30× HiFi coverage, HiCanu achieved superior accuracy and allele recovery compared to the current state of the art. On the effectively haploid CHM13 human cell line, HiCanu achieved an NG50 contig size of 77 Mbp with a per-base consensus accuracy of 99.999% (QV50), surpassing recent assemblies of high-coverage, ultralong Oxford Nanopore Technologies (ONT) reads in terms of both accuracy and continuity. This HiCanu assembly correctly resolves 337 out of 341 validation BACs sampled from known segmental duplications and provides the first preliminary assemblies of nine complete human centromeric regions. Although gaps and errors still remain within the most challenging regions of the genome, these results represent a significant advance toward the complete assembly of human genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.263566.120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7545148PMC
September 2020

An evolutionary driver of interspersed segmental duplications in primates.

Genome Biol 2020 08 10;21(1):202. Epub 2020 Aug 10.

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA.

Background: The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP).

Results: Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis.

Conclusions: LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02074-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7419210PMC
August 2020

Evolution of a Human-Specific Tandem Repeat Associated with ALS.

Am J Hum Genet 2020 09 3;107(3):445-460. Epub 2020 Aug 3.

Division of Medical Genetics, University of Washington School of Medicine, Seattle, WA 98195, USA; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA. Electronic address:

Tandem repeats are proposed to contribute to human-specific traits, and more than 40 tandem repeat expansions are known to cause neurological disease. Here, we characterize a human-specific 69 bp variable number tandem repeat (VNTR) in the last intron of WDR7, which exhibits striking variability in both copy number and nucleotide composition, as revealed by long-read sequencing. In addition, greater repeat copy number is significantly enriched in three independent cohorts of individuals with sporadic amyotrophic lateral sclerosis (ALS). Each unit of the repeat forms a stem-loop structure with the potential to produce microRNAs, and the repeat RNA can aggregate when expressed in cells. We leveraged its remarkable sequence variability to align the repeat in 288 samples and uncover its mechanism of expansion. We found that the repeat expands in the 3'-5' direction, in groups of repeat units divisible by two. The expansion patterns we observed were consistent with duplication events, and a replication error called template switching. We also observed that the VNTR is expanded in both Denisovan and Neanderthal genomes but is fixed at one copy or fewer in non-human primates. Evaluating the repeat in 1000 Genomes Project samples reveals that some repeat segments are solely present or absent in certain geographic populations. The large size of the repeat unit in this VNTR, along with our multiplexed sequencing strategy, provides an unprecedented opportunity to study mechanisms of repeat expansion, and a framework for evaluating the roles of VNTRs in human evolution and disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.07.004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7477013PMC
September 2020

De novo SMARCA2 variants clustered outside the helicase domain cause a new recognizable syndrome with intellectual disability and blepharophimosis distinct from Nicolaides-Baraitser syndrome.

Genet Med 2020 11 22;22(11):1838-1850. Epub 2020 Jul 22.

Department of Genetics, Robert Debré Hospital, AP-HP, Paris, France.

Purpose: Nontruncating variants in SMARCA2, encoding a catalytic subunit of SWI/SNF chromatin remodeling complex, cause Nicolaides-Baraitser syndrome (NCBRS), a condition with intellectual disability and multiple congenital anomalies. Other disorders due to SMARCA2 are unknown.

Methods: By next-generation sequencing, we identified candidate variants in SMARCA2 in 20 individuals from 18 families with a syndromic neurodevelopmental disorder not consistent with NCBRS. To stratify variant interpretation, we functionally analyzed SMARCA2 variants in yeasts and performed transcriptomic and genome methylation analyses on blood leukocytes.

Results: Of 20 individuals, 14 showed a recognizable phenotype with recurrent features including epicanthal folds, blepharophimosis, and downturned nasal tip along with variable degree of intellectual disability (or blepharophimosis intellectual disability syndrome [BIS]). In contrast to most NCBRS variants, all SMARCA2 variants associated with BIS are localized outside the helicase domains. Yeast phenotype assays differentiated NCBRS from non-NCBRS SMARCA2 variants. Transcriptomic and DNA methylation signatures differentiated NCBRS from BIS and those with nonspecific phenotype. In the remaining six individuals with nonspecific dysmorphic features, clinical and molecular data did not permit variant reclassification.

Conclusion: We identified a novel recognizable syndrome named BIS associated with clustered de novo SMARCA2 variants outside the helicase domains, phenotypically and molecularly distinct from NCBRS.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41436-020-0898-yDOI Listing
November 2020

Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes.

Nat Biotechnol 2020 09 4;38(9):1044-1053. Epub 2020 May 4.

Chan Zuckerberg Initiative, Redwood City, CA, USA.

De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks of wall-clock time. To enable rapid human genome assembly, we present Shasta, a de novo long-read assembler, and polishing algorithms named MarginPolish and HELEN. Using a single PromethION nanopore sequencer and our toolkit, we assembled 11 highly contiguous human genomes de novo in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values and 6.5× coverage in reads >100 kb using three flow cells per sample. Shasta produced a complete haploid human genome assembly in under 6 h on a single commercial compute node. MarginPolish and HELEN polished haploid assemblies to more than 99.9% identity (Phred quality score QV = 30) with nanopore reads alone. Addition of proximity-ligation sequencing enabled near chromosome-level scaffolds for all 11 genomes. We compare our assembly performance to existing methods for diploid, haploid and trio-binned human samples and report superior accuracy and speed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41587-020-0503-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7483855PMC
September 2020

Telomere-to-telomere assembly of a complete human X chromosome.

Nature 2020 09 14;585(7823):79-84. Epub 2020 Jul 14.

Arima Genomics, San Diego, CA, USA.

After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist. Here we present a human genome assembly that surpasses the continuity of GRCh38, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2547-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7484160PMC
September 2020

Evaluating heterogeneity in ASD symptomatology, cognitive ability, and adaptive functioning among 16p11.2 CNV carriers.

Autism Res 2020 08 28;13(8):1300-1310. Epub 2020 Jun 28.

Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, Washington, USA.

Individuals with 16p11.2 copy number variant (CNV) show considerable phenotypic heterogeneity. Although autism spectrum disorder (ASD) is reported in approximately 20-23% of individuals with 16p11.2 CNVs, ASD-associated symptoms are observed in those without a clinical ASD diagnosis. Previous work has shown that genetic variation and prenatal and perinatal birth complications influence ASD risk and symptom severity. This study examined the impact of genetic and environmental risk factors on phenotypic heterogeneity among 16p11.2 CNV carriers. Participants included individuals with a 16p11.2 deletion (N = 96) or duplication (N = 77) with exome sequencing from the Simons VIP study. The presence of prenatal factors, perinatal events, additional genetic events, and gender was studied. Regression analyses examined the contribution of each risk factor on ASD symptomatology, cognitive functioning, and adaptive abilities. For deletion carriers, perinatal and additional genetic events were associated with increased ASD symptomatology and decrements in cognitive and adaptive functioning. For duplication carriers, secondary genetic events were associated with greater cognitive impairments. Being female sex was a protective factor for both deletion and duplication carriers. Our findings suggest that ASD-associated risk factors contribute to the variability in symptom presentation in individuals with 16p11.2 CNVs. LAY SUMMARY: There are a wide range of autism spectrum disorder (ASD) symptoms and abilities observed for individuals with genetic changes of the 16p11.2 region. Here, we found perinatal complications contributed to more severe ASD symptoms (deletion carriers) and additional genetic mutations contributed to decreased cognitive abilities (deletion and duplication carriers). A potential protective factor was also observed for females with 16p11.2 variations. Autism Res 2020, 13: 1300-1310. © 2020 International Society for Autism Research, Wiley Periodicals, Inc.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/aur.2332DOI Listing
August 2020