Publications by authors named "Egor Dolzhenko"

24 Publications

  • Page 1 of 1

Long-Term m5C Methylome Dynamics Parallel Phenotypic Adaptation in the Cyanobacterium Trichodesmium.

Mol Biol Evol 2021 03;38(3):927-939

Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA.

A major challenge in modern biology is understanding how the effects of short-term biological responses influence long-term evolutionary adaptation, defined as a genetically determined increase in fitness to novel environments. This is particularly important in globally important microbes experiencing rapid global change, due to their influence on food webs, biogeochemical cycles, and climate. Epigenetic modifications like methylation have been demonstrated to influence short-term plastic responses, which ultimately impact long-term adaptive responses to environmental change. However, there remains a paucity of empirical research examining long-term methylation dynamics during environmental adaptation in nonmodel, ecologically important microbes. Here, we show the first empirical evidence in a marine prokaryote for long-term m5C methylome modifications correlated with phenotypic adaptation to CO2, using a 7-year evolution experiment (1,000+ generations) with the biogeochemically important marine cyanobacterium Trichodesmium. We identify m5C methylated sites that rapidly changed in response to high (750 µatm) CO2 exposure and were maintained for at least 4.5 years of CO2 selection. After 7 years of CO2 selection, however, m5C methylation levels that initially responded to high-CO2 returned to ancestral, ambient CO2 levels. Concurrently, high-CO2 adapted growth and N2 fixation rates remained significantly higher than those of ambient CO2 adapted cell lines irrespective of CO2 concentration, a trend consistent with genetic assimilation theory. These data demonstrate the maintenance of CO2-responsive m5C methylation for 4.5 years alongside phenotypic adaptation before returning to ancestral methylation levels. These observations in a globally distributed marine prokaryote provide critical evolutionary insights into biogeochemically important traits under global change.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msaa256DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947765PMC
March 2021

Repeat expansions confer WRN dependence in microsatellite-unstable cancers.

Nature 2020 10 30;586(7828):292-298. Epub 2020 Sep 30.

Department of Oncology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, UK.

The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair. Depletion of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2769-8DOI Listing
October 2020

Large scale in silico characterization of repeat expansion variation in human genomes.

Sci Data 2020 09 8;7(1):294. Epub 2020 Sep 8.

Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, USA.

Significant progress has been made in elucidating single nucleotide polymorphism diversity in the human population. However, the majority of the variation space in the genome is structural and remains partially elusive. One form of structural variation is tandem repeats (TRs). Expansion of TRs are responsible for over 40 diseases, but we hypothesize these represent only a fraction of the pathogenic repeat expansions that exist. Here we characterize long or expanded TR variation in 1,115 human genomes as well as a replication cohort of 2,504 genomes, identified using ExpansionHunter Denovo. We found that individual genomes typically harbor several rare, large TRs, generally in non-coding regions of the genome. We noticed that these large TRs are enriched in their proximity to Alu elements. The vast majority of these large TRs seem to be expansions of smaller TRs that are already present in the reference genome. We are providing this TR profile as a resource for comparison to undiagnosed rare disease genomes in order to detect novel disease-causing repeat expansions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41597-020-00633-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7479135PMC
September 2020

Genome-wide detection of tandem DNA repeats that are expanded in autism.

Nature 2020 10 27;586(7827):80-86. Epub 2020 Jul 27.

Department of Pediatrics, University of Alberta, Edmonton, Alberta, Canada.

Tandem DNA repeats vary in the size and sequence of each unit (motif). When expanded, these tandem DNA repeats have been associated with more than 40 monogenic disorders. Their involvement in disorders with complex genetics is largely unknown, as is the extent of their heterogeneity. Here we investigated the genome-wide characteristics of tandem repeats that had motifs with a length of 2-20 base pairs in 17,231 genomes of families containing individuals with autism spectrum disorder (ASD) and population control individuals. We found extensive polymorphism in the size and sequence of motifs. Many of the tandem repeat loci that we detected correlated with cytogenetic fragile sites. At 2,588 loci, gene-associated expansions of tandem repeats that were rare among population control individuals were significantly more prevalent among individuals with ASD than their siblings without ASD, particularly in exons and near splice junctions, and in genes related to the development of the nervous system and cardiovascular system or muscle. Rare tandem repeat expansions had a prevalence of 23.3% in children with ASD compared with 20.7% in children without ASD, which suggests that tandem repeat expansions make a collective contribution to the risk of ASD of 2.6%. These rare tandem repeat expansions included previously undescribed ASD-linked expansions in DMPK and FXN, which are associated with neuromuscular conditions, and in previously unknown loci such as FGF14 and CACNB1. Rare tandem repeat expansions were associated with lower IQ and adaptive ability. Our results show that tandem DNA repeat expansions contribute strongly to the genetic aetiology and phenotypic complexity of ASD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2579-zDOI Listing
October 2020

ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data.

Genome Biol 2020 04 28;21(1):102. Epub 2020 Apr 28.

Illumina Inc., 5200 Illumina Way, San Diego, CA, 92122, USA.

Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats. To address this need, we introduce ExpansionHunter Denovo, an efficient catalog-free method for genome-wide repeat expansion detection. Analysis of real and simulated data shows that our method can identify large expansions of 41 out of 44 pathogenic repeats, including nine recently reported non-reference repeat expansions not discoverable via existing methods.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-020-02017-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7187524PMC
April 2020

Genome sequencing in persistently unsolved white matter disorders.

Ann Clin Transl Neurol 2020 01 7;7(1):144-152. Epub 2020 Jan 7.

Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.

Genetic white matter disorders have heterogeneous etiologies and overlapping clinical presentations. We performed a study of the diagnostic efficacy of genome sequencing in 41 unsolved cases with prior exome sequencing, resolving an additional 14 from an historical cohort (n = 191). Reanalysis in the context of novel disease-associated genes and improved variant curation and annotation resolved 64% of cases. The remaining diagnoses were directly attributable to genome sequencing, including cases with small and large copy number variants (CNVs) and variants in deep intronic and technically difficult regions. Genome sequencing, in combination with other methodologies, achieved a diagnostic yield of 85% in this retrospective cohort.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/acn3.50957DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6952322PMC
January 2020

Paragraph: a graph-based structural variant genotyper for short-read sequence data.

Genome Biol 2019 12 19;20(1):291. Epub 2019 Dec 19.

Illumina Inc, 5200 Illumina Way, San Diego, CA, USA.

Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1909-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6921448PMC
December 2019

Bioinformatics-Based Identification of Expanded Repeats: A Non-reference Intronic Pentamer Expansion in RFC1 Causes CANVAS.

Am J Hum Genet 2019 07 20;105(1):151-165. Epub 2019 Jun 20.

McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.

Genomic technologies such as next-generation sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole-genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG) short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS-affected families and identified a core ancestral haplotype, estimated to have arisen in Europe more than twenty-five thousand years ago. WGS of the four RFC1-negative CANVAS-affected families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type, and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2019.05.016DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6612533PMC
July 2019

ExpansionHunter: a sequence-graph-based tool to analyze variation in short tandem repeat regions.

Bioinformatics 2019 11;35(22):4754-4756

Illumina Inc., San Diego, CA 92122, USA.

Summary: We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci.

Availability And Implementation: ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btz431DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853681PMC
November 2019

Length of Uninterrupted CAG, Independent of Polyglutamine Size, Results in Increased Somatic Instability, Hastening Onset of Huntington Disease.

Am J Hum Genet 2019 06 16;104(6):1116-1126. Epub 2019 May 16.

Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4, Canada. Electronic address:

Huntington disease (HD) is caused by a CAG repeat expansion in the huntingtin (HTT) gene. Although the length of this repeat is inversely correlated with age of onset (AOO), it does not fully explain the variability in AOO. We assessed the sequence downstream of the CAG repeat in HTT [reference: (CAG)n-CAA-CAG], since variants within this region have been previously described, but no study of AOO has been performed. These analyses identified a variant that results in complete loss of interrupting (LOI) adenine nucleotides in this region [(CAG)n-CAG-CAG]. Analysis of multiple HD pedigrees showed that this LOI variant is associated with dramatically earlier AOO (average of 25 years) despite the same polyglutamine length as in individuals with the interrupting penultimate CAA codon. This LOI allele is particularly frequent in persons with reduced penetrance alleles who manifest with HD and increases the likelihood of presenting clinically with HD with a CAG of 36-39 repeats. Further, we show that the LOI variant is associated with increased somatic repeat instability, highlighting this as a significant driver of this effect. These findings indicate that the number of uninterrupted CAG repeats, which is lengthened by the LOI, is the most significant contributor to AOO of HD and is more significant than polyglutamine length, which is not altered in these individuals. In addition, we identified another variant in this region, where the CAA-CAG sequence is duplicated, which was associated with later AOO. Identification of these cis-acting modifiers have potentially important implications for genetic counselling in HD-affected families.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2019.04.007DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6556907PMC
June 2019

Glutaminase Deficiency Caused by Short Tandem Repeat Expansion in .

N Engl J Med 2019 04;380(15):1433-1441

From Amsterdam University Medical Centers, University of Amsterdam, Departments of Clinical Chemistry, Pediatrics, and Clinical Genetics, Emma Children's Hospital, Amsterdam Gastroenterology and Metabolism (A.B.P.K., R.L., J.K., J. Meijer, L.A.T., M.T., M.W., R.J.A.W., H.R.W., C.D.M.K.), and United for Metabolic Diseases (A.B.P.K., R.J.A.W., H.R.W., C.D.M.K.), Amsterdam, and the Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht (J.J.F.A.V., J.H.V.), and the Project MinE ALS Sequencing Consortium (J.J.F.A.V., J.H.V.), Utrecht - all in the Netherlands; the Departments of Biochemistry and Molecular Biology and Medical Genetics, Cumming School of Medicine, and Alberta Children's Hospital Research Institute, University of Calgary, Calgary (M.T.-G.), Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute (P.A.R., M.J.J., M.S.K., J. MacIsaac, W.W.W., C.D.M.K.), the Faculty of Pharmaceutical Sciences (B.I.D., G.E.B.W., C.J.R.), and the Departments of Medical Genetics (C.M., I.-S.R.-B., W.W.W.) and Pediatrics (C.D.M.K.), University of British Columbia, Vancouver, the Zebrafish Centre for Advanced Drug Discovery, St. Michael's Hospital and University of Toronto (K.B.-A., F.K., M.L., Y.W., X.-Y.W.), the Centre for Applied Genomics, Genetics and Genome Biology, the Hospital for Sick Children (C.N., S.W.S., B.T., R.K.C.Y.), and the Department of Molecular Genetics (C.N., S.W.S., R.K.C.Y.), the McLaughlin Centre (S.W.S.), and the Departments of Medicine, Physiology, and Laboratory Medicine and Pathobiology, Institute of Medical Science (X.-Y.W.), University of Toronto, Toronto, and the Division of Medical Genetics, Department of Pediatrics, Children's Hospital Eastern Ontario, University of Ottawa, Ottawa (J.S.W., M.T.G.) - all in Canada; the Departments of Medicine and Physiology, National University of Singapore (M.A.P.), and the Translational Laboratory in Genetic Medicine, Agency for Science, Technology, and Research (M.A.P., B.S., X.X., J.Z.) - both in Singapore; Uppsala University, Department of Chemistry-Biomedical Center, Uppsala, Sweden (D.D.); Illumina, San Diego, CA (E.D., M.A.E.); Gene Structure and Disease Section, Laboratory of Cell and Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD (B.H., D.K., K.U.); and the Department of Clinical Inherited Metabolic Disorders, Birmingham Children's Hospital, Birmingham, United Kingdom (S.S.).

We report an inborn error of metabolism caused by an expansion of a GCA-repeat tract in the 5' untranslated region of the gene encoding glutaminase () that was identified through detailed clinical and biochemical phenotyping, combined with whole-genome sequencing. The expansion was observed in three unrelated patients who presented with an early-onset delay in overall development, progressive ataxia, and elevated levels of glutamine. In addition to ataxia, one patient also showed cerebellar atrophy. The expansion was associated with a relative deficiency of messenger RNA transcribed from the expanded allele, which probably resulted from repeat-mediated chromatin changes upstream of the repeat. Our discovery underscores the importance of careful examination of regions of the genome that are typically excluded from or poorly captured by exome sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1056/NEJMoa1806627DOI Listing
April 2019

Copy-number variants in clinical genome sequencing: deployment and interpretation for rare and undiagnosed disease.

Genet Med 2019 05 8;21(5):1121-1130. Epub 2018 Oct 8.

Illumina Inc., San Diego, CA, USA.

Purpose: Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test.

Methods: We performed analytical validation of CNV calling on 17 reference samples, compared the sensitivity of GS-based variants with those from a clinical microarray, and set a bound on precision using orthogonal technologies. We developed a protocol for family-based analysis of GS-based CNV calls, and deployed this across a clinical cohort of 79 rare and undiagnosed cases.

Results: We found that CNV calls from GS are at least as sensitive as those from microarrays, while only creating a modest increase in the number of variants interpreted (~10 CNVs per case). We identified clinically significant CNVs in 15% of the first 79 cases analyzed, all of which were confirmed by an orthogonal approach. The pipeline also enabled discovery of a uniparental disomy (UPD) and a 50% mosaic trisomy 14. Directed analysis of select CNVs enabled breakpoint level resolution of genomic rearrangements and phasing of de novo CNVs.

Conclusion: Robust identification of CNVs by GS is possible within a clinical testing environment.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41436-018-0295-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6752263PMC
May 2019

Molecular characterization of the transition from acute to chronic kidney injury following ischemia/reperfusion.

JCI Insight 2017 09 21;2(18). Epub 2017 Sep 21.

Department of Stem Cell Biology and Regenerative Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, California, USA.

Though an acute kidney injury (AKI) episode is associated with an increased risk of chronic kidney disease (CKD), the mechanisms determining the transition from acute to irreversible chronic injury are not well understood. To extend our understanding of renal repair, and its limits, we performed a detailed molecular characterization of a murine ischemia/reperfusion injury (IRI) model for 12 months after injury. Together, the data comprising RNA-sequencing (RNA-seq) analysis at multiple time points, histological studies, and molecular and cellular characterization of targeted gene activity provide a comprehensive profile of injury, repair, and long-term maladaptive responses following IRI. Tubular atrophy, interstitial fibrosis, inflammation, and development of multiple renal cysts were major long-term outcomes of IRI. Progressive proximal tubular injury tracks with de novo activation of multiple Krt genes, including Krt20, a biomarker of renal tubule injury. RNA-seq analysis highlights a cascade of temporal-specific gene expression patterns related to tubular injury/repair, fibrosis, and innate and adaptive immunity. Intersection of these data with human kidney transplant expression profiles identified overlapping gene expression signatures correlating with different stages of the murine IRI response. The comprehensive characterization of incomplete recovery after ischemic AKI provides a valuable resource for determining the underlying pathophysiology of human CKD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1172/jci.insight.94716DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5612583PMC
September 2017

Biogeographic conservation of the cytosine epigenome in the globally important marine, nitrogen-fixing cyanobacterium Trichodesmium.

Environ Microbiol 2017 11 2;19(11):4700-4713. Epub 2017 Nov 2.

Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA.

Cytosine methylation has been shown to regulate essential cellular processes and impact biological adaptation. Despite its evolutionary importance, only a handful of bacterial, genome-wide cytosine studies have been conducted, with none for marine bacteria. Here, we examine the genome-wide, C -Methyl-cytosine (m5C) methylome and its correlation to global transcription in the marine nitrogen-fixing cyanobacterium Trichodesmium. We characterize genome-wide methylation and highlight conserved motifs across three Trichodesmium isolates and two Trichodesmium metagenomes, thereby identifying highly conserved, novel genomic signatures of potential gene regulation in Trichodesmium. Certain gene bodies with the highest methylation levels correlate with lower expression levels. Several methylated motifs were highly conserved across spatiotemporally separated Trichodesmium isolates, thereby elucidating biogeographically conserved methylation potential. These motifs were also highly conserved in Trichodesmium metagenomic samples from natural populations suggesting them to be potential in situ markers of m5C methylation. Using these data, we highlight predicted roles of cytosine methylation in global cellular metabolism providing evidence for a 'core' m5C methylome spanning different ocean regions. These results provide important insights into the m5C methylation landscape and its biogeochemical implications in an important marine N -fixer, as well as advancing evolutionary theory examining methylation influences on adaptation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1462-2920.13934DOI Listing
November 2017

Detection of long repeat expansions from PCR-free whole-genome sequence data.

Genome Res 2017 11 8;27(11):1895-1903. Epub 2017 Sep 8.

Department of Basic and Clinical Neuroscience, Maurice Wohl Clinical Neuroscience Institute, King's College London, London SE5 9RX, United Kingdom.

Identifying large expansions of short tandem repeats (STRs), such as those that cause amyotrophic lateral sclerosis (ALS) and fragile X syndrome, is challenging for short-read whole-genome sequencing (WGS) data. A solution to this problem is an important step toward integrating WGS into precision medicine. We developed a software tool called ExpansionHunter that, using PCR-free WGS short-read data, can genotype repeats at the locus of interest, even if the expanded repeat is larger than the read length. We applied our algorithm to WGS data from 3001 ALS patients who have been tested for the presence of the repeat expansion with repeat-primed PCR (RP-PCR). Compared against this truth data, ExpansionHunter correctly classified all (212/212, 95% CI [0.98, 1.00]) of the expanded samples as either expansions (208) or potential expansions (4). Additionally, 99.9% (2786/2789, 95% CI [0.997, 1.00]) of the wild-type samples were correctly classified as wild type by this method with the remaining three samples identified as possible expansions. We further applied our algorithm to a set of 152 samples in which every sample had one of eight different pathogenic repeat expansions, including those associated with fragile X syndrome, Friedreich's ataxia, and Huntington's disease, and correctly flagged all but one of the known repeat expansions. Thus, ExpansionHunter can be used to accurately detect known pathogenic repeat expansions and provides researchers with a tool that can be used to identify new pathogenic repeat expansions.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.225672.117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5668946PMC
November 2017

lncRNA requirements for mouse acute myeloid leukemia and normal differentiation.

Elife 2017 09 6;6. Epub 2017 Sep 6.

Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, United Kingdom.

A substantial fraction of the genome is transcribed in a cell-type-specific manner, producing long non-coding RNAs (lncRNAs), rather than protein-coding transcripts. Here, we systematically characterize transcriptional dynamics during hematopoiesis and in hematological malignancies. Our analysis of annotated and de novo assembled lncRNAs showed many are regulated during differentiation and mis-regulated in disease. We assessed lncRNA function via an in vivo RNAi screen in a model of acute myeloid leukemia. This identified several lncRNAs essential for leukemia maintenance, and found that a number act by promoting leukemia stem cell signatures. Leukemia blasts show a myeloid differentiation phenotype when these lncRNAs were depleted, and our data indicates that this effect is mediated via effects on the MYC oncogene. Bone marrow reconstitutions showed that a lncRNA expressed across all progenitors was required for the myeloid lineage, whereas the other leukemia-induced lncRNAs were dispensable in the normal setting.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7554/eLife.25607DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5619947PMC
September 2017

The APOE4 allele shows opposite sex bias in microbleeds and Alzheimer's disease of humans and mice.

Neurobiol Aging 2016 Jan 19;37:47-57. Epub 2015 Oct 19.

Davis School of Gerontology, University of Southern California, Los Angeles, CA, USA; Department of Molecular and Computational Biology, University of Southern California, Los Angeles, CA, USA; Department of Biological Sciences, Dornsife College, University of Southern California, Los Angeles, CA, USA. Electronic address:

The apolipoprotein APOE4 allele confers greater risk of Alzheimer's disease (AD) for women than men, in conjunction with greater clinical deficits per unit of AD neuropathology (plaques, tangles). Cerebral microbleeds, which contribute to cognitive dysfunctions during AD, also show APOE4 excess, but sex-APOE allele interactions are not described. We report that elderly men diagnosed for mild cognitive impairment and AD showed a higher risk of cerebral cortex microbleeds with APOE4 allele dose effect in 2 clinical cohorts (ADNI and KIDS). Sex-APOE interactions were further analyzed in EFAD mice carrying human APOE alleles and familial AD genes (5XFAD (+/-) /human APOE(+/+)). At 7 months, E4FAD mice had cerebral cortex microbleeds with female excess, in contrast to humans. Cerebral amyloid angiopathy, plaques, and soluble Aβ also showed female excess. Both the cerebral microbleeds and cerebral amyloid angiopathy increased in proportion to individual Aβ load. In humans, the opposite sex bias of APOE4 allele for microbleeds versus the plaques and tangles is the first example of organ-specific, sex-linked APOE allele effects, and further shows AD as a uniquely human condition.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neurobiolaging.2015.10.010DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4687024PMC
January 2016

An epigenetic memory of pregnancy in the mouse mammary gland.

Cell Rep 2015 May 7;11(7):1102-9. Epub 2015 May 7.

Howard Hughes Medical Institute, Watson School of Biological Sciences, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA; Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Robinson Way, Cambridge CB2 0RE, UK. Electronic address:

Pregnancy is the major modulator of mammary gland activity. It induces a tremendous expansion of the mammary epithelium and the generation of alveolar structures for milk production. Anecdotal evidence from multiparous humans indicates that the mammary gland may react less strongly to the first pregnancy than it does to subsequent pregnancies. Here, we verify that the mouse mammary gland responds more robustly to a second pregnancy, indicating that the gland retains a long-term memory of pregnancy. A comparison of genome-wide profiles of DNA methylation in isolated mammary cell types reveals substantial and long-lasting alterations. Since these alterations are maintained in the absence of the signal that induced them, we term them epigenetic. The majority of alterations in DNA methylation affect sites occupied by the Stat5a transcription factor and mark specific genes that are upregulated during pregnancy. We postulate that the epigenetic memory of a first pregnancy primes the activation of gene expression networks that promote mammary gland function in subsequent reproductive cycles. More broadly, our data indicate that physiological experience can broadly alter epigenetic states, functionally modifying the capacity of the affected cells to respond to later stimulatory events.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.celrep.2015.04.015DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4439279PMC
May 2015

Genus Ranges of 4-Regular Rigid Vertex Graphs.

Electron J Comb 2015 ;22(3)

European Molecular Biology Laboratory, Heidelberg, Germany.

A rigid vertex of a graph is one that has a prescribed cyclic order of its incident edges. We study orientable genus ranges of 4-regular rigid vertex graphs. The (orientable) genus range is a set of genera values over all orientable surfaces into which a graph is embedded cellularly, and the embeddings of rigid vertex graphs are required to preserve the prescribed cyclic order of incident edges at every vertex. The genus ranges of 4-regular rigid vertex graphs are sets of consecutive integers, and we address two questions: which intervals of integers appear as genus ranges of such graphs, and what types of graphs realize a given genus range. For graphs with 2 vertices ( 1), we prove that all intervals [] for all ≤ , and singletons [] for some ≤ , are realized as genus ranges. For graphs with 2 - 1 vertices ( ≥ 1), we prove that all intervals [] for all ≤ except [0], and [] for some ≤ , are realized as genus ranges. We also provide constructions of graphs that realize these ranges.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5087815PMC
January 2015

The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development.

Cell 2014 Aug;158(5):1187-1198

Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA. Electronic address:

Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci often interweave with each other. Single gene segments can contribute to multiple, distinct somatic loci. Terminal precursor segments from neighboring somatic loci map extremely close to each other, often overlapping. This genome assembly provides a draft of a scrambled genome and a powerful model for studies of genome rearrangement.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2014.07.034DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199391PMC
August 2014

Using beta-binomial regression for high-precision differential methylation analysis in multifactor whole-genome bisulfite sequencing experiments.

BMC Bioinformatics 2014 Jun 24;15:215. Epub 2014 Jun 24.

Molecular and Computational Biology Section, Division of Biological Sciences, University of Southern California, Los Angeles, California, USA.

Background: Whole-genome bisulfite sequencing currently provides the highest-precision view of the epigenome, with quantitative information about populations of cells down to single nucleotide resolution. Several studies have demonstrated the value of this precision: meaningful features that correlate strongly with biological functions can be found associated with only a few CpG sites. Understanding the role of DNA methylation, and more broadly the role of DNA accessibility, requires that methylation differences between populations of cells are identified with extreme precision and in complex experimental designs.

Results: In this work we investigated the use of beta-binomial regression as a general approach for modeling whole-genome bisulfite data to identify differentially methylated sites and genomic intervals.

Conclusions: The regression-based analysis can handle medium- and large-scale experiments where it becomes critical to accurately model variation in methylation levels between replicates and account for influence of various experimental factors like cell types or batch effects.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-15-215DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4230021PMC
June 2014

Genomes on the edge: programmed genome instability in ciliates.

Cell 2013 Jan;152(3):406-16

Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA.

Ciliates are an ancient and diverse group of microbial eukaryotes that have emerged as powerful models for RNA-mediated epigenetic inheritance. They possess extensive sets of both tiny and long noncoding RNAs that, together with a suite of proteins that includes transposases, orchestrate a broad cascade of genome rearrangements during somatic nuclear development. This Review emphasizes three important themes: the remarkable role of RNA in shaping genome structure, recent discoveries that unify many deeply diverged ciliate genetic systems, and a surprising evolutionary "sign change" in the role of small RNAs between major species groups.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cell.2013.01.005DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3725814PMC
January 2013

LUCApedia: a database for the study of ancient life.

Nucleic Acids Res 2013 Jan 27;41(Database issue):D1079-82. Epub 2012 Nov 27.

Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08542, USA.

Organisms represented by the root of the universal evolutionary tree were most likely complex cells with a sophisticated protein translation system and a DNA genome encoding hundreds of genes. The growth of bioinformatics data from taxonomically diverse organisms has made it possible to infer the likely properties of early life in greater detail. Here we present LUCApedia, (http://eeb.princeton.edu/lucapedia), a unified framework for simultaneously evaluating multiple data sets related to the Last Universal Common Ancestor (LUCA) and its predecessors. This unification is achieved by mapping eleven such data sets onto UniProt, KEGG and BioCyc IDs. LUCApedia may be used to rapidly acquire evidence that a certain gene or set of genes is ancient, to examine the early evolution of metabolic pathways, or to test specific hypotheses related to ancient life by corroborating them against the rest of the database.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gks1217DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531223PMC
January 2013

TRANSDUCER GENERATED ARRAYS OF ROBOTIC NANO-ARMS.

Nat Comput 2010 Jun;9(2):437-455

NEW YORK UNIVERSITY, NEW YORK NY 10003,USA.

We consider sets of two-dimensional arrays, called here transducer generated languages, obtained by iterative applications of transducers (finite state automata with output). Each transducer generates a set of blocks of symbols such that the bottom row of a block is an input string accepted by the transducer and, by iterative application of the transducer, each row of the block is an output of the transducer on the preceding row. We show how these arrays can be implemented through molecular assembly of triple crossover DNA molecules. Such assembly could serve as a scaffold for arranging molecular robotic arms capable for simultaneous movements. We observe that transducer generated languages define a class of languages which is a proper subclass of recognizable picture languages, but it containing the class of all factorial local two-dimensional languages. By taking the average growth rate of the number of blocks in the language as a measure of its complexity, we further observe that arrays with high complexity patterns can be generated in this way.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s11047-009-9157-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3957271PMC
June 2010