Publications by authors named "Jacob O Kitzman"

40 Publications

A transcription start site map in human pancreatic islets reveals functional regulatory signatures.

Diabetes 2021 Apr 13. Epub 2021 Apr 13.

Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA

Identifying the tissue-specific molecular signatures of active regulatory elements is critical to understand gene regulatory mechanisms. Here, we identify transcription start sites (TSS) using cap analysis of gene expression (CAGE) across 57 human pancreatic islet samples. We identify 9,954 reproducible CAGE tag clusters (TCs), ∼20% of which are islet-specific and occur mostly distal to known gene TSSs. We integrated islet CAGE data with histone modification and chromatin accessibility profiles to identify epigenomic signatures of transcription initiation. Using a massively parallel reporter assay, we validated the transcriptional enhancer activity for 2,279 of 3,378 (∼68%) tested islet CAGE elements (5% FDR). TCs within accessible enhancers show higher enrichment to overlap type 2 diabetes genome-wide association study (GWAS) signals than existing islet annotations, which emphasizes the utility of mapping CAGE profiles in disease-relevant tissue. This work provides a high-resolution map of transcriptional initiation in human pancreatic islets with utility for dissecting active enhancers at GWAS loci.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2337/db20-1087DOI Listing
April 2021

Author Correction: Inherited causes of clonal haematopoiesis in 97,691 whole genomes.

Authors:
Alexander G Bick Joshua S Weinstock Satish K Nandakumar Charles P Fulco Erik L Bao Seyedeh M Zekavat Mindy D Szeto Xiaotian Liao Matthew J Leventhal Joseph Nasser Kyle Chang Cecelia Laurie Bala Bharathi Burugula Christopher J Gibson Abhishek Niroula Amy E Lin Margaret A Taub Francois Aguet Kristin Ardlie Braxton D Mitchell Kathleen C Barnes Arden Moscati Myriam Fornage Susan Redline Bruce M Psaty Edwin K Silverman Scott T Weiss Nicholette D Palmer Ramachandran S Vasan Esteban G Burchard Sharon L R Kardia Jiang He Robert C Kaplan Nicholas L Smith Donna K Arnett David A Schwartz Adolfo Correa Mariza de Andrade Xiuqing Guo Barbara A Konkle Brian Custer Juan M Peralta Hongsheng Gui Deborah A Meyers Stephen T McGarvey Ida Yii-Der Chen M Benjamin Shoemaker Patricia A Peyser Jai G Broome Stephanie M Gogarten Fei Fei Wang Quenna Wong May E Montasser Michelle Daya Eimear E Kenny Kari E North Lenore J Launer Brian E Cade Joshua C Bis Michael H Cho Jessica Lasky-Su Donald W Bowden L Adrienne Cupples Angel C Y Mak Lewis C Becker Jennifer A Smith Tanika N Kelly Stella Aslibekyan Susan R Heckbert Hemant K Tiwari Ivana V Yang John A Heit Steven A Lubitz Jill M Johnsen Joanne E Curran Sally E Wenzel Daniel E Weeks Dabeeru C Rao Dawood Darbar Jee-Young Moon Russell P Tracy Erin J Buth Nicholas Rafaels Ruth J F Loos Peter Durda Yongmei Liu Lifang Hou Jiwon Lee Priyadarshini Kachroo Barry I Freedman Daniel Levy Lawrence F Bielak James E Hixson James S Floyd Eric A Whitsel Patrick T Ellinor Marguerite R Irvin Tasha E Fingerlin Laura M Raffield Sebastian M Armasu Marsha M Wheeler Ester C Sabino John Blangero L Keoki Williams Bruce D Levy Wayne Huey-Herng Sheu Dan M Roden Eric Boerwinkle JoAnn E Manson Rasika A Mathias Pinkal Desai Kent D Taylor Andrew D Johnson Paul L Auer Charles Kooperberg Cathy C Laurie Thomas W Blackwell Albert V Smith Hongyu Zhao Ethan Lange Leslie Lange Stephen S Rich Jerome I Rotter James G Wilson Paul Scheet Jacob O Kitzman Eric S Lander Jesse M Engreitz Benjamin L Ebert Alexander P Reiner Siddhartha Jaiswal Gonçalo Abecasis Vijay G Sankaran Sekar Kathiresan Pradeep Natarajan

Nature 2021 Mar;591(7851):E27

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03280-1DOI Listing
March 2021

Chromatin information content landscapes inform transcription factor and DNA interactions.

Nat Commun 2021 02 26;12(1):1307. Epub 2021 Feb 26.

Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, USA.

Interactions between transcription factors and chromatin are fundamental to genome organization and regulation and, ultimately, cell state. Here, we use information theory to measure signatures of organized chromatin resulting from transcription factor-chromatin interactions encoded in the patterns of the accessible genome, which we term chromatin information enrichment (CIE). We calculate CIE for hundreds of transcription factor motifs across human samples and identify two classes: low and high CIE. The 10-20% of common and tissue-specific high CIE transcription factor motifs, associate with higher protein-DNA residence time, including different binding site subclasses of the same transcription factor, increased nucleosome phasing, specific protein domains, and the genetic control of both chromatin accessibility and gene expression. These results show that variations in the information encoded in chromatin architecture reflect functional biological variation, with implications for cell state dynamics and memory.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41467-021-21534-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7910283PMC
February 2021

Massively parallel functional testing of MSH2 missense variants conferring Lynch syndrome risk.

Am J Hum Genet 2021 01 23;108(1):163-175. Epub 2020 Dec 23.

Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA. Electronic address:

The lack of functional evidence for the majority of missense variants limits their clinical interpretability and poses a key barrier to the broad utility of carrier screening. In Lynch syndrome (LS), one of the most highly prevalent cancer syndromes, nearly 90% of clinically observed missense variants are deemed "variants of uncertain significance" (VUS). To systematically resolve their functional status, we performed a massively parallel screen in human cells to identify loss-of-function missense variants in the key DNA mismatch repair factor MSH2. The resulting functional effect map is substantially complete, covering 94% of the 17,746 possible variants, and is highly concordant (96%) with existing functional data and expert clinicians' interpretations. The large majority (89%) of missense variants were functionally neutral, perhaps unexpectedly in light of its evolutionary conservation. These data provide ready-to-use functional evidence to resolve the ∼1,300 extant missense VUSs in MSH2 and may facilitate the prospective classification of newly discovered variants in the clinic.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2020.12.003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7820803PMC
January 2021

Inherited causes of clonal haematopoiesis in 97,691 whole genomes.

Authors:
Alexander G Bick Joshua S Weinstock Satish K Nandakumar Charles P Fulco Erik L Bao Seyedeh M Zekavat Mindy D Szeto Xiaotian Liao Matthew J Leventhal Joseph Nasser Kyle Chang Cecelia Laurie Bala Bharathi Burugula Christopher J Gibson Amy E Lin Margaret A Taub Francois Aguet Kristin Ardlie Braxton D Mitchell Kathleen C Barnes Arden Moscati Myriam Fornage Susan Redline Bruce M Psaty Edwin K Silverman Scott T Weiss Nicholette D Palmer Ramachandran S Vasan Esteban G Burchard Sharon L R Kardia Jiang He Robert C Kaplan Nicholas L Smith Donna K Arnett David A Schwartz Adolfo Correa Mariza de Andrade Xiuqing Guo Barbara A Konkle Brian Custer Juan M Peralta Hongsheng Gui Deborah A Meyers Stephen T McGarvey Ida Yii-Der Chen M Benjamin Shoemaker Patricia A Peyser Jai G Broome Stephanie M Gogarten Fei Fei Wang Quenna Wong May E Montasser Michelle Daya Eimear E Kenny Kari E North Lenore J Launer Brian E Cade Joshua C Bis Michael H Cho Jessica Lasky-Su Donald W Bowden L Adrienne Cupples Angel C Y Mak Lewis C Becker Jennifer A Smith Tanika N Kelly Stella Aslibekyan Susan R Heckbert Hemant K Tiwari Ivana V Yang John A Heit Steven A Lubitz Jill M Johnsen Joanne E Curran Sally E Wenzel Daniel E Weeks Dabeeru C Rao Dawood Darbar Jee-Young Moon Russell P Tracy Erin J Buth Nicholas Rafaels Ruth J F Loos Peter Durda Yongmei Liu Lifang Hou Jiwon Lee Priyadarshini Kachroo Barry I Freedman Daniel Levy Lawrence F Bielak James E Hixson James S Floyd Eric A Whitsel Patrick T Ellinor Marguerite R Irvin Tasha E Fingerlin Laura M Raffield Sebastian M Armasu Marsha M Wheeler Ester C Sabino John Blangero L Keoki Williams Bruce D Levy Wayne Huey-Herng Sheu Dan M Roden Eric Boerwinkle JoAnn E Manson Rasika A Mathias Pinkal Desai Kent D Taylor Andrew D Johnson Paul L Auer Charles Kooperberg Cathy C Laurie Thomas W Blackwell Albert V Smith Hongyu Zhao Ethan Lange Leslie Lange Stephen S Rich Jerome I Rotter James G Wilson Paul Scheet Jacob O Kitzman Eric S Lander Jesse M Engreitz Benjamin L Ebert Alexander P Reiner Siddhartha Jaiswal Gonçalo Abecasis Vijay G Sankaran Sekar Kathiresan Pradeep Natarajan

Nature 2020 10 14;586(7831):763-768. Epub 2020 Oct 14.

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Age is the dominant risk factor for most chronic human diseases, but the mechanisms through which ageing confers this risk are largely unknown. The age-related acquisition of somatic mutations that lead to clonal expansion in regenerating haematopoietic stem cell populations has recently been associated with both haematological cancer and coronary heart disease-this phenomenon is termed clonal haematopoiesis of indeterminate potential (CHIP). Simultaneous analyses of germline and somatic whole-genome sequences provide the opportunity to identify root causes of CHIP. Here we analyse high-coverage whole-genome sequences from 97,691 participants of diverse ancestries in the National Heart, Lung, and Blood Institute Trans-omics for Precision Medicine (TOPMed) programme, and identify 4,229 individuals with CHIP. We identify associations with blood cell, lipid and inflammatory traits that are specific to different CHIP driver genes. Association of a genome-wide set of germline genetic variants enabled the identification of three genetic loci associated with CHIP status, including one locus at TET2 that was specific to individuals of African ancestry. In silico-informed in vitro evaluation of the TET2 germline locus enabled the identification of a causal variant that disrupts a TET2 distal enhancer, resulting in increased self-renewal of haematopoietic stem cells. Overall, we observe that germline genetic variation shapes haematopoietic stem cell function, leading to CHIP through mechanisms that are specific to clonal haematopoiesis as well as shared mechanisms that lead to somatic mutations across tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2819-2DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7944936PMC
October 2020

SOX10-regulated promoter use defines isoform-specific gene expression in Schwann cells.

BMC Genomics 2020 Aug 8;21(1):549. Epub 2020 Aug 8.

Neuroscience Graduate Program, University of Michigan, Ann Arbor, MI, USA.

Background: Multicellular organisms adopt various strategies to tailor gene expression to cellular contexts including the employment of multiple promoters (and the associated transcription start sites (TSSs)) at a single locus that encodes distinct gene isoforms. Schwann cells-the myelinating cells of the peripheral nervous system (PNS)-exhibit a specialized gene expression profile directed by the transcription factor SOX10, which is essential for PNS myelination. SOX10 regulates promoter elements associated with unique TSSs and gene isoforms at several target loci, implicating SOX10-mediated, isoform-specific gene expression in Schwann cell function. Here, we report on genome-wide efforts to identify SOX10-regulated promoters and TSSs in Schwann cells to prioritize genes and isoforms for further study.

Results: We performed global TSS analyses and mined previously reported ChIP-seq datasets to assess the activity of SOX10-bound promoters in three models: (i) an adult mammalian nerve; (ii) differentiating primary Schwann cells, and (iii) cultured Schwann cells with ablated SOX10 function. We explored specific characteristics of SOX10-dependent TSSs, which provides confidence in defining them as SOX10 targets. Finally, we performed functional studies to validate our findings at four previously unreported SOX10 target loci: ARPC1A, CHN2, DDR1, and GAS7. These findings suggest roles for the associated SOX10-regulated gene products in PNS myelination.

Conclusions: In sum, we provide comprehensive computational and functional assessments of SOX10-regulated TSS use in Schwann cells. The data presented in this study will stimulate functional studies on the specific mRNA and protein isoforms that SOX10 regulates, which will improve our understanding of myelination in the peripheral nerve.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-020-06963-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7430845PMC
August 2020

Elevated exopolysaccharide levels in Pseudomonas aeruginosa flagellar mutants have implications for biofilm growth and chronic infections.

PLoS Genet 2020 06 12;16(6):e1008848. Epub 2020 Jun 12.

Department of Microbiology, University of Washington, Seattle, Washington, United States of America.

Pseudomonas aeruginosa colonizes the airways of cystic fibrosis (CF) patients, causing infections that can last for decades. During the course of these infections, P. aeruginosa undergoes a number of genetic adaptations. One such adaptation is the loss of swimming motility functions. Another involves the formation of the rugose small colony variant (RSCV) phenotype, which is characterized by overproduction of the exopolysaccharides Pel and Psl. Here, we provide evidence that the two adaptations are linked. Using random transposon mutagenesis, we discovered that flagellar mutations are linked to the RSCV phenotype. We found that flagellar mutants overexpressed Pel and Psl in a surface-contact dependent manner. Genetic analyses revealed that flagellar mutants were selected for at high frequencies in biofilms, and that Pel and Psl expression provided the primary fitness benefit in this environment. Suppressor mutagenesis of flagellar RSCVs indicated that Psl overexpression required the mot genes, suggesting that the flagellum stator proteins function in a surface-dependent regulatory pathway for exopolysaccharide biosynthesis. Finally, we identified flagellar mutant RSCVs among CF isolates. The CF environment has long been known to select for flagellar mutants, with the classic interpretation being that the fitness benefit gained relates to an impairment of the host immune system to target a bacterium lacking a flagellum. Our new findings lead us to propose that exopolysaccharide production is a key gain-of-function phenotype that offers a new way to interpret the fitness benefits of these mutations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008848DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7314104PMC
June 2020

Quantification, Dynamic Visualization, and Validation of Bias in ATAC-Seq Data with ataqv.

Cell Syst 2020 03;10(3):298-306.e4

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Human Genetics, University of Michigan, Ann Arbor, MI 48109, USA. Electronic address:

The assay for transposase-accessible chromatin using sequencing (ATAC-seq) has become the preferred method for mapping chromatin accessibility due to its time and input material efficiency. However, it can be difficult to evaluate data quality and identify sources of technical bias across samples. Here, we present ataqv, a computational toolkit for efficiently measuring, visualizing, and comparing quality control (QC) results across samples and experiments. We use ataqv to analyze 2,009 public ATAC-seq datasets; their QC metrics display a 10-fold range. Tn5 dosage experiments and statistical modeling show that technical variation in the ratio of Tn5 transposase to nuclei and sequencing flowcell density induces systematic bias in ATAC-seq data by changing the enrichment of reads across functional genomic annotations including promoters, enhancers, and transcription-factor-bound regions, with the notable exception of CTCF. ataqv can be integrated into existing computational pipelines and is freely available at https://github.com/ParkerLab/ataqv/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cels.2020.02.009DOI Listing
March 2020

Allele-specific RNA interference prevents neuropathy in Charcot-Marie-Tooth disease type 2D mouse models.

J Clin Invest 2019 12;129(12):5568-5583

The Jackson Laboratory, Bar Harbor, Maine, USA.

Gene therapy approaches are being deployed to treat recessive genetic disorders by restoring the expression of mutated genes. However, the feasibility of these approaches for dominantly inherited diseases - where treatment may require reduction in the expression of a toxic mutant protein resulting from a gain-of-function allele - is unclear. Here we show the efficacy of allele-specific RNAi as a potential therapy for Charcot-Marie-Tooth disease type 2D (CMT2D), caused by dominant mutations in glycyl-tRNA synthetase (GARS). A de novo mutation in GARS was identified in a patient with a severe peripheral neuropathy, and a mouse model precisely recreating the mutation was produced. These mice developed a neuropathy by 3-4 weeks of age, validating the pathogenicity of the mutation. RNAi sequences targeting mutant GARS mRNA, but not wild-type, were optimized and then packaged into AAV9 for in vivo delivery. This almost completely prevented the neuropathy in mice treated at birth. Delaying treatment until after disease onset showed modest benefit, though this effect decreased the longer treatment was delayed. These outcomes were reproduced in a second mouse model of CMT2D using a vector specifically targeting that allele. The effects were dose dependent, and persisted for at least 1 year. Our findings demonstrate the feasibility of AAV9-mediated allele-specific knockdown and provide proof of concept for gene therapy approaches for dominant neuromuscular diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1172/JCI130600DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6877339PMC
December 2019

CRISPR knockout screen implicates three genes in lysosome function.

Sci Rep 2019 07 3;9(1):9609. Epub 2019 Jul 3.

Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109-5618, USA.

Defective biosynthesis of the phospholipid PI(3,5)P underlies neurological disorders characterized by cytoplasmic accumulation of large lysosome-derived vacuoles. To identify novel genetic causes of lysosomal vacuolization, we developed an assay for enlargement of the lysosome compartment that is amenable to cell sorting and pooled screens. We first demonstrated that the enlarged vacuoles that accumulate in fibroblasts lacking FIG4, a PI(3,5)P biosynthetic factor, have a hyperacidic pH compared to normal cells'. We then carried out a genome-wide knockout screen in human HAP1 cells for accumulation of acidic vesicles by FACS sorting. A pilot screen captured fifteen genes, including VAC14, a previously identified cause of endolysosomal vacuolization. Three genes not previously associated with lysosome dysfunction were selected to validate the screen: C10orf35, LRRC8A, and MARCH7. We analyzed two clonal knockout cell lines for each gene. All of the knockout lines contained enlarged acidic vesicles that were positive for LAMP2, confirming their endolysosomal origin. This assay will be useful in the future for functional evaluation of patient variants in these genes, and for a more extensive genome-wide screen for genes required for endolysosome function. This approach may also be adapted for drug screens to identify small molecules that rescue endolysosomal vacuolization.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-019-45939-wDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6610096PMC
July 2019

Genomic annotation of disease-associated variants reveals shared functional contexts.

Diabetologia 2019 05 12;62(5):735-743. Epub 2019 Feb 12.

Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, 2049 Palmer Commons Building, Ann Arbor, MI, 48109, USA.

Variation in non-coding DNA, encompassing gene regulatory regions such as enhancers and promoters, contributes to risk for complex disorders, including type 2 diabetes. While genome-wide association studies have successfully identified hundreds of type 2 diabetes loci throughout the genome, the vast majority of these reside in non-coding DNA, which complicates the process of determining their functional significance and level of priority for further study. Here we review the methods used to experimentally annotate these non-coding variants, to nominate causal variants and to link them to diabetes pathophysiology. In recent years, chromatin profiling, massively parallel sequencing, high-throughput reporter assays and CRISPR gene editing technologies have rapidly become indispensable tools. Rather than treating individual variants in isolation, we discuss the importance of accounting for context, both genetic (such as flanking DNA sequence) and environmental (such as cellular state or environmental exposure). Incorporating these features shows promise in terms of revealing biologically convergent molecular signatures across distant and seemingly unrelated loci. Studying regulatory elements in the proper context will be crucial for interpreting the functional significance of disease-associated variants and applying the resulting knowledge to improve patient care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00125-019-4823-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6451673PMC
May 2019

Next generation sequencing panel based on single molecule molecular inversion probes for detecting genetic variants in children with hypopituitarism.

Mol Genet Genomic Med 2018 May 8. Epub 2018 May 8.

Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.

Background: Congenital Hypopituitarism is caused by genetic and environmental factors. Over 30 genes have been implicated in isolated and/or combined pituitary hormone deficiency. The etiology remains unknown for up to 80% of the patients, but most cases have been analyzed by limited candidate gene screening. Mutations in the PROP1 gene are the most common known cause, and the frequency of mutations in this gene varies greatly by ethnicity. We designed a custom array to assess the frequency of mutations in known hypopituitarism genes and new candidates, using single molecule molecular inversion probes sequencing (smMIPS).

Methods: We used this panel for the first systematic screening for causes of hypopituitarism in children. Molecular inversion probes were designed to capture 693 coding exons of 30 known genes and 37 candidate genes. We captured genomic DNA from 51 pediatric patients with CPHD (n = 43) or isolated GH deficiency (IGHD) (n = 8) and their parents and conducted next generation sequencing.

Results: We obtained deep coverage over targeted regions and demonstrated accurate variant detection by comparison to whole-genome sequencing in a control individual. We found a dominant mutation GH1, p.R209H, in a three-generation pedigree with IGHD.

Conclusions: smMIPS is an efficient and inexpensive method to detect mutations in patients with hypopituitarism, drastically limiting the need for screening individual genes by Sanger sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/mgg3.395DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6081231PMC
May 2018

Variant Interpretation: Functional Assays to the Rescue.

Am J Hum Genet 2017 Sep;101(3):315-325

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Department of Bioengineering, University of Washington, Seattle, WA 98195, USA. Electronic address:

Classical genetic approaches for interpreting variants, such as case-control or co-segregation studies, require finding many individuals with each variant. Because the overwhelming majority of variants are present in only a few living humans, this strategy has clear limits. Fully realizing the clinical potential of genetics requires that we accurately infer pathogenicity even for rare or private variation. Many computational approaches to predicting variant effects have been developed, but they can identify only a small fraction of pathogenic variants with the high confidence that is required in the clinic. Experimentally measuring a variant's functional consequences can provide clearer guidance, but individual assays performed only after the discovery of the variant are both time and resource intensive. Here, we discuss how multiplex assays of variant effect (MAVEs) can be used to measure the functional consequences of all possible variants in disease-relevant loci for a variety of molecular and cellular phenotypes. The resulting large-scale functional data can be combined with machine learning and clinical knowledge for the development of "lookup tables" of accurate pathogenicity predictions. A coordinated effort to produce, analyze, and disseminate large-scale functional data generated by multiplex assays could be essential to addressing the variant-interpretation crisis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ajhg.2017.07.014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5590843PMC
September 2017

Genetics of Combined Pituitary Hormone Deficiency: Roadmap into the Genome Era.

Endocr Rev 2016 12 9;37(6):636-675. Epub 2016 Nov 9.

Department of Human Genetics (Q.F., A.S.G., M.L.B., A.H.M., P.G., L.Y.M.C., A.Z.D., M.I.P.M., A.B.O., J.O.K., R.E.M., J.Z.L., S.A.C.), Graduate Program in Bioinformatics (A.S.G.), Endocrine Division, Department of Internal Medicine (A.A.), and Department of Computational Medicine and Bioinformatics (J.O.K., R.E.M., J.Z.L.), University of Michigan, Ann Arbor, Michigan 48109.

The genetic basis for combined pituitary hormone deficiency (CPHD) is complex, involving 30 genes in a variety of syndromic and nonsyndromic presentations. Molecular diagnosis of this disorder is valuable for predicting disease progression, avoiding unnecessary surgery, and family planning. We expect that the application of high throughput sequencing will uncover additional contributing genes and eventually become a valuable tool for molecular diagnosis. For example, in the last 3 years, six new genes have been implicated in CPHD using whole-exome sequencing. In this review, we present a historical perspective on gene discovery for CPHD and predict approaches that may facilitate future gene identification projects conducted by clinicians and basic scientists. Guidelines for systematic reporting of genetic variants and assigning causality are emerging. We apply these guidelines retrospectively to reports of the genetic basis of CPHD and summarize modes of inheritance and penetrance for each of the known genes. In recent years, there have been great improvements in databases of genetic information for diverse populations. Some issues remain that make molecular diagnosis challenging in some cases. These include the inherent genetic complexity of this disorder, technical challenges like uneven coverage, differing results from variant calling and interpretation pipelines, the number of tolerated genetic alterations, and imperfect methods for predicting pathogenicity. We discuss approaches for future research in the genetics of CPHD.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1210/er.2016-1101DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5155665PMC
December 2016

Fragment Length of Circulating Tumor DNA.

PLoS Genet 2016 07 18;12(7):e1006162. Epub 2016 Jul 18.

Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America.

Malignant tumors shed DNA into the circulation. The transient half-life of circulating tumor DNA (ctDNA) may afford the opportunity to diagnose, monitor recurrence, and evaluate response to therapy solely through a non-invasive blood draw. However, detecting ctDNA against the normally occurring background of cell-free DNA derived from healthy cells has proven challenging, particularly in non-metastatic solid tumors. In this study, distinct differences in fragment length size between ctDNAs and normal cell-free DNA are defined. Human ctDNA in rat plasma derived from human glioblastoma multiforme stem-like cells in the rat brain and human hepatocellular carcinoma in the rat flank were found to have a shorter principal fragment length than the background rat cell-free DNA (134-144 bp vs. 167 bp, respectively). Subsequently, a similar shift in the fragment length of ctDNA in humans with melanoma and lung cancer was identified compared to healthy controls. Comparison of fragment lengths from cell-free DNA between a melanoma patient and healthy controls found that the BRAF V600E mutant allele occurred more commonly at a shorter fragment length than the fragment length of the wild-type allele (132-145 bp vs. 165 bp, respectively). Moreover, size-selecting for shorter cell-free DNA fragment lengths substantially increased the EGFR T790M mutant allele frequency in human lung cancer. These findings provide compelling evidence that experimental or bioinformatic isolation of a specific subset of fragment lengths from cell-free DNA may improve detection of ctDNA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1006162DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4948782PMC
July 2016

Haplotypes drop by drop.

Authors:
Jacob O Kitzman

Nat Biotechnol 2016 Mar;34(3):296-8

Department of Human Genetics and Department of Computational Medicine &Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nbt.3500DOI Listing
March 2016

An essential cell cycle regulation gene causes hybrid inviability in Drosophila.

Science 2015 Dec;350(6267):1552-5

Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA. Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.

Speciation, the process by which new biological species arise, involves the evolution of reproductive barriers, such as hybrid sterility or inviability between populations. However, identifying hybrid incompatibility genes remains a key obstacle in understanding the molecular basis of reproductive isolation. We devised a genomic screen, which identified a cell cycle-regulation gene as the cause of male inviability in hybrids resulting from a cross between Drosophila melanogaster and D. simulans. Ablation of the D. simulans allele of this gene is sufficient to rescue the adult viability of hybrid males. This dominantly acting cell cycle regulator causes mitotic arrest and, thereby, inviability of male hybrid larvae. Our genomic method provides a facile means to accelerate the identification of hybrid incompatibility genes in other model and nonmodel systems.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aac7504DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4703311PMC
December 2015

Experimental Evolution Identifies Vaccinia Virus Mutations in A24R and A35R That Antagonize the Protein Kinase R Pathway and Accompany Collapse of an Extragenic Gene Amplification.

J Virol 2015 Oct 22;89(19):9986-97. Epub 2015 Jul 22.

Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA Division of Clinical Research, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA Departments of Microbiology and Medicine, University of Washington, Seattle, Washington, USA

Unlabelled: Most new human infectious diseases emerge from cross-species pathogen transmissions; however, it is not clear how viruses adapt to productively infect new hosts. Host restriction factors represent one species-specific barrier that viruses may initially have little ability to inhibit in new hosts. For example, viral antagonists of protein kinase R (PKR) vary in their ability to block PKR-mediated inhibition of viral replication, in part due to PKR allelic variation between species. We previously reported that amplification of a weak PKR antagonist encoded by rhesus cytomegalovirus, rhtrs1, improved replication of a recombinant poxvirus (VVΔEΔK+RhTRS1) in several resistant primate cells. To test whether amplification increases the opportunity for mutations that improve virus replication with only a single copy of rhtrs1 to evolve, we passaged rhtrs1-amplified viruses in semipermissive primate cells. After passage, we isolated two viruses that contained only a single copy of rhtrs1 yet replicated as well as the amplified virus. Surprisingly, rhtrs1 was not mutated in these viruses; instead, we identified mutations in two vaccinia virus (VACV) genes, A24R and A35R, either of which was sufficient to improve VVΔEΔK+RhTRS1 replication. Neither of these genes has previously been implicated in PKR antagonism. Furthermore, the mutation in A24R, but not A35R, increased resistance to the antipoxviral drug isatin-β-thiosemicarbazone, suggesting that these mutations employ different mechanisms to evade PKR. This study supports our hypothesis that gene amplification may provide a "molecular foothold," broadly improving replication to facilitate rapid adaptation, while subsequent mutations maintain this efficient replication in the new host without requiring gene amplification.

Importance: Understanding how viruses adapt to a new host may help identify viruses poised to cross species barriers before an outbreak occurs. Amplification of rhtrs1, a weak viral antagonist of the host antiviral protein PKR, enabled a recombinant vaccinia virus to replicate in resistant cells from humans and other primates. After serial passage of rhtrs1-amplified viruses, there arose in two vaccinia virus genes mutations that improved viral replication without requiring rhtrs1 amplification. Neither of these genes has previously been associated with inhibition of the PKR pathway. These data suggest that gene amplification can improve viral replication in a resistant host species and facilitate the emergence of novel adaptations that maintain the foothold needed for continued replication and spread in the new host.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/JVI.01233-15DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4577882PMC
October 2015

Whole genome prediction for preimplantation genetic diagnosis.

Genome Med 2015 8;7(1):35. Epub 2015 Apr 8.

Natera Inc, San Carlos, CA 94070 USA.

Background: Preimplantation genetic diagnosis (PGD) enables profiling of embryos for genetic disorders prior to implantation. The majority of PGD testing is restricted in the scope of variants assayed or by the availability of extended family members. While recent advances in single cell sequencing show promise, they remain limited by bias in DNA amplification and the rapid turnaround time (<36 h) required for fresh embryo transfer. Here, we describe and validate a method for inferring the inherited whole genome sequence of an embryo for preimplantation genetic diagnosis (PGD).

Methods: We combine haplotype-resolved, parental genome sequencing with rapid embryo genotyping to predict the whole genome sequence of a day-5 human embryo in a couple at risk of transmitting alpha-thalassemia.

Results: Inheritance was predicted at approximately 3 million paternally and/or maternally heterozygous sites with greater than 99% accuracy. Furthermore, we successfully phase and predict the transmission of an HBA1/HBA2 deletion from each parent.

Conclusions: Our results suggest that preimplantation whole genome prediction may facilitate the comprehensive diagnosis of diseases with a known genetic basis in embryos.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13073-015-0160-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4445980PMC
May 2015

Haplotype-resolved genome sequencing: experimental methods and applications.

Nat Rev Genet 2015 Jun 7;16(6):344-58. Epub 2015 May 7.

Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.

Human genomes are diploid and, for their complete description and interpretation, it is necessary not only to discover the variation they contain but also to arrange it onto chromosomal haplotypes. Although whole-genome sequencing is becoming increasingly routine, nearly all such individual genomes are mostly unresolved with respect to haplotype, particularly for rare alleles, which remain poorly resolved by inferential methods. Here, we review emerging technologies for experimentally resolving (that is, 'phasing') haplotypes across individual whole-genome sequences. We also discuss computational methods relevant to their implementation, metrics for assessing their accuracy and completeness, and the relevance of haplotype information to applications of genome sequencing in research and clinical medicine.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nrg3903DOI Listing
June 2015

Copy-number variation and false positive prenatal aneuploidy screening results.

N Engl J Med 2015 Apr 1;372(17):1639-45. Epub 2015 Apr 1.

From the Department of Genome Sciences (M.W.S., J.O.K., B.P.C., R.M.D., E.E.E., J.S.), Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology (L.E.S., J.M.H., H.S.G.), and Howard Hughes Medical Institute (E.E.E.), University of Washington, and the Clinical Research Division, Fred Hutchinson Cancer Research Center (H.S.G.) - both in Seattle.

Investigations of noninvasive prenatal screening for aneuploidy by analysis of circulating cell-free DNA (cfDNA) have shown high sensitivity and specificity in both high-risk and low-risk cohorts. However, the overall low incidence of aneuploidy limits the positive predictive value of these tests. Currently, the causes of false positive results are poorly understood. We investigated four pregnancies with discordant prenatal test results and found in two cases that maternal duplications on chromosome 18 were the likely cause of the discordant results. Modeling based on population-level copy-number variation supports the possibility that some false positive results of noninvasive prenatal screening may be attributable to large maternal copy-number variants. (Funded by the National Institutes of Health and others.).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1056/NEJMoa1408408DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4411081PMC
April 2015

Massively Parallel Functional Analysis of BRCA1 RING Domain Variants.

Genetics 2015 Jun 30;200(2):413-22. Epub 2015 Mar 30.

Department of Genome Sciences, University of Washington, Seattle, Washington 98195 Department of Medicine, University of Washington, Seattle, Washington 98195 Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195

Interpreting variants of uncertain significance (VUS) is a central challenge in medical genetics. One approach is to experimentally measure the functional consequences of VUS, but to date this approach has been post hoc and low throughput. Here we use massively parallel assays to measure the effects of nearly 2000 missense substitutions in the RING domain of BRCA1 on its E3 ubiquitin ligase activity and its binding to the BARD1 RING domain. From the resulting scores, we generate a model to predict the capacities of full-length BRCA1 variants to support homology-directed DNA repair, the essential role of BRCA1 in tumor suppression, and show that it outperforms widely used biological-effect prediction algorithms. We envision that massively parallel functional assays may facilitate the prospective interpretation of variants observed in clinical sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/genetics.115.175802DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4492368PMC
June 2015

Deep sequencing of multiple regions of glial tumors reveals spatial heterogeneity for mutations in clinically relevant genes.

Genome Biol 2014 Dec 3;15(12):530. Epub 2014 Dec 3.

Background: The extent of intratumoral mutational heterogeneity remains unclear in gliomas, the most common primary brain tumors, especially with respect to point mutation. To address this, we applied single molecule molecular inversion probes targeting 33 cancer genes to assay both point mutations and gene amplifications within spatially distinct regions of 14 glial tumors.

Results: We find evidence of regional mutational heterogeneity in multiple tumors, including mutations in TP53 and RB1 in an anaplastic oligodendroglioma and amplifications in PDGFRA and KIT in two glioblastomas (GBMs). Immunohistochemistry confirms heterogeneity of TP53 mutation and PDGFRA amplification. In all, 3 out of 14 glial tumors surveyed have evidence for heterogeneity for clinically relevant mutations.

Conclusions: Our results underscore the need to sample multiple regions in GBM and other glial tumors when devising personalized treatments based on genomic information, and furthermore demonstrate the importance of measuring both point mutation and copy number alteration while investigating genetic heterogeneity within cancer samples.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-014-0530-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4272528PMC
December 2014

Massively parallel single-amino-acid mutagenesis.

Nat Methods 2015 Mar 5;12(3):203-6, 4 p following 206. Epub 2015 Jan 5.

Department of Genome Sciences, University of Washington, Seattle, Washington, USA.

Random mutagenesis methods only partially cover the mutational space and are constrained by DNA synthesis length limitations. Here we demonstrate programmed allelic series (PALS), a single-volume, site-directed mutagenesis approach using microarray-programmed oligonucleotides. We created libraries including nearly every missense mutation as singleton events for the yeast transcription factor Gal4 (99.9% coverage) and human tumor suppressor p53 (93.5%). PALS-based comprehensive missense mutational scans may aid structure-function studies, protein engineering, and the interpretation of variants identified by clinical sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.3223DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4344410PMC
March 2015

Large-scale genomic sequencing of extraintestinal pathogenic Escherichia coli strains.

Genome Res 2015 Jan 4;25(1):119-28. Epub 2014 Nov 4.

Department of Genome Sciences,

Large-scale bacterial genome sequencing efforts to date have provided limited information on the most prevalent category of disease: sporadically acquired infections caused by common pathogenic bacteria. Here, we performed whole-genome sequencing and de novo assembly of 312 blood- or urine-derived isolates of extraintestinal pathogenic (ExPEC) Escherichia coli, a common agent of sepsis and community-acquired urinary tract infections, obtained during the course of routine clinical care at a single institution. We find that ExPEC E. coli are highly genomically heterogeneous, consistent with pan-genome analyses encompassing the larger species. Investigation of differential virulence factor content and antibiotic resistance phenotypes reveals markedly different profiles among lineages and among strains infecting different body sites. We use high-resolution molecular epidemiology to explore the dynamics of infections at the level of individual patients, including identification of possible person-to-person transmission. Notably, a limited number of discrete lineages caused the majority of bloodstream infections, including one subclone (ST131-H30) responsible for 28% of bacteremic E. coli infections over a 3-yr period. We additionally use a microbial genome-wide-association study (GWAS) approach to identify individual genes responsible for antibiotic resistance, successfully recovering known genes but notably not identifying any novel factors. We anticipate that in the near future, whole-genome sequencing of microorganisms associated with clinical disease will become routine. Our study reveals what kind of information can be obtained from sequencing clinical isolates on a large scale, even well-characterized organisms such as E. coli, and provides insight into how this information might be utilized in a healthcare setting.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.180190.114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4317167PMC
January 2015

In vitro, long-range sequence information for de novo genome assembly via transposase contiguity.

Genome Res 2014 Dec 19;24(12):2041-9. Epub 2014 Oct 19.

Department of Genome Sciences, University of Washington, Seattle, Washington 98115, USA;

We describe a method that exploits contiguity preserving transposase sequencing (CPT-seq) to facilitate the scaffolding of de novo genome assemblies. CPT-seq is an entirely in vitro means of generating libraries comprised of 9216 indexed pools, each of which contains thousands of sparsely sequenced long fragments ranging from 5 kilobases to > 1 megabase. These pools are "subhaploid," in that the lengths of fragments contained in each pool sums to ∼5% to 10% of the full genome. The scaffolding approach described here, termed fragScaff, leverages coincidences between the content of different pools as a source of contiguity information. Specifically, CPT-seq data is mapped to a de novo genome assembly, followed by the identification of pairs of contigs or scaffolds whose ends disproportionately co-occur in the same indexed pools, consistent with true adjacency in the genome. Such candidate "joins" are used to construct a graph, which is then resolved by a minimum spanning tree. As a proof-of-concept, we apply CPT-seq and fragScaff to substantially boost the contiguity of de novo assemblies of the human, mouse, and fly genomes, increasing the scaffold N50 of de novo assemblies by eight- to 57-fold with high accuracy. We also demonstrate that fragScaff is complementary to Hi-C-based contact probability maps, providing midrange contiguity to support robust, accurate chromosome-scale de novo genome assemblies without the need for laborious in vivo cloning steps. Finally, we demonstrate CPT-seq as a means of anchoring unplaced novel human contigs to the reference genome as well as for detecting misassembled sequences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.178319.114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4248320PMC
December 2014

Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing.

Nat Genet 2014 Dec 19;46(12):1343-9. Epub 2014 Oct 19.

Illumina, Inc., Advanced Research Group, San Diego, California, USA.

Haplotype-resolved genome sequencing enables the accurate interpretation of medically relevant genetic variation, deep inferences regarding population history and non-invasive prediction of fetal genomes. We describe an approach for genome-wide haplotyping based on contiguity-preserving transposition (CPT-seq) and combinatorial indexing. Tn5 transposition is used to modify DNA with adaptor and index sequences while preserving contiguity. After DNA dilution and compartmentalization, the transposase is removed, resolving the DNA into individually indexed libraries. The libraries in each compartment, enriched for neighboring genomic elements, are further indexed via PCR. Combinatorial 96-plex indexing at both the transposition and PCR stage enables the construction of phased synthetic reads from each of the nearly 10,000 'virtual compartments'. We demonstrate the feasibility of this method by assembling >95% of the heterozygous variants in a human genome into long, accurate haplotype blocks (N50 = 1.4-2.3 Mb). The rapid, scalable and cost-effective workflow could enable haplotype resolution to become routine in human genome sequencing.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ng.3119DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4409979PMC
December 2014

Adaptive gene amplification as an intermediate step in the expansion of virus host range.

PLoS Pathog 2014 Mar 13;10(3):e1004002. Epub 2014 Mar 13.

Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America; Division of Clinical Research, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America; Departments of Microbiology and Medicine, University of Washington, Seattle, Washington, United States of America.

The majority of recently emerging infectious diseases in humans is due to cross-species pathogen transmissions from animals. To establish a productive infection in new host species, viruses must overcome barriers to replication mediated by diverse and rapidly evolving host restriction factors such as protein kinase R (PKR). Many viral antagonists of these restriction factors are species specific. For example, the rhesus cytomegalovirus PKR antagonist, RhTRS1, inhibits PKR in some African green monkey (AGM) cells, but does not inhibit human or rhesus macaque PKR. To model the evolutionary changes necessary for cross-species transmission, we generated a recombinant vaccinia virus that expresses RhTRS1 in a strain that lacks PKR inhibitors E3L and K3L (VVΔEΔK+RhTRS1). Serially passaging VVΔEΔK+RhTRS1 in minimally-permissive AGM cells increased viral replication 10- to 100-fold. Notably, adaptation in these AGM cells also improved virus replication 1000- to 10,000-fold in human and rhesus cells. Genetic analyses including deep sequencing revealed amplification of the rhtrs1 locus in the adapted viruses. Supplying additional rhtrs1 in trans confirmed that amplification alone was sufficient to improve VVΔEΔK+RhTRS1 replication. Viruses with amplified rhtrs1 completely blocked AGM PKR, but only partially blocked human PKR, consistent with the replication properties of these viruses in AGM and human cells. Finally, in contrast to AGM-adapted viruses, which could be serially propagated in human cells, VVΔEΔK+RhTRS1 yielded no progeny virus after only three passages in human cells. Thus, rhtrs1 amplification in a minimally permissive intermediate host was a necessary step, enabling expansion of the virus range to previously nonpermissive hosts. These data support the hypothesis that amplification of a weak viral antagonist may be a general evolutionary mechanism to permit replication in otherwise resistant host species, providing a molecular foothold that could enable further adaptations necessary for efficient replication in the new host.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.ppat.1004002DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3953438PMC
March 2014

The complete genome sequence of a Neanderthal from the Altai Mountains.

Nature 2014 Jan 18;505(7481):43-9. Epub 2013 Dec 18.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany.

We present a high-quality genome sequence of a Neanderthal woman from Siberia. We show that her parents were related at the level of half-siblings and that mating among close relatives was common among her recent ancestors. We also sequenced the genome of a Neanderthal from the Caucasus to low coverage. An analysis of the relationships and population history of available archaic genomes and 25 present-day human genomes shows that several gene flow events occurred among Neanderthals, Denisovans and early modern humans, possibly including gene flow into Denisovans from an unknown archaic group. Thus, interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene. In addition, the high-quality Neanderthal genome allows us to establish a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neanderthals and Denisovans.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature12886DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4031459PMC
January 2014

Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions.

Nat Biotechnol 2013 Dec 3;31(12):1119-25. Epub 2013 Nov 3.

Department of Genome Sciences, University of Washington, Seattle, Washington, USA.

Genomes assembled de novo from short reads are highly fragmented relative to the finished chromosomes of Homo sapiens and key model organisms generated by the Human Genome Project. To address this problem, we need scalable, cost-effective methods to obtain assemblies with chromosome-scale contiguity. Here we show that genome-wide chromatin interaction data sets, such as those generated by Hi-C, are a rich source of long-range information for assigning, ordering and orienting genomic sequences to chromosomes, including across centromeres. To exploit this finding, we developed an algorithm that uses Hi-C data for ultra-long-range scaffolding of de novo genome assemblies. We demonstrate the approach by combining shotgun fragment and short jump mate-pair sequences with Hi-C data to generate chromosome-scale de novo assemblies of the human, mouse and Drosophila genomes, achieving--for the human genome--98% accuracy in assigning scaffolds to chromosome groups and 99% accuracy in ordering and orienting scaffolds within chromosome groups. Hi-C data can also be used to validate chromosomal translocations in cancer genomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nbt.2727DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4117202PMC
December 2013