Publications by authors named "William Chow"

32 Publications

The genome sequence of the European golden eagle, Linnaeus 1758.

Wellcome Open Res 2021 14;6:112. Epub 2021 May 14.

Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.

We present a genome assembly from an individual female (the European golden eagle; Chordata; Aves; Accipitridae). The genome sequence is 1.23 gigabases in span. The majority of the assembly is scaffolded into 28 chromosomal pseudomolecules, including the W and Z sex chromosomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.16631.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8499043PMC
May 2021

The genome sequence of the Norway rat, Berkenhout 1769.

Wellcome Open Res 2021 18;6:118. Epub 2021 May 18.

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

We present a genome assembly from an individual male (the Norway rat; Chordata; Mammalia; Rodentia; Muridae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled. This genome assembly, mRatBN7.2, represents the new reference genome for and has been adopted by the Genome Reference Consortium.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.16854.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8495504PMC
May 2021

The genome sequence of the brown trout, Linnaeus 1758.

Wellcome Open Res 2021 13;6:108. Epub 2021 May 13.

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.

We present a genome assembly from an individual female (the brown trout; Chordata; Actinopteri; Salmoniformes; Salmonidae). The genome sequence is 2.37 gigabases in span. The majority of the assembly is scaffolded into 40 chromosomal pseudomolecules. Gene annotation of this assembly on Ensembl has identified 43,935 protein coding genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.16838.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8488904PMC
May 2021

A Chromosome-Level Genome Assembly of the Reed Warbler (Acrocephalus scirpaceus).

Genome Biol Evol 2021 Sep;13(9)

Centre for Ecological and Evolutionary Synthesis, University of Oslo, Norway.

The reed warbler (Acrocephalus scirpaceus) is a long-distance migrant passerine with a wide distribution across Eurasia. This species has fascinated researchers for decades, especially its role as host of a brood parasite, and its capacity for rapid phenotypic change in the face of climate change. Currently, it is expanding its range northwards in Europe, and is altering its migratory behavior in certain areas. Thus, there is great potential to discover signs of recent evolution and its impact on the genomic composition of the reed warbler. Here, we present a high-quality reference genome for the reed warbler, based on PacBio, 10×, and Hi-C sequencing. The genome has an assembly size of 1,075,083,815 bp with a scaffold N50 of 74,438,198 bp and a contig N50 of 12,742,779 bp. BUSCO analysis using aves_odb10 as a model showed that 95.7% of BUSCO genes were complete. We found unequivocal evidence of two separate macrochromosomal fusions in the reed warbler genome, in addition to the previously identified fusion between chromosome Z and a part of chromosome 4A in the Sylvioidea superfamily. We annotated 14,645 protein-coding genes, and a BUSCO analysis of the protein sequences indicated 97.5% completeness. This reference genome will serve as an important resource, and will provide new insights into the genomic effects of evolutionary drivers such as coevolution, range expansion, and adaptations to climate change, as well as chromosomal rearrangements in birds.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evab212DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8459166PMC
September 2021

Towards complete and error-free genome assemblies of all vertebrate species.

Nature 2021 Apr 28;592(7856):737-746. Epub 2021 Apr 28.

UQ Genomics, University of Queensland, Brisbane, Queensland, Australia.

High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species. To address this issue, the international Genome 10K (G10K) consortium has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03451-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081667PMC
April 2021

Evolutionary and biomedical insights from a marmoset diploid genome assembly.

Nature 2021 06 28;594(7862):227-233. Epub 2021 Apr 28.

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

The accurate and complete assembly of both haplotype sequences of a diploid organism is essential to understanding the role of variation in genome functions, phenotypes and diseases. Here, using a trio-binning approach, we present a high-quality, diploid reference genome, with both haplotypes assembled independently at the chromosome level, for the common marmoset (Callithrix jacchus), an primate model system that is widely used in biomedical research. The full spectrum of heterozygosity between the two haplotypes involves 1.36% of the genome-much higher than the 0.13% indicated by the standard estimation based on single-nucleotide heterozygosity alone. The de novo mutation rate is 0.43 × 10 per site per generation, and the paternal inherited genome acquired twice as many mutations as the maternal. Our diploid assembly enabled us to discover a recent expansion of the sex-differentiation region and unique evolutionary changes in the marmoset Y chromosome. In addition, we identified many genes with signatures of positive selection that might have contributed to the evolution of Callithrix biological features. Brain-related genes were highly conserved between marmosets and humans, although several genes experienced lineage-specific copy number variations or diversifying selection, with implications for the use of marmosets as a model system.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03535-xDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8189906PMC
June 2021

Significantly improving the quality of genome assemblies through curation.

Gigascience 2021 01;10(1)

Tree of Life, Wellcome Sanger Institute, Cambridge CB10 1SA, UK.

Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free assemblies is therefore the ultimate, but sadly still unachieved goal of a multitude of research projects. Despite the ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near error-free genome assemblies for eukaryotes. Whilst working towards improved datasets and fully automated pipelines, assembly evaluation and curation is actively used to bridge this shortcoming and significantly reduce the number of assembly errors. In addition to this increase in product value, the insights gained from assembly curation are fed back into the automated assembly strategy and contribute to notable improvements in genome assembly quality. We describe our tried and tested approach for assembly curation using gEVAL, the genome evaluation browser. We outline the procedures applied to genome curation using gEVAL and also our recommendations for assembly curation in a gEVAL-independent context to facilitate the uptake of genome curation in the wider community.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa153DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7794651PMC
January 2021

The genome sequence of the channel bull blenny, (Günther, 1861).

Wellcome Open Res 2020 24;5:148. Epub 2020 Jun 24.

Wellcome Sanger Institute, Cambridge, CB10 1SA, UK.

We present a genome assembly for (channel bull blenny, (Günther, 1861)); Chordata; Actinopterygii (ray-finned fishes), a temperate water outgroup for Antarctic Notothenioids. The size of the genome assembly is 609 megabases, with the majority of the assembly scaffolded into 24 chromosomal pseudomolecules. Gene annotation on Ensembl of this assembly has identified 21,662 coding genes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/wellcomeopenres.16012.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7649722PMC
June 2020

Reference genome and demographic history of the most endangered marine mammal, the vaquita.

Mol Ecol Resour 2021 May 20;21(4):1008-1020. Epub 2020 Nov 20.

Marine Mammal Research, Department of Bioscience, Aarhus University, Roskilde, Denmark.

The vaquita is the most critically endangered marine mammal, with fewer than 19 remaining in the wild. First described in 1958, the vaquita has been in rapid decline for more than 20 years resulting from inadvertent deaths due to the increasing use of large-mesh gillnets. To understand the evolutionary and demographic history of the vaquita, we used combined long-read sequencing and long-range scaffolding methods with long- and short-read RNA sequencing to generate a near error-free annotated reference genome assembly from cell lines derived from a female individual. The genome assembly consists of 99.92% of the assembled sequence contained in 21 nearly gapless chromosome-length autosome scaffolds and the X-chromosome scaffold, with a scaffold N50 of 115 Mb. Genome-wide heterozygosity is the lowest (0.01%) of any mammalian species analysed to date, but heterozygosity is evenly distributed across the chromosomes, consistent with long-term small population size at genetic equilibrium, rather than low diversity resulting from a recent population bottleneck or inbreeding. Historical demography of the vaquita indicates long-term population stability at less than 5,000 (Ne) for over 200,000 years. Together, these analyses indicate that the vaquita genome has had ample opportunity to purge highly deleterious alleles and potentially maintain diversity necessary for population health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1755-0998.13284DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8247363PMC
May 2021

Telomere-to-telomere assembly of a complete human X chromosome.

Nature 2020 09 14;585(7823):79-84. Epub 2020 Jul 14.

Arima Genomics, San Diego, CA, USA.

After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist. Here we present a human genome assembly that surpasses the continuity of GRCh38, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-2547-7DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7484160PMC
September 2020

An improved pig reference genome sequence to enable pig genetics and genomics research.

Gigascience 2020 06;9(6)

Department of Pathology, University of Cambridge, Tennis Court Road, Cambridge CB2 1QP, UK.

Background: The domestic pig (Sus scrofa) is important both as a food source and as a biomedical model given its similarity in size, anatomy, physiology, metabolism, pathology, and pharmacology to humans. The draft reference genome (Sscrofa10.2) of a purebred Duroc female pig established using older clone-based sequencing methods was incomplete, and unresolved redundancies, short-range order and orientation errors, and associated misassembled genes limited its utility.

Results: We present 2 annotated highly contiguous chromosome-level genome assemblies created with more recent long-read technologies and a whole-genome shotgun strategy, 1 for the same Duroc female (Sscrofa11.1) and 1 for an outbred, composite-breed male (USMARCv1.0). Both assemblies are of substantially higher (>90-fold) continuity and accuracy than Sscrofa10.2.

Conclusions: These highly contiguous assemblies plus annotation of a further 11 short-read assemblies provide an unprecedented view of the genetic make-up of this important agricultural and biomedical model species. We propose that the improved Duroc assembly (Sscrofa11.1) become the reference genome for genomic research in pigs.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giaa051DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7448572PMC
June 2020

Birth, expansion, and death of VCY-containing palindromes on the human Y chromosome.

Genome Biol 2019 10 14;20(1):207. Epub 2019 Oct 14.

The Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.

Background: Large palindromes (inverted repeats) make up substantial proportions of mammalian sex chromosomes, often contain genes, and have high rates of structural variation arising via ectopic recombination. As a result, they underlie many genomic disorders. Maintenance of the palindromic structure by gene conversion between the arms has been documented, but over longer time periods, palindromes are remarkably labile. Mechanisms of origin and loss of palindromes have, however, received little attention.

Results: Here, we use fiber-FISH, 10x Genomics Linked-Read sequencing, and breakpoint PCR sequencing to characterize the structural variation of the P8 palindrome on the human Y chromosome, which contains two copies of the VCY (Variable Charge Y) gene. We find a deletion of almost an entire arm of the palindrome, leading to death of the palindrome, a size increase by recruitment of adjacent sequence, and other complex changes including the formation of an entire new palindrome nearby. Together, these changes are found in ~ 1% of men, and we can assign likely molecular mechanisms to these mutational events. As a result, healthy men can have 1-4 copies of VCY.

Conclusions: Gross changes, especially duplications, in palindrome structure can be relatively frequent and facilitate the evolution of sex chromosomes in humans, and potentially also in other mammalian species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-019-1816-yDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6790999PMC
October 2019

Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci.

Nat Genet 2018 11 1;50(11):1574-1583. Epub 2018 Oct 1.

Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. We identify and characterize 2,567 regions on the current mouse reference genome exhibiting the greatest sequence diversity. These regions are enriched for genes involved in pathogen defence and immunity and exhibit enrichment of transposable elements and signatures of recent retrotransposition events. Combinations of alleles and genes unique to an individual strain are commonly observed at these loci, reflecting distinct strain phenotypes. We used these genomes to improve the mouse reference genome, resulting in the completion of 10 new gene structures. Also, 62 new coding loci were added to the reference genome annotation. These genomes identified a large, previously unannotated, gene (Efcab3-like) encoding 5,874 amino acids. Mutant Efcab3-like mice display anomalies in multiple brain regions, suggesting a possible role for this gene in the regulation of brain development.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41588-018-0223-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6205630PMC
November 2018

Non-selective beta blockers inhibit angiosarcoma cell viability and increase progression free- and overall-survival in patients diagnosed with metastatic angiosarcoma.

Oncoscience 2018 Mar 29;5(3-4):109-119. Epub 2018 Apr 29.

Department of Biomedical Sciences, Texas Tech University Health Sciences Center, El Paso, TX, USA.

Patients with metastatic angiosarcoma undergoing chemotherapy, radiation, and/or surgery experience a median progression free survival of less than 6 months and a median overall survival of less than 12 months. Given the aggressive nature of this cancer, angiosarcoma clinical responses to chemotherapy or targeted therapeutics are generally very poor. Inhibition of beta adrenergic receptor (β-AR) signaling has recently been shown to decrease angiosarcoma tumor cell viability, abrogate tumor growth in mouse models, and decrease proliferation rates in preclinical and clinical settings. In the current study we used cell and animal tumor models to show that β-AR antagonism abrogates mitogenic signaling and reduces angiosarcoma tumor cell viability, and these molecular alterations translated into patient tumors. We demonstrated that non-selective β-AR antagonists are superior to selective β-AR antagonists at inhibiting angiosarcoma cell viability. A prospective analysis of non- selective β-AR antagonists in a single arm clinical study of metastatic angiosarcoma patients revealed that incorporation of either propranolol or carvedilol into patients' treatment regimens leads to a median progression free and overall survival of 9 and 36 months, respectively. These data suggest that incorporation of non-selective β-AR antagonists into existing therapies against metastatic angiosarcoma can enhance clinical outcomes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.18632/oncoscience.413DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5978448PMC
March 2018

Repeat associated mechanisms of genome evolution and function revealed by the and genomes.

Genome Res 2018 04 21;28(4):448-459. Epub 2018 Mar 21.

Yale University Medical School, Computational Biology and Bioinformatics Program, New Haven, Connecticut 06520, USA.

Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the and genomes. Together with the and genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of and between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in , which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.234096.117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5880236PMC
April 2018

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

Genome Res 2017 05 10;27(5):849-864. Epub 2017 Apr 10.

Pacific Biosciences, Menlo Park, California 94025, USA.

The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.213611.116DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5411779PMC
May 2017

A New Chicken Genome Assembly Provides Insight into Avian Genome Structure.

G3 (Bethesda) 2017 01 5;7(1):109-117. Epub 2017 Jan 5.

The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, United Kingdom.

The importance of the Gallus gallus (chicken) as a model organism and agricultural animal merits a continuation of sequence assembly improvement efforts. We present a new version of the chicken genome assembly (Gallus_gallus-5.0; GCA_000002315.3), built from combined long single molecule sequencing technology, finished BACs, and improved physical maps. In overall assembled bases, we see a gain of 183 Mb, including 16.4 Mb in placed chromosomes with a corresponding gain in the percentage of intact repeat elements characterized. Of the 1.21 Gb genome, we include three previously missing autosomes, GGA30, 31, and 33, and improve sequence contig length 10-fold over the previous Gallus_gallus-4.0. Despite the significant base representation improvements made, 138 Mb of sequence is not yet located to chromosomes. When annotated for gene content, Gallus_gallus-5.0 shows an increase of 4679 annotated genes (2768 noncoding and 1911 protein-coding) over those in Gallus_gallus-4.0. We also revisited the question of what genes are missing in the avian lineage, as assessed by the highest quality avian genome assembly to date, and found that a large fraction of the original set of missing genes are still absent in sequenced bird species. Finally, our new data support a detailed map of MHC-B, encompassing two segments: one with a highly stable gene copy number and another in which the gene copy number is highly variable. The chicken model has been a critical resource for many other fields of study, and this new reference assembly will substantially further these efforts.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1534/g3.116.035923DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5217101PMC
January 2017

gEVAL - a web-based browser for evaluating genome assemblies.

Bioinformatics 2016 08 7;32(16):2508-10. Epub 2016 Apr 7.

Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

Motivation: For most research approaches, genome analyses are dependent on the existence of a high quality genome reference assembly. However, the local accuracy of an assembly remains difficult to assess and improve. The gEVAL browser allows the user to interrogate an assembly in any region of the genome by comparing it to different datasets and evaluating the concordance. These analyses include: a wide variety of sequence alignments, comparative analyses of multiple genome assemblies, and consistency with optical and other physical maps. gEVAL highlights allelic variations, regions of low complexity, abnormal coverage, and potential sequence and assembly errors, and offers strategies for improvement. Although gEVAL focuses primarily on sequence integrity, it can also display arbitrary annotation including from Ensembl or TrackHub sources. We provide gEVAL web sites for many human, mouse, zebrafish and chicken assemblies to support the Genome Reference Consortium, and gEVAL is also downloadable to enable its use for any organism and assembly.

Availability And Implementation: Web Browser: http://geval.sanger.ac.uk, Plugin: http://wchow.github.io/wtsi-geval-plugin

Contact: [email protected]

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btw159DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4978925PMC
August 2016

The pig X and Y Chromosomes: structure, sequence, and evolution.

Genome Res 2016 Jan 11;26(1):130-9. Epub 2015 Nov 11.

Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom;

We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.188839.114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4691746PMC
January 2016

Growth Attenuation of Cutaneous Angiosarcoma With Propranolol-Mediated β-Blockade.

JAMA Dermatol 2015 Nov;151(11):1226-9

Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso.

Importance: Patients with stage T2 multilesion angiosarcomas of the scalp and face that are larger than 10 cm demonstrate a 2-year survival rate of 0%. To our knowledge, major therapeutic advances against this disease have not been reported for decades. Preclinical data indicate that blocking β-adrenergic signaling with propranolol hydrochloride disrupts angiosarcoma cell survival and xenograft angiosarcoma progression.

Observations: A patient presented with a β-adrenergic-positive multifocal stage T2 cutaneous angiosarcoma (≥20 cm) involving 80% of the scalp, left forehead, and left cheek, with no evidence of metastasis. The patient was immediately administered propranolol hydrochloride, 40 mg twice a day, as his workup progressed and treatment options were elucidated. Evaluation of the proliferative index of the tumor before and after only 1 week of propranolol monotherapy revealed a reduction in the proliferative index of the tumor by approximately 34%. A combination of propranolol hydrochloride, 40 mg 3 times a day, paclitaxel poliglumex, 2 mg/m2 infused weekly, and radiotherapy during the subsequent 8 months resulted in extensive tumor regression with no detectable metastases.

Conclusions And Relevance: Our data suggest that β-blockade alone substantially reduced angiosarcoma proliferation and, in combination with standard therapy, is effective for reducing the size of the tumor and preventing metastases. If successful, β-blockade could be the first major advancement in the treatment of angiosarcoma in decades.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1001/jamadermatol.2015.2554DOI Listing
November 2015

Evaluation of WHO criteria for diagnosis of polycythemia vera: a prospective analysis.

Blood 2013 Sep 30;122(11):1881-6. Epub 2013 Jul 30.

Division of Hematology and Medical Oncology, Department of Medicine.

We prospectively evaluated the accuracy of the 2007 World Health Organization (WHO) criteria for diagnosing polycythemia vera (PV), especially in "early-stage" patients. A total of 28 of 30 patients were diagnosed as PV owing to an elevated Cr-51 red cell mass (RCM), JAK2 positivity, and at least 1 minor criterion. A total of 18 PV patients did not meet the WHO criterion for an increased hemoglobin value and 8 did not meet the WHO criterion for an increased hematocrit value. Bone marrow morphology was very valuable for diagnosis. Low serum erythropoietin (EPO) values were specific for PV, but normal EPO values were found at presentation (20%). We recommend revision of the WHO criteria, especially to distinguish early-stage PV from essential thrombocythemia. Major criteria remain JAK2 positivity and increased red cell volume, but Cr-51 RCM is mandatory for patients who do not meet the defined elevated hemoglobin or hematocrit value (>18.5 g/dL and 60% in men and >16.5 g/dL and 56% in women, respectively). Minor criteria remain bone marrow histology or a low serum EPO value. For patients with a normal EPO value, marrow examination is mandatory for diagnostic confirmation. Because the therapies for myeloproliferative disorders differ, our data have major clinical implications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1182/blood-2013-06-508416DOI Listing
September 2013

The zebrafish reference genome sequence and its relationship to the human genome.

Nature 2013 Apr 17;496(7446):498-503. Epub 2013 Apr 17.

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature12111DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3703927PMC
April 2013

Analyses of pig genomes provide insight into porcine demography and evolution.

Nature 2012 Nov;491(7424):393-8

Animal Breeding and Genomics Centre, Wageningen University, De Elst 1, 6708 WD, Wageningen, The Netherlands.

For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ∼1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature11622DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3566564PMC
November 2012

Decrease in JAK2 V617F allele burden is not a prerequisite to clinical response in patients with polycythemia vera.

Haematologica 2012 Apr 18;97(4):538-42. Epub 2011 Nov 18.

Weill Cornell Medical College, Department of Medicine, Division of Hematology and Medical Oncology, New York, NY 10021, USA.

Background: Although reduction in the JAK2(V617F) allele burden (%V617F) has been suggested as a criterion for defining disease response to cytoreductive therapy in polycythemia vera, its value as a response monitor is unclear. The purpose of this study is to determine whether a reduction in %V617F in polycythemia vera is a prerequisite to achieving hematologic remission in response to cytoreductive therapy.

Design And Methods: We compared the clinical and hematologic responses to change in %V617F (molecular response) in 73 patients with polycythemia vera treated with either interferon (rIFNα-2b: 28, Peg-rIFNα-2a: 18) or non-interferon drugs (n=27), which included hydroxyurea (n=8), imatinib (n=12), dasatinib (n=5), busulfan (n=1), and radioactive phosphorus (n=1). Hematologic response evaluation employed Polycythemia Vera Study Group criteria, and molecular response evaluation, European Leukemia Net criteria.

Results: Of the 46 treated with interferon, 41 (89.1%) had a hematologic response, whereas only 7 (15.2%) had a partial molecular response. Of the 27 who received non-interferon treatments, 16 (59.3%) had a hematologic response, but only 2 (7.4%) had a molecular response. Median duration of follow up was 2.8 years. Statistical agreement between hematologic response and molecular response was poor in all treatment groups.

Conclusions: Generally, hematologic response was not accompanied by molecular response. Therefore, a quantitative change in %V617F is not required for clinical response in patients with polycythemia vera.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3324/haematol.2011.053348DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3347655PMC
April 2012

Genomic organization and evolution of the Atlantic salmon hemoglobin repertoire.

BMC Genomics 2010 Oct 5;11:539. Epub 2010 Oct 5.

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada.

Background: The genomes of salmonids are considered pseudo-tetraploid undergoing reversion to a stable diploid state. Given the genome duplication and extensive biological data available for salmonids, they are excellent model organisms for studying comparative genomics, evolutionary processes, fates of duplicated genes and the genetic and physiological processes associated with complex behavioral phenotypes. The evolution of the tetrapod hemoglobin genes is well studied; however, little is known about the genomic organization and evolution of teleost hemoglobin genes, particularly those of salmonids. The Atlantic salmon serves as a representative salmonid species for genomics studies. Given the well documented role of hemoglobin in adaptation to varied environmental conditions as well as its use as a model protein for evolutionary analyses, an understanding of the genomic structure and organization of the Atlantic salmon α and β hemoglobin genes is of great interest.

Results: We identified four bacterial artificial chromosomes (BACs) comprising two hemoglobin gene clusters spanning the entire α and β hemoglobin gene repertoire of the Atlantic salmon genome. Their chromosomal locations were established using fluorescence in situ hybridization (FISH) analysis and linkage mapping, demonstrating that the two clusters are located on separate chromosomes. The BACs were sequenced and assembled into scaffolds, which were annotated for putatively functional and pseudogenized hemoglobin-like genes. This revealed that the tail-to-tail organization and alternating pattern of the α and β hemoglobin genes are well conserved in both clusters, as well as that the Atlantic salmon genome houses substantially more hemoglobin genes, including non-Bohr β globin genes, than the genomes of other teleosts that have been sequenced.

Conclusions: We suggest that the most parsimonious evolutionary path leading to the present organization of the Atlantic salmon hemoglobin genes involves the loss of a single hemoglobin gene cluster after the whole genome duplication (WGD) at the base of the teleost radiation but prior to the salmonid-specific WGD, which then produced the duplicated copies seen today. We also propose that the relatively high number of hemoglobin genes as well as the presence of non-Bohr β hemoglobin genes may be due to the dynamic life history of salmon and the diverse environmental conditions that the species encounters.Data deposition: BACs S0155C07 and S0079J05 (fps135): GenBank GQ898924; BACs S0055H05 and S0014B03 (fps1046): GenBank GQ898925.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-11-539DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3091688PMC
October 2010

Genomic organization of Atlantic salmon (Salmo salar) fatty acid binding protein (fabp2) genes reveals independent loss of duplicate loci in teleosts.

Mar Genomics 2009 Sep-Dec;2(3-4):193-200. Epub 2009 Nov 25.

Department of Molecular Biology and Biochemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia, Canada V5A 1S6.

Gene and genome duplications are considered to be driving forces of evolution. The relatively recent genome duplication in the common ancestor of salmonids makes this group of fish an excellent system for studying the re-diploidization process and the fates of duplicate genes. We characterized the structure and genome organization of the intestinal fatty acid binding protein (fabp2) genes in Atlantic salmon as a means of understanding the evolutionary fates of members of this protein family in teleosts. A survey of EST databases identified three unique salmonid fabp2 transcripts (fabp2aI, fabp2aII and fabp2b) compared to one transcript in zebrafish. We screened the CHORI-214 Atlantic salmon BAC library and identified BACs containing each of the three fabp2 genes. Physical mapping, genetic mapping and fluorescence in situ hybridization of Atlantic salmon chromosomes revealed that Atlantic salmon fabp2aI, fabp2aII and fabp2b correspond to separate genetic loci that reside on different chromosomes. Comparative genomic analyses indicated that these genes are related to one another by two genome duplications and a gene loss. The first genome duplication occurred in the common ancestor of all teleosts, giving rise to fabp2a and fabp2b, and the second in the common ancestor of salmonids, producing fabp2aI, fabp2aII, fabp2bI and fabp2bII. A subsequent loss of fabp2bI or fabp2bII gave the complement of fabp2 genes seen in Atlantic salmon today. There is also evidence for independent losses of fabp2b genes in zebrafish and tetraodon. Although there is no evidence for partitioning of tissue expression of fabp2 genes (i.e., sub-functionalization) in Atlantic salmon, the pattern of amino acid substitutions in Atlantic salmon and rainbow trout fabp2aI and fabp2aII suggests that neo-functionalization is occurring.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.margen.2009.10.003DOI Listing
October 2012

Genomic organization and evolution of the vomeronasal type 2 receptor-like (OlfC) gene clusters in Atlantic salmon, Salmo salar.

Mol Biol Evol 2009 May 12;26(5):1117-25. Epub 2009 Feb 12.

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada.

There are three major multigene superfamilies of olfactory receptors (OR, V1R, and V2R) in mammals. The ORs are expressed in the main olfactory organ, whereas the V1Rs and V2Rs are located in the vomeronasal organ. Fish only possess one olfactory organ in each nasal cavity, the olfactory rosette; therefore, it has been proposed that their V2R-like genes be classified as olfactory C family G protein-coupled receptors (OlfC). There are large variations in the sizes of OR gene repertoires. Previous studies have shown that fish have between 12 and 46 functional V2R-like genes, whereas humans have lost all functional V2Rs, and frog sp. have more than 240. Pseudogenization of V2R genes is a prevalent event across species. In the mouse and frog genomes, there are approximately double the number of pseudogenes compared with functional genes. An oligonucleotide probe was designed from a conserved sequence from four Atlantic salmon OlfC genes and used to screen the Atlantic salmon bacterial artificial chromosome (BAC) library. Hybridization-positive BACs were matched to fingerprint contigs, and representative BACs were shotgun cloned and sequenced. We identified 55 OlfC genes. Twenty-nine of the OlfC genes are classified as putatively functional genes and 26 as pseudogenes. The OlfC genes are found in two genomic clusters on chromosomes 9 and 20. Phylogenetic analysis revealed that the OlfC genes could be divided into 10 subfamilies, with nine of these subfamilies corresponding to subfamilies found in other teleosts and one being salmon specific. There is also a large expansion in the number of OlfC genes in one subfamily in Atlantic salmon. Subfamily gene expansions have been identified in other teleosts, and these differences in gene number reflect species-specific evolutionary requirements for olfaction. Total RNA was isolated from the olfactory epithelium and other tissues from a presmolt to examine the expression of the odorant genes. Several of the putative OlfC genes that we identified are expressed only in the olfactory epithelium, consistent with these genes encoding odorant receptors.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/msp027DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2668830PMC
May 2009

Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome.

BMC Genomics 2008 Aug 28;9:404. Epub 2008 Aug 28.

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada.

Background: With a whole genome duplication event and wealth of biological data, salmonids are excellent model organisms for studying evolutionary processes, fates of duplicated genes and genetic and physiological processes associated with complex behavioral phenotypes. It is surprising therefore, that no salmonid genome has been sequenced. Atlantic salmon (Salmo salar) is a good representative salmonid for sequencing given its importance in aquaculture and the genomic resources available. However, the size and complexity of the genome combined with the lack of a sequenced reference genome from a closely related fish makes assembly challenging. Given the cost and time limitations of Sanger sequencing as well as recent improvements to next generation sequencing technologies, we examined the feasibility of using the Genome Sequencer (GS) FLX pyrosequencing system to obtain the sequence of a salmonid genome. Eight pooled BACs belonging to a minimum tiling path covering approximately 1 Mb of the Atlantic salmon genome were sequenced by GS FLX shotgun and Long Paired End sequencing and compared with a ninth BAC sequenced by Sanger sequencing of a shotgun library.

Results: An initial assembly using only GS FLX shotgun sequences (average read length 248.5 bp) with approximately 30x coverage allowed gene identification, but was incomplete even when 126 Sanger-generated BAC-end sequences (approximately 0.09x coverage) were incorporated. The addition of paired end sequencing reads (additional approximately 26x coverage) produced a final assembly comprising 175 contigs assembled into four scaffolds with 171 gaps. Sanger sequencing of the ninth BAC (approximately 10.5x coverage) produced nine contigs and two scaffolds. The number of scaffolds produced by the GS FLX assembly was comparable to Sanger-generated sequencing; however, the number of gaps was much higher in the GS FLX assembly.

Conclusion: These results represent the first use of GS FLX paired end reads for de novo sequence assembly. Our data demonstrated that this improved the GS FLX assemblies; however, with respect to de novo sequencing of complex genomes, the GS FLX technology is limited to gene mining and establishing a set of ordered sequence contigs. Currently, for a salmonid reference sequence, it appears that a substantial portion of sequencing should be done using Sanger technology.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-9-404DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2532694PMC
August 2008

Genomic organization and characterization of two vomeronasal 1 receptor-like genes (ora1 and ora2) in Atlantic salmon Salmo salar.

Mar Genomics 2008 Mar 28;1(1):23-31. Epub 2008 Apr 28.

Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada.

Olfactory receptors are encoded by three large multigene superfamilies (OR, V1R and V2R) in mammals. Fish do not possess a vomeronasal system; therefore, it has been proposed that their V1R-like genes be classified as olfactory receptors related to class A G protein-coupled receptors (ora). Unlike mammalian genomes, which contain more than a hundred V1R genes, the five species of teleost fish that have been investigated to date appear to have six ora genes (ora1-6) except for pufferfish that have lost ora1. The common ancestor of salmonid fishes is purported to have undergone a whole genome duplication. As salmonids have a life history that requires the use of olfactory cues to navigate back to their natal habitats to spawn, we set out to determine if ora1 or ora2 is duplicated in a representative species, Atlantic salmon (Salmo salar). We used an oligonucleotide probe designed from a conserved sequence of several teleost ora2 genes to screen an Atlantic salmon BAC library (CHORI-214). Hybridization-positive BACs belonged to a single fingerprint contig of the Atlantic salmon physical map. All were also positive for ora2 by PCR. One of these BACs was chosen for further study, and shotgun sequencing of this BAC identified two V1R-like genes, ora1 and ora2, that are in a head-to-head conformation as is seen in some other teleosts. The gene products, ora1 and ora2, are highly conserved among teleosts. We only found evidence for a single ora1-2 locus in the Atlantic salmon genome, which was mapped to linkage group 6. Fluorescent in situ hybridization (FISH) analysis placed ora1-2 on chromosome 12. Conserved synteny was found surrounding the ora1 and ora2 genes in Atlantic salmon, medaka and three-spined stickleback, but not zebrafish.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.margen.2008.04.003DOI Listing
March 2008
-->