Publications by authors named "Harris A Lewin"

93 Publications

Towards complete and error-free genome assemblies of all vertebrate species.

Nature 2021 Apr 28;592(7856):737-746. Epub 2021 Apr 28.

UQ Genomics, University of Queensland, Brisbane, Queensland, Australia.

High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species. To address this issue, the international Genome 10K (G10K) consortium has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-021-03451-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081667PMC
April 2021

Platypus and echidna genomes reveal mammalian biology and evolution.

Nature 2021 Apr 6;592(7856):756-762. Epub 2021 Jan 6.

Tree of Life Programme, Wellcome Sanger Institute, Cambridge, UK.

Egg-laying mammals (monotremes) are the only extant mammalian outgroup to therians (marsupial and eutherian animals) and provide key insights into mammalian evolution. Here we generate and analyse reference genomes of the platypus (Ornithorhynchus anatinus) and echidna (Tachyglossus aculeatus), which represent the only two extant monotreme lineages. The nearly complete platypus genome assembly has anchored almost the entire genome onto chromosomes, markedly improving the genome continuity and gene annotation. Together with our echidna sequence, the genomes of the two species allow us to detect the ancestral and lineage-specific genomic changes that shape both monotreme and mammalian evolution. We provide evidence that the monotreme sex chromosome complex originated from an ancestral chromosome ring configuration. The formation of such a unique chromosome complex may have been facilitated by the unusually extensive interactions between the multi-X and multi-Y chromosomes that are shared by the autosomal homologues in humans. Further comparative genomic analyses unravel marked differences between monotremes and therians in haptoglobin genes, lactation genes and chemosensory receptor genes for smell and taste that underlie the ecological adaptation of monotremes.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41586-020-03039-0DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8081666PMC
April 2021

Vertebrate Chromosome Evolution.

Annu Rev Anim Biosci 2021 02 13;9:1-27. Epub 2020 Nov 13.

The Genome Center, University of California, Davis, California 95616, USA; email:

The study of chromosome evolution is undergoing a resurgence of interest owing to advances in DNA sequencing technology that facilitate the production of chromosome-scale whole-genome assemblies de novo. This review focuses on the history, methods, discoveries, and current challenges facing the field, with an emphasis on vertebrate genomes. A detailed examination of the literature on the biology of chromosome rearrangements is presented, specifically the relationship between chromosome rearrangements and phenotypic evolution, adaptation, and speciation. A critical review of the methods for identifying, characterizing, and visualizing chromosome rearrangements and computationally reconstructing ancestral karyotypes is presented. We conclude by looking to the future, identifying the enormous technical and scientific challenges presented by the accumulation of hundreds and eventually thousands of chromosome-scale assemblies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1146/annurev-animal-020518-114924DOI Listing
February 2021

Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates.

Proc Natl Acad Sci U S A 2020 09 21;117(36):22311-22322. Epub 2020 Aug 21.

The Genome Center, University of California, Davis, CA 95616;

The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of COVID-19. The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of ACE2 sequences from 410 vertebrate species, including 252 mammals, to study the conservation of ACE2 and its potential to be used as a receptor by SARS-CoV-2. We designed a five-category binding score based on the conservation properties of 25 amino acids important for the binding between ACE2 and the SARS-CoV-2 spike protein. Only mammals fell into the medium to very high categories and only catarrhine primates into the very high category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 spike protein binding and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (frequency <0.001) variants in 10/25 binding sites. In addition, we found significant signals of selection and accelerated evolution in the ACE2 coding sequence across all mammals, and specific to the bat lineage. Our results, if confirmed by additional experimental data, may lead to the identification of intermediate host species for SARS-CoV-2, guide the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.2010146117DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7486773PMC
September 2020

Broad Host Range of SARS-CoV-2 Predicted by Comparative and Structural Analysis of ACE2 in Vertebrates.

bioRxiv 2020 Apr 18. Epub 2020 Apr 18.

The Genome Center, University of California Davis, Davis, CA 95616, USA.

The novel coronavirus SARS-CoV-2 is the cause of Coronavirus Disease-2019 (COVID-19). The main receptor of SARS-CoV-2, angiotensin I converting enzyme 2 (ACE2), is now undergoing extensive scrutiny to understand the routes of transmission and sensitivity in different species. Here, we utilized a unique dataset of 410 vertebrates, including 252 mammals, to study cross-species conservation of ACE2 and its likelihood to function as a SARS-CoV-2 receptor. We designed a five-category ranking score based on the conservation properties of 25 amino acids important for the binding between receptor and virus, classifying all species from to . Only mammals fell into the to categories, and only catarrhine primates in the category, suggesting that they are at high risk for SARS-CoV-2 infection. We employed a protein structural analysis to qualitatively assess whether amino acid changes at variable residues would be likely to disrupt ACE2/SARS-CoV-2 binding, and found the number of predicted unfavorable changes significantly correlated with the binding score. Extending this analysis to human population data, we found only rare (<0.1%) variants in 10/25 binding sites. In addition, we observed evidence of positive selection in ACE2 in multiple species, including bats. Utilized appropriately, our results may lead to the identification of intermediate host species for SARS-CoV-2, justify the selection of animal models of COVID-19, and assist the conservation of animals both in native habitats and in human care.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/2020.04.16.045302DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7263403PMC
April 2020

Introduction.

Annu Rev Anim Biosci 2020 02;8

Co-Editors.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1146/annurev-av-08-112619-100001DOI Listing
February 2020

Precision nomenclature for the new genomics.

Gigascience 2019 08;8(8)

Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg 194044, Russia.

The confluence of two scientific disciplines may lead to nomenclature conflicts that require new terms while respecting historical definitions. This is the situation with the current state of cytology and genomics, which offer examples of distinct nomenclature and vocabularies that require reconciliation. In this article, we propose the new terms C-scaffold (for chromosome-scale assemblies of sequenced DNA fragments, commonly named scaffolds) and scaffotype (the resulting collection of C-scaffolds that represent an organism's genome). This nomenclature avoids conflict with the historical definitions of the terms chromosome (a microscopic body made of DNA and protein) and karyotype (the collection of images of all chromosomes of an organism or species). As large-scale sequencing projects progress, adoption of this nomenclature will assist end users to properly classify genome assemblies, thus facilitating genomic analysis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giz086DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6705538PMC
August 2019

An integrated chromosome-scale genome assembly of the Masai giraffe (Giraffa camelopardalis tippelskirchi).

Gigascience 2019 08;8(8)

The Genome Center, University of California, Davis, CA 95616, USA.

Background: The Masai giraffe (Giraffa camelopardalis tippelskirchi) is the largest-bodied giraffe and the world's tallest terrestrial animal. With its extreme size and height, the giraffe's unique anatomical and physiological adaptations have long been of interest to diverse research fields. Giraffes are also critical to ecosystems of sub-Saharan Africa, with their long neck serving as a conduit to food sources not shared by other herbivores. Although the genome of a Masai giraffe has been sequenced, the assembly was highly fragmented and suboptimal for genome analysis. Herein we report an improved giraffe genome assembly to facilitate evolutionary analysis of the giraffe and other ruminant genomes.

Findings: Using SOAPdenovo2 and 170 Gbp of Illumina paired-end and mate-pair reads, we generated a 2.6-Gbp male Masai giraffe genome assembly, with a scaffold N50 of 3 Mbp. The incorporation of 114.6 Gbp of Chicago library sequencing data resulted in a HiRise SOAPdenovo + Chicago assembly with an N50 of 48 Mbp and containing 95% of expected genes according to BUSCO analysis. Using the Reference-Assisted Chromosome Assembly tool, we were able to order and orient scaffolds into 42 predicted chromosome fragments (PCFs). Using fluorescence in situ hybridization, we placed 153 cattle bacterial artificial chromosomes onto giraffe metaphase spreads to assess and assign the PCFs on 14 giraffe autosomes and the X chromosome resulting in the final assembly with an N50 of 177.94 Mbp. In this assembly, 21,621 protein-coding genes were identified using both de novo and homology-based predictions.

Conclusions: We have produced the first chromosome-scale genome assembly for a Giraffidae species. This assembly provides a valuable resource for the study of artiodactyl evolution and for understanding the molecular basis of the unique adaptive traits of giraffes. In addition, the assembly will provide a powerful resource to assist conservation efforts of Masai giraffe, whose population size has declined by 52% in recent years.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giz090DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6669057PMC
August 2019

Large-scale ruminant genome sequencing provides insights into their evolution and distinct traits.

Science 2019 06;364(6446)

Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark.

The ruminants are one of the most successful mammalian lineages, exhibiting morphological and habitat diversity and containing several key livestock species. To better understand their evolution, we generated and analyzed de novo assembled genomes of 44 ruminant species, representing all six Ruminantia families. We used these genomes to create a time-calibrated phylogeny to resolve topological controversies, overcoming the challenges of incomplete lineage sorting. Population dynamic analyses show that population declines commenced between 100,000 and 50,000 years ago, which is concomitant with expansion in human populations. We also reveal genes and regulatory elements that possibly contribute to the evolution of the digestive system, cranial appendages, immune system, metabolism, body size, cursorial locomotion, and dentition of the ruminants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1126/science.aav6202DOI Listing
June 2019

Fine-tuned adaptation of embryo-endometrium pairs at implantation revealed by transcriptome analyses in Bos taurus.

PLoS Biol 2019 04 12;17(4):e3000046. Epub 2019 Apr 12.

UMR BDR, INRA, ENVA, Université Paris Saclay, Jouy-en-Josas, France.

Interactions between embryo and endometrium at implantation are critical for the progression of pregnancy. These reciprocal actions involve exchange of paracrine signals that govern implantation and placentation. However, it remains unknown how these interactions between the conceptus and the endometrium are coordinated at the level of an individual pregnancy. Under the hypothesis that gene expression in endometrium is dependent on gene expression of extraembryonic tissues and genes expressed in extraembryonic tissues are dependent of genes expressed in the endometrium, we performed an integrative analysis of transcriptome profiles of paired extraembryonic tissue and endometria obtained from cattle (Bos taurus) pregnancies initiated by artificial insemination. We quantified strong dependence (|r| > 0.95, empirical false discovery rate [eFDR] < 0.01) in transcript abundance of genes expressed in the extraembryonic tissues and genes expressed in the endometrium. The profiles of connectivity revealed distinct coexpression patterns of extraembryonic tissues with caruncular and intercaruncular areas of the endometrium. Notably, a subset of highly coexpressed genes between extraembryonic tissue (n = 229) and caruncular areas of the endometrium (n = 218, r > 0.9999, eFDR < 0.001) revealed a blueprint of gene expression specific to each pregnancy. Gene ontology analyses of genes coexpressed between extraembryonic tissue and endometrium revealed significantly enriched modules with critical contribution for implantation and placentation, including "in utero embryonic development," "placenta development," and "regulation of transcription." Coexpressing modules were remarkably specific to caruncular or intercaruncular areas of the endometrium. The quantitative association between genes expressed in extraembryonic tissue and endometrium emphasize a coordinated communication between these two entities in mammals. We provide evidence that implantation in mammalian pregnancy relies on the ability of the extraembryonic tissue and the endometrium to develop a fine-tuned adaptive response characteristic of each pregnancy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pbio.3000046DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6481875PMC
April 2019

Evolution of gene regulation in ruminants differs between evolutionary breakpoint regions and homologous synteny blocks.

Genome Res 2019 04 13;29(4):576-589. Epub 2019 Feb 13.

Royal Veterinary College, University of London, London NW1 0TU, United Kingdom.

The role of chromosome rearrangements in driving evolution has been a long-standing question of evolutionary biology. Here we focused on ruminants as a model to assess how rearrangements may have contributed to the evolution of gene regulation. Using reconstructed ancestral karyotypes of Cetartiodactyls, Ruminants, Pecorans, and Bovids, we traced patterns of gross chromosome changes. We found that the lineage leading to the ruminant ancestor after the split from other cetartiodactyls was characterized by mostly intrachromosomal changes, whereas the lineage leading to the pecoran ancestor (including all livestock ruminants) included multiple interchromosomal changes. We observed that the liver cell putative enhancers in the ruminant evolutionary breakpoint regions are highly enriched for DNA sequences under selective constraint acting on lineage-specific transposable elements (TEs) and a set of 25 specific transcription factor (TF) binding motifs associated with recently active TEs. Coupled with gene expression data, we found that genes near ruminant breakpoint regions exhibit more divergent expression profiles among species, particularly in cattle, which is consistent with the phylogenetic origin of these breakpoint regions. This divergence was significantly greater in genes with enhancers that contain at least one of the 25 specific TF binding motifs and located near bovidae-to-cattle lineage breakpoint regions. Taken together, by combining ancestral karyotype reconstructions with analysis of regulatory element and gene expression evolution, our work demonstrated that lineage-specific regulatory elements colocalized with gross chromosome rearrangements may have provided valuable functional modifications that helped to shape ruminant evolution.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.239863.118DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6442394PMC
April 2019

A near-chromosome-scale genome assembly of the gemsbok (Oryx gazella): an iconic antelope of the Kalahari desert.

Gigascience 2019 02 1;8(2). Epub 2019 Feb 1.

The UC Davis Genome Center, Department of Evolution and Ecology, College of Biological Sciences, and the Department of Reproduction and Population Health, School of Veterinary Medicine, University of California, Davis, USA.

Background: The gemsbok (Oryx gazella) is one of the largest antelopes in Africa. Gemsbok are heterothermic and thus highly adapted to live in the desert, changing their feeding behavior when faced with extreme drought and heat. A high-quality genome sequence of this species will assist efforts to elucidate these and other important traits of gemsbok and facilitate research on conservation efforts.

Findings: Using 180 Gbp of Illumina paired-end and mate-pair reads, a 2.9 Gbp assembly with scaffold N50 of 1.48 Mbp was generated using SOAPdenovo. Scaffolds were extended using Chicago library sequencing, which yielded an additional 114.7 Gbp of DNA sequence. The HiRise assembly using SOAPdenovo + Chicago library sequencing produced a scaffold N50 of 47 Mbp and a final genome size of 2.9 Gbp, representing 90.6% of the estimated genome size and including 93.2% of expected genes according to Benchmarking Universal Single-Copy Orthologs analysis. The Reference-Assisted Chromosome Assembly tool was used to generate a final set of 47 predicted chromosome fragments with N50 of 86.25 Mbp and containing 93.8% of expected genes. A total of 23,125 protein-coding genes and 1.14 Gbp of repetitive sequences were annotated using de novo and homology-based predictions.

Conclusions: Our results provide the first high-quality, chromosome-scale genome sequence assembly for gemsbok, which will be a valuable resource for studying adaptive evolution of this species and other ruminants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/giy162DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6351727PMC
February 2019

Chromosome Segregation Is Biased by Kinetochore Size.

Curr Biol 2018 05 26;28(9):1344-1356.e5. Epub 2018 Apr 26.

Chromosome Instability & Dynamics Laboratory, Instituto de Biologia Molecular e Celular, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; Instituto de Investigação e Inovação em Saúde (i3S), Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal; Cell Division Group, Experimental Biology Unit, Department of Biomedicine, Faculdade de Medicina, Universidade do Porto, Alameda Professor Hernâni Monteiro, 4200-319 Porto, Portugal. Electronic address:

Chromosome missegregation during mitosis or meiosis is a hallmark of cancer and the main cause of prenatal death in humans. The gain or loss of specific chromosomes is thought to be random, with cell viability being essentially determined by selection. Several established pathways including centrosome amplification, sister-chromatid cohesion defects, or a compromised spindle assembly checkpoint can lead to chromosome missegregation. However, how specific intrinsic features of the kinetochore-the critical chromosomal interface with spindle microtubules-impact chromosome segregation remains poorly understood. Here we used the unique cytological attributes of female Indian muntjac, the mammal with the lowest known chromosome number (2n = 6), to characterize and track individual chromosomes with distinct kinetochore size throughout mitosis. We show that centromere and kinetochore functional layers scale proportionally with centromere size. Measurement of intra-kinetochore distances, serial-section electron microscopy, and RNAi against key kinetochore proteins confirmed a standard structural and functional organization of the Indian muntjac kinetochores and revealed that microtubule binding capacity scales with kinetochore size. Surprisingly, we found that chromosome segregation in this species is not random. Chromosomes with larger kinetochores bi-oriented more efficiently and showed a 2-fold bias to congress to the equator in a motor-independent manner. Despite robust correction mechanisms during unperturbed mitosis, chromosomes with larger kinetochores were also strongly biased to establish erroneous merotelic attachments and missegregate during anaphase. This bias was impervious to the experimental attenuation of polar ejection forces on chromosome arms by RNAi against the chromokinesin Kif4a. Thus, kinetochore size is an important determinant of chromosome segregation fidelity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.cub.2018.03.023DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5954971PMC
May 2018

Earth BioGenome Project: Sequencing life for the future of life.

Proc Natl Acad Sci U S A 2018 04;115(17):4325-4333

China National Genebank, BGI-Shenzhen, 518083 Shenzhen, Guangdong, China.

Increasing our understanding of Earth's biodiversity and responsibly stewarding its resources are among the most crucial scientific and social challenges of the new millennium. These challenges require fundamental new knowledge of the organization, evolution, functions, and interactions among millions of the planet's organisms. Herein, we present a perspective on the Earth BioGenome Project (EBP), a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth's eukaryotic biodiversity over a period of 10 years. The outcomes of the EBP will inform a broad range of major issues facing humanity, such as the impact of climate change on biodiversity, the conservation of endangered species and ecosystems, and the preservation and enhancement of ecosystem services. We describe hurdles that the project faces, including data-sharing policies that ensure a permanent, freely available resource for future scientific discovery while respecting access and benefit sharing guidelines of the Nagoya Protocol. We also describe scientific and organizational challenges in executing such an ambitious project, and the structure proposed to achieve the project's goals. The far-reaching potential benefits of creating an open digital repository of genomic information for life on Earth can be realized only by a coordinated international effort.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1720115115DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5924910PMC
April 2018

Reconstruction and evolutionary history of eutherian chromosomes.

Proc Natl Acad Sci U S A 2017 07 19;114(27):E5379-E5388. Epub 2017 Jun 19.

Department of Evolution and Ecology, University of California, Davis, CA 95616

Whole-genome assemblies of 19 placental mammals and two outgroup species were used to reconstruct the order and orientation of syntenic fragments in chromosomes of the eutherian ancestor and six other descendant ancestors leading to human. For ancestral chromosome reconstructions, we developed an algorithm (DESCHRAMBLER) that probabilistically determines the adjacencies of syntenic fragments using chromosome-scale and fragmented genome assemblies. The reconstructed chromosomes of the eutherian, boreoeutherian, and euarchontoglires ancestor each included >80% of the entire length of the human genome, whereas reconstructed chromosomes of the most recent common ancestor of simians, catarrhini, great apes, and humans and chimpanzees included >90% of human genome sequence. These high-coverage reconstructions permitted reliable identification of chromosomal rearrangements over ∼105 My of eutherian evolution. Orangutan was found to have eight chromosomes that were completely conserved in homologous sequence order and orientation with the eutherian ancestor, the largest number for any species. Ruminant artiodactyls had the highest frequency of intrachromosomal rearrangements, and interchromosomal rearrangements dominated in murid rodents. A total of 162 chromosomal breakpoints in evolution of the eutherian ancestral genome to the human genome were identified; however, the rate of rearrangements was significantly lower (0.80/My) during the first ∼60 My of eutherian evolution, then increased to greater than 2.0/My along the five primate lineages studied. Our results significantly expand knowledge of eutherian genome evolution and will facilitate greater understanding of the role of chromosome rearrangements in adaptation, speciation, and the etiology of inherited and spontaneously occurring diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1702012114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5502614PMC
July 2017

Systematic Profiling of Short Tandem Repeats in the Cattle Genome.

Genome Biol Evol 2017 01;9(1):20-31

Animal Genomics and Improvement Laboratory, Agricultural Research Service, Beltsville, MD.

Short tandem repeats (STRs), or microsatellites, are genetic variants with repetitive 2–6 base pair motifs in many mammalian genomes. Using high-throughput sequencing and experimental validations, we systematically profiled STRs in five Holsteins. We identified a total of 60,106 microsatellites and generated the first high-resolution STR map, representing a substantial pool of polymorphism in dairy cattle. We observed significant STRs overlap with functional genes and quantitative trait loci (QTL). We performed evolutionary and population genetic analyses using over 20,000 common dinucleotide STRs. Besides corroborating the well-established positive correlation between allele size and variance in allele size, these analyses also identified dozens of outlier STRs based on two anomalous relationships that counter expected characteristics of neutral evolution. And one STR locus overlaps with a significant region of a summary statistic designed to detect STR-related selection. Additionally, our results showed that only 57.1% of STRs located within SNP-based linkage disequilibrium (LD) blocks whereas the other 42.9% were out of blocks. Therefore, a substantial number of STRs are not tagged by SNPs in the cattle genome, likely due to STR's distinct mutation mechanism and elevated polymorphism. This study provides the foundation for future STR-based studies of cattle genome evolution and selection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evw256DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5381564PMC
January 2017

Massive dysregulation of genes involved in cell signaling and placental development in cloned cattle conceptus and maternal endometrium.

Proc Natl Acad Sci U S A 2016 12 8;113(51):14492-14501. Epub 2016 Dec 8.

Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61820;

A major unresolved issue in the cloning of mammals by somatic cell nuclear transfer (SCNT) is the mechanism by which the process fails after embryos are transferred to the uterus of recipients before or during the implantation window. We investigated this problem by using RNA sequencing (RNA-seq) to compare the transcriptomes in cattle conceptuses produced by SCNT and artificial insemination (AI) at day (d) 18 (preimplantation) and d 34 (postimplantation) of gestation. In addition, endometrium was profiled to identify the communication pathways that might be affected by the presence of a cloned conceptus, ultimately leading to mortality before or during the implantation window. At d 18, the effects on the transcriptome associated with SCNT were massive, involving more than 5,000 differentially expressed genes (DEGs). Among them are 121 genes that have embryonic lethal phenotypes in mice, cause defects in trophoblast and placental development, and/or affect conceptus survival in mice. In endometria at d 18, <0.4% of expressed genes were affected by the presence of a cloned conceptus, whereas at d 34, ∼36% and <0.7% of genes were differentially expressed in intercaruncular and caruncular tissues, respectively. Functional analysis of DEGs in placental and endometrial tissues suggests a major disruption of signaling between the cloned conceptus and the endometrium, particularly the intercaruncular tissue. Our results support a "bottleneck" model for cloned conceptus survival during the periimplantation period determined by gene expression levels in extraembryonic tissues and the endometrial response to altered signaling from clones.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1520945114DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5187692PMC
December 2016

Systems Analysis of Early Host Gene Expression Provides Clues for Transient Mycobacterium avium ssp avium vs. Persistent Mycobacterium avium ssp paratuberculosis Intestinal Infections.

PLoS One 2016;11(9):e0161946. Epub 2016 Sep 21.

Department of Veterinary Pathobiology, College of Veterinary Medicine & Biomedical Sciences, Texas A&M University, College Station, Texas, 77843, United States of America.

It has long been a quest in ruminants to understand how two very similar mycobacterial species, Mycobacterium avium ssp. paratuberculosis (MAP) and Mycobacterium avium ssp. avium (MAA) lead to either a chronic persistent infection or a rapid-transient infection, respectively. Here, we hypothesized that when the host immune response is activated by MAP or MAA, the outcome of the infection depends on the early activation of signaling molecules and host temporal gene expression. To test our hypothesis, ligated jejuno-ileal loops including Peyer's patches in neonatal calves were inoculated with PBS, MAP, or MAA. A temporal analysis of the host transcriptome profile was conducted at several times post-infection (0.5, 1, 2, 4, 8 and 12 hours). When comparing the transcriptional responses of calves infected with the MAA versus MAP, discordant patterns of mucosal expression were clearly evident, and the numbers of unique transcripts altered were moderately less for MAA-infected tissue than were mucosal tissues infected with the MAP. To interpret these complex data, changes in the gene expression were further analyzed by dynamic Bayesian analysis. Bayesian network modeling identified mechanistic genes, gene-to-gene relationships, pathways and Gene Ontologies (GO) biological processes that are involved in specific cell activation during infection. MAP and MAA had significant different pathway perturbation at 0.5 and 12 hours post inoculation. Inverse processes were observed between MAP and MAA response for epithelial cell proliferation, negative regulation of chemotaxis, cell-cell adhesion mediated by integrin and regulation of cytokine-mediated signaling. MAP inoculated tissue had significantly lower expression of phagocytosis receptors such as mannose receptor and complement receptors. This study reveals that perturbation of genes and cellular pathways during MAP infection resulted in host evasion by mucosal membrane barrier weakening to access entry in the ileum, inhibition of Ca signaling associated with decreased phagosome-lysosome fusion as well as phagocytosis inhibition, bias toward Th2 cell immune response accompanied by cell recruitment, cell proliferation and cell differentiation; leading to persistent infection. Contrarily, MAA infection was related to cellular responses associated with activation of molecular pathways that release chemicals and cytokines involved with containment of infection and a strong bias toward Th1 immune response, resulting in a transient infection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031438PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0161946PLOS
September 2016

Identification of a nonsense mutation in APAF1 that is likely causal for a decrease in reproductive efficiency in Holstein dairy cattle.

J Dairy Sci 2016 Aug 8;99(8):6693-6701. Epub 2016 Jun 8.

Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana 61801; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana 61801; Department of Evolution and Ecology and the UC Davis Genome Center, University of California, Davis, Davis 95616. Electronic address:

The HH1 haplotype on chromosome 5 is associated with a reduced conception rate and a deficit of homozygotes at the population level in Holstein cattle. The source HH1 haplotype was traced to the bull Pawnee Farm Arlinda Chief (Chief), who was born in 1962 and has sired more than 16,000 daughters. We identified a nonsense mutation in APAF1 (apoptotic protease activating factor 1;APAF1 p.Q579X) within HH1 using whole-genome resequencing of Chief and 3 of his sons. This mutation is predicted to truncate 670 AA (53.7%) of the encoded APAF1 protein that contains a WD40 domain critical to protein-protein interactions. Initial screening revealed no homozygous individuals for the mutation in 758 animals previously genotyped, whereas all 497 HH1 carriers possessed 1 copy of the mutant allele. Subsequent commercial genotyping of 246,773 Holsteins revealed 5,299 APAF1 heterozygotes and zero homozygotes for the mutation. The causative role of this mutation is also supported by functional data in mice that have demonstrated Apaf1 to be an essential molecule in the cytochrome-c-mediated apoptotic cascade and directly implicated in developmental and neurodegenerative disorders. In addition, most Apaf1 homozygous knockouts die by day 16.5 of development. We thus propose that the APAF1 p.Q579X nonsense mutation is the functional equivalent of the Apaf1 knockout. This mutation has caused an estimated 525,000 spontaneous abortions worldwide over the past 35 years, accounting for approximately $420 million in losses. With the mutation identified, selection against the deleterious allele in breeding schemes has aided in eliminating this defect from the population, reducing carrier frequency from 8% in past decades to 2% in 2015.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3168/jds.2015-10517DOI Listing
August 2016

Diversity and population-genetic properties of copy number variations and multicopy genes in cattle.

DNA Res 2016 Jun 15;23(3):253-62. Epub 2016 Apr 15.

USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA

The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1 Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/dnares/dsw013DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4909312PMC
June 2016

Genome-wide adaptive complexes to underground stresses in blind mole rats Spalax.

Nat Commun 2014 Jun 3;5:3966. Epub 2014 Jun 3.

Institute of Evolution, University of Haifa, Mount Carmel, Haifa 31905, Israel.

The blind mole rat (BMR), Spalax galili, is an excellent model for studying mammalian adaptation to life underground and medical applications. The BMR spends its entire life underground, protecting itself from predators and climatic fluctuations while challenging it with multiple stressors such as darkness, hypoxia, hypercapnia, energetics and high pathonecity. Here we sequence and analyse the BMR genome and transcriptome, highlighting the possible genomic adaptive responses to the underground stressors. Our results show high rates of RNA/DNA editing, reduced chromosome rearrangements, an over-representation of short interspersed elements (SINEs) probably linked to hypoxia tolerance, degeneration of vision and progression of photoperiodic perception, tolerance to hypercapnia and hypoxia and resistance to cancer. The remarkable traits of the BMR, together with its genomic and transcriptomic information, enhance our understanding of adaptation to extreme environments and will enable the utilization of BMR models for biomedical research in the fight against cancer, stroke and cardiovascular diseases.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms4966DOI Listing
June 2014

Memories of Carl from an improbable friend.

Authors:
Harris A Lewin

RNA Biol 2014 14;11(3):273-8. Epub 2014 Apr 14.

University of California at Davis; Department of Evolution and Ecology and the UC Davis Genome Center; Davis, CA USA.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4161/rna.28866DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008559PMC
March 2015

Postpartal subclinical endometritis alters transcriptome profiles in liver and adipose tissue of dairy cows.

Bioinform Biol Insights 2014 19;8:45-63. Epub 2014 Feb 19.

Department of Animal Sciences, University of Illinois, Urbana, Illinois, USA. ; Division of Nutritional Sciences, University of Illinois, Urbana, Illinois, USA.

Transcriptome alterations in liver and adipose tissue of cows with subclinical endometritis (SCE) at 29 d postpartum were evaluated. Bioinformatics analysis was performed using the Dynamic Impact Approach by means of KEGG and DAVID databases. Milk production, blood metabolites (non-esterified fatty acids, magnesium), and disease biomarkers (albumin, aspartate aminotransferase) did not differ greatly between healthy and SCE cows. In liver tissue of cows with SCE, alterations in gene expression revealed an activation of complement and coagulation cascade, steroid hormone biosynthesis, apoptosis, inflammation, oxidative stress, MAPK signaling, and the formation of fibrinogen complex. Bioinformatics analysis also revealed an inhibition of vitamin B3 and B6 metabolism with SCE. In adipose, the most activated pathways by SCE were nicotinate and nicotinamide metabolism, long-chain fatty acid transport, oxidative phosphorylation, inflammation, T cell and B cell receptor signaling, and mTOR signaling. Results indicate that SCE in dairy cattle during early lactation induces molecular alterations in liver and adipose tissue indicative of immune activation and cellular stress.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4137/BBI.S13735DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3934763PMC
February 2014

Systems biology analysis of Brucella infected Peyer's patch reveals rapid invasion with modest transient perturbations of the host transcriptome.

PLoS One 2013 9;8(12):e81719. Epub 2013 Dec 9.

Department of Veterinary Pathobiology, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, United States of America.

Brucella melitensis causes the most severe and acute symptoms of all Brucella species in human beings and infects hosts primarily through the oral route. The epithelium covering domed villi of jejunal-ileal Peyer's patches is an important site of entry for several pathogens, including Brucella. Here, we use the calf ligated ileal loop model to study temporal in vivo Brucella-infected host molecular and morphological responses. Our results document Brucella bacteremia occurring within 30 min after intraluminal inoculation of the ileum without histopathologic traces of lesions. Based on a system biology Dynamic Bayesian Network modeling approach (DBN) of microarray data, a very early transient perturbation of the host enteric transcriptome was associated with the initial host response to Brucella contact that is rapidly averted allowing invasion and dissemination. A detailed analysis revealed active expression of Syndecan 2, Integrin alpha L and Integrin beta 2 genes, which may favor initial Brucella adhesion. Also, two intestinal barrier-related pathways (Tight Junction and Trefoil Factors Initiated Mucosal Healing) were significantly repressed in the early stage of infection, suggesting subversion of mucosal epithelial barrier function to facilitate Brucella transepithelial migration. Simultaneously, the strong activation of the innate immune response pathways would suggest that the host mounts an appropriate protective immune response; however, the expression of the two key genes that encode innate immunity anti-Brucella cytokines such as TNF-α and IL12p40 were not significantly changed throughout the study. Furthermore, the defective expression of Toll-Like Receptor Signaling pathways may partially explain the lack of proinflammatory cytokine production and consequently the absence of morphologically detectable inflammation at the site of infection. Cumulatively, our results indicate that the in vivo pathogenesis of the early infectious process of Brucella is primarily accomplished by compromising the mucosal immune barrier and subverting critical immune response mechanisms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0081719PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857238PMC
September 2014

Changes in WNT signaling-related gene expression associated with development and cloning in bovine extra-embryonic and endometrial tissues during the peri-implantation period.

Mol Reprod Dev 2013 Dec 17;80(12):977-87. Epub 2013 Oct 17.

Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.

We determined if somatic cell nuclear transfer (SCNT) cloning is associated with WNT-related gene expression in cattle development, and if the expression of genes in the WNT pathway changes during the peri-implantation period. Extra-embryonic and endometrial tissues were collected at gestation days 18 and 34 (d18, d34). WNT5A, FZD4, FZD5, LRP5, CTNNB1, GNAI2, KDM1A, BCL2L1, and SFRP1 transcripts were localized in extra-embryonic tissue, whereas SFRP1 and DKK1 were localized in the endometrium. There were no differences in the localization of these transcripts in extra-embryonic tissue or endometrium from SCNT or artificial insemination (AI) pregnancies. Expression levels of WNT5A were 11-fold greater in the allantois of SCNT than AI samples. In the trophoblast, expression of WNT5A, FZD5, CTNNB1, and DKK1 increased significantly from d18 to d34, whereas expression of KDM1A and SFRP1 decreased, indicating that implantation is associated with major changes in WNT signaling. SCNT was associated with altered WNT5A expression in trophoblasts, with levels increasing 2.3-fold more in AI than SCNT conceptuses from d18 to d34. In the allantois, expression of WNT5A increased 6.3-fold more in SCNT than AI conceptuses from d18 to d34. Endometrial tissue expression levels of the genes tested did not differ between AI or SCNT pregnancies, although expression of individual genes showed variation across developmental stages. Our results demonstrate that SCNT is associated with altered expression of specific WNT-related genes in extra-embryonic tissue in a time- and tissue-specific manner. The pattern of gene expression in the WNT pathway suggests that noncanonical WNT signal transduction is important for implantation of cattle conceptuses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/mrd.22257DOI Listing
December 2013

Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.

Bioinform Biol Insights 2013 4;7:253-70. Epub 2013 Aug 4.

Mammalian NutriPhysioGenomics, Department of Animal Sciences, University of Illinois, Urbana, Illinois, USA. ; Division of Nutritional Sciences, University of Illinois, Urbana, Illinois USA.

Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.4137/BBI.S12328DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3738383PMC
August 2013

Inferring demography from runs of homozygosity in whole-genome sequence, with correction for sequence errors.

Mol Biol Evol 2013 Sep 10;30(9):2209-23. Epub 2013 Jul 10.

Department of Agriculture and Food Systems, Melbourne School of Land and Environment, University of Melbourne, Victoria, Australia.

Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (Ne) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493-496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in Ne. The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the Ne around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with Ne of between 3,500 and 6,000. The most recent reduction of Ne to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/molbev/mst125DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748359PMC
September 2013

Draft genome sequence of the Tibetan antelope.

Nat Commun 2013 ;4:1858

Key Laboratory for High Altitude Medicine of Ministry of Chinese Education and Research Center for High Altitude Medicine, Qinghai University, Xining, Qinghai 810001, China.

The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared with other plain-dwelling mammals, the genome of the Tibetan antelope shows signals of adaptive evolution and gene-family expansion in genes associated with energy metabolism and oxygen transmission. Both the highland American pika, and the Tibetan antelope have signals of positive selection for genes involved in DNA repair and the production of ATPase. Genes associated with hypoxia seem to have experienced convergent evolution. Thus, our study suggests that common genetic mechanisms might have been utilized to enable high-altitude adaptation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms2860DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3674232PMC
December 2013

Reference-assisted chromosome assembly.

Proc Natl Acad Sci U S A 2013 Jan 10;110(5):1785-90. Epub 2013 Jan 10.

Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

One of the most difficult problems in modern genomics is the assembly of full-length chromosomes using next generation sequencing (NGS) data. To address this problem, we developed "reference-assisted chromosome assembly" (RACA), an algorithm to reliably order and orient sequence scaffolds generated by NGS and assemblers into longer chromosomal fragments using comparative genome information and paired-end reads. Evaluation of results using simulated and real genome assemblies indicates that our approach can substantially improve genomes generated by a wide variety of de novo assemblers if a good reference assembly of a closely related species and outgroup genomes are available. We used RACA to reconstruct 60 Tibetan antelope (Pantholops hodgsonii) chromosome fragments from 1,434 SOAPdenovo sequence scaffolds, of which 16 chromosome fragments were homologous to complete cattle chromosomes. Experimental validation by PCR showed that predictions made by RACA are highly accurate. Our results indicate that RACA will significantly facilitate the study of chromosome evolution and genome rearrangements for the large number of genomes being sequenced by NGS that do not have a genetic or physical map.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1220349110DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3562798PMC
January 2013
-->