Publications by authors named "Paolo Ribeca"

34 Publications

Transcriptomic Profiling of Dromedary Camels Immunised with a MERS Vaccine Candidate.

Vet Sci 2021 Aug 3;8(8). Epub 2021 Aug 3.

Vaccine Development Unit, Department of Infectious Disease Research, King Abdullah International Medical Research Center (KAIMRC), Riyadh 11481, Saudi Arabia.

Middle East Respiratory Syndrome coronavirus (MERS-CoV) infects dromedary camels and zoonotically infects humans, causing a respiratory disease with severe pneumonia and death. With no approved antiviral or vaccine interventions for MERS, vaccines are being developed for camels to prevent virus transmission into humans. We have previously developed a chimpanzee adenoviral vector-based vaccine for MERS-CoV (ChAdOx1 MERS) and reported its strong humoral immunogenicity in dromedary camels. Here, we looked back at total RNA isolated from whole blood of three immunised dromedaries pre and post-vaccination during the first day; and performed RNA sequencing and bioinformatic analysis in order to shed light on the molecular immune responses following a ChAdOx1 MERS vaccination. Our finding shows that a number of transcripts were differentially regulated as an effect of the vaccination, including genes that are involved in innate and adaptive immunity, such as type I and II interferon responses. The camel Bcl-3 and Bcl-6 transcripts were significantly upregulated, indicating a strong activation of Tfh cell, B cell, and NF-κB pathways. In conclusion, this study gives an overall view of the first changes in the immune transcriptome of dromedaries after vaccination; it supports the potency of ChAdOx1 MERS as a potential camel vaccine to block transmission and prevent new human cases and outbreaks.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/vetsci8080156DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8402689PMC
August 2021

Engineered Promoter-Switched Viruses Reveal the Role of Poxvirus Maturation Protein A26 as a Negative Regulator of Viral Spread.

J Virol 2021 09 9;95(19):e0101221. Epub 2021 Sep 9.

Department of Microbial Sciences, School of Biosciences and Medicine, University of Surrey, Guildford, United Kingdom.

Vaccinia virus produces two types of virions known as single-membraned intracellular mature virus (MV) and double-membraned extracellular enveloped virus (EV). EV production peaks earlier when initial MVs are further wrapped and secreted to spread infection within the host. However, late during infection, MVs accumulate intracellularly and become important for host-to-host transmission. The process that regulates this switch remains elusive and is thought to be influenced by host factors. Here, we examined the hypothesis that EV and MV production are regulated by the virus through expression of F13 and the MV-specific protein A26. By switching the promoters and altering the expression kinetics of F13 and A26, we demonstrate that A26 expression downregulates EV production and plaque size, thus limiting viral spread. This process correlates with A26 association with the MV surface protein A27 and exclusion of F13, thus reducing EV titers. Thus, MV maturation is controlled by the abundance of the viral A26 protein, independently of other factors, and is rate limiting for EV production. The A26 gene is conserved within vertebrate poxviruses but is strikingly lost in poxviruses known to be transmitted exclusively by biting arthropods. A26-mediated virus maturation thus has the appearance to be an ancient evolutionary adaptation to enhance transmission of poxviruses that has subsequently been lost from vector-adapted species, for which it may serve as a genetic signature. The existence of virus-regulated mechanisms to produce virions adapted to fulfill different functions represents a novel level of complexity in mammalian viruses with major impacts on evolution, adaptation, and transmission. Chordopoxviruses are mammalian viruses that uniquely produce a first type of virion adapted to spread within the host and a second type that enhances transmission between hosts, which can take place by multiple ways, including direct contact, respiratory droplets, oral/fecal routes, or via vectors. Both virion types are important to balance intrahost dissemination and interhost transmission, so virus maturation pathways must be tightly controlled. Here, we provide evidence that the abundance and kinetics of expression of the viral protein A26 regulates this process by preventing formation of the first form and shifting maturation toward the second form. A26 is expressed late after the initial wave of progeny virions is produced, so sufficient viral dissemination is ensured, and A26 provides virions with enhanced environmental stability. Conservation of A26 in all vertebrate poxviruses, but not in those transmitted exclusively via biting arthropods, reveals the importance of A26-controlled virus maturation for transmission routes involving environmental exposure.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/JVI.01012-21DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8428399PMC
September 2021

Mutagenesis Mapping of RNA Structures within the Foot-and-Mouth Disease Virus Genome Reveals Functional Elements Localized in the Polymerase (3D)-Encoding Region.

mSphere 2021 08 14;6(4):e0001521. Epub 2021 Jul 14.

Biomedical Sciences Research Complex (BSRC), School of Biology, University of St. Andrews, St. Andrews, United Kingdom.

RNA structures can form functional elements that play crucial roles in the replication of positive-sense RNA viruses. While RNA structures in the untranslated regions (UTRs) of several picornaviruses have been functionally characterized, the roles of putative RNA structures predicted for protein coding sequences (or open reading frames [ORFs]) remain largely undefined. Here, we have undertaken a bioinformatic analysis of the foot-and-mouth disease virus (FMDV) genome to predict 53 conserved RNA structures within the ORF. Forty-six of these structures were located in the regions encoding the nonstructural proteins (nsps). To investigate whether structures located in the regions encoding the nsps are required for FMDV replication, we used a mutagenesis method, CDLR mapping, where sequential coding segments were shuffled to minimize RNA secondary structures while preserving protein coding, native dinucleotide frequencies, and codon usage. To examine the impact of these changes on replicative fitness, mutated sequences were inserted into an FMDV subgenomic replicon. We found that three of the RNA structures, all at the 3' termini of the FMDV ORF, were critical for replicon replication. In contrast, disruption of the other 43 conserved RNA structures that lie within the regions encoding the nsps had no effect on replicon replication, suggesting that these structures are not required for initiating translation or replication of viral RNA. Conserved RNA structures that are not essential for virus replication could provide ideal targets for the rational attenuation of a wide range of FMDV strains. Some RNA structures formed by the genomes of RNA viruses are critical for viral replication. Our study shows that of 46 conserved RNA structures located within the regions of the foot-and-mouth disease virus (FMDV) genome that encode the nonstructural proteins, only three are essential for replication of an FMDV subgenomic replicon. Replicon replication is dependent on RNA translation and synthesis; thus, our results suggest that the three RNA structures are critical for either initiation of viral RNA translation and/or viral RNA synthesis. Although further studies are required to identify whether the remaining 43 RNA structures have other roles in virus replication, they may provide targets for the rational large-scale attenuation of a wide range of FMDV strains. FMDV causes a highly contagious disease, posing a constant threat to global livestock industries. Such weakened FMDV strains could be investigated as live-attenuated vaccines or could enhance biosecurity of conventional inactivated vaccine production.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/mSphere.00015-21DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8386395PMC
August 2021

Whole genome de novo sequencing and comparative genomic analyses suggests that Chlamydia psittaci strain 84/2334 should be reclassified as Chlamydia abortus species.

BMC Genomics 2021 Mar 6;22(1):159. Epub 2021 Mar 6.

Department of Animal Science and Aquatic Ecology, Faculty of Bioscience Engineering, University of Ghent, Ghent, Belgium.

Background: Chlamydia abortus and Chlamydia psittaci are important pathogens of livestock and avian species, respectively. While C. abortus is recognized as descended from C. psittaci species, there is emerging evidence of strains that are intermediary between the two species, suggesting they are recent evolutionary ancestors of C. abortus. Such strains include C. psittaci strain 84/2334 that was isolated from a parrot. Our aim was to classify this strain by sequencing its genome and explore its evolutionary relationship to both C. abortus and C. psittaci.

Results: In this study, methods based on multi-locus sequence typing (MLST) of seven housekeeping genes and on typing of five species discriminant proteins showed that strain 84/2334 clustered with C. abortus species. Furthermore, whole genome de novo sequencing of the strain revealed greater similarity to C. abortus in terms of GC content, while 16S rRNA and whole genome phylogenetic analysis, as well as network and recombination analysis showed that the strain clusters more closely with C. abortus strains. The analysis also suggested a closer evolutionary relationship between this strain and the major C. abortus clade, than to two other intermediary avian C. abortus strains or C. psittaci strains. Molecular analyses of genes (polymorphic membrane protein and transmembrane head protein genes) and loci (plasticity zone), found in key virulence-associated regions that exhibit greatest diversity within and between chlamydial species, reveal greater diversity than present in sequenced C. abortus genomes as well as similar features to both C. abortus and C. psittaci species. The strain also possesses an extrachromosomal plasmid, as found in most C. psittaci species but absent from all sequenced classical C. abortus strains.

Conclusion: Overall, the results show that C. psittaci strain 84/2334 clusters very closely with C. abortus strains, and are consistent with the strain being a recent C. abortus ancestral species. This suggests that the strain should be reclassified as C. abortus. Furthermore, the identification of a C. abortus strain bearing an extra-chromosomal plasmid has implications for plasmid-based transformation studies to investigate gene function as well as providing a potential route for the development of a next generation vaccine to protect livestock from C. abortus infection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-021-07477-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7937271PMC
March 2021

A Systematic Evaluation of High-Throughput Sequencing Approaches to Identify Low-Frequency Single Nucleotide Variants in Viral Populations.

Viruses 2020 10 20;12(10). Epub 2020 Oct 20.

Department of Microbial and Cellular Sciences, Faculty of Health and Medical Sciences, School of Biosciences and Medicine, University of Surrey, Guildford GU2 7XH, UK.

High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have taken a systematic approach to evaluate laboratory and bioinformatic pipelines to accurately identify low-frequency SNVs in viral populations. Artificial DNA and RNA "populations" were created by introducing known SNVs at predetermined frequencies into template nucleic acid before being sequenced on an Illumina MiSeq platform. These were used to assess the effects of abundance and starting input material type, technical replicates, read length and quality, short-read aligner, and percentage frequency thresholds on the ability to accurately call variants. Analyses revealed that the abundance and type of input nucleic acid had the greatest impact on the accuracy of SNV calling as measured by a micro-averaged Matthews correlation coefficient score, with DNA and high RNA inputs (10 copies) allowing for variants to be called at a 0.2% frequency. Reduced input RNA (10 copies) required more technical replicates to maintain accuracy, while low RNA inputs (10 copies) suffered from consensus-level errors. Base errors identified at specific motifs identified in all technical replicates were also identified which can be excluded to further increase SNV calling accuracy. These findings indicate that samples with low RNA inputs should be excluded for SNV calling and reinforce the importance of optimising the technical and bioinformatics steps in pipelines that are used to accurately identify sequence variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v12101187DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7594041PMC
October 2020

Full genome sequencing of archived wild type and vaccine rinderpest virus isolates prior to their destruction.

Sci Rep 2020 04 16;10(1):6563. Epub 2020 Apr 16.

The Pirbright Institute, Ash Road, Pirbright, Surrey, GU24 0NF, UK.

When rinderpest virus (RPV) was declared eradicated in 2011, the only remaining samples of this once much-feared livestock virus were those held in various laboratories. In order to allow the destruction of our institute's stocks of RPV while maintaining the ability to recover the various viruses if ever required, we have determined the full genome sequence of all our distinct samples of RPV, including 51 wild type viruses and examples of three different types of vaccine strain. Examination of the sequences of these virus isolates has shown that the African isolates form a single disparate clade, rather than two separate clades, which is more in accord with the known history of the virus in Africa. We have also identified two groups of goat-passaged viruses which have acquired an extra 6 bases in the long untranslated region between the M and F protein coding sequences, and shown that, for more than half the genomes sequenced, translation of the F protein requires translational frameshift or non-standard translation initiation. Curiously, the clade containing the lapinised vaccine viruses that were developed originally in Korea appears to be more similar to the known African viruses than to any other Asian viruses.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-020-63707-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7162898PMC
April 2020

Pervasive Differential Splicing in Marek's Disease Virus can Discriminate CVI-988 Vaccine Strain from RB-1B Very Virulent Strain in Chicken Embryonic Fibroblasts.

Viruses 2020 03 18;12(3). Epub 2020 Mar 18.

Integrative Biology and Bioinformatics, The Pirbright Institute, Ash Road, Woking GU24 0NF, UK.

Marek's disease is a major scourge challenging poultry health worldwide. It is caused by the highly contagious Marek's disease virus (MDV), an alphaherpesvirus. Here, we showed that, similar to other members of its family, MDV also presents a complex landscape of splicing events, most of which are uncharacterised and/or not annotated. Quite strikingly, and although the biological relevance of this fact is unknown, we found that a number of viral splicing isoforms are strain-specific, despite the close sequence similarity of the strains considered: very virulent RB-1B and vaccine CVI-988. We validated our findings by devising an assay that discriminated infections caused by the two strains in chicken embryonic fibroblasts on the basis of the presence of some RNA species. To our knowledge, this study is the first to accomplish such a result, emphasizing how relevant a comprehensive picture of the viral transcriptome is to fully understand viral pathogenesis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v12030329DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7150913PMC
March 2020

Pervasive within-host recombination and epistasis as major determinants of the molecular evolution of the foot-and-mouth disease virus capsid.

PLoS Pathog 2020 01 6;16(1):e1008235. Epub 2020 Jan 6.

The Pirbright Institute, Woking, Surrey, United Kingdom.

Although recombination is known to occur in foot-and-mouth disease virus (FMDV), it is considered only a minor determinant of virus sequence diversity. Analysis at phylogenetic scales shows inter-serotypic recombination events are rare, whereby recombination occurs almost exclusively in non-structural proteins. In this study we have estimated recombination rates within a natural host in an experimental setting. African buffaloes were inoculated with a SAT-1 FMDV strain containing two major viral sub-populations differing in their capsid sequence. This population structure enabled the detection of extensive within-host recombination in the genomic region coding for structural proteins and allowed recombination rates between the two sub-populations to be estimated. Quite surprisingly, the effective recombination rate in VP1 during the acute infection phase turns out to be about 0.1 per base per year, i.e. comparable to the mutation/substitution rate. Using a high-resolution map of effective within-host recombination in the capsid-coding region, we identified a linkage disequilibrium pattern in VP1 that is consistent with a mosaic structure with two main genetic blocks. Positive epistatic interactions between co-evolved variants appear to be present both within and between blocks. These interactions are due to intra-host selection both at the RNA and protein level. Overall our findings show that during FMDV co-infections by closely related strains, capsid-coding genes recombine within the host at a much higher rate than expected, despite the presence of strong constraints dictated by the capsid structure. Although these intra-host results are not immediately translatable to a phylogenetic setting, recombination and epistasis must play a major and so far underappreciated role in the molecular evolution of the virus at all scales.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.ppat.1008235DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6964909PMC
January 2020

Transposons played a major role in the diversification between the closely related almond and peach genomes: results from the almond genome sequence.

Plant J 2020 01 22;101(2):455-472. Epub 2019 Oct 22.

IRTA, Campus UAB, Edifici CRAG, Cerdanyola del Vallès (Bellaterra), 08193, Barcelona, Spain.

We sequenced the genome of the highly heterozygous almond Prunus dulcis cv. Texas combining short- and long-read sequencing. We obtained a genome assembly totaling 227.6 Mb of the estimated almond genome size of 238 Mb, of which 91% is anchored to eight pseudomolecules corresponding to its haploid chromosome complement, and annotated 27 969 protein-coding genes and 6747 non-coding transcripts. By phylogenomic comparison with the genomes of 16 additional close and distant species we estimated that almond and peach (Prunus persica) diverged around 5.88 million years ago. These two genomes are highly syntenic and show a high degree of sequence conservation (20 nucleotide substitutions per kb). However, they also exhibit a high number of presence/absence variants, many attributable to the movement of transposable elements (TEs). Transposable elements have generated an important number of presence/absence variants between almond and peach, and we show that the recent history of TE movement seems markedly different between them. Transposable elements may also be at the origin of important phenotypic differences between both species, and in particular for the sweet kernel phenotype, a key agronomic and domestication character for almond. Here we show that in sweet almond cultivars, highly methylated TE insertions surround a gene involved in the biosynthesis of amygdalin, whose reduced expression has been correlated with the sweet almond phenotype. Altogether, our results suggest a key role of TEs in the recent history and diversification of almond and its close relative peach.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/tpj.14538DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7004133PMC
January 2020

Mass Die-Off of Saiga Antelopes, Kazakhstan, 2015.

Emerg Infect Dis 2019 06;25(6):1169-1176

In 2015, a mass die-off of ≈200,000 saiga antelopes in central Kazakhstan was caused by hemorrhagic septicemia attributable to the bacterium Pasteurella multocida serotype B. Previous analyses have indicated that environmental triggers associated with weather conditions, specifically air moisture and temperature in the region of the saiga antelope calving during the 10-day period running up to the event, were critical to the proliferation of latent bacteria and were comparable to conditions accompanying historically similar die-offs in the same areas. We investigated whether additional viral or bacterial pathogens could be detected in samples from affected animals using 3 different high-throughput sequencing approaches. We did not identify pathogens associated with commensal bacterial opportunisms in blood, kidney, or lung samples and thus concluded that P. multocida serotype B was the primary cause of the disease.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3201/eid2506.180990DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6537709PMC
June 2019

Persistent Infection of African Buffalo (Syncerus caffer) with Foot-and-Mouth Disease Virus: Limited Viral Evolution and No Evidence of Antibody Neutralization Escape.

J Virol 2019 08 17;93(15). Epub 2019 Jul 17.

The Pirbright Institute, Woking, Surrey, United Kingdom

African buffaloes () are the principal "carrier" hosts of foot-and-mouth disease virus (FMDV). Currently, the epithelia and lymphoid germinal centers of the oropharynx have been identified as sites for FMDV persistence. We carried out studies in FMDV SAT1 persistently infected buffaloes to characterize the diversity of viruses in oropharyngeal epithelia, germinal centers, probang samples (oropharyngeal scrapings), and tonsil swabs to determine if sufficient virus variation is generated during persistence for immune escape. Most sequencing reads of the VP1 coding region of the SAT1 virus inoculum clustered around 2 subpopulations differing by 22 single-nucleotide variants of intermediate frequency. Similarly, most sequences from oropharynx tissue clustered into two subpopulations, albeit with different proportions, depending on the day postinfection (dpi). There was a significant difference between the populations of viruses in the inoculum and in lymphoid tissue taken at 35 dpi. Thereafter, until 400 dpi, no significant variation was detected in the viral populations in samples from individual animals, germinal centers, and epithelial tissues. Deep sequencing of virus from probang or tonsil swab samples harvested prior to postmortem showed less within-sample variability of VP1 than that of tissue sample sequences analyzed at the same time. Importantly, there was no significant difference in the ability of sera collected between 14 and 400 dpi to neutralize the inoculum or viruses isolated at later time points in the study from the same animal. Therefore, based on this study, there is no evidence of escape from antibody neutralization contributing to FMDV persistent infection in African buffalo. Foot-and-mouth disease virus (FMDV) is a highly contagious virus of cloven-hoofed animals and is recognized as the most important constraint to international trade in animals and animal products. African buffaloes () are efficient carriers of FMDV, and it has been proposed that new virus variants are produced in buffalo during the prolonged carriage after acute infection, which may spread to cause disease in livestock populations. Here, we show that despite an accumulation of low-frequency sequence variants over time, there is no evidence of significant antigenic variation leading to immune escape. Therefore, carrier buffalo are unlikely to be a major source of new virus variants.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/JVI.00563-19DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6639274PMC
August 2019

Bovine Derived Cultures Generate Heterogeneous Populations of Antigen Presenting Cells.

Front Immunol 2019 29;10:612. Epub 2019 Mar 29.

The Pirbright Institute, Woking, United Kingdom.

Antigen presenting cells (APC) of the mononuclear phagocytic system include dendritic cells (DCs) and macrophages (Macs) which are essential mediators of innate and adaptive immune responses. Many of the biological functions attributed to these cell subsets have been elucidated using models that utilize -matured cells derived from common progenitors. However, it has recently been shown that monocyte culture systems generate heterogeneous populations of cells, DCs, and Macs. In light of these findings, we analyzed the most commonly used bovine -derived APC models and compared them to DCs. Here, we show that bovine monocyte-derived DCs and Macs can be differentiated on the basis of CD11c and MHC class II (MHCII) expression and that conditions generate a heterologous group of both DCs and Macs with defined and specific biological activities. In addition, skin-migrating macrophages present in the bovine afferent lymph were identified and phenotyped for the first time. RNA sequencing analyses showed that these monophagocytic cells have distinct transcriptomic profiles similar to those described in other species. These results have important implications for the interpretation of data obtained using systems.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fimmu.2019.00612DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450137PMC
August 2020

The Site Frequency/Dosage Spectrum of Autopolyploid Populations.

Front Genet 2018 23;9:480. Epub 2018 Oct 23.

Centre for Research in Agricultural Genomics, Barcelona, Spain.

The Site Frequency Spectrum (SFS) and the heterozygosity of allelic variants are among the most important summary statistics for population genetic analysis of diploid organisms. We discuss the generalization of these statistics to populations of autopolyploid organisms in terms of the joint Site Frequency/Dosage Spectrum and its expected value for autopolyploid populations that follow the standard neutral model. Based on these results, we present estimators of nucleotide variability from High-Throughput Sequencing (HTS) data of autopolyploids and discuss potential issues related to sequencing errors and variant calling. We use these estimators to generalize Tajima's and other SFS-based neutrality tests to HTS data from autopolyploid organisms. Finally, we discuss how these approaches fail when the number of individuals is small. In fact, in autopolyploids there are many possible deviations from the Hardy-Weinberg equilibrium, each reflected in a different shape of the individual dosage distribution. The SFS from small samples is often dominated by the shape of these deviations of the dosage distribution from its Hardy-Weinberg expectations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fgene.2018.00480DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6207136PMC
October 2018

Differential gene regulation underlies variation in melanic plumage coloration in the dark-eyed junco (Junco hyemalis).

Mol Ecol 2018 11 22;27(22):4501-4515. Epub 2018 Oct 22.

National Museum of Natural Sciences, Spanish National Research Council (CSIC), Madrid, Spain.

Colour plays a prominent role in species recognition; therefore, understanding the proximate basis of pigmentation can provide insight into reproductive isolation and speciation. Colour differences between taxa may be the result of regulatory differences or be caused by mutations in coding regions of the expressed genes. To investigate these two alternatives, we studied the pigment composition and the genetic basis of coloration in two divergent dark-eyed junco (Junco hyemalis) subspecies, the slate-coloured and Oregon juncos, which have evolved marked differences in plumage coloration since the Last Glacial Maximum. We used HPLC and light microscopy to investigate pigment composition and deposition in feathers from four body areas. We then used RNA-seq to compare the relative roles of differential gene expression in developing feathers and sequence divergence in transcribed loci under common-garden conditions. Junco feathers differed in eumelanin and pheomelanin content and distribution. Within subspecies, in lighter feathers melanin synthesis genes were downregulated (including PMEL, TYR, TYRP1, OCA2 and MLANA), and ASIP was upregulated. Feathers from different body regions also showed differential expression of HOX and WNT genes. Feathers from the same body regions that differed in colour between the two subspecies showed differential expression of ASIP and three other genes (MFSD12, KCNJ13 and HAND2) associated with pigmentation in other taxa. Sequence variation in the expressed genes was not related to colour differences. Our findings support the hypothesis that differential regulation of a few genes can account for marked differences in coloration, a mechanism that may facilitate the rapid phenotypic diversification of juncos.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/mec.14878DOI Listing
November 2018

Within-Host Recombination in the Foot-and-Mouth Disease Virus Genome.

Viruses 2018 04 25;10(5). Epub 2018 Apr 25.

The Pirbright Institute, Ash Road, Woking, Surrey GU24 0NF, UK.

Recombination is one of the determinants of genetic diversity in the foot-and-mouth disease virus (FMDV). FMDV sequences have a mosaic structure caused by extensive intra- and inter-serotype recombination, with the exception of the capsid-encoding region. While these genome-wide patterns of broad-scale recombination are well studied, not much is known about the patterns of recombination that may exist within infected hosts. In addition, detection of recombination among viruses evolving at the within-host level is challenging due to the similarity of the sequences and the limitations in differentiating recombination from point mutations. Here, we present the first analysis of recombination events between closely related FMDV sequences occurring within buffalo hosts. The detection of these events was made possible by the occurrence of co-infection of two viral swarms with about 1% nucleotide divergence. We found more than 15 recombination events, unequally distributed across eight samples from different animals. The distribution of these events along the FMDV genome was neither uniform nor related to the phylogenetic distribution of recombination breakpoints, suggesting a mismatch between within-host evolutionary pressures and long-term selection for infectivity and transmissibility.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v10050221DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5977214PMC
April 2018

A 3-way hybrid approach to generate a new high-quality chimpanzee reference genome (Pan_tro_3.0).

Gigascience 2017 11;6(11):1-6

Institut de Biologia Evolutiva, (CSIC-Universitat Pompeu Fabra), PRBB, Doctor Aiguader 88, Barcelona, Catalonia 08003, Spain.

The chimpanzee is arguably the most important species for the study of human origins. A key resource for these studies is a high-quality reference genome assembly; however, as with most mammalian genomes, the current iteration of the chimpanzee reference genome assembly is highly fragmented. In the current iteration of the chimpanzee reference genome assembly (Pan_tro_2.1.4), the sequence is scattered across more then 183 000 contigs, incorporating more than 159 000 gaps, with a genome-wide contig N50 of 51 Kbp. In this work, we produce an extensive and diverse array of sequencing datasets to rapidly assemble a new chimpanzee reference that surpasses previous iterations in bases represented and organized in large scaffolds. To this end, we show substantial improvements over the current release of the chimpanzee genome (Pan_tro_2.1.4) by several metrics, such as increased contiguity by >750% and 300% on contigs and scaffolds, respectively, and closure of 77% of gaps in the Pan_tro_2.1.4 assembly gaps spanning >850 Kbp of the novel coding sequence based on RNASeq data. We further report more than 2700 genes that had putatively erroneous frame-shift predictions to human in Pan_tro_2.1.4 and show a substantial increase in the annotation of repetitive elements. We apply a simple 3-way hybrid approach to considerably improve the reference genome assembly for the chimpanzee, providing a valuable resource for the study of human origins. Furthermore, we produce extensive sequencing datasets that are all derived from the same cell line, generating a broad non-human benchmark dataset.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gigascience/gix098DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5714192PMC
November 2017

ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data.

BMC Genomics 2017 01 3;18(1). Epub 2017 Jan 3.

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.

Background: Chimeric transcripts are commonly defined as transcripts linking two or more different genes in the genome, and can be explained by various biological mechanisms such as genomic rearrangement, read-through or trans-splicing, but also by technical or biological artefacts. Several studies have shown their importance in cancer, cell pluripotency and motility. Many programs have recently been developed to identify chimeras from Illumina RNA-seq data (mostly fusion genes in cancer). However outputs of different programs on the same dataset can be widely inconsistent, and tend to include many false positives. Other issues relate to simulated datasets restricted to fusion genes, real datasets with limited numbers of validated cases, result inconsistencies between simulated and real datasets, and gene rather than junction level assessment.

Results: Here we present ChimPipe, a modular and easy-to-use method to reliably identify fusion genes and transcription-induced chimeras from paired-end Illumina RNA-seq data. We have also produced realistic simulated datasets for three different read lengths, and enhanced two gold-standard cancer datasets by associating exact junction points to validated gene fusions. Benchmarking ChimPipe together with four other state-of-the-art tools on this data showed ChimPipe to be the top program at identifying exact junction coordinates for both kinds of datasets, and the one showing the best trade-off between sensitivity and precision. Applied to 106 ENCODE human RNA-seq datasets, ChimPipe identified 137 high confidence chimeras connecting the protein coding sequence of their parent genes. In subsequent experiments, three out of four predicted chimeras, two of which recurrently expressed in a large majority of the samples, could be validated. Cloning and sequencing of the three cases revealed several new chimeric transcript structures, 3 of which with the potential to encode a chimeric protein for which we hypothesized a new role. Applying ChimPipe to human and mouse ENCODE RNA-seq data led to the identification of 131 recurrent chimeras common to both species, and therefore potentially conserved.

Conclusions: ChimPipe combines discordant paired-end reads and split-reads to detect any kind of chimeras, including those originating from polymerase read-through, and shows an excellent trade-off between sensitivity and precision. The chimeras found by ChimPipe can be validated in-vitro with high accuracy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s12864-016-3404-9DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209911PMC
January 2017

Extreme genomic erosion after recurrent demographic bottlenecks in the highly endangered Iberian lynx.

Genome Biol 2016 12 14;17(1):251. Epub 2016 Dec 14.

CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028, Barcelona, Spain.

Background: Genomic studies of endangered species provide insights into their evolution and demographic history, reveal patterns of genomic erosion that might limit their viability, and offer tools for their effective conservation. The Iberian lynx (Lynx pardinus) is the most endangered felid and a unique example of a species on the brink of extinction.

Results: We generate the first annotated draft of the Iberian lynx genome and carry out genome-based analyses of lynx demography, evolution, and population genetics. We identify a series of severe population bottlenecks in the history of the Iberian lynx that predate its known demographic decline during the 20th century and have greatly impacted its genome evolution. We observe drastically reduced rates of weak-to-strong substitutions associated with GC-biased gene conversion and increased rates of fixation of transposable elements. We also find multiple signatures of genetic erosion in the two remnant Iberian lynx populations, including a high frequency of potentially deleterious variants and substitutions, as well as the lowest genome-wide genetic diversity reported so far in any species.

Conclusions: The genomic features observed in the Iberian lynx genome may hamper short- and long-term viability through reduced fitness and adaptive potential. The knowledge and resources developed in this study will boost the research on felid evolution and conservation genomics and will benefit the ongoing conservation and management of this emblematic species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13059-016-1090-1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5155386PMC
December 2016

Genome sequence of the olive tree, Olea europaea.

Gigascience 2016 06 27;5:29. Epub 2016 Jun 27.

Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain.

Background: The Mediterranean olive tree (Olea europaea subsp. europaea) was one of the first trees to be domesticated and is currently of major agricultural importance in the Mediterranean region as the source of olive oil. The molecular bases underlying the phenotypic differences among domesticated cultivars, or between domesticated olive trees and their wild relatives, remain poorly understood. Both wild and cultivated olive trees have 46 chromosomes (2n).

Findings: A total of 543 Gb of raw DNA sequence from whole genome shotgun sequencing, and a fosmid library containing 155,000 clones from a 1,000+ year-old olive tree (cv. Farga) were generated by Illumina sequencing using different combinations of mate-pair and pair-end libraries. Assembly gave a final genome with a scaffold N50 of 443 kb, and a total length of 1.31 Gb, which represents 95 % of the estimated genome length (1.38 Gb). In addition, the associated fungus Aureobasidium pullulans was partially sequenced. Genome annotation, assisted by RNA sequencing from leaf, root, and fruit tissues at various stages, resulted in 56,349 unique protein coding genes, suggesting recent genomic expansion. Genome completeness, as estimated using the CEGMA pipeline, reached 98.79 %.

Conclusions: The assembled draft genome of O. europaea will provide a valuable resource for the study of the evolution and domestication processes of this important tree, and allow determination of the genetic bases of key phenotypic traits. Moreover, it will enhance breeding programs and the formation of new varieties.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13742-016-0134-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4922053PMC
June 2016

CARGO: effective format-free compressed storage of genomic information.

Nucleic Acids Res 2016 07 29;44(12):e114. Epub 2016 Apr 29.

Algorithm Development, Centro Nacional de Análisis Genómico, Carrer Baldiri i Reixac 4, Barcelona 08028, Spain Integrative Biology, The Pirbright Institute, Ash Road, Pirbright, Woking, GU24 0NF, United Kingdom

The recent super-exponential growth in the amount of sequencing data generated worldwide has put techniques for compressed storage into the focus. Most available solutions, however, are strictly tied to specific bioinformatics formats, sometimes inheriting from them suboptimal design choices; this hinders flexible and effective data sharing. Here, we present CARGO (Compressed ARchiving for GenOmics), a high-level framework to automatically generate software systems optimized for the compressed storage of arbitrary types of large genomic data collections. Straightforward applications of our approach to FASTQ and SAM archives require a few lines of code, produce solutions that match and sometimes outperform specialized format-tailored compressors and scale well to multi-TB datasets. All CARGO software components can be freely downloaded for academic and non-commercial use from http://bio-cargo.sourceforge.net.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkw318DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4937321PMC
July 2016

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.

Nat Commun 2015 Dec 9;6:10001. Epub 2015 Dec 9.

Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.

As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to ∼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms10001DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682041PMC
December 2015

Boosting the FM-Index on the GPU: Effective Techniques to Mitigate Random Memory Access.

IEEE/ACM Trans Comput Biol Bioinform 2015 Sep-Oct;12(5):1048-59

The recent advent of high-throughput sequencing machines producing big amounts of short reads has boosted the interest in efficient string searching techniques. As of today, many mainstream sequence alignment software tools rely on a special data structure, called the FM-index, which allows for fast exact searches in large genomic references. However, such searches translate into a pseudo-random memory access pattern, thus making memory access the limiting factor of all computation-efficient implementations, both on CPUs and GPUs. Here, we show that several strategies can be put in place to remove the memory bottleneck on the GPU: more compact indexes can be implemented by having more threads work cooperatively on larger memory blocks, and a k-step FM-index can be used to further reduce the number of memory accesses. The combination of those and other optimisations yields an implementation that is able to process about two Gbases of queries per second on our test platform, being about 8 × faster than a comparable multi-core CPU version, and about 3 × to 5 × faster than the FM-index implementation on the GPU provided by the recently announced Nvidia NVBIO bioinformatics library.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2014.2377716DOI Listing
July 2016

Efficient Alignment of Illumina-Like High-Throughput Sequencing Reads with the GEnomic Multi-tool (GEM) Mapper.

Curr Protoc Bioinformatics 2015 Jun 19;50:11.13.1-11.13.20. Epub 2015 Jun 19.

Centro Nacional de Análisis Genómico, Barcelona, Spain.

Modern Illumina-like high-throughput sequencing machines allow the cheap decoding of great amounts of DNA. The GEnomic Multi-tool (GEM) mapper is one of the fastest and most sensitive methods known to date to align such data to a known genomic reference. This unit explains how to use it effectively.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/0471250953.bi1113s50DOI Listing
June 2015

Transcriptome and genome sequencing uncovers functional variation in humans.

Nature 2013 Sep 15;501(7468):506-11. Epub 2013 Sep 15.

Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland.

Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of messenger RNA and microRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project--the first uniformly processed high-throughput RNA-sequencing data from multiple human populations with high-quality genome sequences. We discover extremely widespread genetic variation affecting the regulation of most genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on the cellular mechanisms of regulatory and loss-of-function variation, and allows us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature12531DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3918453PMC
September 2013

The GEM mapper: fast, accurate and versatile alignment by filtration.

Nat Methods 2012 Dec 28;9(12):1185-8. Epub 2012 Oct 28.

Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain.

Because of ever-increasing throughput requirements of sequencing data, most existing short-read aligners have been designed to focus on speed at the expense of accuracy. The Genome Multitool (GEM) mapper can leverage string matching by filtration to search the alignment space more efficiently, simultaneously delivering precision (performing fully tunable exhaustive searches that return all existing matches, including gapped ones) and speed (being several times faster than comparable state-of-the-art tools).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nmeth.2221DOI Listing
December 2012

Modelling and simulating generic RNA-Seq experiments with the flux simulator.

Nucleic Acids Res 2012 Nov 7;40(20):10073-83. Epub 2012 Sep 7.

Bioinformatics and Genomics Program, Centre de Regulació Genòmica (CRG), 08003 Barcelona, Spain.

High-throughput sequencing of cDNA libraries constructed from cellular RNA complements (RNA-Seq) naturally provides a digital quantitative measurement for every expressed RNA molecule. Nature, impact and mutual interference of biases in different experimental setups are, however, still poorly understood-mostly due to the lack of data from intermediate protocol steps. We analysed multiple RNA-Seq experiments, involving different sample preparation protocols and sequencing platforms: we broke them down into their common--and currently indispensable--technical components (reverse transcription, fragmentation, adapter ligation, PCR amplification, gel segregation and sequencing), investigating how such different steps influence abundance and distribution of the sequenced reads. For each of those steps, we developed universally applicable models, which can be parameterised by empirical attributes of any experimental protocol. Our models are implemented in a computer simulation pipeline called the Flux Simulator, and we show that read distributions generated by different combinations of these models reproduce well corresponding evidence obtained from the corresponding experimental setups. We further demonstrate that our in silico RNA-Seq provides insights about hidden precursors that determine the final configuration of reads along gene bodies; enhancing or compensatory effects that explain apparently controversial observations can be observed. Moreover, our simulations identify hitherto unreported sources of systematic bias from RNA hydrolysis, a fragmentation technique currently employed by most RNA-Seq protocols.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gks666DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3488205PMC
November 2012

Landscape of transcription in human cells.

Nature 2012 Sep;489(7414):101-8

Centre for Genomic Regulation and UPF, Doctor Aiguader 88, Barcelona 08003, Catalonia, Spain.

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/nature11233DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3684276PMC
September 2012

Fast computation and applications of genome mappability.

PLoS One 2012 19;7(1):e30377. Epub 2012 Jan 19.

Institut de Génétique et Développement (IGDR), Université Rennes 1, Rennes, France.

We present a fast mapping-based algorithm to compute the mappability of each region of a reference genome up to a specified number of mismatches. Knowing the mappability of a genome is crucial for the interpretation of massively parallel sequencing experiments. We investigate the properties of the mappability of eukaryotic DNA/RNA both as a whole and at the level of the gene family, providing for various organisms tracks which allow the mappability information to be visually explored. In addition, we show that mappability varies greatly between species and gene classes. Finally, we suggest several practical applications where mappability can be used to refine the analysis of high-throughput sequencing data (SNP calling, gene expression quantification and paired-end experiments). This work highlights mappability as an important concept which deserves to be taken into full account, in particular when massively parallel sequencing technologies are employed. The GEM mappability program belongs to the GEM (GEnome Multitool) suite of programs, which can be freely downloaded for any use from its website (http://gemlibrary.sourceforge.net).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030377PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3261895PMC
June 2012

Evidence for transcript networks composed of chimeric RNAs in human cells.

PLoS One 2012 4;7(1):e28213. Epub 2012 Jan 4.

Bioinformatics and Genomics, Centre for Genomic Regulation and Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.

The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5' and 3' transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0028213PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3251577PMC
May 2012
-->