Publications by authors named "Arthur Gruber"

23 Publications

  • Page 1 of 1

Characterization of a Novel Mitovirus of the Sand Fly Using Genomic and Virus-Host Interaction Signatures.

Viruses 2020 12 23;13(1). Epub 2020 Dec 23.

Bioinformatics Postgraduate Program, Universidade de São Paulo, São Paulo 05508-000, Brazil.

Hematophagous insects act as the major reservoirs of infectious agents due to their intimate contact with a large variety of vertebrate hosts. is the main vector of in the New World, but its role as a host of viruses is poorly understood. In this work, RNA libraries were subjected to progressive assembly using viral profile HMMs as seeds. A sequence phylogenetically related to fungal viruses of the genus was identified and this novel virus was named Lul-MV-1. The 2697-base genome presents a single gene coding for an RNA-directed RNA polymerase with an organellar genetic code. To determine the possible host of Lul-MV-1, we analyzed the molecular characteristics of the viral genome. Dinucleotide composition and codon usage showed profiles similar to mitochondrial DNA of invertebrate hosts. Also, the virus-derived small RNA profile was consistent with the activation of the siRNA pathway, with size distribution and 5' base enrichment analogous to those observed in viruses of sand flies, reinforcing as a putative host. Finally, RT-PCR of different insect pools and sequences of public RNA libraries confirmed the high prevalence of Lul-MV-1. This is the first report of a mitovirus infecting an insect host.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v13010009DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7822452PMC
December 2020

A single unidirectional piRNA cluster similar to the locus is the major source of EVE-derived transcription and small RNAs in mosquitoes.

RNA 2020 05 29;26(5):581-594. Epub 2020 Jan 29.

Department of Biochemistry and Immunology, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, CEP 30270-901, Brazil.

Endogenous viral elements (EVEs) are found in many eukaryotic genomes. Despite considerable knowledge about genomic elements such as transposons (TEs) and retroviruses, we still lack information about nonretroviral EVEs. mosquitoes have a highly repetitive genome that is covered with EVEs. Here, we identified 129 nonretroviral EVEs in the AaegL5 version of the genome. These EVEs were significantly associated with TEs and preferentially located in repeat-rich clusters within intergenic regions. Genome-wide transcriptome analysis showed that most EVEs generated transcripts although only around 1.4% were sense RNAs. The majority of EVE transcription was antisense and correlated with the generation of EVE-derived small RNAs. A single genomic cluster of EVEs located in a 143 kb repetitive region in chromosome 2 contributed with 42% of antisense transcription and 45% of small RNAs derived from viral elements. This region was enriched for TE-EVE hybrids organized in the same coding strand. These generated a single long antisense transcript that correlated with the generation of phased primary PIWI-interacting RNAs (piRNAs). The putative promoter of this region had a conserved binding site for the transcription factor Cubitus interruptus, a key regulator of the locus in Here, we have identified a single unidirectional piRNA cluster in the genome that is the major source of EVE transcription fueling the generation of antisense small RNAs in mosquitoes. We propose that this region is a locus in due to its relatedness to the major unidirectional piRNA cluster in .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1261/rna.073965.119DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7161354PMC
May 2020

Bioinformatics Meets Virology: The European Virus Bioinformatics Center's Second Annual Meeting.

Viruses 2018 May 14;10(5). Epub 2018 May 14.

European Virus Bioinformatics Center, 07743 Jena, Germany.

The Second Annual Meeting of the European Virus Bioinformatics Center (EVBC), held in Utrecht, Netherlands, focused on computational approaches in virology, with topics including (but not limited to) virus discovery, diagnostics, (meta-)genomics, modeling, epidemiology, molecular structure, evolution, and viral ecology. The goals of the Second Annual Meeting were threefold: (i) to bring together virologists and bioinformaticians from across the academic, industrial, professional, and training sectors to share best practice; (ii) to provide a meaningful and interactive scientific environment to promote discussion and collaboration between students, postdoctoral fellows, and both new and established investigators; (iii) to inspire and suggest new research directions and questions. Approximately 120 researchers from around the world attended the Second Annual Meeting of the EVBC this year, including 15 renowned international speakers. This report presents an overview of new developments and novel research findings that emerged during the meeting.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/v10050256DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5977249PMC
May 2018

Draft Genome Sequence of Curtobacterium sp. Strain ER1/6, an Endophytic Strain Isolated from Citrus sinensis with Potential To Be Used as a Biocontrol Agent.

Genome Announc 2016 Nov 17;4(6). Epub 2016 Nov 17.

NAP/BIOP, Departamento de Microbiologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, Biomédicas II, Cidade Universitária, São Paulo, São Paulo, Brazil

Herein, we report a draft genome sequence of the endophytic Curtobacterium sp. strain ER1/6, isolated from a surface-sterilized Citrus sinensis branch, and it presented the capability to control phytopathogens. Functional annotation of the ~3.4-Mb genome revealed 3,100 protein-coding genes, with many products related to known ecological and biotechnological aspects of this bacterium.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.01264-16DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5114373PMC
November 2016

GenSeed-HMM: A Tool for Progressive Assembly Using Profile HMMs as Seeds and its Application in Alpavirinae Viral Discovery from Metagenomic Data.

Front Microbiol 2016 4;7:269. Epub 2016 Mar 4.

Department of Parasitology, Institute of Biomedical Sciences, University of São Paulo São Paulo, Brazil.

This work reports the development of GenSeed-HMM, a program that implements seed-driven progressive assembly, an approach to reconstruct specific sequences from unassembled data, starting from short nucleotide or protein seed sequences or profile Hidden Markov Models (HMM). The program can use any one of a number of sequence assemblers. Assembly is performed in multiple steps and relatively few reads are used in each cycle, consequently the program demands low computational resources. As a proof-of-concept and to demonstrate the power of HMM-driven progressive assemblies, GenSeed-HMM was applied to metagenomic datasets in the search for diverse ssDNA bacteriophages from the recently described Alpavirinae subfamily. Profile HMMs were built using Alpavirinae-specific regions from multiple sequence alignments (MSA) using either the viral protein 1 (VP1; major capsid protein) or VP4 (genome replication initiation protein). These profile HMMs were used by GenSeed-HMM (running Newbler assembler) as seeds to reconstruct viral genomes from sequencing datasets of human fecal samples. All contigs obtained were annotated and taxonomically classified using similarity searches and phylogenetic analyses. The most specific profile HMM seed enabled the reconstruction of 45 partial or complete Alpavirinae genomic sequences. A comparison with conventional (global) assembly of the same original dataset, using Newbler in a standalone execution, revealed that GenSeed-HMM outperformed global genomic assembly in several metrics employed. This approach is capable of detecting organisms that have not been used in the construction of the profile HMM, which opens up the possibility of diagnosing novel viruses, without previous specific information, constituting a de novo diagnosis. Additional applications include, but are not limited to, the specific assembly of extrachromosomal elements such as plastid and mitochondrial genomes from metagenomic data. Profile HMM seeds can also be used to reconstruct specific protein coding genes for gene diversity studies, and to determine all possible gene variants present in a metagenomic sample. Such surveys could be useful to detect the emergence of drug-resistance variants in sensitive environments such as hospitals and animal production facilities, where antibiotics are regularly used. Finally, GenSeed-HMM can be used as an adjunct for gap closure on assembly finishing projects, by using multiple contig ends as anchored seeds.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3389/fmicb.2016.00269DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4777721PMC
March 2016

The Pangenome of the Anticarsia gemmatalis Multiple Nucleopolyhedrovirus (AgMNPV).

Genome Biol Evol 2015 Nov 27;8(1):94-108. Epub 2015 Nov 27.

Department of Microbiology, Institute of Biomedical Sciences-ICB II, Laboratory of Molecular Evolution and Bioinformatics, University of São Paulo-USP, São Paulo, SP, Brazil

The alphabaculovirus Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV) is the world's most successful viral bioinsecticide. Through the 1980s and 1990s, this virus was extensively used for biological control of populations of Anticarsia gemmatalis (Velvetbean caterpillar) in soybean crops. During this period, genetic studies identified several variable loci in the AgMNPV; however, most of them were not characterized at the sequence level. In this study we report a full genome comparison among 17 wild-type isolates of AgMNPV. We found the pangenome of this virus to contain at least 167 hypothetical genes, 151 of which are shared by all genomes. The gene bro-a that might be involved in host specificity and carrying transporter is absent in some genomes, and new hypothetical genes were observed. Among these genes there is a unique rnf12-like gene, probably implicated in ubiquitination. Events of gene fission and fusion are common, as four genes have been observed as single or split open reading frames. Gains and losses of genomic fragments (from 20 to 900 bp) are observed within tandem repeats, such as in eight direct repeats and four homologous regions. Most AgMNPV genes present low nucleotide diversity, and variable genes are mainly located in a locus known to evolve through homologous recombination. The evolution of AgMNPV is mainly driven by small indels, substitutions, gain and loss of nucleotide stretches or entire coding sequences. These variations may cause relevant phenotypic alterations, which probably affect the infectivity of AgMNPV. This work provides novel information on genomic evolution of the AgMNPV in particular and of baculoviruses in general.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/gbe/evv231DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4758234PMC
November 2015

Genomic analysis of the causative agents of coccidiosis in domestic chickens.

Genome Res 2014 Oct 11;24(10):1676-85. Epub 2014 Jul 11.

Wellcome Trust Sanger Institute, Genome Campus, Hinxton, Cambridgeshire CB10 1SA, United Kingdom;

Global production of chickens has trebled in the past two decades and they are now the most important source of dietary animal protein worldwide. Chickens are subject to many infectious diseases that reduce their performance and productivity. Coccidiosis, caused by apicomplexan protozoa of the genus Eimeria, is one of the most important poultry diseases. Understanding the biology of Eimeria parasites underpins development of new drugs and vaccines needed to improve global food security. We have produced annotated genome sequences of all seven species of Eimeria that infect domestic chickens, which reveal the full extent of previously described repeat-rich and repeat-poor regions and show that these parasites possess the most repeat-rich proteomes ever described. Furthermore, while no other apicomplexan has been found to possess retrotransposons, Eimeria is home to a family of chromoviruses. Analysis of Eimeria genes involved in basic biology and host-parasite interaction highlights adaptations to a relatively simple developmental life cycle and a complex array of co-expressed surface proteins involved in host cell binding.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.168955.113DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4199364PMC
October 2014

A selective review of advances in coccidiosis research.

Adv Parasitol 2013 ;83:93-171

Department of Poultry Science, University of Arkansas, Fayetteville, Arkansas, USA.

Coccidiosis is a widespread and economically significant disease of livestock caused by protozoan parasites of the genus Eimeria. This disease is worldwide in occurrence and costs the animal agricultural industry many millions of dollars to control. In recent years, the modern tools of molecular biology, biochemistry, cell biology and immunology have been used to expand greatly our knowledge of these parasites and the disease they cause. Such studies are essential if we are to develop new means for the control of coccidiosis. In this chapter, selective aspects of the biology of these organisms, with emphasis on recent research in poultry, are reviewed. Topics considered include taxonomy, systematics, genetics, genomics, transcriptomics, proteomics, transfection, oocyst biogenesis, host cell invasion, immunobiology, diagnostics and control.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/B978-0-12-407705-8.00002-1DOI Listing
February 2014

The Eimeria transcript DB: an integrated resource for annotated transcripts of protozoan parasites of the genus Eimeria.

Database (Oxford) 2013 14;2013:bat006. Epub 2013 Feb 14.

Department of Parasitology, Institute of Biomedical Sciences, University of São Paulo, Avenida Professor Lineu Prestes 1374, São Paulo SP 05508-000, Brazil.

Parasites of the genus Eimeria infect a wide range of vertebrate hosts, including chickens. We have recently reported a comparative analysis of the transcriptomes of Eimeria acervulina, Eimeria maxima and Eimeria tenella, integrating ORESTES data produced by our group and publicly available Expressed Sequence Tags (ESTs). All cDNA reads have been assembled, and the reconstructed transcripts have been submitted to a comprehensive functional annotation pipeline. Additional studies included orthology assignment across apicomplexan parasites and clustering analyses of gene expression profiles among different developmental stages of the parasites. To make all this body of information publicly available, we constructed the Eimeria Transcript Database (EimeriaTDB), a web repository that provides access to sequence data, annotation and comparative analyses. Here, we describe the web interface, available sequence data sets and query tools implemented on the site. The main goal of this work is to offer a public repository of sequence and functional annotation data of reconstructed transcripts of parasites of the genus Eimeria. We believe that EimeriaTDB will represent a valuable and complementary resource for the Eimeria scientific community and for those researchers interested in comparative genomics of apicomplexan parasites. Database URL: http://www.coccidia.icb.usp.br/eimeriatdb/
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bat006DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3572530PMC
June 2013

A comparative transcriptome analysis reveals expression profiles conserved across three Eimeria spp. of domestic fowl and associated with multiple developmental stages.

Int J Parasitol 2012 Jan 22;42(1):39-48. Epub 2011 Nov 22.

Departamento de Parasitologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, Av. Prof. Lineu Prestes, 1374, São Paulo, SP 05508-000, Brazil.

Coccidiosis of the domestic fowl is a worldwide disease caused by seven species of protozoan parasites of the genus Eimeria. The genome of the model species, Eimeria tenella, presents a complexity of 55-60MB distributed in 14 chromosomes. Relatively few studies have been undertaken to unravel the complexity of the transcriptome of Eimeria parasites. We report here the generation of more than 45,000 open reading frame expressed sequence tag (ORESTES) cDNA reads of E. tenella, Eimeria maxima and Eimeria acervulina, covering several developmental stages: unsporulated oocysts, sporoblastic oocysts, sporulated oocysts, sporozoites and second generation merozoites. All reads were assembled to constitute gene indices and submitted to a comprehensive functional annotation pipeline. In the case of E. tenella, we also incorporated publicly available ESTs to generate an integrated body of information. Orthology analyses have identified genes conserved across different apicomplexan parasites, as well as genes restricted to the genus Eimeria. Digital expression profiles obtained from ORESTES/EST countings, submitted to clustering analyses, revealed a high conservation pattern across the three Eimeria spp. Distance trees showed that unsporulated and sporoblastic oocysts constitute a distinct clade in all species, with sporulated oocysts forming a more external branch. This latter stage also shows a close relationship with sporozoites, whereas first and second generation merozoites are more closely related to each other than to sporozoites. The profiles were unambiguously associated with the distinct developmental stages and strongly correlated with the order of the stages in the parasite life cycle. Finally, we present The Eimeria Transcript Database (http://www.coccidia.icb.usp.br/eimeriatdb), a website that provides open access to all sequencing data, annotation and comparative analysis. We expect this repository to represent a useful resource to the Eimeria scientific community, helping to define potential candidates for the development of new strategies to control coccidiosis of the domestic fowl.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijpara.2011.10.008DOI Listing
January 2012

Development of molecular assays for the identification of the 11 Eimeria species of the domestic rabbit (Oryctolagus cuniculus).

Vet Parasitol 2011 Mar 4;176(2-3):275-80. Epub 2010 Nov 4.

Departamento de Parasitologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo SP 05508-000, Brazil.

Coccidiosis are the major parasitic diseases in poultry and other domestic animals including the domestic rabbit (Oryctolagus cuniculus). Eleven distinct Eimeria species have been identified in this host, but no PCR-based method has been developed so far for unequivocal species differentiation. In this work, we describe the development of molecular diagnostic assays that allow for the detection and discrimination of the 11 Eimeria species that infect rabbits. We determined the nucleotide sequences of the ITS1 ribosomal DNAs and designed species-specific primers for each species. We performed specificity tests of the assays using heterologous sets of primers and DNA samples, and no cross-specific bands were observed. We obtained a detection limit varying from 500fg to 1pg, which corresponds approximately to 0.8-1.7 sporulated oocysts, respectively. The test reported here showed good reproducibility and presented a consistent sensitivity with three different brands of amplification enzymes. These novel diagnostic assays will permit population surveys to be performed with high sensitivity and specificity, thus contributing to a better understanding of the epidemiology of this important group of coccidian parasites.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.vetpar.2010.10.054DOI Listing
March 2011

Sequence-specific reconstruction from fragmentary databases using seed sequences: implementation and validation on SAGE, proteome and generic sequencing data.

Bioinformatics 2008 Aug 9;24(15):1676-80. Epub 2008 Jun 9.

Instituto do Coração - USP, Av. Prof. Enéas de Carvalho Aguiar 44, São Paulo SP, 05403-000, Brazil.

Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.

Availability: GenSeed is available under the GNU General Public License at http://www.coccidia.icb.usp.br/genseed/
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btn283DOI Listing
August 2008

Sequencing and analysis of chromosome 1 of Eimeria tenella reveals a unique segmental organization.

Genome Res 2007 Mar 6;17(3):311-9. Epub 2007 Feb 6.

Malaysia Genome Institute, UKM-MTDC Smart Technology Centre, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor DE, Malaysia.

Eimeria tenella is an intracellular protozoan parasite that infects the intestinal tracts of domestic fowl and causes coccidiosis, a serious and sometimes lethal enteritis. Eimeria falls in the same phylum (Apicomplexa) as several human and animal parasites such as Cryptosporidium, Toxoplasma, and the malaria parasite, Plasmodium. Here we report the sequencing and analysis of the first chromosome of E. tenella, a chromosome believed to carry loci associated with drug resistance and known to differ between virulent and attenuated strains of the parasite. The chromosome--which appears to be representative of the genome--is gene-dense and rich in simple-sequence repeats, many of which appear to give rise to repetitive amino acid tracts in the predicted proteins. Most striking is the segmentation of the chromosome into repeat-rich regions peppered with transposon-like elements and telomere-like repeats, alternating with repeat-free regions. Predicted genes differ in character between the two types of segment, and the repeat-rich regions appear to be associated with strain-to-strain variation.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.5823007DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1800922PMC
March 2007

TRAP: automated classification, quantification and annotation of tandemly repeated sequences.

Bioinformatics 2006 Feb 6;22(3):361-2. Epub 2005 Dec 6.

Departamento de Parasitologia, Instituto de Ciências Biomédicas, Universidade de São Paulo, São Paulo SP, 05508-000, Brazil.

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bti809DOI Listing
February 2006

T cell epitope characterization in tandemly repetitive Trypanosoma cruzi B13 protein.

Microbes Infect 2005 Aug-Sep;7(11-12):1184-95

Laboratory of Immunology, Heart Institute (InCor), University of São Paulo School of Medicine, Av. Dr. Enéas de Carvalho Aguiar, 44, Bloco II, 9th andar, São Paulo, SP 05403-000, Brazil.

Proteins containing tandemly repetitive sequences are present in several immunodominant protein antigens in pathogenic protozoan parasites. The tandemly repetitive Trypanosoma cruzi B13 protein is recognized by IgG antibodies from 98% of Chagas' disease patients. Little is known about the molecular mechanisms that lead to the immunodominance of the repeated sequences, and there is limited information on T cell epitopes in such repetitive antigens. We finely characterized the T cell recognition of the tandemly repetitive, degenerate B13 protein by T cell lines, clones and PBMC from Chagas' disease cardiomyopathy (CCC), asymptomatic T. cruzi infected (ASY) and non-infected individuals (N). PBMC proliferative responses to recombinant B13 protein were restricted to individuals bearing HLA-DQA1*0501(DQ7), -DR1, and -DR2; B13 peptides bound to the same HLA molecules in binding assays. The HLA-DQ7-restricted minimal T cell epitope [FGQAAAG(D/E)KP] was identified with an overlapping combinatorial peptide library including all B13 sequence variants in T. cruzi Y strain B13 protein; the underlined small residues GQA were the major HLA contact residues. Among natural B13 15-mer variant peptides, molecular modeling showed that several variant positions were solvent (TCR)-exposed, and substitutions at exposed positions abolished recognition. While natural B13 variant peptide S15.9 seems to be the immunodominant epitope for Chagas' disease patients, S15.4 was preferentially recognized by CCC rather than ASY patients, which may be pathogenically relevant. This is the first thorough characterization of T cell epitopes of a tandemly repetitive protozoan antigen and may suggest a role for T cell help in the immunodominance of protozoan repetitive antigens.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.micinf.2005.03.033DOI Listing
December 2005

EGene: a configurable pipeline generation system for automated sequence analysis.

Bioinformatics 2005 Jun 6;21(12):2812-3. Epub 2005 Apr 6.

Depto. de Ciências da Computação, Instituto de Matemática e Estatística São Paulo, SP, 05508-900, Brazil.

Unlabelled: EGene is a generic, flexible and modular pipeline generation system that makes pipeline construction a modular job. EGene allows for third-party programs to be used and integrated according to the needs of distinct projects and without any previous programming or formal language experience being required. EGene comes with CoEd, a visual tool to facilitate pipeline construction and documentation. A series of components to build pipelines for sequence processing is provided.

Availability: http://www.lbm.fmvz.usp.br/egene/

Contact: [email protected]; [email protected]

Supplementary Information: http://www.lbm.fmvz.usp.br/egene/
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bti424DOI Listing
June 2005

Identification and complete sequencing of novel human transcripts through the use of mouse orthologs and testis cDNA sequences.

Genet Mol Res 2004 Dec 30;3(4):493-511. Epub 2004 Dec 30.

Laboratory of Molecular Biology and Genomics, Ludwig Institute for Cancer Research, São Paulo, SP, Brazil.

The correct identification of all human genes, and their derived transcripts, has not yet been achieved, and it remains one of the major aims of the worldwide genomics community. Computational programs suggest the existence of 30,000 to 40,000 human genes. However, definitive gene identification can only be achieved by experimental approaches. We used two distinct methodologies, one based on the alignment of mouse orthologous sequences to the human genome, and another based on the construction of a high-quality human testis cDNA library, in an attempt to identify new human transcripts within the human genome sequence. We generated 47 complete human transcript sequences, comprising 27 unannotated and 20 annotated sequences. Eight of these transcripts are variants of previously known genes. These transcripts were characterized according to size, number of exons, and chromosomal localization, and a search for protein domains was undertaken based on their putative open reading frames. In silico expression analysis suggests that some of these transcripts are expressed at low levels and in a restricted set of tissues.
View Article and Find Full Text PDF

Download full-text PDF

Source
December 2004

Characterization of SCAR markers of Eimeria spp. of domestic fowl and construction of a public relational database (The Eimeria SCARdb).

FEMS Microbiol Lett 2004 Sep;238(1):183-8

Departamento de Patologia, Faculdade de Medicina Veterinária e Zootecnia, USP, Av. Prof. Orlando Marques de Paiva 87, São Paulo, SP 05508-000, Brazil.

This study reports the development and characterization of 151 sequence characterized amplified region (SCAR) markers for the seven Eimeria species that infect the domestic fowl. From this set, 84 markers are species-specific and 67 present partial specificity. The complete nucleotide sequence was derived for all markers, revealing the presence of micro- and minisatellite repetitive units in 22 SCARs, with up to five distinct repeat units being observed per marker. Only 15 markers showed significant hits in similarity searches against public sequence databases, thus confirming their anonymous and non-coding character. Finally, a relational database of the markers (the Eimeria SCARdb) was developed and made available on the Internet, providing a valuable resource of SCAR markers that can be useful for molecular diagnosis, and also for epizootiological, genetic variability and genome mapping studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.femsle.2004.07.034DOI Listing
September 2004

A transcript finishing initiative for closing gaps in the human transcriptome.

Genome Res 2004 Jul 14;14(7):1413-23. Epub 2004 Jun 14.

We report the results of a transcript finishing initiative, undertaken for the purpose of identifying and characterizing novel human transcripts, in which RT-PCR was used to bridge gaps between paired EST clusters, mapped against the genomic sequence. Each pair of EST clusters selected for experimental validation was designated a transcript finishing unit (TFU). A total of 489 TFUs were selected for validation, and an overall efficiency of 43.1% was achieved. We generated a total of 59,975 bp of transcribed sequences organized into 432 exons, contributing to the definition of the structure of 211 human transcripts. The structure of several transcripts reported here was confirmed during the course of this project, through the generation of their corresponding full-length cDNA sequences. Nevertheless, for 21% of the validated TFUs, a full-length cDNA sequence is not yet available in public databases, and the structure of 69.2% of these TFUs was not correctly predicted by computer programs. The TF strategy provides a significant contribution to the definition of the complete catalog of human genes and transcripts, because it appears to be particularly useful for identification of low abundance transcripts expressed in a restricted set of tissues as well as for the delineation of gene boundaries and alternatively spliced isoforms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.2111304DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC442158PMC
July 2004

The Eimeria genome projects: a sequence of events.

Trends Parasitol 2004 May;20(5):199-201

Institute for Animal Health, Compton Laboratory, Compton, Nr Newbury, Berkshire RG20 7NN, UK.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.pt.2004.02.005DOI Listing
May 2004

The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags.

Proc Natl Acad Sci U S A 2003 Nov 30;100(23):13418-23. Epub 2003 Oct 30.

Laboratorio de Genética Molecular do Cancer, Departmento de Radiologia, Universidade de São Paulo, Travessa da Rua Dr. Ovídeo Pires de Campos S/N, 4deg, Brazil.

Whereas genome sequencing defines the genetic potential of an organism, transcript sequencing defines the utilization of this potential and links the genome with most areas of biology. To exploit the information within the human genome in the fight against cancer, we have deposited some two million expressed sequence tags (ESTs) from human tumors and their corresponding normal tissues in the public databases. The data currently define approximately 23,500 genes, of which only approximately 1,250 are still represented only by ESTs. Examination of the EST coverage of known cancer-related (CR) genes reveals that <1% do not have corresponding ESTs, indicating that the representation of genes associated with commonly studied tumors is high. The careful recording of the origin of all ESTs we have produced has enabled detailed definition of where the genes they represent are expressed in the human body. More than 100,000 ESTs are available for seven tissues, indicating a surprising variability of gene usage that has led to the discovery of a significant number of genes with restricted expression, and that may thus be therapeutically useful. The ESTs also reveal novel nonsynonymous germline variants (although the one-pass nature of the data necessitates careful validation) and many alternatively spliced transcripts. Although widely exploited by the scientific community, vindicating our totally open source policy, the EST data generated still provide extensive information that remains to be systematically explored, and that may further facilitate progress toward both the understanding and treatment of human cancers.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1073/pnas.1233632100DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC263829PMC
November 2003

Pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of Plasmodium vivax in human patients.

Malar J 2003 Jul 21;2:21. Epub 2003 Jul 21.

Departamento de Parasitologia, ICB, Universidade de São Paulo, São Paulo, Brazil.

Background: Plasmodium vivax is the most widely distributed human malaria, responsible for 70-80 million clinical cases each year and large socio-economical burdens for countries such as Brazil where it is the most prevalent species. Unfortunately, due to the impossibility of growing this parasite in continuous in vitro culture, research on P. vivax remains largely neglected.

Methods: A pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of P. vivax was performed. To do so, 1,184 clones from a cDNA library constructed with parasites obtained from 10 different human patients in the Brazilian Amazon were sequenced. Sequences were automatedly processed to remove contaminants and low quality reads. A total of 806 sequences with an average length of 586 bp met such criteria and their clustering revealed 666 distinct events. The consensus sequence of each cluster and the unique sequences of the singlets were used in similarity searches against different databases that included P. vivax, Plasmodium falciparum, Plasmodium yoelii, Plasmodium knowlesi, Apicomplexa and the GenBank non-redundant database. An E-value of <10(-30) was used to define a significant database match. ESTs were manually assigned a gene ontology (GO) terminology

Results: A total of 769 ESTs could be assigned a putative identity based upon sequence similarity to known proteins in GenBank. Moreover, 292 ESTs were annotated and a GO terminology was assigned to 164 of them.

Conclusion: These are the first ESTs reported for P. vivax and, as such, they represent a valuable resource to assist in the annotation of the P. vivax genome currently being sequenced. Moreover, since the GC-content of the P. vivax genome is strikingly different from that of P. falciparum, these ESTs will help in the validation of gene predictions for P. vivax and to create a gene index of this malaria parasite.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1475-2875-2-21DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC183858PMC
July 2003
-->