Analysis and functional annotation of an expressed sequence tag collection for tropical crop sugarcane.

Genome Res 2003 Dec 12;13(12):2725-35. Epub 2003 Nov 12.

Centro de Biologia Molecular e Engenharia Genética, Instituto da Computação, Universidade Estadual de Campinas, 13083-970 Campinas-SP, Brazil.

To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged.

Download full-text PDF

Source
http://dx.doi.org/10.1101/gr.1532103DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC403815PMC
December 2003

Publication Analysis

Top Keywords

assembled sequences
16
expressed sequence
8
sequences
8
sequence tag
8
assembled
5
sugarcane
5
pfam domains
4
analysis sucest
4
databases global
4
public databases
4
1415 pfam
4
global analysis
4
sucest data
4
sequences 1415
4
indicated 14409
4
set indicated
4
domains reassembling
4
14409 assembled
4
data set
4
sequences 33%
4

Similar Publications

A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus.

BMC Genomics 2009 Sep 11;10:428. Epub 2009 Sep 11.

Instituto de Biología Molecular y Celular de Plantas, Universidad Politécnica de Valencia and Consejo Superior de Investigaciones Científicas, Avenida de los Naranjos s/n, Valencia 46022, Spain.

Background: Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. Read More

View Article and Full-Text PDF
September 2009

Serial analysis of gene expression in sugarcane (Saccharum spp.) leaves revealed alternative C4 metabolism and putative antisense transcripts.

Plant Mol Biol 2007 Apr 9;63(6):745-62. Epub 2007 Jan 9.

Laboratório de Melhoramento de Plantas, Centro de Energia Nuclear na Agricultura, Universidade de São Paulo, Piracicaba, SP, Brazil.

Sugarcane (Saccharum spp.) is a highly efficient biomass and sugar producing crop. Leaf reactions have been considered as potential rate-limiting step for sucrose accumulation in sugarcane stalks. Read More

View Article and Full-Text PDF
April 2007

Exploiting EST databases for the development and characterisation of 3425 gene-tagged CISP markers in biofuel crop sugarcane and their transferability in cereals and orphan tropical grasses.

BMC Res Notes 2013 Feb 4;6:47. Epub 2013 Feb 4.

Division of Plant Physiology and Biochemistry, Indian Institute of Sugarcane Research, Rae Bareli Road, Lucknow, Uttar Pradesh 226002, India.

Background: Sugarcane is an important cash crop, providing 70% of the global raw sugar as well as raw material for biofuel production. Genetic analysis is hindered in sugarcane because of its large and complex polyploid genome and lack of sufficiently informative gene-tagged markers. Modern genomics has produced large amount of ESTs, which can be exploited to develop molecular markers based on comparative analysis with EST datasets of related crops and whole rice genome sequence, and accentuate their cross-technical functionality in orphan crops like tropical grasses. Read More

View Article and Full-Text PDF
February 2013

Sugarcane genome sequencing by methylation filtration provides tools for genomic research in the genus Saccharum.

Plant J 2014 Jul 17;79(1):162-72. Epub 2014 Jun 17.

Laboratório de Biologia Molecular de Plantas, Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Av. Carlos Chagas Filho 373, CCS, Bl.L-29, Cidade Universitária, Rio de Janeiro, 21941-599, RJ, Brazil.

Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene-rich regions. Gene-enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Read More

View Article and Full-Text PDF
July 2014