OrfM: a fast open reading frame predictor for metagenomic data.

Bioinformatics 2016 09 3;32(17):2702-3. Epub 2016 May 3.

Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD 4072, Australia.

Unlabelled: Finding and translating stretches of DNA lacking stop codons is a task common in the analysis of sequence data. However, the computational tools for finding open reading frames are sufficiently slow that they are becoming a bottleneck as the volume of sequence data grows. This computational bottleneck is especially problematic in metagenomics when searching unassembled reads, or screening assembled contigs for genes of interest. Here, we present OrfM, a tool to rapidly identify open reading frames (ORFs) in sequence data by applying the Aho-Corasick algorithm to find regions uninterrupted by stop codons. Benchmarking revealed that OrfM finds identical ORFs to similar tools ('GetOrf' and 'Translate') but is four-five times faster. While OrfM is sequencing platform-agnostic, it is best suited to large, high quality datasets such as those produced by Illumina sequencers.

Availability And Implementation: Source code and binaries are freely available for download at http://github.com/wwood/OrfM or through GNU Guix under the LGPL 3+ license. OrfM is implemented in C and supported on GNU/Linux and OSX.

Contacts: b.woodcroft@uq.edu.au

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btw241DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5013905PMC
September 2016
5 Reads

Publication Analysis

Top Keywords

open reading
12
sequence data
12
reading frames
8
data
5
orfm
5
frames orfs
4
rapidly identify
4
identify open
4
orfs sequence
4
applying aho-corasick
4
algorithm find
4
binaries freely
4
aho-corasick algorithm
4
download http//githubcom/wwood/orfm
4
freely download
4
data applying
4
orfm tool
4
guix lgpl
4
screening assembled
4
high quality
4

Similar Publications