Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS.

Bioinformatics 2012 Mar 11;28(5):619-27. Epub 2012 Jan 11.

Department of Computer Science, Freie Universit├Ąt Berlin, Takustrasse 9, Max-Planck-Institute for Molecular Genetics, Berlin, Germany.

Motivation: The reliable detection of genomic variation in resequencing data is still a major challenge, especially for variants larger than a few base pairs. Sequencing reads crossing boundaries of structural variation carry the potential for their identification, but are difficult to map.

Results: Here we present a method for 'split' read mapping, where prefix and suffix match of a read may be interrupted by a longer gap in the read-to-reference alignment. We use this method to accurately detect medium-sized insertions and long deletions with precise breakpoints in genomic resequencing data. Compared with alternative split mapping methods, SplazerS significantly improves sensitivity for detecting large indel events, especially in variant-rich regions. Our method is robust in the presence of sequencing errors as well as alignment errors due to genomic mutations/divergence, and can be used on reads of variable lengths. Our analysis shows that SplazerS is a versatile tool applicable to unanchored or single-end as well as anchored paired-end reads. In addition, application of SplazerS to targeted resequencing data led to the interesting discovery of a complete, possibly functional gene retrocopy variant.

Availability: SplazerS is available from http://www.seqan.de/projects/ splazers.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/bts019DOI Listing
March 2012
4 Reads

Publication Analysis

Top Keywords

resequencing data
12
splazers
5
data
5
long deletions
4
application splazers
4
insertions long
4
splazers targeted
4
detect medium-sized
4
deletions precise
4
medium-sized insertions
4
precise breakpoints
4
compared alternative
4
reads addition
4
alternative split
4
data compared
4
addition application
4
breakpoints genomic
4
genomic resequencing
4
targeted resequencing
4
accurately detect
4

Similar Publications