Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data.

Bioinformatics 2014 Jan 18;30(2):165-71. Epub 2013 Nov 18.

Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland, Department of Genetic Medicine and Development, University of Geneva Medical School, Institute of Genetics and Genomics in Geneva, University of Geneva, 1211, Geneva, Switzerland and Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, 1011, Lausanne, Switzerland.

Motivation: High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts.

Results: We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays.

Availability: The R package abs filter for library clonality simulations and detection of amplification-biased sites is available from http://updepla1srv1.epfl.ch/waszaks/absfilter

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btt667DOI Listing
January 2014
12 Reads

Publication Analysis

Top Keywords

amplification bias
12
chip-seq data
8
high-throughput sequencing
8
clonal reads
8
data derived
8
sites
5
parent-daughter trios
4
trios high-sequencing
4
depth binding
4
sites suffered
4
high-sequencing depth
4
binding sites
4
bias evidenced
4
reads representing
4
representing alleles
4
number clonal
4
larger number
4
evidenced larger
4
suffered amplification
4
cell lines
4

Similar Publications