Friday, April 9, 2010

A Signal-Noise Model for Significance Analysis of ChIP-seq with Negative Control

A Signal-Noise Model for Significance Analysis of ChIP-seq with Negative Control
Han Xu 1,3, Lusy Handoko 2, Xueliang Wei 4, Chaopeng Ye 2, Jianpeng Sheng 5, Chia-Lin Wei 2, Feng Lin 3,* and Wing-Kin Sung 1,4,*

1Computational & Mathematical Biology Group, Genome Institute of Singapore, 138672, Singapore; 2Genome Technology & Biology Group, Genome Institute of Singapore, 138672, Singapore; 3School of Computer Engineering, Nanyang Technological University, 637553, Singapore; 4School of Computing, National University of Singapore, 117543, Singapore; 5School of Biological Science, Nanyang Techno-logical University, 637551, Singapore

*To whom correspondence should be addressed. Feng Lin, Wing-Kin Sung, E-mail: sungk@gis.a-star.edu.sg, asflin@ntu.edu.sg


Abstract

Motivation: ChIP-seq is becoming the main approach to the genome-wide study of protein-DNA interactions and histone modifications. Existing informatics tools perform well to extract strong ChIP-enriched sites. However, two questions remain to be answered: (a) to which extent is a ChIP-seq experiment able to reveal the weak ChIP-enriched sites? (b) are the weak sites biologically meaningful? To answer these questions, it is necessary to identify the weak ChIP signals from background noise.

Results: We propose a linear signal-noise model, in which a noise rate was introduced to represent the fraction of noise in a ChIP library. We developed an iterative algorithm to estimate the noise rate using a control library, and derived a library-swapping strategy for the FDR estimation. These approaches were integrated in a general-purpose framework, named CCAT (Control based ChIP-seq Analysis Tool), for the significance analysis of ChIP-seq. Applications to H3K4me3 and H3K36me3 datasets showed CCAT predicted significantly more ChIP-enriched sites than previous methods did. With the high sensitivity of CCAT prediction, we revealed distinct chromatin features associated to the strong and weak H3K4me3 sites.

Availability: http://cmb.gis.a-star.edu.sg/ChIPSeq/tools.htm

No comments:

Post a Comment