Detection of splice junctions from paired-end RNA-seq data by SpliceMap
Kin Fai Au1, Hui Jiang1,2, Lan Lin3, Yi Xing3 and Wing Hung Wong1,*
1Department of Statistics, Stanford University, Stanford, CA 94305, 2Stanford Genome Technology Center, 855 California Ave, Palo Alto, CA 94304 and 3Department of Internal Medicine and Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
*To whom correspondence should be addressed. Tel: ; Fax: +1 650 725 8977; Email: email@example.com
Received December 7, 2009. Revised March 10, 2010. Accepted March 12, 2010.
Alternative splicing is a prevalent post-transcriptional process, which is not only important to normal cellular function but is also involved in human diseases. The newly developed second generation sequencing technique provides high-throughput data (RNA-seq data) to study alternative splicing events in different types of cells. Here, we present a computational method, SpliceMap, to detect splice junctions from RNA-seq data. This method does not depend on any existing annotation of gene structures and is capable of finding novel splice junctions with high sensitivity and specificity. It can handle long reads (50–100 nt) and can exploit paired-read information to improve mapping accuracy. Several parameters are included in the output to indicate the reliability of the predicted junction and help filter out false predictions. We applied SpliceMap to analyze 23 million paired 50-nt reads from human brain tissue. The results show at this depth of sequencing, RNA-seq can support reliable detection of splice junctions except for those that are present at very low level. Compared to current methods, SpliceMap can achieve 12% higher sensitivity without sacrificing specificity.