Biases in Illumina transcriptome sequencing caused by random hexamer priming
Kasper D. Hansen1,*, Steven E. Brenner2 and Sandrine Dudoit1,3
1Division of Biostatistics, School of Public Health, UC Berkeley, 101 Haviland Hall, Berkeley, CA 94720-7358, 2Department of Plant and Microbial Biology, UC Berkeley, 461 Koshland Hall, Berkeley, CA 94720-3102 and 3Department of Statistics, UC Berkeley, 367 Evans Hall, Berkeley, CA 94720-3860, USA
*To whom correspondence should be addressed. Tel: ; Fax: ; Email: email@example.com
Received December 1, 2009. Revised March 16, 2010. Accepted March 17, 2010.
Generation of cDNA using random hexamer priming induces biases in the nucleotide composition at the beginning of transcriptome sequencing reads from the Illumina Genome Analyzer. The bias is independent of organism and laboratory and impacts the uniformity of the reads along the transcriptome. We provide a read count reweighting scheme, based on the nucleotide frequencies of the reads, that mitigates the impact of the bias.