LLM3D: a log-linear modeling-based method to predict functional gene regulatory interactions from genome-wide expression data
1. Geert Geeven1,
2. Harold D. MacGillavry2,
3. Ruben Eggers3,
4. Marion M. Sassen2,
5. Joost Verhaagen1,3,
6. August B. Smit2,
7. Mathisca C. M. de Gunst1 and
8. Ronald E. van Kesteren2,*
+ Author Affiliations
1Department of Mathematics, Faculty of Sciences, VU University, De Boelelaan 1081, 1081 HV Amsterdam, 2Department of Molecular and Cellular Neurobiology, Center for Neurogenomics and Cognitive Research, Neuroscience Campus Amsterdam, VU University, De Boelelaan 1085, 1081 HV Amsterdam and 3Department of Neuroregeneration, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, The Netherlands
1. ↵*To whom correspondence should be addressed. Tel: +31 20 5987111; Fax: +31 20 5989281; Email: firstname.lastname@example.org
* Received October 25, 2010.
* Revision received February 24, 2011.
* Accepted February 25, 2011.
All cellular processes are regulated by condition-specific and time-dependent interactions between transcription factors and their target genes. While in simple organisms, e.g. bacteria and yeast, a large amount of experimental data is available to support functional transcription regulatory interactions, in mammalian systems reconstruction of gene regulatory networks still heavily depends on the accurate prediction of transcription factor binding sites. Here, we present a new method, log-linear modeling of 3D contingency tables (LLM3D), to predict functional transcription factor binding sites. LLM3D combines gene expression data, gene ontology annotation and computationally predicted transcription factor binding sites in a single statistical analysis, and offers a methodological improvement over existing enrichment-based methods. We show that LLM3D successfully identifies novel transcriptional regulators of the yeast metabolic cycle, and correctly predicts key regulators of mouse embryonic stem cell self-renewal more accurately than existing enrichment-based methods. Moreover, in a clinically relevant in vivo injury model of mammalian neurons, LLM3D identified peroxisome proliferator-activated receptor γ (PPARγ) as a neuron-intrinsic transcriptional regulator of regenerative axon growth. In conclusion, LLM3D provides a significant improvement over existing methods in predicting functional transcription regulatory interactions in the absence of experimental transcription factor binding data.