PLoS One 2010 Mar 3;5(3):e9513. Epub 2010 Mar 3.
Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America.
Signaling and regulatory pathways that guide gene expression have only been partially defined for most organisms. However, given the increasing number of microarray measurements, it may be possible to reconstruct such pathways and uncover missing connections directly from experimental data. Using a compendium of microarray gene expression data obtained from Escherichia coli, we constructed a series of Bayesian network models for the reactive oxygen species (ROS) pathway as defined by EcoCyc. A consensus Bayesian network model was generated using those networks sharing the top recovered score. This microarray-based network only partially agreed with the known ROS pathway curated from the literature and databases. A top network was then expanded to predict genes that could enhance the Bayesian network model using an algorithm we termed 'BN+1'. This expansion procedure predicted many stress-related genes (e.g., dusB and uspE), and their possible interactions with other ROS pathway genes. A term enrichment method discovered that biofilm-associated microarray data usually contained high expression levels of both uspE and gadX. The predicted involvement of gene uspE in the ROS pathway and interactions between uspE and gadX were confirmed experimentally using E. coli reporter strains. Genes gadX and uspE showed a feedback relationship in regulating each other's expression. Both genes were verified to regulate biofilm formation through gene knockout experiments. These data suggest that the BN+1 expansion method can faithfully uncover hidden or unknown genes for a selected pathway with significant biological roles. The presently reported BN+1 expansion method is a generalized approach applicable to the characterization and expansion of other biological pathways and living systems.