J Mol Biol 2009 Apr 10;388(1):48-70. Epub 2009 Mar 10.
Department of Biochemistry and Cell Biology, Rice University, Houston, TX 77251, USA.
We report the genome sequence of Bacillus subtilis phage SPO1. The unique genome sequence is 132,562 bp long, and DNA packaged in the virion (the chromosome) has a 13,185-bp terminal redundancy, giving a total of 145,747 bp. We predict 204 protein-coding genes and 5 tRNA genes, and we correlate these findings with the extensive body of investigations of SPO1, including studies of the functions of the 61 previously defined genes and studies of the virion structure. Sixty-nine percent of the encoded proteins show no similarity to any previously known protein. We identify 107 probable transcription promoters; most are members of the promoter classes identified in earlier studies, but we also see a new class that has the same sequence as the host sigma K promoters. We find three genes encoding potential new transcription factors, one of which is a distant homologue of the host sigma factor K. We also identify 75 probable transcription terminator structures. Promoters and terminators are generally located between genes and together with earlier data give what appears to be a rather complete picture of how phage transcription is regulated. There are complete genome sequences available for five additional phages of Gram-positive hosts that are similar to SPO1 in genome size and in composition and organization of genes. Comparative analysis of SPO1 in the context of these other phages yields insights about SPO1 and the other phages that would not be apparent from the analysis of any one phage alone. These include assigning identities as well as probable functions for several specific genes and inferring evolutionary events in the phages' histories. The comparative analysis also allows us to put SPO1 into a phylogenetic context. We see a pattern similar to what has been noted in phage T4 and its relatives, in which there is minimal successful horizontal exchange of genes among a "core" set of genes that includes most of the virion structural genes and some genes of DNA metabolism, but there is extensive horizontal transfer of genes over the remainder of the genome. There is a correlation between genes in rapid evolutionary flux through these genomes and genes that are small.