Appl Bioinformatics 2004 ;3(1):3-8
Department of Biochemistry, Queen's University, Kingston, ON, Canada.
DNA base compositions were determined chemically long before sequencing technologies permitted the direct counting of bases. Some recent observations made using modern sequencing technologies could have been deduced by application of elementary principles to early chemical observations. This paper draws attention to the fact that the potential for significant stem-loop structure is a general property of single-stranded DNA (genic and non-genic) and hence for any corresponding transcripts, whether they function by virtue of their structure (eg rRNA) or as mRNA. Furthermore, there is Chargaff's second parity rule: in single strands, the percentage of purines approximately equals the percentage of pyrimidines. Since, in stems, purines match pyrimidines, Szybalski's rule that transcripts violate the second parity rule in favour of purines, must apply to loops. Since purine loading occurs in both mesophilic and thermophilic species, genes with transcripts that need stable secondary structures for functioning at high temperatures must achieve this by selectively increasing the GC percentage (GC%) of stems, while retaining purine loading of loops. Arguments that purine loading is specific for the loops of RNA-synonymous strands of genes whose transcripts function by virtue of their secondary structure (ie rRNAs, not mRNAs) need to take into account, as controls, the loop regions of mRNA-synonymous strands. Entire genes, or entire genomes where gene orientation is not considered, are not appropriate controls.