There are 139 oppositions in the 16 S and 263 oppositions in the 23 S rRNA comparative structure models of unpaired nucleotides that are adjacent to the end of a helix. In the hypothetical world where the frequency of each of the four nucleotides is 25% at paired and unpaired positions and there is no bias for any nucleotide pairs at these positions, for each opposition, we expect a 12.5% chance of finding an AA or AG (2/16 = 0.125 = 12.5%; GA is excluded, see above). Thus, for any one rRNA sequence, we expect, based upon this random sampling, there to be approximately 17 sites in 16 S and 33 sites in 23 S rRNA with an AA or AG opposition at the end of a helix (referred to hereunder as AA.AG@helix.ends) [139 * 0.125 = 17; 263 * 0.125 = 33]. The expected number of AA and AG sites that occur at the same positions in our collection of approximately 5850 16 S and 325 23 S rRNA sequences is extremely small. For n sequences, assuming that nucleotide pairs are unbiased and that each sequence is independent of all other sequences, the chance of finding 90% of the sequences with either an AA or an AG opposition at a single site is 1:(0.125)0.9*n ([footnote 1]).
Therefore, for each site, the probability that 90% of the approximately 5850 sequences in 16 S rRNA will have an AA or AG opposition is 1:5.9 * 104754; the corresponding probability for 23 S rRNA (approximately 325 sequences) is 1:1.4 * 10264. Using the values calculated above for the expected number of AA.AG@helix.ends sites in the 16 S and 23 S rRNAs, there are 1.7 x 1011 possible sets of AA.AG@helix.ends in 16 S rRNA (where each set contains an AA or AG at 17 of the 139 sites) and 1.0 x 1042 possible sets of AA.AG@helix.ends in 23 S rRNA (AA or AG at 33 of 263 sites). From these two sets of calculations, we conclude that the odds of finding the same pattern in 90% of the sequence sets by random chance are extremely low. In fact, we observe that 30% of the oppositions at the ends of 16 S rRNA helices (42 of 139) and 28% of the oppositions at the ends of 23 S rRNA helices (73 of 263) have an AA or AG opposition in at least 90% of the sequences.
1 When flipping a non-biased coin, the odds of it landing "heads" are 1:2 (or 1/2). For each additional flip, we multiply by 1/2 again, so that the odds of flipping "heads" five consecutive times are 1:32 ((1/2)5). We can equate the chances of finding an AA or AG opposition at an individual site to flipping a (hypothetical) 16-sided coin. There is a 2/16 chance (assuming no bias for nucleotide pairs at a site) of having either AA or AG present. For each additional sequence, again, we must multiply by 2/16. So, for n sequences, the probability of finding an AA or AG opposition at the same site in all sequences is 1:(2/16)n (or 1:(0.125)n).
