9 − 100% similarity), closely followed by flaA (84.4 − 100%). The 16S rRNA gene had by far the lowest levels of inter-strain sequence variation (99.3 − 100% similarity). This indicated that the pyrH and rrsA/B gene H 89 mouse sequences respectively had the best and worst strain-differentiating abilities. The levels of nucleotide diversity per site
(Pi) within each of the eight genes are shown in Table 4. In the protein-encoding genes, Pi values ranged from ca. 0.033 (pyrH, recA) to 0.026 (dnaN). Figure 2 Taxonomic resolution based on the ranges of intraspecific sequence similarity (%) for the individual 16S rRNA, flaA, recA, pyrH, ppnK, dnaN, era and radC genes, within the Doramapimod 20 Treponema denticola strains analyzed. The y-axis indicates the levels of nucleotide identity (%) shared between the eight individual gene sequences analyzed from each strain, with the range represented as a bar. Detection of recombination using concatenated multi-gene sequence data Failing to account for DNA homologous recombination (i.e. horizontal genetic exchange) can lead to erroneous phylogenetic reconstruction and also elevate the false-positive error rate in positive selection inference. Therefore, we checked for evidence of recombination within each of the eight individual genetic loci in all 20 strains, by identifying possible DNA ‘breakpoints’
using the HYPHY 2.0 software suite [41]. No evidence of genetic recombination was found within any gene sequences in any strain. This indicated that all the sites in the respective gene sequences shared a common evolutionary KPT-330 chemical structure history. Analysis of selection pressure at each genetic locus Selection pressure was analyzed by determining the ratios of non-synonymous
to synonymous mutations (ω = d N/d S) for each codon site within each of the seven protein-encoding genes, in each of the 20 strains. When ω < 1, the codon is under negative selection pressure, i.e. purifying or stabilizing selection, to conserve the amino acid Phospholipase D1 composition of the encoded protein. Table 4 summarizes the global rate ratios (ω = d N/d S) with 95% confidence intervals, as well as the numbers of negatively selected codon sites for each of the genes investigated. It may be seen that global ratios for the seven genes were subject to strong purifying selection (ω < 0.106), indicating that there was a strong selective pressure to conserve the function of the encoded proteins. No positively-selected sites were found in any of the 140 gene sequences. Phylogenetic analyses of T. denticola strains using concatenated multi-gene sequence data The DNA sequences of the seven protein-encoding genes were concatenated in the order: flaA − recA − pyrH − ppnK − dnaN − era − radC, for analysis using BA and ML approaches. The combined data matrix contained 6,513 nucleotides for each strain.