Login

Join for Free!
61309 members
table of contents table of contents

Comparative analysis of the plasmid sequences has revealed the most conserved regions …


');

Biology Articles » Genetics » Genomics » Co-evolution of genomes and plasmids within Chlamydia trachomatis and the emergence in Sweden of a new variant strain » Results and Discussion

Results and Discussion
- Co-evolution of genomes and plasmids within Chlamydia trachomatis and the emergence in Sweden of a new variant strain

Genome sequences of serotype A and B isolates of C. trachomatis

The genomes of two serotype B ocular isolates of C. trachomatis were sequenced. These provide the first high quality genome sequences of ocular isolates: B/TZ1A828/OT and B/Jali20/OT (referred to as CTB and Jali20 respectively, for ease of nomenclature), summarised in Table 1. Whole genome comparisons showed that the genomes are highly syntenic, and that there are no whole gene differences between the strains. The pairwise mean nucleotide identity between orthologous genes is 99.8%. A comparison of the genomes against that of C. trachomatis serotype A strain A/HAR-13 (referred to as A/HAR-13) sequenced by microarray [21] showed no insertions or deletions of predicted coding sequences (CDSs).

Previous microarray and sequencing studies have shown that almost all sequence and gene variation distinguishing serotypes of C. trachomatis from each other are restricted to a region at the terminus of replication known as the Plasticity Zone (PZ) [22-24]. It was demonstrated by microarray and PCR analysis that a strain belonging to serotype B lacked most of the genes found within this region [22,23]. However, analysis of the serotype B isolates sequenced for this study showed that they possess many of the CDSs found within the PZ (see additional file 1). These include cytotoxin gene fragments, remnants of a much larger cytotoxin gene which is thought to have been similar to those still found intact in C. muridarum [25]. In a strain of C. trachomatis serotype D, a remnant of the cytotoxin gene is expressed, and is cytotoxic on HeLa cells [26]. Consistent with previous analysis of the PZ, the phospholipase D genes display high levels of disruption in both serotype B strains [24]. The PZ also encompasses the trpRBA operon, which has been found to be functional only in genital isolates, and to carry mutations in ocular strains [27]. This operon is non-functional in both the strains presented here, with either the trpB (B/TZ1A828/OT) or trpA (B/Jali20/OT) being disrupted (see additional file 1).

Additional file 1. Comparison of the Plasticity Zone between several strains, visualised by the Artemis Comparison Tool. The grey lines indicate forward and reverse reading frames of sequenced genomes, with predicted coding sequences superimposed. The red bars indicate regions of 97–100% nucleotide identity. Brown CDSs denote pseudogenes. The cytotoxin locus is reduced in D/UW-3/CX, yet produces an active cytotoxin. It is further deleted in strain L2/434/BU. The phospholipase D locus contains pseudogenes in all strains. The trp operon is complete in strains D/UW-3/CX and L2/434/BU, but has pseudogene components in the serotype A and B strains: trpB in B/TZ1A828/OT and A/HAR-13, and trpA in Jali20 and A/HAR13.

Format: PPT Size: 85KB Download file


Despite there being no whole gene differences between the strains, analysis of the genomes identified several pseudogene differences when they were compared to each other and also with the genomes of other C. trachomatis serotypes and biovariants A/HAR-13 and the serotype L2 strain L2/434/BU (see additional file 2). Eleven of the disrupted CDSs are common to both serotype B strains, whereas only one of these is common also to A/HAR_13 and L2/434/BU (CTB_3211/JALI_3211/CTA_0350/CTL0578). The equivalent putative membrane proteins CTB_0571/JALI_0571 contain different inactivating mutations: a single base deletion causing a frameshift after codon 333 (CTB), or a single nucleotide polymorphism (SNP) prematurely truncating the protein after codon 177 (Jali20). Another example of a CDS with distinct inactivating mutations is the putative exported protein JALI_1341/CTA_0142. This CDS carries a frameshift mutation in Jali20, and is truncated by a premature stop codon leading to a loss of 60 amino acids from the C terminus in A/HAR-13.

Additional file 2. Pseudogene differences between strains B/TZ1A828/OT, B/Jali20, A/HAR-13 and L2/434/BU. Pseudogenes (Ψ) are highlighted in brown, and the CDSs contained within the plasticity zone are shown in mauve.

Format: XLS Size: 22KB Download file


Inclusion proteins are an important family of chlamydial proteins, associated with virulence, which target the host inclusion membrane. Consequently, the inactivation of CDSs CTB_2231 and JALI_2231 encoding candidate inclusion membrane proteins may have implications with regard to how the cell interacts with the host. This CDS has been disrupted by an identical SNP creating a stop codon after codon 113 in both strains, with JALI_2231 having undergone a further single base insertion and deletion at other sites to create two additional frameshifts. Another notable difference is the variation in the sequence of the secF gene, which is present as a full length gene in the serotype B strains, but is found as two separate CDSs secD/secF in A/HAR-13 [21]. A further functional loss is the operon comprising pyruvoyl-dependent arginine decarboxylase and arginine/ornithine antiporter, involved in pH homeostasis. The antiporter (CTB_3721/JALI_3721/CTA_0406-7) is disrupted in the serotype A and B strains, the decarboxylase (CTL0627) is disrupted in L2/434/BU, whereas the operon is intact in the serotype D strain D/UW-3/CX [24].

Set against a high level of sequence identity (generally in excess of 99% identity at the nucleotide level), some predicted CDSs display higher levels of variation (see additional file 3). As has been noted before, these include ompA, which is used to distinguish between serotypes [3], tarp, encoding the translocated actin recruiting phosphoprotein [24], and hctB, encoding histone-like HC2 [28].

Additional file 3. Variable CDSs, comparing strains B/TZ1A828/OT, B/Jali20 and L2/434/BU. CDSs with significant variability between the strains are listed, with brief description of variation.

Format: XLS Size: 19KB Download file


Phylogenetic analysis of C. trachomatis genomes

The first genome-scale, SNP-based phylogenetic analysis of all six available C. trachomatis genomes was carried out, covering serotypes A, B, D, L2 and L2b (Table 2). Comparative genome analysis identified 11,500 SNPs, of which the large majority define splits between the three major groups (Figure 1). Monophyly of LGV strains is supported by 6200 SNPs, 1477 SNPs unite the three ocular strains and 1377 are unique to the genital, serotype D strain. These splits are also strongly supported in the results of the phylogenetic analysis of the SNPs, using a general time reversible model of evolution and four discrete gamma distributed rate categories to account for among site rate variation. Pairwise comparisons of the genomes by SNP numbers also confirms this clustering (Table 3). Within the ocular strains, the two serotype B isolates cluster together, suggesting that serotypes are identifiable on the basis of SNP phylogenies. These data are derived from the genomes of six isolates (five serotypes), whilst the numbers are small they reflect the same patterns of genome evolution observed using fragments of the genome [5] and our data using complete genomes are strong enough to support these associations, although further studies would be beneficial to confirm these findings when the technology will allow rapid easy purification of genomes from these obligate intracellular pathogens.

Phylogenetic analysis of C. trachomatis plasmids

The plasmid sequences from the strains CTB and Jali20 were assembled from the genome shotgun. To investigate the new variant strain which evaded diagnosis, isolates from epithelial sexually transmitted infections (STIs) from the city of Malmo in Sweden were included. Genital tract isolates representing the new variant C. trachomatis (strain Sweden2, serotype E) and three concurrently isolated strains (Sweden3, serotype E; Sweden4 and Sweden5, both serotype F) were selected. To represent plasmids from other chlamydial serotypes, plasmid sequences were obtained from Genbank covering further trachoma and LGV strains (Table 2).

Alignment of these 11 plasmid sequences showed that there are 83 SNP locations, representing approximately 1.1% variation. Six of these occur in intergenic locations. The SNPs and their effects on coding sequences are shown in Figure 2. Each plasmid is unique and identifiable by the presence of at least one SNP and/or indel (Figure 1B–D). Only two SNPs, at positions 5,328 and 7,458 (using pSW3 as the reference sequence), allow differentiation of the chlamydial plasmids into LGV, trachoma and genital tract groupings. Most of the SNPs are located within CDSs and there is no significant clustering of SNPs within the plasmid. Only one non-synonymous mutation occurs within CDS2 which may suggest that this gene is under strong selection. Analysis of the informative SNPs and indels allowed phylogenetic reconstruction of the relationships between the plasmids (Figure 1). The resulting phylogenetic tree shows that the chlamydial plasmids segregate into tight groupings reflecting the phenotypes of their host bacteria. The LGV plasmids are the most distantly related, with the STI and ocular strain plasmids having apparently diverged at a later time.

Comparative phylogenetics of complete genomes and plasmids

A comparison of the phylogenies of the genomes and plasmids is given in Figure 1. Of the serotype A, B, L2 and L2b strains, for which both complete genome and plasmid sequence sets exist, and serotype D for which there are independent complete genome and plasmid sequences, the phylogenies mirror each other. The complete genomes and plasmids from the LGV strains cluster tightly, as do those from the ocular strains, and the STI strains branch from these and also cluster together.

These data suggest that the chlamydial plasmids have not been freely exchanged, but have remained closely linked to their cognate host chromosome (Figure 1).

Analysis of the new variant plasmid

The plasmid with the most variation (pSW2) was found in strain Sweden2. However, pSW2 still belongs to the genital tract lineage and therefore has not appeared as result of a transfer event. The complete nucleotide sequence of pSW2 comprises 7,169 bp, 333 bp smaller than the 7,502 bp plasmid (pSW3) from strain Sweden3. Sweden3 could be hypothesised to be the potential progenitor strain of Sweden2 because they have identical sequences of the chromosomal ompA gene. The difference in size between pSW2 and pSW3 is accounted for by a deletion of 377 bp and a duplication of 44 bp at a different locus (Figure 3). pSW2 is the smallest chlamydial plasmid described, some 200 bp smaller than the previously smallest known chlamydial plasmid, pCpnE1 (7,369 bp), which is from an equine strain of C. pneumoniae [29,18].

The 377 bp deletion within pSW2 is situated within CDS1, creating a frameshift which shortens the predicted protein from 305 to 178 amino acids and removing a primer binding site for the diagnostic nucleic acid amplification tests (NAATs). This region of the plasmid was originally selected as the target for several commercial diagnostic NAATS and the new variant strain became established because infection by this strain went undetected and hence untreated [9]. Plasmid pL2 from strain L2/434/BU contains a single nucleotide deletion within CDS1 (position 910) [15,30] leading to a truncated CDS1 protein (260 amino acids). pCpnE1 also has a deletion within CDS1, but the location is different to that within pSW2 and the effect is to create two small putative CDSs, which are unlikely to be functional.

The observation of CDS1 as a region of the plasmid apparently prone to inactivation and therefore potentially dispensable may be explained by the possible functional redundancy between the proteins encoded by CDS1 and CDS2. These proteins have similar sizes (305 and 332 amino acids respectively), share 35% amino acid sequence identity, and both match to the Pfam domain PF00589 (CDS1 e-value of 0.003, CDS2 e-value of 4.9e-39), suggesting some functional equivalence [29].

A second difference between the pSW2 and the other C. trachomatis plasmids is a 44 bp perfect tandem duplication, located immediately upstream of both CDS2 and CDS3, which are divergently transcribed. The transcription start points (tsp) for CDS2 and CDS3 (encoding a homologue of DnaB, a protein involved in forming the replication complex) have been mapped previously [31] and are both located within this 44 bp section. The duplication of the tsp could potentially boost CDS2 expression. Both the deletion in CDS1 and the 44 bp duplication are unique to pSW2, and no intermediate plasmid carrying either of the mutations separately has been identified. This could indicate that these changes are related events, and that potential up-regulation of CDS2 may compensate for the loss of a functional product from CDS1.

Candidate plasmid regions for improved diagnostic targets

When all C. trachomatis plasmid CDSs from the eleven complete nucleotide sequences were compared, CDS2 was found to be the most highly conserved. Although there are eleven SNPs within the coding sequence, only one results in an amino acid change (Figure 2), suggestive of a functional requirement. This SNP (Met-Leu, position 1,147), present in pSW2 and pSW3, is at the extreme carboxy terminus of the protein. A further constraint on variation within CDS2 is the presence of two short RNA molecules (225 and 415 nucleotides), which are complementary to the 3' terminus of the primary transcript encoding CDS2 [32]. These two short 'antisense' transcripts are differentially expressed during the developmental cycle. This level of sequence conservation, possibly tied to an essential function, suggests that the region of the plasmid encompassing CDS2 would be a good target for future screening.

CDS6, CDS7 and CDS8 also show high levels of amino acid conservation. CDS6 (unknown function) is the smallest plasmid encoded protein and contains a single SNP. The proteins encoded by CDS7 and CDS8 display homology to proteins involved in the process of plasmid partitioning [29], and have been shown to be active at cell division. These proteins may play an important role for these relatively low copy number plasmids, ensuring that each daughter cell acquires an equal number of plasmid copies.

The protein predicted to be encoded by CDS5, previously designated ORF 5 (pgp3) has the largest number of non-synonymous SNPs. There are 14 SNPs, evenly spread throughout CDS, resulting in ten amino acid changes (Figure 2). SNP 5,112 differentiates LGV plasmids from the trachoma plasmids and SNP 5,114 is unique to the blinding trachoma isolates (using pSW3 as the reference sequence). The protein encoded by CDS5 (pgp3) has been located to the cell surface and it has recently been suggested that the CDS5 product can be secreted from inside the inclusion, to the cytoplasm of Chlamydia-infected cells [33]. Thus the higher number of non-synonymous changes in this CDS could result from immune selection giving rise to more variation.

The area of the plasmid from the stop codon of CDS8 to the start codon of CDS1 has the highest density of intergenic SNPs, as well as apparent deletions, making the region the most susceptible to mutation within the C. trachomatis plasmids. Thus the area around the replication origin is the most variable and is a poor region in which to design diagnostic PCR primers.

Analysis of plasmid copy number

The sequencing of pSW5 revealed that this plasmid carries one 22 bp repeat fewer than the others at the putative origin of replication. To test whether this affects plasmid copy number, DNA from several strains was subjected to quantitative PCR. The results showed that, where loss of the repeat sequence had occurred, plasmid copy number was not adversely affected, with plasmid/genome (P/G) ratios in the range of 2–6 (Figure 4). Interestingly, it appears that genital strains have a slightly lower plasmid copy number than the others, and strain Jali20 has the highest P/G ratio.


rating: 1.00 from 1 votes | updated on: 30 Jun 2009 | views: 3557 |

Rate article:







excellent!bad…