Sequence-Based Classification Scheme for the Genus Legionella Targeting the mip Gene |
| |
Authors: | Rodney M. Ratcliff Janice A. Lanser Paul A. Manning Michael W. Heuzenroeder |
| |
Affiliation: | Infectious Diseases Laboratories, Institute of Medical and Veterinary Science, Adelaide, South Australia, 5000,1. and Microbial Pathogenesis Unit, Department of Microbiology and Immunology, University of Adelaide, Adelaide, South Australia, 5005,2. Australia |
| |
Abstract: | The identification and speciation of strains of Legionella is often difficult, and even the more successful chromatographic classification techniques have struggled to discriminate newly described species. A sequence-based genotypic classification scheme is reported, targeting approximately 700 nucleotide bases of the mip gene and utilizing gene amplification and direct amplicon sequencing. With the exception of Legionella geestiana, for which an amplicon was not produced, the scheme clearly and unambiguously discriminated among the remaining 39 Legionella species and correctly grouped 26 additional serogroup and reference strains within those species. Additionally, the genotypic classification of approximately 150 wild strains from several continents was consistent with their phenotypic classification, with the exception of a few strains where serological cross-reactivity was complex, potentially confusing the latter classification. Strains thought to represent currently uncharacterized species were also found to be genotypically unique. The scheme is technically simple for a laboratory with even basic molecular capabilities and equipment, if access to a sequencing laboratory is available.The genus Legionella comprises approximately 40 species, at least 7 of which have more than one serotype (3, 15, 31). Approximately half of the species have been associated with human disease (28). Legionella-like organisms isolated from clinical specimens, or from the environment during the course of an outbreak, need to be identified to elucidate the disease process and to identify the source. Legionellae have proved to be relatively unreactive when traditional biochemical tests are utilized, necessitating more complex identification methods (6, 7, 26, 41). Serologically based methods are widely used in clinical laboratories, but antigen cross-reactivity limits specificity and restricts their confident use to a few frequently isolated species (38). This is especially true for countries where legionellosis caused by species other than L. pneumophila is common (12). More complex classification schemes have been proposed (26, 38), the most successful being one based on the range and proportion of cellular fatty acids and ubiquinones (21, 22, 40, 43). As additional species have been characterized, this method has become less discriminating, since apparently unique patterns were proved to be shared by several species (43). The inclusion of hydroxylated fatty acids has improved discrimination, but it requires the analysis of both mono- and dihydroxylated fatty acids, and individual patterns are complex, making analysis difficult (21).Gene sequence-based phylogenic (genotypic) schemes have become widely used for organisms which are difficult to classify, as more sequences have been determined and sequencing methods have become simpler, more widely available, and cost effective (11, 23, 24, 29, 32, 34). Genotypic schemes have the great advantage of being unaffected by colony age and growth conditions and, in contrast to chromatographic methods, are not subject to extraction and chromatographic conditions or constituent equipment. Additionally, because a gene sequence is essentially a long digital string, with each digit being one of only four nucleotides, genotypic schemes are less ambiguous and can utilize significantly more discriminatory data than phenotypic ones, and in a form that lends itself to widely available computer analysis software. Many genotypic schemes utilize variation in the 16S rRNA sequence (11, 23, 24, 32, 34), because of the ease with which regions can be amplified and sequenced with universal primers. The 16S rRNA sequences of Legionella species have been reported (18), as have the sequences of the mip gene (2, 12, 13, 31), which codes for an immunophilin of the FK506 binding protein (FKBP) class (14). This protein, which ranges in size from 232 to 251 amino acids, depending on the Legionella species (31), is an outer membrane protein important in the intracellular cycle of Legionella. While it is known to be involved with the survival of the bacterium immediately after uptake into phagocytic cells (9, 12, 28), its exact role is unclear. Additionally, analogs are found widely in both prokaryotes and eukaryotes and are likely to have a significant cellular role (14). Other gene sequences have been determined for Legionella (5, 17, 36), but only the rRNA sequences and the mip gene have been comprehensively determined for most species, an essential prerequisite for any gene to be the basis of a genotyping scheme. Ratcliff et al. (31) recently phylogenetically compared most Legionella species, using the species variation among both the 16S rRNA and mip genes, and found over twice the variation in the mip gene at the DNA level (56% of base sites) as in 16S rRNA (23% of base sites). A pairwise comparison of species reveals a mip gene variation of 3 to 31% (mean, 20%) between species pairs compared with 1 to 10% (mean, 6%) for 16S rRNA. For the mip gene, interspecies nucleotide variation occurred throughout the gene but especially within a hypervariable insert of up to 51 bases immediately adjacent to the region coding for the signal sequence, at redundant third codon sites, and in sequences coding for either single or small regions of variable amino acids interspersed among regions coding for total or near-total amino acid homology, especially toward the 3′ end of the gene (31). These last regions are known to encode the active portions of the protein’s enzymatic peptidylprolyl cis-trans-isomerase (PPIase) activity (31).Additionally the mip gene appeared to be relatively stable genetically, with no evidence of homologous recombination, in that identical or near-identical sequences were not found for the mip genes from phenotypically divergent species. With respect to genetic stability, the mip gene may therefore behave like housekeeping genes, which are known to be more stable than other gene classes (1). Homologous recombination would severely compromise a sequence-based classification scheme (1), and it is a theoretical possibility at least for rRNA targets (37). Thus, the genetic stability and greater mutational variation of the mip gene suggest that it is an ideal target for a classification scheme, with results likely to be more discriminating in identifying species and more resilient to clonal variation within each species. It may even be possible to discriminate between serogroups where these are present or to demonstrate distinct intraspecies clonal groups.The present study reports the use of the mip gene to develop a sequence-based classification scheme for Legionella, the first proposed for this genus. Further, it reports the comparison of sequences from species which have additional serogroups, to determine if serogroups can be discriminated. Similarly, it reports the comparison of sequences from wild strains isolated on several continents, for which there is confirmatory phenotypic or DNA hybridization identification data, to test the robustness of the scheme for variation within strains of the same species. Lastly, isolates which appear phenotypically or from DNA hybridization studies to be different from currently characterized species were tested to determine if a sequence-based classification scheme can clarify their identities. Some of these unusual isolates have been previously reported (43). |
| |
Keywords: | |
|
|