IJSEM
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Institution: University of Queensland-St.Lucia
Abstract of this Article
Reprint (PDF) Version of this Article
Email this article to a friend
Similar articles found in:
IJSEM Online
Search Medline for articles by:
Hugenholtz, P. || Huber, T.
Search Agricola for articles by:
Hugenholtz P. || Huber T.
Alert me when:
new articles cite this article
Download to Citation Manager
Int J Syst Evol Microbiol 53 (2003), 289-293; DOI  10.1099/ijs.0.02441-0
© 2003 International Union of Microbiological Societies

Chimeric 16S rDNA sequences of diverse origin are accumulating in the public databases

Philip Hugenholtz{dagger} and Thomas Huber

ComBinE group, Advanced Computational Modelling Centre, The University of Queensland, Brisbane 4072, Australia

Correspondence
Philip Hugenholtz
philiph@nature.berkeley.edu


   ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
NOTE ADDED IN PROOF
REFERENCES
 
A significant number of chimeric 16S rDNA sequences of diverse origin were identified in the public databases by partial treeing analysis. This suggests that chimeric sequences, representing phylogenetically novel non-existent organisms, are routinely being overlooked in molecular phylogenetic surveys despite a general awareness of PCR-generated artefacts amongst researchers.


Published online ahead of print on 26 July 2002 as DOI 10.1099/ijs.0.02441-0.

{dagger}Present address: Environmental Science, Policy and Management, Division of Ecosystem Sciences, 151 Hilgard Hall, University of California Berkeley, Berkeley, CA 94720-3110, USA.


   INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
NOTE ADDED IN PROOF
REFERENCES
 
Culture-independent studies based on obtaining 16S rRNA genes directly from the environment by broad-specificity primer PCR and cloning have greatly improved our understanding of microbial diversity (Hugenholtz et al., 1998; Pace, 1997). However, such PCR-based surveys have a number of recognized limitations (Hugenholtz & Goebel, 2001; von Wintzingerode et al., 1997), perhaps the most insidious of which is the formation of recombinant or chimeric sequences during PCR amplification. Chimera formation is thought to occur when a prematurely terminated amplicon reanneals to a foreign DNA strand and is copied to completion in the following PCR cycles. This results in a sequence composed of two or more phylogenetically distinct parent sequences and, when comparatively analysed with other 16S rDNA sequences, suggests the presence of a non-existent organism. This problem was recognized early on in the application of PCR-clone library studies (Kopczynski et al., 1994; Liesack et al., 1991) and significant efforts have been made both to quantify (and hopefully reduce) chimera formation (Qiu et al., 2001; Speksnijder et al., 2001; Wang & Wang, 1996, 1997) and to improve their detection (Komatsoulis & Waterman, 1997; Liesack et al., 1991; Maidak et al., 2001; Robinson-Cox et al., 1995). Despite these precautions, a surprising number of chimeric 16S rDNA sequences from molecular phylogenetic surveys were detected in the public databases during a recent collation (Hugenholtz, 2002).


   METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
NOTE ADDED IN PROOF
REFERENCES
 
Phylogenetic analysis.
16S rDNA sequences from several PCR-clone library studies were obtained from the public databases and imported into an ARB (http://www.arb-home.de/) database, where they were automatically aligned against existing sequences using FAST ALIGNER (version 1.03) followed by a manual refinement of the alignment. Studies were selected on the basis that at least one sequence in the study had been putatively identified as chimeric during routine database updating. Only almost-complete 16S rDNA sequences (>1300 nt) were included in the analysis, because the phylogenetic placement of shorter sequences can be unreliable, particularly if they lack close relatives in the database (Hugenholtz et al., 1998). Datasets comprising all sequences (>1300 nt) from a single study and 341 or 200 reference sequences representing the bacterial or archaeal domains, respectively, were selected for phylogenetic inference (Hugenholtz, 2002). These datasets are available through the Ribosomal Database Project (RDP; Maidak et al., 2001; http://rdp.cme.msu.edu/html/alignments.html).

Evolutionary distance trees were inferred independently from 5' and 3' halves of each dataset (partial treeing) applying the Lane mask (Lane, 1991) from absolute positions 0 to 4000 (635 nt for comparative analysis equivalent to Escherichia coli positions 28–762; 5' half) and 4000 to 0 (653 nt equivalent to E. coli positions 762–1512; 3' half) using the ‘column selection’ option in the filter selection menu. The environmental clone sequences in the dataset were then marked and tree topologies were compared for branching incongruencies indicative of chimeric sequences (Wang & Wang, 1997). The alignments of putatively identified chimeras were examined against their closest 5' and 3' matches (at least two of each) and inspected for nucleotide signature shifts characteristic of chimeric sequences (Wang & Wang, 1997). Breakpoints (also known as chimeric junctions or recombination sites) were estimated as being halfway between the change of nucleotide signatures characteristic of each parent group. Exact breakpoints are difficult to determine because the parent sequences are usually identical around the recombination site (Hugenholtz & Goebel, 2001). Positively identified chimeras were flagged in the database by appending ‘#’ to the clone name and annotating the ‘warning’ field with the affiliations of the parent sequences and approximate breakpoint. This information is summarized in Table 1.


View this table:
[in this window]
[in a new window]
 
Table. 1 Chimeric 16S rDNA sequences detected in the public databases

 

   RESULTS AND DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
NOTE ADDED IN PROOF
REFERENCES
 
A sampling of nine published and three unpublished studies with publicly available sequence data revealed 21 inter-phylum and 18 intra-phylum chimeras by partial treeing (Table 1). Numerous smaller local topological rearrangements of sequences in the partial trees were observed, suggesting the presence of additional chimeras created from closely related parent sequences (data not shown). Inter-phylum chimeras are particularly problematic in phylogenetic inference, as they can result in novel lines of descent that may be misinterpreted as representing novel species, genera or even families of prokaryotes, although they should also be the easiest to detect (von Wintzingerode et al., 1997). The most popular method for detecting chimeric 16S rDNA sequences is the CHIMERA_CHECK program available through the RDP (Maidak et al., 2001). This is one of a number of nearest-neighbour methods that detect chimeric sequences by determining whether fragments of two independent database entries have a higher overall similarity to the query sequence than a single, full-length database entry (Komatsoulis & Waterman, 1997; Robinson-Cox et al., 1995). Unfortunately, once a chimeric sequence is added to the RDP database it becomes invisible to CHIMERA_CHECK because it is simply compared against itself in the analysis. Currently, there is no way to use CHIMERA_CHECK against subsets of the database, such as 16S rDNA sequences from cultivated organisms, to bypass this problem.

One instance of a chimeric sequence with two breakpoints was detected, SAGMA-C (Table 1). PCR-mediated recombinant sequences with multiple recombination sites have been documented previously (Bradley & Hillis, 1997) and are more likely to occur between closely related sequences, as seen in this instance. The sequences presented in Table 1 reduce the quality of the public databases and should be removed, or divided at the breakpoint and resubmitted as separate entries designated A and B to distinguish the chimeric fragments.

This study is by no means an exhaustive search of the public databases and simply serves to illustrate that chimeric 16S rDNA sequences are being overlooked in molecular phylogenetic surveys, despite a general appreciation of PCR-generated chimeras amongst researchers. Nearest-neighbour chimera detection methods should be routinely supplemented with partial treeing analysis, as this method is less sensitive to the absence of closely related parent sequences in the databases and is relatively simple to implement using ARB. In addition, we have recently written a program, called BELLEROPHON, that detects chimeric sequences in aligned datasets based on partial treeing analysis; this program is available online (http://cassandra.visac.uq.edu.au/perl/bellerophon.pl).


   NOTE ADDED IN PROOF
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
NOTE ADDED IN PROOF
REFERENCES
 
One of the three unpublished studies addressed in this paper has now been published (Lowe et al., 2002; sequences d011 and d035 in Table 1).


   REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS AND DISCUSSION
NOTE ADDED IN PROOF
REFERENCES
 
Bano, N. & Hollibaugh, J. T. (2002). Phylogenetic composition of bacterioplankton assemblages from the Arctic Ocean. Appl Environ Microbiol 68, 505–518.[Abstract/Free Full Text]

Bradley, R. D. & Hillis, D. M. (1997). Recombinant DNA sequences generated by PCR amplification. Mol Biol Evol 14, 592–593.[Free Full Text]

Garrity, G. M., Winters, M. & Searles, D. B. (2001). Taxonomic Outline of the Procaryotes. Bergey's Manual of Systematic Bacteriology, release 1.0, April 2001. New York: Springer-Verlag (http://www.cme.msu.edu/bergeys/).

Hugenholtz, P. (2002). Exploring prokaryotic diversity in the genomic era. Genome Biol 3, REVIEWS0003.[Medline]

Hugenholtz, P. & Goebel, B. M. (2001). The polymerase chain reaction as a tool to investigate microbial diversity in environmental samples. In Environmental Molecular Microbiology: Protocols and Applications, pp. 31–42. Edited by P. A. Rochelle. Wymondham, UK: Horizon Scientific Press.

Hugenholtz, P., Goebel, B. M. & Pace, N. R. (1998). Impact of culture-independent studies on the emerging phylogenetic view of bacterial diversity. J Bacteriol 180, 4765–4774; erratum, 6793.[Free Full Text]

Komatsoulis, G. & Waterman, M. (1997). A new computational method for detection of chimeric 16S rRNA artifacts generated by PCR amplification from mixed bacterial populations. Appl Environ Microbiol 63, 2338–2346.[Abstract]

Kopczynski, E. D., Bateson, M. M. & Ward, D. M. (1994). Recognition of chimeric small-subunit ribosomal DNAs composed of genes from uncultivated microorganisms. Appl Environ Microbiol 60, 746–748.[Abstract]

Lane, D. J. (1991). 16S/23S rRNA sequencing. In Nucleic Acid Techniques in Bacterial Systematics, pp. 115–175. Edited by E. Stackebrandt & M. Goodfellow. New York: Wiley.

Lanoil, B. D., Sassen, R., La Duc, M. T., Sweet, S. T. & Nealson, K. H. (2001). Bacteria and Archaea physically associated with Gulf of Mexico gas hydrates. Appl Environ Microbiol 67, 5143–5153.[Abstract/Free Full Text]

Li, L., Kato, C. & Horikoshi, K. (1999). Bacterial diversity in deep-sea sediments from different depths. Biodivers Conserv 8, 659–677.[CrossRef]

Liesack, W., Weyland, H. & Stackebrandt, E. (1991). Potential risks of gene amplification by PCR as determined by 16S rDNA analysis of a mixed-culture of strict barophilic bacteria. Microb Ecol 21, 191–198.

Lowe, M., Madsen, E. L., Schindler, K., Smith, C., Emrich, S., Robb, F. & Halden, R. U. (2002). Geochemistry and microbial diversity of a trichloroethene-contaminated Superfund site undergoing intrinsic in situ reductive dechlorination. FEMS Microbiol Ecol 40, 123–134.[CrossRef]

Maidak, B. L., Cole, J. R., Lilburn, T. G. & 7 other authors (2001). The RDP-II (Ribosomal Database Project). Nucleic Acids Res 29, 173–174.[Abstract/Free Full Text]

Pace, N. R. (1997). A molecular view of microbial diversity and the biosphere. Science 276, 734–740.[Abstract/Free Full Text]

Qiu, X. Y., Wu, L. Y., Huang, H. S., McDonel, P. E., Palumbo, A. V., Tiedje, J. M. & Zhou, J. Z. (2001). Evaluation of PCR-generated chimeras: mutations, and heteroduplexes with 16S rRNA gene-based cloning. Appl Environ Microbiol 67, 880–887.[Abstract/Free Full Text]

Reysenbach, A. L., Longnecker, K. & Kirshtein, J. (2000). Novel bacterial and archaeal lineages from an in situ growth chamber deployed at a Mid-Atlantic Ridge hydrothermal vent. Appl Environ Microbiol 66, 3798–3806.[Abstract/Free Full Text]

Robinson-Cox, J. F., Bateson, M. M. & Ward, D. M. (1995). Evaluation of nearest-neighbor methods for detection of chimeric small-subunit rRNA sequences. Appl Environ Microbiol 61, 1240–1245.[Abstract]

Speksnijder, A., Kowalchuk, G. A., De Jong, S., Kline, E., Stephen, J. R. & Laanbroek, H. J. (2001). Microvariation artifacts introduced by PCR and cloning of closely related 16S rRNA gene sequences. Appl Environ Microbiol 67, 469–472.[Abstract/Free Full Text]

Stein, L. Y., La Duc, M. T., Grundl, T. J. & Nealson, K. H. (2001). Bacterial and archaeal populations associated with freshwater ferromanganous micronodules and sediments. Environ Microbiol 3, 10–18.[CrossRef][Medline]

Takai, K. & Horikoshi, K. (1999). Genetic diversity of archaea in deep-sea hydrothermal vent environments. Genetics 152, 1285–1297.[Abstract/Free Full Text]

Takai, K., Moser, D. P., DeFlaun, M., Onstott, T. C. & Fredrickson, J. K. (2001). Archaeal diversity in waters from deep South African gold mines. Appl Environ Microbiol 67, 5750–5760.[Abstract/Free Full Text]

von Wintzingerode, F., Gobel, U. B. & Stackebrandt, E. (1997). Determination of microbial diversity in environmental samples: pitfalls of PCR-based rRNA analysis. FEMS Microbiol Rev 21, 213–229.[CrossRef][Medline]

Wang, G. C.-Y. & Wang, Y. (1996). The frequency of chimeric molecules as a consequence of PCR co-amplification of 16S rRNA genes from different bacterial species. Microbiology 142, 1107–1114.[Abstract]

Wang, G. C.-Y. & Wang, Y. (1997). Frequency of formation of chimeric molecules is a consequence of PCR coamplification of 16S rRNA genes from mixed bacterial genomes. Appl Environ Microbiol 63, 4645–4650.[Abstract]

Wu, J. H., Liu, W. T., Tseng, I. C. & Cheng, S. S. (2001a). Characterization of a 4-methylbenzoate-degrading methanogenic consortium as determined by small-subunit rDNA sequence analysis. J Biosci Bioeng 91, 449–455.[CrossRef]

Wu, J. H., Liu, W. T., Tseng, I. C. & Cheng, S. S. (2001b). Characterization of microbial consortia in a terephthalate-degrading anaerobic granular sludge system. Microbiology 147, 373–382.[Abstract/Free Full Text]




Abstract of this Article
Reprint (PDF) Version of this Article
Email this article to a friend
Similar articles found in:
IJSEM Online
Search Medline for articles by:
Hugenholtz, P. || Huber, T.
Search Agricola for articles by:
Hugenholtz P. || Huber T.
Alert me when:
new articles cite this article
Download to Citation Manager


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
INT J SYST EVOL MICROBIOL MICROBIOLOGY J GEN VIROL
J MED MICROBIOL ALL SGM JOURNALS