Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2010:2:646-55.
doi: 10.1093/gbe/evq048. Epub 2010 Aug 4.

Evolutionary dynamics of complete Campylobacter pan-genomes and the bacterial species concept

Affiliations
Comparative Study

Evolutionary dynamics of complete Campylobacter pan-genomes and the bacterial species concept

Tristan Lefébure et al. Genome Biol Evol. 2010.

Abstract

Defining bacterial species and understanding the relative cohesiveness of different components of their genomes remains a fundamental problem in microbiology. Bacterial species tend to be comprised of both a set of core and dispensable genes, with the sum of these two components forming the species pan-genome. The role of the core and dispensable genes in defining bacterial species and the question of whether pan-genomes are finite or infinite remain unclear. Here we demonstrate, through the analysis of 96 genome sequences derived from two closely related sympatric sister species of pathogenic bacteria (Campylobacter coli and C. jejuni), that their pan-genome is indeed finite and that there are unique and cohesive features to each of their genomes defining their genomic identity. The two species have a similar pan-genome size; however, C. coli has acquired a larger core genome and each species has evolved a number of species-specific core genes, possibly reflecting different adaptive strategies. Genome-wide assessment of the level of lateral gene transfer within and between the two sister species, as well as within the core and non-core genes, demonstrates a resistance to interspecies recombination in the core genome of the two species and therefore provides persuasive support for the core genome hypothesis for bacterial species.

PubMed Disclaimer

Figures

F<sc>IG</sc>. 1.—
FIG. 1.—
Pipeline combining de novo assemblies and read mapping, yielding a gene content table and core gene alignments.
F<sc>IG</sc>. 2.—
FIG. 2.—
Core genome (A) and pan-genome (B) size estimates, as well as number of newly discovered genes (C), as a function of the number of sequenced genomes. The genome input order was randomly permuted 1,000 times. The lines describe the average number of genes (using median statistics), whereas the vertical bars delimit the second and third quartiles, with the exception of panel (C), where quartiles are represented by short dashed lines. On panel (A), the long dashed lines correspond to the average core genome size when one taxon is allowed a missing core gene, whereas on the (B and C) panels, they describe the pan-genome size or number of new genes for the combined species data set when the putative pseudogenes are excluded.
F<sc>IG</sc>. 3.—
FIG. 3.—
Principal component analysis (PCA) of Campylobacter coli and C. jejuni gene characteristics aimed at discriminating different gene frequency groups (between-group PCA). The first and second components of the PCA summarized 45% and 21% of the total inertia, respectively, whereas the gene frequency group factor accounted for 17% of the total inertia. (A) Histogram of gene occurrence frequency and the frequency groups g1–g5 that were used in the analysis. (B) Canonical graph, with B(g|C) representing the codon usage distance to the average genomic codon usage; nNR, the number of hits with the NR database; Pfam, the best Pfam score; GC, the GC content; P2, the U/C choice in degenerate codon position index; length, protein length; Glim, Glimmer score; and eNR, the log transformed best NR E value. (C) Projection of gene orthologs on the first and second components, with red dots representing genes belonging to the g5 group, black dots the g2–g4 groups, and blue dots g1. Ellipses and gravity centers are used to represent the frequency group distribution.
F<sc>IG</sc>. 4.—
FIG. 4.—
Overlap between the core and dispensable (disp.) genomic components of Campylobacter coli and C. jejuni; core genes were allowed to be missing in one strain per species. The absent/absent section represents genes that were found in other Campylobacter species but absent in C. coli and C. jejuni. Cirle radii are proportional to the number of genes. The black surface represents the proportion of putative pseudogenes.
F<sc>IG</sc>. 5.—
FIG. 5.—
Campylobacter jejuni NCTC11168 genome map with genes (CDS) displayed on either strand as tracks 1 and 2 (tracks numbered from outside in). Gene frequencies and core genes are displayed for C. jejuni (tracks 3 and 4) and C. coli (tracks 5 and 6), as well as recombinant genes that showed evidence of interspecies (tracks 7 and 9, the height being proportional to the number of LGT events) and intraspecies (tracks 8 and 10) LGTs. Coding genes that are core genes in C. jejuni but absent (in black) or dispensable (in gray) in C. coli are labeled with gene names or locus names. Color code for each track is given at the top of the figure.
F<sc>IG</sc>. 6.—
FIG. 6.—
Gene tree bipartition support. This graph displays the percentage of gene tree supporting or rejecting a set of 5,264 bipartitions that are supported by at least one gene tree. Support was assessed using nonparametric bootstrap, with support higher than 70% and 90% in red and blue, respectively. Bipartitions are sorted from the least rejected to the most rejected, resulting in bipartitions supporting species paraphyly at the extreme right. The five most commonly supported bipartitions are labeled as “(C.jejuni),” Campylobacter jejuni monophyly; “(C.coli),” C. coli monophyly; “a,” cje12 and cjj81116 monophyly; “b,” cje23 and cjjhb9313 monophyly; and “c,” monophyly of the C. coli species with the exception of cco71.
F<sc>IG</sc>. 7.—
FIG. 7.—
Overlap between the intra- (intra) and interspecies (inter) divergences in the core (white) and dispensable (gray) genes.

Similar articles

Cited by

References

    1. Ashburner M, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25:25–29. - PMC - PubMed
    1. Barrell D, et al. The GOA database in 2009—an integrated Gene Ontology Annotation resource. Nucleic Acids Res. 2009;37:396–403. - PMC - PubMed
    1. Bentley DR, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456:53–59. - PMC - PubMed
    1. Boyle EI, et al. GO::TermFinder—open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004;20:3710–3715. - PMC - PubMed
    1. Bruen TC, Philippe H, Bryant D. A simple and robust statistical test for detecting the presence of recombination. Genetics. 2006;172:2665–2681. - PMC - PubMed

Publication types