- Split View
-
Views
-
Cite
Cite
Dennis V. Lavrov, Walker Pett, Oliver Voigt, Gert Wörheide, Lise Forget, B. Franz Lang, Ehsan Kayal, Mitochondrial DNA of Clathrina clathrus (Calcarea, Calcinea): Six Linear Chromosomes, Fragmented rRNAs, tRNA Editing, and a Novel Genetic Code, Molecular Biology and Evolution, Volume 30, Issue 4, April 2013, Pages 865–880, https://doi.org/10.1093/molbev/mss274
- Share Icon Share
Abstract
Sponges (phylum Porifera) are a large and ancient group of morphologically simple but ecologically important aquatic animals. Although their body plan and lifestyle are relatively uniform, sponges show extensive molecular and genetic diversity. In particular, mitochondrial genomes from three of the four previously studied classes of Porifera (Demospongiae, Hexactinellida, and Homoscleromorpha) have distinct gene contents, genome organizations, and evolutionary rates. Here, we report the mitochondrial genome of Clathrina clathrus (Calcinea, Clathrinidae), a representative of the fourth poriferan class, the Calcarea, which proves to be the most unusual. Clathrina clathrus mitochondrial DNA (mtDNA) consists of six linear chromosomes 7.6–9.4 kb in size and encodes at least 37 genes: 13 protein codings, 2 ribosomal RNAs (rRNAs), and 24 transfer RNAs (tRNAs). Protein genes include atp9, which has now been found in all major sponge lineages, but no atp8. Our analyses further reveal the presence of a novel genetic code that involves unique reassignments of the UAG codons from termination to tyrosine and of the CGN codons from arginine to glycine. Clathrina clathrus mitochondrial rRNAs are encoded in three (srRNA) and ⩾6 (lrRNA) fragments distributed out of order and on several chromosomes. The encoded tRNAs contain multiple mismatches in the aminoacyl acceptor stems that are repaired posttranscriptionally by 3′-end RNA editing. Although our analysis does not resolve the phylogenetic position of calcareous sponges, likely due to their high rates of mitochondrial sequence evolution, it confirms mtDNA as a promising marker for population studies in this group. The combination of unusual mitochondrial features in C. clathrus redefines the extremes of mtDNA evolution in animals and further argues against the idea of a “typical animal mtDNA.”
Introduction
Mitochondria, the double-membrane organelles found in most eukaryotic cells and best known for their role in energy production, contain their own genome (mitochondrial DNA [mtDNA]), which is maintained, expressed, and inherited separately from the nuclear genome (reviewed in Gibor and Granick 1964; Birky 2001). Although, the genetic function of mtDNA is well conserved, its size, structure, and gene expression mechanisms vary greatly among and within major lineages of eukaryotes (Lang et al. 1999; Burger, Gray, et al. 2003). By contrast, mtDNA of animals is generally considered remarkably uniform and is typically described as a small and economically organized circular molecule with a conserved gene set, modified genetic code, stable gene order, and high rates of sequence evolution. However, such portrayal of animal mtDNA is based on early studies limited primarily to bilaterian animals (e.g., Clary and Wolstenholme 1985) and is now out of date. A more expanded sampling of bilaterian animals and, especially, recent work on nonbilaterian animals (phyla Cnidaria, Ctenophora, Placozoa, and Porifera) revealed substantial mitochondrial genomic diversity within these groups (reviewed in Lavrov 2011). This diversity is particularly pronounced in sponges, which should be expected given their ancient origin (Love et al. 2009) and early diversification (Pisera 2006).
Sponges (phylum Porifera) are sessile aquatic animals that play a major role in marine and many freshwater benthic communities (Van Soest et al. 2012). They are subdivided into four classes: Calcarea (calcareous sponges), Demospongiae (demosponges), Hexactinellida (glass sponges), and Homoscleromorpha (homoscleromorphs) (Hooper et al. 2002; Gazave et al. 2011). The first two complete mitochondrial genomes of sponges were determined only in 2005 (Lavrov et al. 2005). Since then, approximately 30 other sponge mt-genomes have been published (Erpenbeck et al. 2007, 2009; Haen et al. 2007; Wang and Lavrov 2007, 2008; Belinky et al. 2008; Lukic-Bilela et al. 2008; Rosengarten et al. 2008; Gazave et al. 2010; Lavrov 2010, 2012; Ereskovsky et al. 2011; Pleše et al. 2012; Sperling et al. 2012). This body of work revealed distinct evolutionary trajectories in mitochondrial genome evolution among and within major lineages of sponges. Thus, mitochondrial genomes of demosponges and homoscleromorphs are characterized by the retention of several features characteristic of many nonmetazoan eukaryotes, including a minimally modified genetic code, presence of several extra genes, more bacteria-like structures of encoded transfer and ribosomal RNAs (rRNAs), low rate of sequence evolution, and the presence of multiple noncoding regions (Lavrov et al. 2005; Wang and Lavrov 2007, 2008). By contrast, mitochondrial genomes of glass sponges have evolved several features in parallel with their counterparts in bilaterian animals and share with them a similar mtDNA organization with a single large noncoding region, a similar nucleotide composition of the coding strand, and, most surprisingly, an identical reassignment of the AGR codons from arginine to serine (Haen et al. 2007; Rosengarten et al. 2008). Within Demospongiae, loss of transfer RNA (tRNA) genes has occurred in several lineages (Wang and Lavrov 2008; Erpenbeck et al. 2009), and a transfer of atp9 to the nucleus has been reported (Erpenbeck et al. 2007). Within Homoscleromorpha two different mitochondrial organizations have been found: all genomes from the family Oscarellidae are characterized by the presence of tatC, a gene for subunit C of the twin arginine translocase, otherwise not known in animals, as well 27 tRNA genes, whereas those from the family Plakinidae lack all but five tRNA genes present in Oscarellidae (Gazave et al. 2010). Remarkably, to date there are no complete and only a few partial mt sequences from Calcarea (Voigt, Eichmann, et al. 2012), the fourth class of sponges. Moreover, several mitochondrial sequences in GenBank assigned to this group appear to be contaminations of demosponge origin (Borchiellini et al. 2004).
Calcareous sponges (class Calcarea) are the only sponges with skeletons made of calcium carbonate (CaCO3) spicules. The majority of approximately 680 described species are small, colorless, and inconspicuous organisms found mostly in cryptic marine habitats (e.g., beneath rocks and inside cavities) in shallow waters (Wörheide and Hooper 2003); however, a few can reach up to 50 cm in size and some have been collected from bathyal and abyssal zones (Van Soest et al. 2012). Calcareous sponges are present in seas at all latitudes and display a wide variety of organizations, including the famous morphocline in the organization of the aquiferous system, which can be of an asconoid, syconoid, solenoid, sylleibid, and leuconoid type (Voigt, Wülfing, et al. 2012). Based on morphological and cytological data, calcareous sponges have been placed as a sister group to either Demospongiae (e.g., Cellularia [Reiswig et al. 1983]) or Demospongiae + Hexactinellida (e.g., Silicea [Gray 1867]). By contrast, most molecular phylogenetic studies using rRNA (Cavalier-Smith et al. 1996; Zrzavy et al. 1998; Borchiellini et al. 2001; Medina et al. 2001; Manuel et al. 2003) and nuclear housekeeping (Sperling et al. 2009) genes reconstructed Porifera as a paraphyletic group, with calcareous sponges being more closely related to Eumetazoa than to other sponges. Recent phylogenomic analyses based on expressed sequence tags (ESTs) data reversed this trend and supported the monophyly of sponges (Philippe et al. 2009, 2011; Pick et al. 2010) but placed Calcarea as the sister group to Homoscleromorpha, a newly established class of sponges (Gazave et al. 2011) (reviewed in Wörheide et al. 2012). Within Calcarea, two subclasses Calcinea and Calcaronea are recognized and well supported by the analyses of cytological, developmental, and molecular characters (see Manuel 2006, for a review). However, the phylogenetic relationships within these subclasses remain highly contentious (Manuel et al. 2003; Dohrmann et al. 2006; Voigt, Wülfing, et al. 2012).
Here, we report the mitochondrial genome sequence of the calcareous sponge Clathrina clathrus (Calcinea, Clathrinidae) and four mitochondrial complementary DNA (cDNA) sequences of Leucetta chagosensis (Calcinea, Leucettidae). These highly unusual sequences provide a first glimpse of mitochondrial genome evolution in Calcarea and broaden our view on the existing diversity of mitochondrial genomes in sponges and animals in general.
Results
Multipartite and Linear Architecture of the C. clathrus Mitochondrial Genome
The mitochondrial genome of C. clathrus collected at the Station Marine d'Endoume, Marseille (France) was assembled from paired end Illumina DNA sequencing reads in six contigs 5,738; 6,382; 7,920; 8,013; 8,413; and 9,199 bp in size, which were extended by polymerase chain reaction (PCR)/Sanger sequencing using primers designed for the ends of the longer contigs to 7,162+; 8,857; 7,920; 8,866; 9,269; and 9,370 bp, respectively (fig. 1). The sequences of all contigs were identical at one end and similar at the other. The contigs also displayed a comparable internal organization with genes compactly arrayed in approximately one-half of each of them and another half containing exclusively noncoding sequence. The total size of the assembly was >50 kb, making C. clathrus mtDNA the largest animal mitochondrial genome reported to date.
Three disjoint mtDNA regions were also PCR-amplified and sequenced from another specimen of C. clathrus collected off Isola del Giglio (Italy) (fig. 1A) and used to design probes for Southern hybridization (SH) analysis. SH analysis with probes for rnl_5, cob, and cox1 revealed a peculiar double-band hybridization pattern, with the estimated sizes of the lower bands at 7, 9, and ∼10 kb, and estimated sizes of the upper bands about twice as large (fig. 1B). The difference in the band sizes for individual probes suggests that each assembled contig represents a mitochondrial chromosome. The double-band hybridization pattern can be interpreted either as that of partially dimerized linear DNA molecules, or of circular plasmids with fast-migrating supercoiled circles and slowly migrating open circles. However, because no Illumina sequences have been found that would link the two terminal regions and no PCR reaction across the ends was successful, we interpreted C. clathrus mitochondrial chromosomes as linear.
High Inter- and Intraspecific Variation in Coding Sequences
Thirteen protein genes (atp6, atp9, cob, cox1–3, nad1–6, and nad4L) were identified in the mitochondrial genome of C. clathrus (France). These genes were localized to five different chromosomes in both transcriptional directions (fig. 1). Seven of them (atp6, cob, cox1, nad1–4) were also present in PCR-amplified fragments from C. clathrus (Italy). In addition, four mitochondrial protein genes (cob, cox1, cox3, and nad1) were identified in an EST library of L. chagosensis and PCR-amplified from another specimen of the same species. Analyses of calcareous mitochondrial coding sequences revealed substantial variation at both the intra- and interspecific levels. The two specimens of C. clathrus had overall 7.9% sequence divergence in coding sequences (from 4% for cob to 17.9% for nad2) (table 1). Similarly, the two specimens of L. chagosensis displayed on average 4.3% sequence divergence at the nucleotide level. Finally, coding sequences in C. clathrus and L. chagosensis differed at more than 50% of sites, comparable with the differences between C. clathrus and the homoscleromorph Oscarella carmela (table 1). These large sequence divergences in both intra- and interspecific comparisons reflect high rates of sequence evolution in calcareous mtDNA as proposed earlier (Lavrov et al. 2006; Voigt, Eichmann, et al. 2012), which also manifest themselves as exceptionally long branches in phylogenetic analyses (discussed later). At the same time, the ratios of synonymous to nonsynonymous substitutions in mitochondrial coding sequences were consistently <1 (table 1), indicating the presence of purifying selection and hence the functionality of these genes.
Gene . | Sizea (codons) . | Rates of Evolutionb . | Sequence Identityc (%) . | Initiation Codon . | Stop Codon . | ||||
---|---|---|---|---|---|---|---|---|---|
. | . | dN . | dS . | dN/dS . | CC/CC1 . | CC/LC . | CC/OC . | . | . |
atp6 | 254 | 0.046 | 0.217 | 0.213 | 91.6 | 39.3 | ATG | TAA | |
atp9 | 78 | — | — | — | 61.0 | ACAd | TAA | ||
Cob | 372 | 0.004 | 0.123 | 0.031 | 96.0 | 53.0 | 46.1 | ATT | TAA |
cox1 | 474 | 0.006 | 0.187 | 0.032 | 95.2 | 58.3 | 50.2 | AAAd | TAA |
cox2 | 216 | — | — | — | 43.5 | ATG | TAG | ||
cox3 | 269 | — | — | — | 44.3 | 43.3 | CCTd | TAA | |
Nad1 | 312 | 0.019 | 0.140 | 0.136 | 95.1 | 46.6 | 45.9 | ATC | TAA |
Nad2 | 462 | 0.143 | 0.468 | 0.306 | 82.1 | 33.6 | ATG | TAA | |
Nad3 | 169 | 0.047 | 0.326 | 0.146 | 89.3 | 29.7 | ATG | TAA | |
Nad4 | 477 | 0.010 | 0.201 | 0.051 | 94.3 | 40.4 | ATG | TAG | |
nad4L | 89 | — | — | — | — | 39.7 | GGGd | TAA | |
Nad5 | 530 | — | — | — | — | 38.6 | ATG | TAA | |
Nad6 | 198 | — | — | — | — | 34.6 | ATG | AATe |
Gene . | Sizea (codons) . | Rates of Evolutionb . | Sequence Identityc (%) . | Initiation Codon . | Stop Codon . | ||||
---|---|---|---|---|---|---|---|---|---|
. | . | dN . | dS . | dN/dS . | CC/CC1 . | CC/LC . | CC/OC . | . | . |
atp6 | 254 | 0.046 | 0.217 | 0.213 | 91.6 | 39.3 | ATG | TAA | |
atp9 | 78 | — | — | — | 61.0 | ACAd | TAA | ||
Cob | 372 | 0.004 | 0.123 | 0.031 | 96.0 | 53.0 | 46.1 | ATT | TAA |
cox1 | 474 | 0.006 | 0.187 | 0.032 | 95.2 | 58.3 | 50.2 | AAAd | TAA |
cox2 | 216 | — | — | — | 43.5 | ATG | TAG | ||
cox3 | 269 | — | — | — | 44.3 | 43.3 | CCTd | TAA | |
Nad1 | 312 | 0.019 | 0.140 | 0.136 | 95.1 | 46.6 | 45.9 | ATC | TAA |
Nad2 | 462 | 0.143 | 0.468 | 0.306 | 82.1 | 33.6 | ATG | TAA | |
Nad3 | 169 | 0.047 | 0.326 | 0.146 | 89.3 | 29.7 | ATG | TAA | |
Nad4 | 477 | 0.010 | 0.201 | 0.051 | 94.3 | 40.4 | ATG | TAG | |
nad4L | 89 | — | — | — | — | 39.7 | GGGd | TAA | |
Nad5 | 530 | — | — | — | — | 38.6 | ATG | TAA | |
Nad6 | 198 | — | — | — | — | 34.6 | ATG | AATe |
aIncluding stop codon.
bIn comparison between two C. clathrus specimens.
cPercent sequence identity between the following specimens: CC, C. clathrus (France); CC1, C. clathrus (Italy); LC, Leucetta chagosensis (Australia); and OC, Oscarella carmela.
dNo conventional start codon has been found for the reading frame.
eNo conventional termination codon has been found for the reading frame.
Gene . | Sizea (codons) . | Rates of Evolutionb . | Sequence Identityc (%) . | Initiation Codon . | Stop Codon . | ||||
---|---|---|---|---|---|---|---|---|---|
. | . | dN . | dS . | dN/dS . | CC/CC1 . | CC/LC . | CC/OC . | . | . |
atp6 | 254 | 0.046 | 0.217 | 0.213 | 91.6 | 39.3 | ATG | TAA | |
atp9 | 78 | — | — | — | 61.0 | ACAd | TAA | ||
Cob | 372 | 0.004 | 0.123 | 0.031 | 96.0 | 53.0 | 46.1 | ATT | TAA |
cox1 | 474 | 0.006 | 0.187 | 0.032 | 95.2 | 58.3 | 50.2 | AAAd | TAA |
cox2 | 216 | — | — | — | 43.5 | ATG | TAG | ||
cox3 | 269 | — | — | — | 44.3 | 43.3 | CCTd | TAA | |
Nad1 | 312 | 0.019 | 0.140 | 0.136 | 95.1 | 46.6 | 45.9 | ATC | TAA |
Nad2 | 462 | 0.143 | 0.468 | 0.306 | 82.1 | 33.6 | ATG | TAA | |
Nad3 | 169 | 0.047 | 0.326 | 0.146 | 89.3 | 29.7 | ATG | TAA | |
Nad4 | 477 | 0.010 | 0.201 | 0.051 | 94.3 | 40.4 | ATG | TAG | |
nad4L | 89 | — | — | — | — | 39.7 | GGGd | TAA | |
Nad5 | 530 | — | — | — | — | 38.6 | ATG | TAA | |
Nad6 | 198 | — | — | — | — | 34.6 | ATG | AATe |
Gene . | Sizea (codons) . | Rates of Evolutionb . | Sequence Identityc (%) . | Initiation Codon . | Stop Codon . | ||||
---|---|---|---|---|---|---|---|---|---|
. | . | dN . | dS . | dN/dS . | CC/CC1 . | CC/LC . | CC/OC . | . | . |
atp6 | 254 | 0.046 | 0.217 | 0.213 | 91.6 | 39.3 | ATG | TAA | |
atp9 | 78 | — | — | — | 61.0 | ACAd | TAA | ||
Cob | 372 | 0.004 | 0.123 | 0.031 | 96.0 | 53.0 | 46.1 | ATT | TAA |
cox1 | 474 | 0.006 | 0.187 | 0.032 | 95.2 | 58.3 | 50.2 | AAAd | TAA |
cox2 | 216 | — | — | — | 43.5 | ATG | TAG | ||
cox3 | 269 | — | — | — | 44.3 | 43.3 | CCTd | TAA | |
Nad1 | 312 | 0.019 | 0.140 | 0.136 | 95.1 | 46.6 | 45.9 | ATC | TAA |
Nad2 | 462 | 0.143 | 0.468 | 0.306 | 82.1 | 33.6 | ATG | TAA | |
Nad3 | 169 | 0.047 | 0.326 | 0.146 | 89.3 | 29.7 | ATG | TAA | |
Nad4 | 477 | 0.010 | 0.201 | 0.051 | 94.3 | 40.4 | ATG | TAG | |
nad4L | 89 | — | — | — | — | 39.7 | GGGd | TAA | |
Nad5 | 530 | — | — | — | — | 38.6 | ATG | TAA | |
Nad6 | 198 | — | — | — | — | 34.6 | ATG | AATe |
aIncluding stop codon.
bIn comparison between two C. clathrus specimens.
cPercent sequence identity between the following specimens: CC, C. clathrus (France); CC1, C. clathrus (Italy); LC, Leucetta chagosensis (Australia); and OC, Oscarella carmela.
dNo conventional start codon has been found for the reading frame.
eNo conventional termination codon has been found for the reading frame.
A Novel Mitochondrial Genetic Code in Calcinean Sponges
Conceptual translation of C. clathrus mitochondrial coding regions revealed multiple internal UAG “termination” codons (63 in total), suggesting a modified genetic code. Comparative sequence analysis using GenDecoder (Abascal et al. 2006) showed that 79% of C. clathrus UAG codons present at highly conserved positions (entropy <1.0 and gaps <20%) corresponded to tyrosine codons in other animals. In addition, this analysis uncovered an unexpected pattern of occurrence of CGN codons, preferentially used at position specifying glycine in other animals (>85% codons at highly conserved sites) and not found at highly conserved arginine positions. Finally, C. clathrus UGA codons were mostly located at tryptophan position in the alignments (∼88% codons at highly conserved sites), consistent with the use of this codon to specify tryptophan in animal mitochondria. We also analyzed patterns of codon usage in four mitochondrial genes from the L. chagosensis cDNA library (Philippe et al. 2009) with similar results: 72% of UAG codons and >80% of CGN codons at highly conserved sites corresponded to tyrosine and glycine, respectively. Because unusual codon usage was observed in both gDNA and cDNA sequences, we ruled out mRNA editing as an explanation for it. Furthermore, the fact that identical changes were found in two phylogenetically distant species of calcinean sponges (Rossi et al. 2011) indicates that they may be characteristic for the whole subclass. Together, these results suggested a novel mitochondrial genetic code in calcinean sponges with the reassignment of CGN codons from arginine to glycine, and the UAG codon from termination to tyrosine. The latter reassignment is supported by the presence of an encoded tRNA with a CUA anticodon in C. clathrus mtDNA as described in the following section.
Clathrina clathrus mt-tRNAs Include an Unusual tRNACUA and Show Evidence of Template-Dependent Editing
We identified 23 sequences in C. clathrus mtDNA that could be folded into characteristic tRNA secondary structure with a capacity to decode all codon families except for alanine GCN and tyrosine UAY. The set of potential tRNAs included an unusual tRNA with an CUA anticodon, inferred to be responsible for the translation of the reassigned UAG and possibly also UAY codons as tyrosine. The predicted structure of this tRNA supports its identification as , as it belongs to so-called type II tRNAs—distinguished by a long variable arm—which is restricted to three families: , , and (Normanly et al. 1992). However, C. clathrus has an unusual structure as it contains an extra nucleotide between the aminoacyl acceptor stem and the D-arm, which can form a Watson–Crick pair with nucleotide 26 (fig. 2B). In addition, the set of encoded tRNAs includes two with the CAU anticodon, inferred to code for and (supplementary fig. S3, Supplementary Material online) as in demosponges, homoscleromorphs, placozoans, and many nonmetazoan eukaryotes (Lavrov 2011).
All encoded tRNAs displayed well-conserved sequences of the anticodon, DHU (D-) and TΨC (T-) loops (fig. 2A), but had multiple mismatches in the aminoacyl acceptor stems, suggesting tRNA editing. The occurrence of editing has been experimentally confirmed for five tRNAs (, , , , and ) using the reverse transcriptase-PCR (RT-PCR) approach described in Price and Gray (1999) (fig. 2B and C). The edited tRNAs formed perfect Watson–Crick complementary paired T-arms and aminoacyl acceptor arms with the CCA sequence added at the 3′-end (Chen et al. 1992). The editing involved all four nucleotides and thus appears to be template dependent (but see Schürer et al. 2001).
An additional interesting feature of the set of C. clathrus mt-tRNAs is the presence of an Y11–R24 pair in all of them. Although Y11–R24 is a conserved feature of most tRNAs, some have a characteristic R–Y pair at these positions. tRNAs with the R11–Y24 pair include prokaryotic and organellar initiator tRNA methionine (Marck and Grosjean 2002), animal mt-tRNA tryptophan (Wolstenholme 1992), and mt-tRNA proline in glass sponges, demosponges, and homoscleromorphs (Lavrov 2007). The exclusive occurrence of the Y11–R24 in all C. clathrus mt-tRNAs suggests concomitant changes at these positions in methionine, tryptophan, and proline tRNAs in this species.
SSU- and LSU-rRNA Are Encoded by Fragmented Genes Distributed Across Several Chromosomes
We used Infernal 1.0.2 (Nawrocki et al. 2009) and BLAST (Altschul et al. 1990) to identify genes for the large and small subunit rRNA (LSU–rRNA and SSU–rRNA, respectively) in the mitochondrial sequences of C. clathrus. Both rRNA genes were discontinuous and located on several chromosomes (fig. 1), but when pieced together, encoded well-conserved rRNA structures (fig. 3). The gene for SSU–rRNA (rns) was split into three fragments located on chromosomes 1 and 2 (fig. 3A). The gene for LSU–rRNA (rnl) was split into at least six fragments located on three different chromosomes. The 5′ half of LSU–rRNA (helices 1–60) was encoded by three genomic regions on chromosome 2 in a nonsequential order (fig. 1). Because the 5′ half is a less conserved part of the molecule, additional unidentified regions encoding this part may be present. The better conserved 3′-end of the molecule is also split into three fragments: two encoded on chromosome 1 and separated by trnP(ugg) and trnK(uuu), and one encoded on chromosome 6 (figs. 1 and 3B). We used RT-PCR to confirm that the transcribed fragments are not spliced together (amplifications were observed within but not between individual fragments) and to confirm the expression of trnP(ugg) located between rnl_4 and rnl_5, as described in the previous section. Interestingly, despite their unusual organization, the rate of sequence evolution in C. clathrus mitochondrial rRNAs appeared to be similar to those in other animals, in contrast to protein genes, where it was highly accelerated (supplementary fig. S4, Supplementary Material online).
Noncoding Regions and Terminal Repeats
Each mitochondrial chromosome of C. clathrus contained a large terminal noncoding region (hereafter referred to as noncoding region 1 or nc1) that comprised about half of its length (3,593–5,210 bp). In addition, a second large noncoding region (noncoding region 2 or nc2) was present at the other end of the chromosomes and varied from 909 to 1,671 bp in size. The sequences of the nc1 showed 62–79% sequence identity among chromosomes (fig. 4A), with the distal part being more conserved (fig. 4C). In particular, the terminal sequence of approximately 1,000 bp in the nc1 was identical among all chromosomes. This terminal sequence contained five direct repeats and had a potential to form multiple closely spaced hairpin structures (fig. 4E).
The sequence of nc2 was very similar among the chromosomes 2, 4, and 5 (97–98% sequence identity), and, to a lesser extent, the chromosome 6 (∼90% identity with the other three chromosomes) (fig. 4B and D). By contrast, the sequence of nc2 of the chromosome 1 showed on average only ∼73% overall identity and that of the chromosome 3—only 61% overall identity with the corresponding regions on the other chromosomes (fig. 4B and D). However, on all chromosomes, there was a strong drop in the number of guanosines in the terminal part of nc2 (this terminal part was preceded by a dinucleotide [TC] repeat in four of the chromosomes). Very few sequences ⩾10 bp in length were shared between the two terminal noncoding regions in any of the chromosomes (supplementary fig. S5, Supplementary Material online).
Phylogenetic Analysis of Mitochondrial Protein Sequences Does Not Resolve the Phylogenetic Position of Calcareous Sponges
We used maximum likelihood (ML) and Bayesian (BI) methods to explore the phylogenetic position of calcareous sponges based on mitochondrial sequence data (fig. 5 and supplementary fig. S6, Supplementary Material online). Representatives of all phyla of nonbilaterian animals except Ctenophora as well as of major phyla of bilaterian animals and several outgroups were included in the analysis. The two available mitochondrial genomes of ctenophores (Pett et al. 2011; Kohn et al. 2012) were not included in the analysis because they have been shown to form extremely long branches on phylogenetic trees that are likely to distort the placement of other long branches due to the long branch attraction (LBA) artifact (Felsenstein 1978).
Most of the reconstructed phylogenetic relationships corresponded closely to the current view on animal phylogeny, including the relationships among bilaterian animals and those within major lineages of nonbilaterian animals. However, several unconventional relationships have been recovered, consistent with the earlier phylogenetic studies based on mtDNA data (Lavrov et al. 2005; Haen et al. 2007; Lavrov 2011). The phylogenetic position of calcareous sponges was unstable in our analyses, and was influenced by the type of the analysis, the model of sequence evolution used, and the inclusion/exclusion of the partial sequence from L. chagosensis. In general, the PhyloBayes analyses with the CAT or CAT + GTR model of sequence evolution placed calcareous sponges either in a group with demosponges and homoscleromorphs, or in a polytomy with other nonbilaterian animals (fig. 5). By contrast, ML analyses using a site-homogeneous model of amino acid substitutions produced phylogenies with fast evolving sponges (Calcarea and Hexactinellida) grouping with bilaterian animals (supplementary fig. S6, Supplementary Material online). Importantly, rates of sequence evolution were markedly different among individual lineages, making the results susceptible to the LBA artifact. In addition, we found large deviations in amino acid composition of mitochondrial proteins in bilaterian animals and glass sponges compared with the rest of the animals and outgroups (fig. 5C) that might violate the stationarity assumption made by most models of sequence evolution (Lockhart et al. 1994; Gowri-Shankar and Rattray 2006). The use of a site and time-heterogeneous model (CAT–BP) of amino acid replacement (Blanquart and Lartillot 2008) was not able to overcome this bias and produced results similar to the CAT and CAT-GTR models (supplementary fig. S6, Supplementary Material online). Finally, the inferred topology changed little when we eliminated various percentages of sites with the highest rates of sequence evolution from the alignment as described in Goremykin et al. (2010) and Derelle and Lang (2012) (data not shown).
Discussion
A Unique Combination of Unusual Features in C. clathrus mtDNA
Early sampling of animal mitochondrial genomes was biased by two different factors: a prevailing scientific interest in some (mostly vertebrate) taxa and technical difficulties associated with other. Although the invention of PCR and design of Metazoa-specific mtDNA primers allowed a broader sampling of animal mt-genomes, the unusual mtDNA architecture and/or rapid evolution, remained an insurmountable obstacle for some taxa. The advent of the next generation sequencing technology overcame these technical limitations and allowed the sequencing of several highly unusual mt-genomes (Shao et al. 2009; Pett et al. 2011; Smith et al. 2012). The mitochondrial genome of the calcareous sponge C. clathrus characterized in this study is among the most unusual in sponges and, arguably, among all animals. In this section, we discuss four of its outstanding features: a multipartite and linear genome architecture, a novel genetic code, edited tRNAs, and fragmented rRNAs.
Although it is common to depict animal mtDNA as a single, small, circular molecule, it has been known for more than 20 years that Medusozoa, one of the two major lineages in the phylum Cnidaria, have linear and, in some cases, multipartite genomes (Bridge et al. 1992). Recent studies have characterized several of these genomes and showed that they are composed of 1–8 linear chromosomes (Shao et al. 2006; Kayal and Lavrov 2008; Voigt et al. 2008; Smith et al. 2012; Kayal et al. 2012). In addition, studies of several isopod species have shown that their mtDNA is composed of a combination of ∼14 kb linear monomers and approximately 28 kb head-to-head dimers (Raimond et al. 1999; Marcadé et al. 2007; Doublet et al. 2012). This study revealed that another independent origin of linear genome organization in animal mtDNA occurred within calcareous sponges.
Linear mtDNA is also found in close unicellular relatives of animals Amoebidium parasiticum (Burger, Forget, et al. 2003), and potentially in Capsaspora owczarzaki, and Ministeria vibrans (D.V. Lavrov and B.F. Lang, unpublished data). It is also widespread in plants and fungi, where so-called “polydisperse linear DNA” made of linear concatemers of various sizes appears to be the norm (Nosek et al. 1998; Valach et al. 2011). However, a multipartite linear organization is relatively rare in these groups, with the notable exception of the A. parasiticum genome, consisting of several hundred linear chromosomes (Burger, Forget, et al. 2003). By contrast, a multipartite genome organization evolved at least twice among linear mt-genomes in Cnidaria: in the common ancestor of Cubozoa and within Hydrozoa (Voigt et al. 2008; Kayal et al. 2012). Linear and multipartite genome organization requires special mechanisms for the maintenance and transmission of mtDNA (reviewed by Nosek et al. 1998). Based on the presence of two DNA bands in southern blot analysis, we propose that C. clathrus mitochondrial chromosomes may resemble that found in Paramecium aurelia, where approximately 10% of the molecules are present as linear duplexes of dimer length and where replication of mtDNA is initiated by a cross-linking of the duplex strands at one end of the molecule (Pritchard and Cummings 1981). The mechanism of concerted transmission of all six chromosomes to the daughter mitochondria is unknown. However, a similar problem has been solved several times in organisms that have multichromosomal mitochondrial genomes (Burger, Gray, et al. 2003), including extreme cases of thousands of circular and linear chromosomes (Burger, Forget, et al. 2003; Liu et al. 2005).
The second unusual feature of calcinean sponge mitochondria is a novel genetic code that includes unique reassignments of the UAG codons from termination to tyrosine and of the CGN codon family from arginine to glycine (Calcinean mitochondrial genetic code). Although CGN codons were previously identified as “mutable in their meaning” (Soll and RajBhandary 2006) and may not code for an amino acid in yeast mitochondria (Clark-Walker et al. 1985), this is the first report of their actual reassignment. Similarly, the reassignment of the UAG codon to tyrosine is novel, although changes of the UAG codon to leucine/alanine and glutamine have been found in mtDNA of green plants and nuclear genomes of ciliates, respectively (Knight et al. 2001). Interestingly, a convergent change to tyrosine has been recently inferred for another (UAA) termination codon in a nematode (Jacob et al. 2009).
The third interesting feature of calcinean mitochondrial genomes is fragmented and discontinuous rRNA genes. Fragmented rRNA genes are highly unusual in animal mtDNA and so far have been found only in the oysters Crassostera gigas, C. hongkongensis, and C. virginica, where LSU–rRNA is encoded in two pieces (Milbury and Gaffney 2005; Milbury et al. 2010), and in Placozoa, where LSU–rRNA is encoded in two or three pieces (Signorovitch et al. 2007; Burger et al. 2009). Outside of Metazoa, fragmented mitochondrial rRNAs have been described in bacteria, green algae, ciliates, and some other protist lineages (Schnare et al. 1986; Heinonen et al. 1987; Boer and Gray 1988; Denovan-Wright and Lee 1995; Nedelcu 1997; Evguenieva-Hackenberg 2005) as well as monoblepharidalean fungi (Forget et al. 2002; Bullerwell et al. 2003). Extreme rRNA segmentation has been documented for mitochondrial rRNA genes of apicomplexans and dinoflagellates (Feagin et al. 1997; Jackson et al. 2007, 2012; Waller and Jackson 2009). As is also the case in other organisms (Gillespie et al. 1999), there was no obvious RNA processing signal at the inferred ends of C. clathrus mt RNA fragments and mechanisms of their processing/assembly remain unknown.
Finally, mitochondrial sequences of calcareous sponges encode tRNA molecules with multiple mismatches in the amino-acyl acceptor stem, which undergo posttranscriptional RNA editing. Several different types of mitochondrial tRNA editing have been reported to date, including C-to-U editing in mitochondria of marsupials (Janke and Pääbo 1993; Borner et al. 1996), trypanosomes (Alfonzo et al. 1999), and plants (Maréchal-Drouard et al. 1993; Fey et al. 2002); insertion editing in slime molds (Antes et al. 1998); 5′-end editing in amoebozoans (Lonergan and Gray 1993a, 1993b) and some fungi (Laforest et al. 1997); or a combination of these (Gott et al. 2010). The 3′-end editing similar to that found in calcareous sponges has been found in some bilaterian animals (Yokobori and Paabo 1995; Tomita et al. 1996; Yokobori and Pääbo 1997; Reichert et al. 1998; Lavrov et al. 2000; Segovia et al. 2011) and a jakobid flagellate (Leigh and Lang 2004). Further studies are needed to determine whether the tRNA editing in C. clathrus is a recent acquisition that is specific for this species/group of species, as is typical for animals (Brennicke et al. 1999) or if it is a common feature of all calcinean/calcareous sponges. In either case, calcareous sponges do represent an example of yet another independent acquisition of 3′-end tRNA editing in Metazoa.
The combination of the four unusual mitochondrial features listed above makes C. clathrus mtDNA among the most unusual in animals. Elucidation of its structure adds a new dimension to the observed diversity of animal mtDNA by showing that all four classes of sponges (Demospongiae, Hexactinellida, Homoscleromorpha, and Calcarea) have distinct modes and tempos of mitochondrial genome evolution. Furthermore, our previous studies of demosponges and homoscleromorphs (Wang and Lavrov 2008; Gazave et al. 2010), as well as our preliminary mitochondrial data from Sycon ciliatum and Petrobiona massiliana, two calcareous sponges belonging to the second calcarean subclass Calcaronea, reveal additional variation in mt genome organization within classes of sponges.
Conclusion
Because of their phylogenetic position as the sister group to the rest of the animals as well as their ancient origin and early diversification, sponges can be expected to harbor a large molecular and genomic diversity, most of which is only starting to be explored. The variation in mitochondrial genome architecture among the four classes of sponges is, in our view, a reflection of this diversity and a telltale sign of sponge nuclear genome diversity. A highly unusual mitochondrial genome of C. clathrus with multiple linear chromosomes, several changes in the genetic code, fragmented rRNA genes, tRNA editing, and high rate of sequence evolution pushes the boundaries of what is possible in mtDNA evolution in animals and warrants further studies of molecular mechanisms underlying these changes. Although high rates of sequence evolution in calcareous sponge mtDNA prevented us from resolving their phylogenetic position, they make calcarean mtDNA a promising marker for population studies in this group.
Materials and Methods
DNA Extraction and Sequencing
Specimens of C. clathrus were collected by Michael Nickel off Isola del Giglio (Italy) in 2004 and by DVL at the Station marine d'Endoume, Marseille (France) in 2008, and preserved in 8M guanidine hydrochloride solution. Total DNA from both specimens was prepared by phenol–chloroform extraction following proteinase K digestion (Saghai-Maroof et al. 1984). We used two different approaches to determine mtDNA sequences from these specimens.
A PCR-based approach (Burger et al. 2007) was used for the first specimen. Regions of cob, cox1, and rnl were amplified and sequenced with conserved primers developed for animal mtDNA (Burger et al. 2007). Sequences were extended using a modified step-out protocol developed in our lab (Burger et al. 2007) and reamplified by long PCR with the TAKARA LA-PCR kit. Long PCR products were combined in equimolar concentration, sheared into pieces 1–2 kb in size, end-repaired, and cloned using the TOPO Shotgun Subcloning Kit from Invitrogen. White colonies containing inserts were collected, grown overnight in 96-well blocks, and submitted to the DNA Sequencing and Synthesis Facility of the ISU Office of Biotechnology for high-throughput plasmid preparation and sequencing.
A total DNA sequencing approach was used for the second specimen. Paired 100 bp reads of the total genomic DNA were generated with Illumina sequencing by synthesis chemistry. mtDNA sequence was assembled directly from these reads (supplementary fig. S1, Supplementary Materials online). Repeated sequences in the terminal regions were PCR-amplified using primers designed at the ends of the longest contigs and sequenced using Sanger sequencing. In addition to mtDNA, both 18S and 28S rRNA gene sequences were assembled for this specimen of C. clathrus and used to confirm its identification by comparison with orthologs from other calcareous sponges (supplementary fig. S2, Supplementary Materials online). Both conventional (Sanger) and Illumina (Solexa) sequencing were conducted at the Iowa State University DNA Facility on an Applied Biosystems 3730xl DNA Analyzer and Illumina HiSeq 2000, respectively. Sanger sequences were assembled with the STADEN software suite (Staden 1996). Illumina sequences were assembled with ABySS using a k-mer size of 82 (Simpson et al. 2009). All assemblies were submitted to NCBI’s GenBank (accessions JX978466–JX978471 and JX996194–JX9961946).
Southern Blot Analysis
Southern hybridization was performed following a modified version of the protocol described in Sambrook et al. (1989) on a total DNA preparation from the specimen of C. clathrus collected off Isola del Giglio (Italy). Partial sequences of three genes, cob (chromosome 4), cox1 (chr. 5), and rnl_5 (chr. 1) were independently amplified with specific primers and radiolabeled by random priming. Three samples of total DNA (∼1 µg each) and approximately 0.1 ng of each PCR product were run on a 0.8 % agarose gel and capillary transferred to a Hybond-XL nylon membrane (GE Healthcare). After treatment, the membrane was cut in three pieces corresponding to each PCR product, which were exposed separately but simultaneously to corresponding radiolabeled probes during an overnight incubation at 60°C. The results of SH were visualized by exposing an X-ray film to membranes for 40 h.
RNA Extraction and cDNA Synthesis
Total RNA was extracted from the specimen of C. clathrus collected at the Station marine d'Endoume using the TRIZOL Reagent (Invitrogen) and treated with DNase. An aliquot of total DNA was circularized using T4 RNA ligase (Fermentas). tRNA, partial rRNAs (both rns and rnl), and mRNA (cox1) sequences were amplified by RT-PCR, with primers designed based on mtDNA sequences (supplementary table S1, Supplementary Material online). Reverse transcription was performed using SuperScript III Reverse Transcriptase (Invitrogen) under conditions described by the manufacturer. PCR amplification of cDNA products was performed with recombinant Taq DNA polymerase (Invitrogen). PCR products for tRNAs were cloned using the TA Cloning Kit from Invitrogen, while PCR amplicons for the mRNA and rRNAs were directly sequenced. At least eight colonies were analyzed per tRNA gene.
Partial mt Sequences of L. chagosensis
EST sequences of one specimen of L. chagosensis sampled at North Stradbroke Island, Australia, were available from a previous study (Philippe at al. 2009). Mitochondrial protein-coding genes were identified from the EST pool via BLAST searches (Altschul et al. 1990). Leucetta-specific primers were designed for cob, cox1, cox3, and nad1 (supplementary table S1, Supplementary Material online) and used to PCR-amplify and sequence these genes from another specimen of L. chagosensis sampled in the Gulf of Aqaba, Egypt.
Gene Annotation, RNA Folding, and Genome Description
Open reading frames in the nucleotide sequences were first characterized using the minimally derived code (with TGA = Trp). Further deviations from this genetic code were discovered with the GenDecorder v.1.6 program (Abascal et al. 2006). tRNA genes were identified by tRNAscan-SE (Lowe and Eddy 1997); protein genes were recognized by similarity searches in local databases using the FASTA program (Pearson 1994), and in GenBank at NCBI using BLAST network service (Benson et al. 2003). rRNA genes were characterized by the Cmsearch program in the Infernal 1.0.2 package (Nawrocki et al. 2009) based on manually curated alignments of nonbilaterian mitochondrial rRNA sequences, and BLAST searches (Altschul et al. 1990) against other mitochondrial sequences. The secondary structures of these genes were adjusted manually according to the patterns of conservation in the three phylogenetic domains + chloroplasts + mitochondria compiled by the Gutell lab at the Comparative RNA Web Site (http://www.rna.icmb.utexas.edu, last accessed 18 December 2012) and drawn with RNAviz (De Rijk et al. 2003).
Phylogenetic Inference
Two data sets were created from mitochondrial sequences of C. clathrus and representatives of major lineages of opisthokonts (supplementary table S2, Supplementary Materials online) with and without partial sequences of L. chagosensis. Inferred amino acid sequences of individual mitochondrial proteins were aligned with Mafft v6.861b (Katoh and Toh 2008). Conserved blocks within the alignments were selected with Gblocks 0.91b (Castresana 2000) using relaxed parameters (parameters 1 and 2 =1/2 ½, parameter 3 = 8, parameter 4 = 5, all gap positions in parameter 5). Cleaned alignments were concatenated in two data sets 2,492 and 2,473 amino acids in length that encompassed 53 and 54 species, respectively. Additional alignments were created by removing 10–50% of the positions with the highest rate of evolution (as estimated by PAML [Yang 2007]) from the dataset of 53 species. Finally, a reduced alignment consisting of 39 species and 2,492 amino acid positions was created and used for phylogenetic analysis with a nonhomogenous model of amino acid substitution.
Bayesian inferences were performed with the CAT + Γ4 and CAT + GTR + Γ4 mixture models implemented in the program PhyloBayes version 3.2f (Lartillot et al. 2009) and the CATBP model implemented in nhPhyloBayes v.0.2.1 (Blanquart and Lartillot 2008). PhyloBayes analyses consisted of four chains over 15,000 generations using CAT + GTR + Γ model (maxdiff < 0.1), and four chains over 65,000 generations using CAT + Γ model (maxdiff < 0.15). The chains were sampled every 10th tree after the first 100 burn-in cycles. nhPhyloBayes analysis consisted of two chains over 5,000 generation using the CATBP + GTR + Γ model, sampled every 10th tree after the first 500 burn-in cycles. ML analyses were performed in RAxML version 7.3 (Stamatakis 2006), using “the easy and fast way” (100 bootstrap replicates followed by an ML search). We used the PROTGAMMALGF model (Le and Gascuel 2008) for ML searches, which is the best-fitting model for the concatenated matrix according to the Akaike information criterion (computed with ProtTest 3.2; Darriba et al. 2011). Site rates were calculated using CODEML (Yang 2007) under the same model and positions with the highest substitution rates were sequentially eliminated from the alignment.
The correspondence analysis was performed using the R package ca (Greenacre and Nenadic 2007). For all species used in phylogenetic analysis, we determined the total amino acid usage, obtaining a matrix where the rows represent the species and the 20 columns are the respective amino acid frequencies.
Acknowledgments
This work was supported by the National Science Foundation grant (no. DEB-0828783) to D.V.L.; the National Sciences and Engineering Research Council of Canada grant (no. NSERC 194560-2011) to B.F.L.; and the German Research Foundation (DFG) projects WO896/3 and WO896/6 within the SPP1174 “Deep Metazoan Phylogeny” to G.W. The authors thank Michael Nickel and Dan Jackson for specimens of Clathrina clathrus and Leucetta chagosensis, respectively, and Katherine Wilson and Katrina Lutap for technical assistance. They also thank Michelle Klautau, two anonymous reviewers, and the associate editor for their comments on an earlier version of this manuscript.
References
Author notes
†Present address: Department of Invertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC
Associate editor: Todd Oakley