Novel gene re-arrangement in the mitochondrial genome of Pisidiaserratifrons (Anomura, Galatheoidea, Porcellanidae) and phylogenetic associations in Anomura

Abstract To improve the taxonomy and systematics of Porcellanidae within the evolution of Anomura, we describe the complete mitochondrial genomes (mitogenomes) sequence of Pisidiaserratifrons, which is 15,344 bp in size, contains the entire set of 37 genes and has an AT-rich region. Compared with the pancrustacean ground pattern, at least five gene clusters (or genes) are significantly different with the typical genes, involving eleven tRNA genes and four PCGs and the tandem duplication/random loss and recombination models were used to explain the observed large-scale gene re-arrangements. The phylogenetic results showed that all Porcellanidae species clustered together as a group with well nodal support. Most Anomura superfamilies were found to be monophyletic, except Paguroidea. Divergence time estimation implies that the age of Anomura is over 225 MYA, dating back to at least the late Triassic. Most of the extant superfamilies and families arose during the late Cretaceous to early Tertiary. In general, the results obtained in this study will contribute to a better understanding of gene re-arrangements in Porcellanidae mitogenomes and provide new insights into the phylogeny of Anomura.


Introduction
The infraorder Anomura is a highly diverse group of decapod crustaceans, including seven superfamilies, 20 families, 335 genera and more than 2500 species in total, some of the king crab and squat lobster being economically important (Dawson 1989, Lovrich 1997, Poore et al. 2011).However, the phylogenetic relationships within Anomura remain controversial.Earlier, based on adult morphological characteristics, classifications often differed in high-level composition (Baba 2008).Recently, more and more molecular and morphological data have been used to reconstruct the phylogeny of Anomura (Schnabel and Ahyong 2010, Kim et al. 2013, Gong et al. 2019).Although the monophyly of Anomura is widely accepted, phylogenetic relationships at high taxonomic levels remain unresolved, is dynamic and under continuous debate.Initially, the superfamily Galatheoidea was divided into seven families (i.e.Galatheidae, Munididae, Munidopsidae, Porcellanidae, Aeglidae, Chirostylidae and Kiwaidae) (Macpherson et al. 2005, Schnabel et al. 2011).Subsequently, Chirostylidae and Kiwaidae were removed to superfamily Chirostylidea, while Aeglidae was removed to Aegloidea (Pérez-Losada et al. 2002, McLaughlin et al. 2007).The current classification scheme divides Galatheoidea into Galatheidae (squat lobsters), Munididae, Munidopsidae and Porcellanidae (porcelain crabs) (Ahyong et al. 2010).So far, the phylogenetic relationship of some families in Anomura is still unclear.Therefore, data from sufficient species are required for a comprehensive phylogenetic analysis of the infraorder Anomura.
The porcelain crab (Pisidia serratifrons) is one of the marine crustaceans that live in shallow waters less than 200 metres, with various habitats, which belong to the subphylum Crustacea, order Decapoda, infraorder Anomura, family Porcellanidae, genus Pisidia (Kim and Ko 2011).P. serratifrons is mainly distributed in the southern Korea, southern Japan and the southeast coastal region of China (Morton 1997, Qing et al. 2016).So far, most studies of this species have focused on morphology and growth (Morton 1997, Kim andKo 2011).
The mitogenome of metazoans is usually 14-20 kb in size and encoded with a set of 37 genes, including 13 protein-coding genes (PCGs) (cox1-3, cob, nad1-6, nad4L, atp6 and atp8), two ribosomal RNA genes (rrns and rrnl), 22 transport RNA genes (tRNAs) and an AT-rich region (also called D-loop region, CR) which contains some initiation sites for transcription and replication of the genome (Smith andSmith 2002, Sato andSato 2013).Due to its haploid properties, matrilineal inheritance and rapid evolutionary rate, the mitogenome is increasingly being used in re-arrangement trends and phylogenetic analyses.With the rapid development of sequencing technology, more and more complete mitogenome sequences have been used in comparative genomics, molecular evolution and phylogeny (Tan et al. 2019).
Gene re-arrangements in the Anomura mitogenomes are relatively common (Arndt andSmith 1998, Hickerson andCunningham 2000).As the sequence of animal mitogenomes remains stable over a long period of time and a complex shared derivative gene sequence is unlikely to appear independently in different pedigrees, gene re-arrangements can be used as an indicator to clarify the interspecific relationship.So far, several hypotheses have been suggested to help explain gene re-arrangements in animal mitogenomes.The recombination model and tandem duplication/random loss (TDRL) model are more commonly accepted.Recombination models are involved in the breaking and reconnecting of DNA strands.The TDRL model assumes that the re-arranged gene order occurs via tandem duplications followed by random deletion of certain duplications (Moritz and Brown 1987).This model has been widely used to explain the translocation of genes encoded on the same strand (Posada and Crandall 1998).
In this study, we successfully sequenced the complete mitogenome of P. serratifrons.In addition, the gene structure and gene re-arrangement of the mitogenome of P. serratifrons have been reported and a phylogenetic analysis of 31 Anomura species has been conducted, based on the nucleotide sequences of 13 PCGs.Based on the similarities and differences of the gene re-arrangement order in the mitogenome, the possible rearrangement process was discussed in order to have a better understanding of the rearrangement events and evolutionary mechanisms of the Anomura mitogenome.

Sampling and DNA extraction
A specimen of P. serratifrons was collected from Zhoushan, Zhejiang Province, China (29°98′30N, 122°96′99″E).The specimen was immediately preserved in absolute ethanol after collection and then stored at −20°C.This specimen was identified by morphology and fresh tissues were dissected from the operculum and preserved in absolute ethanol before DNA extraction.The total genomic DNA was extracted using the salt-extraction procedure (Aljanabi and Martinez 1997) with a slight modification and stored at −20°C.

Genome sequencing, assembly and annotation
The mitogenomes of P. serratifrons was sequenced by Origin gene Co. Ltd., Shanghai, China and was sequenced on the Illumina HiSeq X Ten platform.HiSeq X Ten libraries with an insert size of 300-500 bp were generated from the genomic DNA.About 10 Gb of raw data were generated for each library.Low-quality reads, adapters and sequences with high "N" ratios and length less than 25 bp were removed.The clean reads were assembled using the software NOVOPlasty (Dierckxsens et al. 2017) (https://github.com/ndierckx/NOVOPlasty) and annotated and manually corrected on the basis of the complete mitogenome sets, assembled de novo by using MITOS tools (Bernt et al. 2012) (MITOS Web Server (uni-leipzig.de)).To confirm the correct sequences, we compared the assembled mitochondrial genes with those of other Porcellanidae species and identified the mitogenomic sequences by checking the cox1 barcode sequence with NCBI BLAST (Altschul et al. 1997).The abnormal start and stop codons were determined by comparing them with the start and stop codons of other marine crustacea.Then, the reads were reconstructed using the de novo assembly programme.The complete mtDNA was annotated using the software Sequin version 16.0.The mitogenome map of the P. serratifrons was drawn using the online tool Poksee ( https://proksee.ca)(Grant and Stothard 2008).The secondary structures predicted of the tRNA genes were plotted by using MITOS Web Server.The relative synonymous codon usage (RSCU) values and Substitution saturation for the 13 PCGs were calculated by DAMBE 5 and analysed with MEGA 7 (Kumar et al. 2016).The GC-skews and AT-skews were used to determine the base compositional difference and strand asymmetry amongst the samples.

Phylogenetic analysis
The phylogenetic relationship within Anomura was reconstructed using the sequences of the 13 PCGs of a total of 34 complete mitogenome sequences downloaded from the GenBank database (https://www.ncbi.nlm.nih.gov/genbank/) and adding two species of Ocypodea to serve as the outgroup (Suppl.material 2).The phylogenetic relationships were analysed with Maximum Likelihood (ML) by using IQ-TREE 1.6.2 and Bayesian Inference (BI) methods in MrBayes 3.2 version programme (Perna and Kocher 1995, Huelsenbeck and Ronquist 2001, Nguyen et al. 2015).The ML analysis was inferred with 1000 ultrafast likelihood bootstrap replicates by using IQ-TREE 1.6.2.The best-fit model for each partition was K3Pu+f+R4, selected according to the Bayesian Information Criterion (BIC).BI was performed in MrBayes 3.2 and the best-fit evolutionary models were determined using MrMTgui (Ronquist et al. 2012).MrMTgui was used to associate PAUP, ModelTest and MrModelTest across platforms.MrBayes settings for the best-fit model (GTR + I + G) were selected by AIC in MrModelTest 2.3 (Nylander et al. 2004).The Bayesian phylogenetic analyses were performed using the parameter values estimated with the commands in MrModelTest or ModelTest (nst = 6, rates = invgamma) (Posada and Crandall 1998).Each with three hot chains and one cold chain were run simultaneously twice by Markov Chain Monte Carlo (MCMC) sampling and the posterior distribution was estimated.The MCMC chains were set for 2,000,000 generations and sampled every 1000 steps, with a relative burn-in of 25%.The convergence of the independent runs was evaluated by the mean standard deviation of the split frequencies (< 0.01).The phylogenetic trees were visualised and edited using Figure Tree v.1.4.3 software (Rambaut 2017).

Divergence time estimation
The divergence times of Anomura were estimated with the programme BEAST v.1.10.4 ( Joseph and Drummond 2011) under the uncorrelated strict clock model and fossil calibration points were used (Suppl.material 3), including with a normal prior distribution.The HKY substitution model was selected using based on BEAUti software and the Yule speciation process model.This study ran four independent Markov Chain Monte Carlo (MCMC) algorithms, the chain length of Markov Chain setting is 800,000,000 generations and sampled every 8000 generations.The first 10% of the trees were discarded as burn-in and each parameter was examined by the effective sample size (ESS) (> 200, as recommended) in Tracer v.1.6.Trees were assessed using TreeAnnotator and a chronogram was constructed in FigTree v.1.4.2.

Genome structure and composition
The complete mitogenome sequence of P. serratifrons is a typical closed-circular molecule of 15，344 bp in size (GenBank accession number OM461359), which is a similar length to the published Porcellanidae mitogenomes (Tan et al. 2014, Lee et al. 2016), ranging from 15,324 to 15,348bp (Suppl.material 2).The mitogenome contents (Table 2) of P. serratifrons is the same as most published Anomura (Hickerson and Cunningham 2000, Yang et al. 2008, Kim et al. 2013), including 37 genes, 13PCGs, 22 tRNAs and two rRNA (rrnl and rrns), as well as a brief non-coding region.All the genes were identified and shown in Fig. 1 and Table 1.Most of the 37 genes are located on the heavy (H-) strand, except four PCGs (i.e.nad5, nad4, nad4l and nad1), eight tRNAs (i.e.tRNA-Phe, His, Pro, Leu, Val, Gln, Cys and Tyr) and two rRNA which are located on the light (L-) strand (Fig. 1).There are seven regions with overlap in the total P. serratifrons mitogenome, with one of them more than 10 bp trnL1 (23 bp) and the other six shorter than 10 bp nad4/atp8 (7 bp), cox1 (5 bp), cob (2 bp) and trnF/atp6 (1 bp) (Table 1).The P. serratifrons mitogenome also contains 376 bp of intergenic spacers located in 20 regions, ranging from 1 to 57 bp (Table 1) and indicating the occurrence of tandem duplications and the deletions of redundant genes.The nucleotide composition of the P. serratifrons mitogenome is A, 37.78%, T, 36.51%,G, 9.7% and C, 16.01%, with a high AT bias.The A + T (%) content of the mitogenomes was 74.29%.The AT-skew and GC-skew values are calculated for the chosen complete mitogenomes (Table 2).AT-skew of the P. serratifrons mitogenome is positive (0.017) and GC-skew of the P. serratifrons mitogenome is negative (−0.246), informing Ts and Cs are more abundant than Ts and Gs.

PCGs and codon usage
The initial and terminal codons of all PCGs of P. serratifrons are listed in Table 3. P. serratifrons has 13 PCGs in the typical order found in Anomura species, containing seven NADH dehydrogenase (nad1-nad6, nad4L), three cytochrome c-oxidases (cox1-cox3), two ATPases (atp6, atp8) and cytochrome b (cob).The total length of the 13 PCGs is 11077 bp.The length of the 13 PCGs ranges from 159 to 1680 bp.The average A+T content is 72.7%, ranging from 67.84% (cox1) to 84.28% (atp8) (Table 1).The AT-skew and GC-skew are −0.182 and 0.011, respectively (Table 3).All of the PCGs are initiated by the start codon ATN (ATT, ATG, ATA and ATC), except cox1 (ACG) and cob (TTG)，which is consistent with the mitogenome of most invertebrate species (Kong et al. 2009, Lee et al. 2016) .The majority of the PCGs are terminated with TAA, whereas nad1 uses TAG as the stop codon (Table 3).The most frequently used amino acid in P. serratifrons is Leu and the least common anion acid is Cys (Fig. 2).The relative synonymous codon usage (RSCU) values for P. serratifrons of the 13 PCGs are shown in Table 3 and Fig. 2. The three most frequently detected codons are CUU (Leu), whereas GCA (Gln) is the least common codon.Based on CDspT and RSCU, comparative analyses showed that the codon usage  Table 3.
The codon number and relative synonymous codon usage in the mitochondrial genome of P. serratifrons.et al. 2016).Fourteen of them are encoded by the heavy strain (H-) and the rest are encoded by the light strain (L-).In the whole mitogenome, the size of tRNAs ranges from 64 to 70 bp and has a total length of 1477 bp, with an obvious AT bias (76.98%) (Table 2).
The AT-skew and GC-skew are 0.043 and 0.111, respectively, showing a slight bias towards the use of As and an apparent bias toward Gs (Table 2).The trnS1 cannot form a secondary structure due to the lack of dihydrouracil (DHU) arms, while other tRNAs are capable of folding into a typical clover-leaf secondary structure (Fig. 3).The variation of trnS1 structure is consistent with the trnS1 structure reported in other invertebrate mitogenomes (Yang et al. 2008, Tan et al. 2019).The rrns and rrnl are 776 and 1303 bp, respectively, which are typically separated by tRNA-Val (Table 1).These sizes are similar to those of other Porcellanidae species.The A-T content of rRNAs is 77.73%.The AT-skew and GC-skew are 0.025 and 0.374, respectively, suggesting a slight bias towards the use of As and an apparent bias toward Gs (Table 2).As in most typical mitogenomes of other crabs, CR is located between rrnS and tRNA-Met.The 371 bp CR is obviously AT biased (77.63%).The AT-skew and GC-skew are −0.143 and −0.320, respectively, indicating an obvious bias towards the use of Ts and Cs.The index of substitution saturation (Iss) was measured as an implemention in DAMBE 5 and the GTR substitution model Iss is for the Novel gene re-arrangement in the mitochondrial genome of Pisidia serratifrons ... combined dataset of all PCGs of the 31 Anomura mitogenomes and was signifcantly lower (Iss = 0.647) than the critical values (Iss, cSym = 0.879).The genes are not saturated, so the reconstructed phylogeny was reliable.

Gene re-arrangement
Compared with the gene arrangement in the ancestral crustaceans (pancrustacean ground pattern), we found that the gene order in P. serratifrons mitogenome underwent a massive re-arrangement.As Fig. 4 shows, at least five gene clusters (or genes) are significantly different from the typical genes, involving eleven tRNA genes (D, G, A, R, N, S1, E, P, I, Q and M) and four PCGs (atp8, atp6, cox3 and nad3) (Fig. 4).The re-arrangement of the five gene clusters (or genes) is as follows (Fig. 5): ( 1   -cox3-G-nad3-A-D-atp8-atp6-cox3-G-nad3-A, I-Q-M-I-Q-M (underline denotes the deleted gene) with formation of two new gene blocks (G-nad3-A-D-atp8-atp6-cox3) and (M-I-Q).Tandem duplication followed by random loss is widely used to explain this type of translocation of mitochondrial genes (Kong et al. 2009, Shi et al. 2015, Sun et al. 2019).Therefore, we ascertain that the duplication-random loss model is the most likely explanation for these two gene block re-arrangements.Then, the two new gene blocks result in a translocation.(G-nad3-A-D-atp8-atp6-cox3) block is translocated downstream to the nad2, leaving G-nad3-A in the original position.( M-I-Q) block is translocated to upstream of W, leaving M-I in the original position.In the second step, four tRNA clusters (R-N-S -E) shifted to upstream of W. P is translocated to downstream of S .Finally, the ultimate gene arrangement of the P. serratifrons mitogenome is shown in Fig. 5C.
Comparing mitochondrial gene order has been proved to be a valuable tool in crustacean phylogeny.Based on the comparative analysis of mitochondrial gene arrangement within Galatheoidea, we found that eight Galatheoidea mitogenomes showed a massive rearrangement, which differs from any gene order ever reported in decapods (Fig. 6).Amongst the eight gene re-arrangement patterns in this study, the (F-nad5-H-nad4-nad4L) and (rrnL-V-rrnS) regions are extremely conserved, which is consistent with the conclusion of Shao et al. ( 2001) that the ( F-nad5-H-nad4-nad4L) and (rrnL-V-rrnS) regions are considered extremely conserved in animals.The P. serratifrons mitochondrial gene arrangement is closest to Neopetrolisthes maculatus and Petrolisthes haswelli which provides further support for the close relationship.The mitochondrial gene orders of Munida gregaria shared the most similarities with Munida isos, while Munidopsis Verrilli and Munidopsis lauensis shared higher similarities with Shinkaia crosnieri.These results are consistent with the conclusion from the gene order based phylogenetic tree.The gene order of the Munididae has a complex within-genus re-arrangement which seems to be related to their particular habitat.Our results support the fact that those comparisons of mitochondrial gene re-arrangements are a useful tool for phylogenetic studies.

Phylogenetic relationships
In the present study, the phylogenetic relationships were analysed, based on the sequences of the 13 PCGs to clarify the relationships in Anomura.P. serratifrons and another 31 known Anomura species were analysed, with O. ceratophthalmus and Q. stimpsoni as outgroups.The two phylogenetic trees (i.e.Maximum Likelihood (ML) tree and Bayesian Inference (BI) tree) resulted in identical topological structuring with different supporting value.Subsequently, only one topology (ML) with both support values was presented and displayed (Fig. 7).It was obvious that P. serratifronsa, N. Amongst the 11 families included in our phylogenetic tree, each family in the tree forms a monophyletic clade with high nodal support values, except Paguridae.At a higher level of The phylogenetic tree was inferred from the nucleotide sequences of 13 mitogenome PCGs using BI and ML methods.Numbers on branches indicate posterior probability (BI) and bootstrap support (ML).
Novel gene re-arrangement in the mitochondrial genome of Pisidia serratifrons ...

Divergence time estimation
The

Conclusion
In this study, the mitogenome of P. serratifrons was sequenced by next-generation sequencing, thereby generating new mitochondrial data for Porcellanidae.We analysed the mitogenome of P. serratifrons and found it is similar to other Anomura with many significant features including AT-skew, a codon usage bias etc.Compared with the pancrustacean ground pattern, the gene order in P. serratifrons mitogenome underwent a massive re-arrangement.The Galatheoidea showed eight re-arrangement patterns and their re-arrangement similarity is consistent with phylogenetic relationships.Our phylogenetic tree had similarities and disagreements with predecessor studies.The phylogenetic analyses indicated that P. serratifronsa, N. maculatus and P. haswelli formed a Porcellanidae clade.Divergence time estimation implies that the age of Anomura is over 225 MYA, dating back to at least the late Triassic.Most of the extant superfamilies and families arose during the late Cretaceous to early Tertiary.These results provide insight into the gene arrangement features of Anomura mitogenomes and lay the foundation for further phylogenetic studies on Anomura.

Figure 1 .
Figure 1.Circular mitogenome map of P. serratifrons.Protein coding, ribosomal and tRNA genes are shown with standard abbreviations.Arrows indicate the orientation of gene transcription.The inner circles show the G-C content and GC-skew, which are plotted as the deviation from the average value of the entire sequence.

Figure 2 .
Figure 2. Codon usage patterns in the mitogenome of P. serratifrons CDspT, codons per thousand codons.Codon families are provided on the x-axis (A) and the relative synonymous codon usage (RSCU) (B).

Figure 3 .Figure 4 .
Figure 3.Putative secondary structures of tRNAs from the P. serratifrons mitogenome.The tRNAs are labelled with the abbreviations of their corresponding amino acids.

Figure 5 .
Figure 5. Inferred intermediate steps between the ancestral gene arrangement of crustaceans and P. serratifrons mitogenome.A Duplication-loss and translocation in the ancestral mitogenome of crustaceans.The duplicated gene block is boxed in dash and the lost genes are labelled with grey B Translocation; C The final gene order in the P. serratifrons mitogenome.

Figure 6 .
Figure 6.Mitochondrial gene arrangements of eight species in Galatheoidea.Gene arrangement of all genes are transcribed from left to right.The re-arranged gene blocks are underlined and compared with ancestral gene arrangement of Anomura.
maculatus and P. haswelli formed a Porcellanidae clade with high support value.The families Munididae and Munidopsidae were grouped into one clade and Porcellanidaeas as the basal group which was similar to what was reported by McLaughlin et al. based on morphological characters and by Gong et al. based on the amino acid dataset of 13 PCGs (McLaughlin et al. 2007, Gong et al. 2019).
divergence time analysis, based on 13 PCGs of the mitochondrial genome, implies that the divergence of Anomura occurred in the early Triassic (~ 225.2 MYA, 95% credibility interval = 182.79-297.16MYA, Fig. 8A), which is roughly the same as the conclusion of Bracken-Grissom et al. (2013) that the origin of Anomura is Late Permian ~ 259 (224-296) MYA, based on the divergence time analysis.The Galatheoidea superfamily diverged in the early Jurassic (208 Ma, 95% credibility interval = 167.73-215.52MYA, Fig. 8B), into the Munidopsidae and Munididae during the Early Jurassic (~ 173 MYA, Fig. 8C), while the family Procellanidae diverged in the Early Jurassic (~ 187 MYA, Fig. 8D) with rapid speciation of present day species occurring since the mid-Miocene (~ 54 MYA, Fig. 8E).The Lomidae, Kiwaidae and Chirostylidae all originated in the Jurassic (~ 183.81 MYA, 175.62 Ma and 158.48 Ma, respectively).The hermit crab formed two subclades during the Jurassic period (~ 191 MYA, Fig. 8 F), the first subclade branches being composed of Lithodidae and Paguridae.The most recent common ancestor of Lithodidae and Paguridae was divided into a new family in the Middle Tertiary (~ 39.84 MYA, Fig. 8G).The Paguridae was first discovered in the Tertiary (~ 29.5 MYA, Fig. 8H).The second subclade was formed by the hermit crabs in the middle Cretaceous (~ 60.3 MYA, Fig. 8I) and differentiation formed the family of Albuneidae, Coenobitidae and Diogenidae.The differentiation time was longer than that of the first subclade and appeared about 20 MYA earlier.The results support the multi-family origin of the hermit crab.

Figure 8 .
Figure 8. Anomura divergence time estimated using the Bayesian relaxed-molecular clock method.The 95% confidence intervals for each node are shown in light blue bars.1-3: 3 fossil calibration nodes (Corresponding to Suppl.material 3).

Table 1 .
Features of the mitochondrial genome of P. serratifrons.Novel gene re-arrangement in the mitochondrial genome of Pisidia serratifrons ...

Table 2 .
Composition and skewness of P. serratifrons mitogenome.Novel gene re-arrangement in the mitochondrial genome of Pisidia serratifrons ...
pattern of P. serratifrons is conserved.The codon usage patterns of 13 PCGs are similar to those of other Porcellanidae species (Tan et al. 2014).
classification, most superfamilies from Anomura were found to be monophyletic, except Paguroidea, which is in line with previous studies(McLaughlin 1983, Tan et al. 2018).It showed that Paguroidea was divided into two clades, ((Coenobitidae + Diogenidae) + (Lithodidae + Paguridae)), which is consistent with previous results(Tan et al. 2018, Gong  et al. 2019), while Tan et al. (2019) deem that Lithodidae was excluded from Paguroidea and belonged to a new superfamily Lithodoidea.Besides, our phylogenetic tree showed that (Porcellanidae + (Munidopsidae + Munididae)) formed a Galatheoidea clade in this tree and (Chirostylidae+ Kiwaidae) formed Chirostylidea in a clade which was consistent with Sun et al. (2019) (based on morphological characters) and Schnabel et al. (2011) (based on mitochondrial 16S rRNA and nuclear 18S and 28S rRNA), while the monophyly of Galatheoidea is still not recognised by some studies, mainly due to the classification of Chirostylidae.According Tan et al. (2018), they regarded Chirostylidae as a member of the Galatheoidea and Galatheoidea formed a polyphyletic clade in their studies.