Biodiversity Data Journal :
Research Article
|
Corresponding author: Yang Li (liyang@qdio.ac.cn)
Academic editor: Danwei Huang
Received: 09 Aug 2024 | Accepted: 18 Sep 2024 | Published: 25 Sep 2024
© 2024 Junyuan Li, Xuyi Yang, Zifeng Zhan, Juan Feng, Tinghui Xie, Yang Li
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Li J, Yang X, Zhan Z, Feng J, Xie T, Li Y (2024) Development of microsatellite markers and evaluation of the genetic diversity of the edible sea anemone Paracondylactis sinensis (Cnidaria, Anthozoa) in China. Biodiversity Data Journal 12: e134363. https://doi.org/10.3897/BDJ.12.e134363
|
|
Paracondylactis sinensis Carlgren, 1934 is a sea anemone with economic value in China. The wild population of P. sinensis has been shrinking due to overfishing and environmental pollution, which have caused price instability. In winter, the price of P. sinensis can reach 25 USD per kilogram. Up to now, there are no genetic markers developed for P. sinensis, preventing a further exploration of their population genetic diversity. In this study, the full-length transcriptome of P. sinensis was sequenced and microsatellite DNA markers (simple sequence repeats [SSRs]) were developed from those transcripts. A total of 52 primer pairs, which can amplify specific polymorphic bands in PCR experiments, were designed for the SSR markers. Genetic diversity and population genetics were analysed for P. sinensis populations collected from the coasts of Taizhou and Rizhao using six microsatellite DNA loci. While inbreeding was detected in both populations (Fis > 0), the overall number of alleles (Na = 11.3) and bottleneck analysis suggested that the genetic diversity of P. sinensis has not been greatly impacted. Clustering analyses using STRUCTURE, principal coordinate analysis and unweighted pair group method with arithmetic mean tree revealed that the Taizhou population diverged from the Rizhao population; however, the genetic differentiation between the populations was moderate. Human-mediated commercial activities may be the principal reasons for the gene flow between the populations. Our study provides the first evaluation of the genetic resources of wild P. sinensis populations in China, which can serve as a useful reference for future comparative studies on population genetics and may guide policy-makers in initiating strategies for germplasm conservation and artificial breeding.
Anthozoa, microsatellite DNA loci, population genetics, genetic diversity, ISO-seq
The sea anemone Paracondylactis sinensis Carlgren, 1934 (Cnidaria, Anthozoa) is distributed in the Indo-West Pacific Ocean and has been discovered mainly in countries such as Japan, China, India and Vietnam. P. sinensis lives in sandy beaches in intertidal or shallow subtidal zones. In China, P. sinensis has become an economic species and is widely accepted as a delectable delicacy, especially by local residents of Zhejiang Province (
Up to now, only the mitochondrial genome and some mitochondrial barcodes are accessible in GenBank for P. sinensis (
One Paracondylactis sinensis individual collected from the coast of Taizhou (
The long reads generated by the PacBio sequencer were processed according to the PacBio ISO-Seq pipeline (https://github.com/PacificBiosciences/IsoSeq/blob/master/isoseq-clustering.md). To polish the transcript consensus sequences clustered from PacBio long reads, both Illumina data (cleaned by Trimmomatic 0.36 (
SSRs were detected with the MicroSatellite identification tool (MISA) software (
Sixteen individuals of Paracondylactis sinensis were collected from the coasts of Taizhou, Zhejiang (
The genetic diversity indicators, including the number of (effective) alleles, Shannon index, observed heterozygosity, expected heterozygosity and inbreeding coefficient, were calculated by GenAlEx 6.5 (
A total of 1,112,498 polymerase reads were obtained using the PacBio Sequel II platform. After preprocessing, 142.47 Gb of subreads with an N50 length as long as 2,843 bp were obtained. The Illumina sequencer generated 8.68, 6.30 and 8.28 Gb of clean data for the tentacle, body column and mesentery tissues, respectively. The sequencing data produced in this study have been submitted to the National Center for Biotechnology Information Database under BioProject number PRJNA1086232. The clustering procedure for the PacBio long reads generated 54,259 transcripts, which included 54,219 high-quality transcripts and 40 low-quality transcripts. After polishing and redundant sequences removed, 16,490 non-redundant transcripts remained. The total length of these final transcripts was 47,089,935 bp, with an N50 length of 3,059 bp. The length distribution pattern of the transcripts indicated that most transcripts ranged from 2,000 bp to 3,000 bp in length (Fig.
MISA software revealed 7,596 SSRs distributed within 4,867 transcripts. The distribution frequency of SSRs was 46.06% and the average interval between SSRs was 6.20 kb. The major SSR motifs were mononucleotide, trinucleotide and dinucleotide, accounting for 47.76%, 30.27% and 11.37% of the total number of repeats, respectively (Table
The number and distribution frequency of different SSR types in Paracondylactis sinensis.
Type of repeat | Repeat times | Total | Ratio/% | Distribution Frequency/% | ||||||
5 | 6 | 7 | 8 | 9 | 10 | >10 | ||||
Mononucleotide | \ | \ | \ | \ | \ | \ | 3628 | 3628 | 47.76 | 22.00 |
Dinucleotide | \ | 284 | 162 | 116 | 68 | 37 | 197 | 864 | 11.37 | 5.24 |
Trinucleotide | 1119 | 471 | 286 | 139 | 96 | 55 | 133 | 2299 | 30.27 | 13.94 |
Tetranucleotide | 184 | 89 | 46 | 17 | 24 | 18 | 95 | 473 | 6.23 | 2.87 |
Pentanucleotide | 19 | 7 | 1 | 5 | 1 | 6 | 28 | 67 | 0.88 | 0.41 |
Hexanucleotide | 51 | 44 | 27 | 16 | 16 | 12 | 99 | 265 | 3.49 | 1.61 |
Total | 1373 | 895 | 522 | 293 | 205 | 128 | 4180 | 7596 | 100 | 46.06 |
Ratio/% | 18.08 | 11.78 | 6.87 | 3.86 | 2.70 | 1.69 | 55.03 |
Screening of SSR primer paris with specific and polymorphic resultant bands validated by PCR.
Marker name | SSR motif | Forward primer (5'-3') | Reverse primer (5'-3') | Product size (bp) |
SS1 | (AG)10 | CCCAGGGAGTTGCCATTCT | TCCCCAAATCTCCATCTGCT | 280 |
SS2 | (TG)11 | GGCAAATCCCAGCTCCC | TGTCGGCAAATGTTTTGACAG | 278 |
SS3 | (GT)14 | GCTGTCGCATTGCTTCAGT | GGGCACACGTGACAAGG | 271 |
SS4 | (TC)20 | ACGCCTTCTATAGCTCGCG | GGCGACCAACAGATGCG | 265 |
SS5* | (GA)20 | AGCCGTAGTAGACCCCGT | GCTCTGACGTCACGAGC | 258 |
SS6 | (TG)37 | TGCCCTGAGATCAGCCC | ACATTACAGCTGGTTGGAGG | 257 |
SS7 | (GT)12 | GCGCCCCCACCTATACTG | GCATCTGCAGTAAGGCGT | 253 |
SS8* | (GA)14 | AACACGTCCCTATCGGAGT | AACGGTCGAAAGGGGTCC | 245 |
SS9* | (TATG)26 | CGCCACTCATGCTTGCC | ACACTTGGAAGACCTTTTGCT | 279 |
SS10 | (TGT)9 | CCAAGCCACGAAATCCTTGG | TCCGAGTCCCTGGCTGTT | 279 |
SS11 | (TTG)10 | TGACGATGCGTGCAAGGT | ACAGCGAACGTTGACTTCT | 278 |
SS12 | (CAA)10 | ACGCAAGCAATCCGTGG | GGCGCTGCCTGGATGTT | 277 |
SS13 | (TAGA)14 | GGCCCAAATACGCTAACCG | CGCGGAAACAGGGGTAGG | 274 |
SS14 | (CCA)11 | GCTCAGCACTTCGTACACCT | TCATGAACCCTGCTCCATC | 272 |
SS15* | (TCT)8 | TCAACTTCTCGCCGGCAG | GGGGAGGAGGGAAGGGAG | 271 |
SS16 | (AAGG)31 | AGGCTAAGGACTGTCGCAG | CCGTCGCAAAACGTCTGG | 269 |
SS17 | (TTG)8 | GGCCATGTTGTGAGAGACCT | AGGGCCGGTGGGAATAG | 261 |
SS18 | (GAT)13 | AGGACAAATGCCCACCGAG | GGTTTCCGCCGTCTTCC | 258 |
SS19 | (TTCT)13 | AGCATGCGGTCTGTCGAC | GCATTCTGGTCTAGCTGGGG | 256 |
SS20 | (GAT)10 | AAAAGGAAGCCTGGCAAG | TGAGGGTCCAGCCTTGACT | 255 |
SS21 | (AGG)8 | TAAGCGGCGTCTGTGC | ACTCGCGAGGGCTTCTTG | 254 |
SS22 | (ATG)13 | ATGTCATACGGCAAAATGGC | TGGGTGGCATTCTTCGGT | 253 |
SS23 | (TGG)8 | TGGTGGGGAGGGTGATG | TCACACATGCCCTCTTGAC | 252 |
SS24 | (GACA)9 | TCTTGTCCCGGCTGCATG | CCCGTGGCAGACATCCAT | 251 |
SS25 | (ATG)12 | AGTTGGCAGGGCAAATAAC | TGCCTTGTCAAAAATGTCCG | 244 |
SS26 | (GCTT)8 | CCGGTAGGAACTTCCCTGG | TAACTGTCAACCCGCGCG | 242 |
SS27 | (AAC)9 | CACCTACACCGCAAGCC | CACGAAACGCAAGCAGC | 240 |
SS28 | (TTCC)8 | AGCTCCGTCTTTCCTTTCCT | GCATTACCACGTAAAATGCG | 235 |
SS29 | (TCTT)8 | AGCCGAAGAGCTCTGGGT | TGCTTGCTTGTTCAGTGTTGG | 233 |
SS30 | (AAC)9 | TGCAACAGCAAACGGGAAG | TCACATCACTGGTGAGGC | 226 |
SS31* | (ACA)9 | GGCTTGCAACCCCTTCG | GCCAGTGCCTTTCCTTTC | 222 |
SS32 | (AGA)8 | GTGGTGGGCAACATGGGT | GCGTCAGTCGTGCCACT | 222 |
SS33* | (TTG)8 | GCTCGACTCAGCTGCGT | TGACAGTAGACAGCTACCAC | 219 |
SS34 | (AAC)8 | ACCCCTAACAACAGTGGGC | ATCTCGGGTCGCCAAACC | 214 |
SS35 | (CGT)9 | CGGGGTGGCTATGCGATC | AGATCACTAAAGCTGCAGAC | 197 |
SS36 | (CAA)9 | ACAGCAGGCACGTTTCC | AGGGTGGAGGTCGCATCT | 185 |
SS37 | (TTTC)15 | AGGCATTAGGTCATTCGGACT | TTCGCACGGGGTCTTTCC | 176 |
SS38 | (TTG)8 | TAAGCGCAAGGCCCAC | ACCTTGTGCACTGCTAGCT | 174 |
SS39 | (GTT)11 | TGCCGAGTTTCACAGCG | GCGCGTCTTTGTTGTTGC | 147 |
SS40 | (ATG)8 | TGAAAGTGCGCCCGAAGT | GGTGATGGTACGTCAGTCAGT | 147 |
SS41 | (AGA)8 | CCGTCTGCTCTTGCCAG | TCACAACGTAACGGACAGC | 109 |
SS42 | (TCA)8 | CGGATCGTCACGTCACC | AGCAACACCTTTGTTTTGTGT | 108 |
SS43 | (TTTAC)8 | ATGACGGCCGAAACCACG | GGGAAAAGTTTGGTACTCGGT | 272 |
SS44 | (TTTTG)24 | CGAGAACTCGTTTGTGTTGTT | AGCTGCTTCACTTTGGTCTT | 262 |
SS45 | (T)20 | CCTGAGTCAGTTACGCAAAGG | AGCCATCAAAAGTTACAGGCT | 229 |
SS46 | (TGA)11 | CCTCCAGGGATGAAGGCG | TTGGCCCGATGACAGGTC | 127 |
SS47 | (TC)20 | TGCTTCACTCACCCATGGG | AGCCTCTCCTACTCACAGCT | 129 |
SS48 | (TG)32 | ACCGACGCGTTGAAAGG | CGTTCTCACATCCAAAACGGT | 149 |
SS49 | (CT)15 | GAAGTTGGCCCTAGCGC | TGGACCAAGGTTACTGGACAC | 170 |
SS50 | (TC)13 | CCCTTCTCAGTGGTTGGC | ATCTCGCGGGAGGAGAGT | 174 |
SS51 | (TG)19 | TGTGAGGATTTGGAGGTTTCG | GCATAGCAAAACCAGGCAC | 184 |
SS52 | (GAT)14 | GCAACCATGGATGATACCGC | TGCTGTCTTCATCGTGGCT | 187 |
Note: the marker name labelled with a star indicates its following usage in fluorescent primer synthesis and population genetic analysis. |
The number of alleles (Na), number of effective alleles (Ne) and inbreeding coefficient (Fis) of the P. sinensis population collected from Taizhou were 11.167, 7.713 and 0.085, respectively, which were relatively smaller than the corresponding values of the population collected from Rizhao (11.500, 7.833 and 0.118). The Shannon index (I), observed heterozygosity (Ho) and expected heterozygosity (He) were slightly greater in the Taizhou population (2.157, 0.781 and 0.856) than in the Rizhao population (2.107, 0.740 and 0.838) (Table
Genetic diversity parameters for the populations of Paracondylactis sinensis from Taizhou (TZ) and Rizhao (RZ).
Population | N | Na | Ne | I | Ho | He | Fis |
TZ | 16 | 11.167 | 7.713 | 2.157 | 0.781 | 0.856 | 0.085 |
RZ | 16 | 11.500 | 7.833 | 2.107 | 0.740 | 0.838 | 0.118 |
Mean | 16 | 11.333 | 7.773 | 2.132 | 0.760 | 0.847 | 0.101 |
Note: N: number of sapmles; Na: number of different alleles; Ne: number of effective alleles; I: Shannon's information index; Ho: observed heterozygosity; He: expected heterozygosity; Fis: inbreeding coefficient. |
Analysis of molecular variance for the Paracondylactis sinensis from Taizhou and Rizhao, based on six microsatellite loci data.
Source of variation | df | Sum of squares | Estimated variance | Percentage of variance |
Between populaitons | 1 | 11.719 | 0.273 | 9% |
Amongst individuals within populations | 30 | 89.594 | 0.353 | 12% |
Within individuals | 32 | 73.000 | 2.281 | 78% |
Total | 63 | 174.313 | 2.907 | 100% |
Population analyses of 32 Paracondylactis sinensis individuals collected from Taizhou (16 individuals) and Rizhao (16). (A) Bayesian STRUCTURE clustering results; each colour represents the proportion of inferred ancestry from K (=2) ancestral populations, with each bar representing an individual sample; (B) Two-dimensional projection of the PCoA for 32 P. sinensis samples along the first two principal axes; (C) Relationships of P. sinensis populations based on genetic similarities derived from polymorphism patterns of SSR markers, displayed in a UPGMA dendrogram. TZ: Taizhou; RZ: Rizhao; Psin: P. sinensis.
With the development of high-throughput sequencing technology, single nucleotide polymorphism (SNP) markers based population genomics have gained widespread use in studies aimed at understanding genetic diversity within species (
The number of alleles (Na) is a parameter that can reflect genetic diversity and adaptability to changing environmental conditions. However, the value of Na is subjected to the number of samples and detection methods. The value of Na increases with the use of larger sample sizes or more sensitive detection methods. In a study by Remy Gatins (
The STRUCTURE, PCoA cluster and UPGMA tree analyses revealed that the populations from Taizhou and Rizhao were two distinct genetic groups (Fig.
This study represents the first successful development of microsatellite markers for Paracondylactis sinensis, an economically valuable sea anemone in China. By employing a combination of PacBio long-read and Illumina short-read sequencing technologies, we identified 52 polymorphic SSR markers and utilised six microsatellite loci to evaluate the genetic diversity of P. sinensis populations from Taizhou and Rizhao. Our analyses revealed mild genetic differentiation between the two populations, with evidence of gene flow likely facilitated by human-mediated commercial activities. The absence of significant bottleneck effects and the relatively high genetic diversity, indicated by the number of alleles and heterozygosity levels, suggest that the genetic resources of P. sinensis have not been critically compromised. However, the detected inbreeding coefficients highlight the need for conservation strategies to safeguard the genetic integrity of this species in the face of ongoing environmental pressures. This research provides a crucial genetic baseline that can provide information for future studies and conservation efforts, offering valuable insights into the genetic management and sustainable utilisation of P. sinensis in China.
The raw data of sequencing reads for the transcriptome reported in this paper have been submitted in the National Center for Biotechnology Information (NCBI) database: BioProject PRJNA1086232.
Thanks are given to the Oceanographic Data Center, IOCAS and the Center for High Performance Computing and System Simulation, Pilot National Laboratory for Marine Science and Technology (Qingdao), for providing computing power.
This study was supported by the Taizhou Science and Technology Plan Project (No. 23nya15), the Biological Resources Programme of Chinese Academy of Sciences (No. KFJ-BRP-017-097), General Scientific Research Project of the Department of Education of Zhejiang Province (Y202250489) and Science and Technology Program of Nanji Islands National Marine Nature Reserve Administration (No. JJZB-PYCG-2021112901).
Junyuan Li: Conceptualisation, Investigation, Methodology, Software, Validation, Formal analysis, Writing - Original draft, Writing - Review and Editing, Funding Acquisition; Xuyi Yang: Resources, Validation, Data Curation, Writing - Review and Editing; Juan Feng: Investigation, Visualisation; Ting hui Xie: Investigation, Resources, Validation, Data Curation; Zifeng Zhan: Supervision, Project administration, Funding Acquisition; Yang Li: Conceptualisation, Validation, Investigation, Data Curation, Supervision, Project Administration, Funding Acquisition.