Biodiversity Data Journal :
Research Article
|
Corresponding author: Weiwei Zhang (zhangweiwei@jxau.edu.cn), Yongtao Xu (ytxu666@jxau.edu.cn)
Academic editor: Ricardo Moratelli
Received: 14 Oct 2024 | Accepted: 11 Nov 2024 | Published: 21 Nov 2024
© 2024 Yuqin Liu, Dandan Wang, Zhiming Cao, Wuhua Liu, Zechun Bao, Weiwei Zhang, Yongtao Xu
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Liu Y, Wang D, Cao Z, Liu W, Bao Z, Zhang W, Xu Y (2024) Assembling and dietary application of a local trnL metabarcoding database for Cervus nippon kopschi in Taohongling Nature Reserve. Biodiversity Data Journal 12: e139269. https://doi.org/10.3897/BDJ.12.e139269
|
The quality and completeness of the reference database have a direct impact on the accuracy of forage plant identification, thereby influencing the level of conservation and management of wildlife resources. In our research, target amplification was subjected to first-generation sequencing to assemble a local reference database using chloroplast trnL metabarcoding. We found that the primers c-h outperformed g-h as a universal DNA metabarcoding and 162 valid choloroplast trnL sequences were submitted (GenBank ID: PP081756 - PP081917), which exhibited an obvious preference for A and T nucleotides (60.49%). The haplotype diversity (Hd), nucleotide diversity (Pi) and average number of nucleotide differences (K) of these trnL sequences were 0.978, 0.0484 and 4.743, respectively. To assess the availability of the local database in identifying the diet of South China sika deer (Cervus nippon kopschi), high-throughput metabarcoding sequencing and BLAST analysis were performed. Ultimately, 25 forage plant species were identified, belonging to 19 families and 25 genera. Shrubs and herbaceous plants, such as Potentilla freyniana, Persicaria perfoliata, Rosa laevigata and Ardisia japonica etc, dominated the forage plants. This study established a local trnL reference database that holds immense value for the forage plant identification and nutritional evaluation for sika deer and other sympatric herbivores, as well as the conservation and management of biodiversity.
local database, Cervus nippon kopschi, diet, trnL metabarcoding, high-throughput sequencing
Food could provide the necessary nutrition and energy for the life activities of a species, it is a crucial resource for sustaining the survival and growth of populations and a key area of scientific research in zooecology. In recent years, the fragmentation and loss of animal habitats have resulted from global climate change, environmental pollution and excessive exploitation and the availability of food resources is in decline. Large herbivores have faced a population decline and consistently have the highest proportions of threatened species at multiple spatial and temporal scales (
The Taohongling Sika Deer National Nature Reserve (hereafter, TNNR) is the predominant habitat for the South China sika deer. Throughout the years, the reserve has diligently pursued a rigorous programme of forest conservation and restoration. The vegetation succession of the Reserve has restored the plants in many zones to evergreen broad-leaved mixed forests (
The traditional dietary analysis includes field observations, stomach content analysis (
DNA metabarcoding has the potential to qualitatively and quantitatively analyse the composition of herbivore diets (
The accuracy and precision of dietary research have a direct impact on the level of wildlife resource management (
Study areas
The TNNR is located on the south bank of the middle-lower reaches of the Yangtze River, along the northern border of the subtropical zone, Pengze, Jiangxi Province, China. The Reserve represents the largest distribution area for the South China sika deer and is a natural habitat for numerous wildlife species. The total area of TNNR is 12,500 hm2, the core area is 2,670 hm2, the experimental area is 1,830 hm2 and the buffer zone is 8,000 hm2. The core zone is for conservation and most of the sika deer live in this area; the experimental zone is for human activities and regulated development; a buffer zone has some allowable human activities, thereby mitigating artificial interference for the core zone (
Sample collection
Six sampling sites, including Nursery bases, XianLingAn, fir forests, NieJiashan, WuGuiShi and the Bamboo Garden, were designated in the frequent activity areas of sika deer at the TNNR in the autumn of 2022 (Fig.
DNA extraction and PCR amplification
To avoid impacting the yield and concentration of DNA during extraction, plant samples were thoroughly dried using silica gel before the experiment. After completely drying, 20 mg (not exceeding 30 mg) of dried leaf tissue were placed in a 2 ml EP tube with one steel bead. The plant tissue was then ground for 80 seconds at a frequency of 45 Hz using a plant tissue homogeniser. Plant samples that were not successfully homogenised were subjected to repeated grinding until they became a fine powder. To prevent DNA degradation, the ground plant tissue was transferred promptly. In this study, the DNA extraction of the samples was performed using the Fore Gene Plant DNA Isolation Kit (Chengdu). The DNA Optical Density (OD) value was measured by an ultraviolet spectrophotometer and the A260/A280 of most DNA extracts was between 1.70 and 2.21, indicating highly purified DNA. Then, the DNA was stored in a -20℃ freezer until further use.
The sequences for the trnL (UAA) gene universal primers, designated as g (5’-GGGCAAT CCTGAGCCAA-3’) and h (5’-CCATTGAGTCTCTGCACCTATC-3’) and the primers c (5’-CGAAATCGGTAGACGCTACG-3’) and h (5’-CCATTGAGTCTCTG CACCTATC-3’) were employed to amplify plant DNA (
Data s tatistics
The obtained sequences were aligned using Clustal W and trimmed. Afterwards, MEGA-11 software was employed to calculate the base composition and variation information of aligned sequences. The analysis includes conserved sites, variable sites, parsimony-informative sites, sequence length, base composition and the value for transitions and transversions. The phylogenetic tree was constructed using the Maximum Likelihood (ML) method based on the Tamura 3-parameter mode. The system utilised bootstrap analysis with 1000 iterations to assess the confidence of the tree nodes. Additionally, DnaSP was employed to calculate the total number of haplotypes for each population and the overall number of haplotypes across all species, haplotype diversity (Hd), nucleotide diversity (Pi) and average number of nucleotides (K). Hd and Pi values are two crucial metrics for assessing the level of variation within a population. A higher value indicates a greater degree of polymorphism within the population.
Diet a pplication
To validate the application and effectiveness of DNA sequences in the plant reference database, 15 fresh faecal samples of sika deer were collected from TNNR in summer. Two faecal pellets were randomly taken from each faecal sample and mixed to form a single composite sample with three repetitions, a total of five mixed samples were prepared. Total DNA was extracted by the protocol for high-throughput sequencing and sent to Shanghai Personal Biotechnology Co., Ltd. for sequencing. Purified amplicons were pooled in equimolar and paired-end sequenced (2 × 300) on an Illumina MiSeq platform (Illumina, San Diego, USA). Demultiplexed sequences from each sample were quality filtered and trimmed, denoised and merged and then the chimeric sequences were identified and removed using the QIIME2 dada2 plugin to obtain the feature table of OTUs (
DNA extraction from plant samples
A total of 290 plant samples were collected in this study from TNNR, encompassing 282 species from 90 families and 204 genera, with eight species being duplicates (Suppl. material
Amplification and sequence analysis
In this study, a total of 268 DNA samples (including three replicates) were subjected to PCR amplification and gel electrophoresis analysis, the part of gel electrophoresis bands shown in Fig.
Gel electrophoresis test and sequence peak maps. A Amplification detection of primers g-h, length approximately 50 bp; B Amplification detection of primers c-h, length approximately 150 bp (M: DL2000 DNA Marker, the sample numbers above the lanes followed by Clematis puberula var. ganpiniana, Lactuca sativa, Castanopsis tibetana, Cunninghamia lanceolata, Rorippa cantoniensis, Hedyotis chrysotricha and Lonicera japonica from left to right); C Normal sequence peaks (Artemisia caruifolia) and cross-peaks (Dicranopteris pedata).
Venn analysis and Phylogenetic tree maps at the order level. A Single dot indicates the number of endemic species identified within a group and multiple dots connected to the line indicate the number of shared species between groups; B Phylogenetic tree map for the order Liliales (including an outgroup); C Phylogenetic tree map for the order Ericales; D Phylogenetic tree map for the order Magnoliales. The numbers associated with each branch represent the bootstrap values obtained from the Maximum Likelihood analysis (note: the same colour represents the same genus).
The 162 sequences revealed an average base composition of 37.44% for T (thymine), 22.35% for C (cytosine), 23.05% for A (adenine) and 17.16% for G (guanine). Notably, there was a significantly higher content of A+T (accounting for 60.49%), indicating a significant preference for A and T in the base composition. The average transition/transversion ratio (Ts/Tv), represented by the value R, was found to be 1.0. Based on 162 sequences, 70 polymorphic sites, 42 singleton sites and 28 parsimony-informative sites were obtained. A total of 71 haplotypes were detected, the average number of nucleotide differences per site (K) was calculated to be 4.743, haplotype diversity was found to be 0.978 and nucleotide diversity (Pi) was 0.04840. At the order level, the 34 orders exhibited haplotype diversity values ranging from 0.50000 to 1.00000, while the nucleotide diversity values ranged from 0.00518 to 0.22655. The results indicated the nucleotide diversity of Magnoliales and Liliales was relatively high, with Pi values exceeding 0.1 and exhibited a wide range of sequence variations and significant nucleotide sequence differences between individuals; Lilium and other orders occupied relatively lower nucleotide diversity, with higher sequence similarity amongst individuals. The order Magnoliales exhibited the highest number of polymorphic sites and had larger nucleotide diversity values correspondingly (Table
Genetic diversity parameters of 162 sequences (including variable sites, sample size, number of haplotypes, haplotype diversity, nucleotide diversity and the number of average nucleotide differences)
Orders |
Variable site |
Sample size |
Number of haplotype (Nh) |
Haplotype diversity (Hd) |
Nucleotide diversity (Pi) |
Number of average nucleotide difference(K) |
Liliales |
2 |
4 |
2 |
0.500 |
0.005 |
1.000 |
Cupressales |
5 |
2 |
2 |
1.000 |
0.027 |
5.000 |
Dipsacales |
12 |
5 |
5 |
1.000 |
0.038 |
6.500 |
Lamiales |
18 |
15 |
11 |
0.933 |
0.026 |
4.152 |
Fabales |
35 |
6 |
6 |
1.000 |
0.080 |
13.667 |
Ericales |
43 |
15 |
9 |
0.924 |
0.070 |
12.390 |
Poales |
30 |
18 |
11 |
0.915 |
0.058 |
9.935 |
Malpighiales |
16 |
5 |
5 |
1.000 |
0.042 |
7.800 |
Malvales |
6 |
2 |
2 |
1.000 |
0.029 |
6.000 |
Asterales |
5 |
4 |
3 |
0.833 |
0.013 |
2.500 |
Fagales |
16 |
3 |
2 |
0.667 |
0.055 |
10.667 |
Gentianales |
37 |
7 |
7 |
1.000 |
0.082 |
6.000 |
Ranunculales |
25 |
7 |
6 |
0.952 |
0.069 |
10.952 |
Magnoliales |
99 |
5 |
4 |
0.900 |
0.227 |
14.762 |
Rosales |
31 |
19 |
11 |
0.936 |
0.041 |
7.550 |
Solanales |
2 |
3 |
2 |
0.667 |
0.007 |
1.333 |
Caryophyllales |
27 |
9 |
9 |
1.000 |
0.068 |
11.250 |
Dioscoreales |
4 |
2 |
2 |
1.000 |
0.018 |
4.000 |
Asparagales |
32 |
3 |
3 |
1.000 |
0.119 |
22.333 |
Sapindales |
24 |
6 |
4 |
0.800 |
0.060 |
11.200 |
Sequence alignment and diet identification
Following strict data quality control, multiple sequences with 100% similarity were clustered into one OTU to reduce the computational burden of subsequent species annotation. Finally, 56 OTU sequences were selected to align with the reference database and 25 chloroplast trnL sequences were matched successfully to local reference database, based on the BLAST platform. These sequences were further analysed and classified, belonging to 19 families and 25 genera (Fig.
The 25 forages that were successfully identified based on the local reference database for sika deer
Number |
Forage plants |
Sequence information |
Sequence length(bp) |
OTU2 |
Dalbergia hupeana |
GACTTAATTGGATTGAGCCTTGGTATGGAAACGTACCAAGTGATAACTTTCAAATTCAG AGAAACCCCGGAATTAACAATGGGCAATCCTGAGCCAAATCCCGTTTTCTGAAAGCAAA GAAAAATTAAGAAAGAAAAAGG |
140 |
OTU4 |
Kadsura longipedunculata |
GACTTGATTGGATTGAGCCTTAGTATGGAAACCTACTAAGTGGTAGCTTCCAAATTCAG AGAAACCCTGGAATTAAAAATGGGTAATCCTGAGCCAAATCCTGTTTTCAGAAAACAAT GGTTTAGAAGTTTAGAAAGCGAGAATAAAAAAAAGGTAGG |
158 |
OTU5 |
Loropetalum chinense |
GACTTGATTAGCTTGAGCCTTGGTATGGAAACCTGCTAAGTGGTAACTTCCAAATTCAG AGAAACCCCGGAATTCAAAATGGGCAATCCTGAGCCAAATCCTGTTTTCCGAAAACAAA GACAAGGGTTCAGAAAGCGAGAATCAAAATAAAAAAAG |
156 |
OTU6 |
Abelia chinensis |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACTAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATTAATAAAAATGGGCAATCCTGAGCCAAATCCAGTTTTACGAAAAC AAGGGTTCAGAAAGCTAAAATCAAAAAG |
146 |
OTU9 |
Phyllanthus urinaria |
GACTTAATTGAATTGAGCCTTGGTATGGAAATCTACCAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATTAAAAATGGGCAATCCTGAGCCAAATCCAGTTTTCTGAAAACAA ACAAAGGTTCGTATCATAAAGATAGAATAAATAAAG |
153 |
OTU17 |
Mucuna sempervirens |
GACTTAATTGGATTGAGTCTTGGTATGGAAACTTACCAAGTGAGAACTTTCAAATTCAG AGAAACCCTGGAATTCACAATGGGCAATCCTGAGCCAAATCCTCTTTTTCGAAAACAAA GATTTAAAGGAAAATAAAAAAGGG |
142 |
OTU19 |
Zea mays |
GACTTGATTGTATTGAGCCTTGGTATGGAAACCTGCTAAGTGGTAACTTCCAAATTCAG AGAAACCCTGGAATGAAAAATGGACAATCCTGAGCCAAATCCCTTTTTTGAAAAACAAG TGGTTGTCAAACTAGAACCCAAAGAAAAG |
147 |
OTU22 |
Phyllostachys edulis |
GACTTGATTGTATTGAGCCTTGGTATGGAAACCTGCTAAGTGGTAACTTCCAAATTCAG AGAAACCCTGGAATTAAAAAAGGGCAATCCTGAGCCAAATCCGTGTTTTGAGAAAACAA GTGGTTCTCGAACTAGAATCCAAAGGAAAAG |
149 |
OTU23 |
Rhus chinensis |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACCAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATCAAAAATGGGCAATCCTGAGCCAAATCCTATTTAATGAGAACAAA AACAAACAAGGGGTCAGAACGGGAGAAAGAG |
149 |
OTU25 |
Cunninghamia lanceolata |
GACTTAAATTTTTTGAGCCTTGGTATGGAAACTTACCAAGTGATAGCATCCAAATCCAG GGAACCCTGGGATATTTTGAATGGGCAATCCTGAGCCAAATCCGATTTCTGGAGACAA TAGTCTCCTATCCTAGAAAGG |
138 |
OTU27 |
Lysimachia congestiflora |
GACTTGATTAGCTTGAGCCTTGGTATGGAAACCTGCTAAGTGGTAACTTCCAAATTCAG AGAAACCCCGGAATTCAAAATGGGCAATCCTGAGCCAAATCCTCTTTTTCGAAAACAAA GATTTAAAGGAAAATAAAAAAGGG |
142 |
OTU28 |
Viburnum dilatatum |
GACTTAATTGAATTGAGCCTTGGTATGGAAACCTACTAAGTGAGAACTTTCAAATTCAG AGAAACCCTGGAATTAATAAAAATGGGCAATCCTGAGCCAAATCCTGTTTTCCGAAAAC AAACAAAGAATCGAAAAAAAG |
139 |
OTU29 |
Rosa laevigata |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACCAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATTAAAAATGGGCAATCCTGAGCCAAATCCCGTTTTATGAAAACAAA CAAAGTTTGCGAAAGCGAGAATAAAAAAAAG |
149 |
OTU37 |
Trachelospermum jasminoides |
GACTTAATTGGATTGAGCCTTGGTAAGGAAACCTACTAAGTGATGACTTTCAAATTCAG AGAAACCCCGGAATTAAGAAAAAGGGCAATCCTGAGCCAAATCCTATTTTCCACAAACA AAGGTTCAGAAAACGAAAACAAG |
141 |
OTU39 |
Citrus reticulata |
GACTTAATTGGATTGAGCCTTAGTATGGAAACTTACTAAGTGATAACTTTCAAATTCAG AGAAACCCAGGAATTAAAAATGGGTAATCCTGAGCCAAATCCTCTTCTCTTTTCCAAGA ACAAACAGGGGTTCAGAAAGCGAAAAAGGGG |
149 |
OTU40 |
Ardisia japonica |
GACTTAATTGGATTGAGCCTTAGTATGGAAACCTACTAAGTGAGAACTTTCAAATTCAG AGAAACCCTGGAATTAATAAAAATGGGCAATCCTGAGCCAAATCCTCTTTTTCGAAAAC AAAGATTAAAGGAAAATAAAAAAGAGG |
145 |
OTU42 |
Camphora officinarum |
GACTTGGTTGGATTGAGCCTTGGTATGGAAACCTACTAAGTGATAACTTCCAAATTCAG AGAAACCCTGGAATTAAAAATGGGCAATCCTGAGCCAAATCCTGTTTTCAGAAAACAAG GGTTCAGAAAGCGAGAACCAAAAAAG |
144 |
OTU44 |
Potentilla freyniana |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACCAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATTAAAAATGGGCAATCCTGAGCCAAATCCCGTTTTATGAAAACAAA CAAGGGTTTCATAAACCGAGAATAAAAAAG |
148 |
OTU45 |
Persicaria perfoliata |
GACTTAATTGGATTGAGCCTTGGTATGGAAACTTACTAAGTGAGAACTTTCAAATTCAG AGAAACCCTGGAAGTAAAAAAGGGCAATCCTGAGCCAACTCCTGCTTTCCAAAAGGAA AGAAAAAGAG |
127 |
OTU48 |
Eurya nitida |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACTAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATTAATAAAAATGGGCAATCCTGAGCCAAATCCTGTTTTTCGAAAAC AAACAAAGATTCAGAAAGCGAAAATCAAAAAAG |
151 |
OTU49 |
Callicarpa bodinieri |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACTAAGTGAGAACTTTCAAATTCAG AGAAACCCCGGAATTAATAAAAATGGGCAATCCTGAGCCAAATCCTGTTTTCTCAAAAC AAAGGTTCAAAAAACGAAAAAAAAG |
143 |
OTU50 |
Celtis biondii |
GACTTAATTGGATTGAGCCTTGGTATGGAAACCTACCAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATTAAAAAAAATGGGCAATCCTGAGCCAAATCCGGTTTTCTGAAAA CAAACAAGGATTCAGGATTCAGAAAGCGATAATAAAAAAGAATCG |
162 |
OTU53 |
Pleuropterus multiflorus |
GACTTAATTGGTTTGAGCCTTAGTATGGAAACCTACTAAGTGAGAACTTTCAAATTCAG AGAAACCCTGGAATTAAAAAAATGGGCAATCCTGAGCCAACTCCTTCTTTCCAAAAGGA AGAAAAAAG |
127 |
OTU54 |
Akebia trifoliata |
GACTTGATTGGATTGAGCCTTGGTATGGAAACCTACTAAGTGATAACTTTCAAATTCAG AGAAACCCTGGAATGAAAAATGGGCAATCCTGAGCCAAATCCTGTTTTCAGAAAAAAAA AGGTTCAGAAAGCGAGATTAAAAAAATAAAGGAAG |
153 |
OTU55 |
Reynoutria japonica |
GACTTAATTGGTTTGAGCCTTAGTATGGAAACCTACTAAGTGAGAACTTTCAAATTCAG AGAAACCCTGGAATTAAAAAAATGGGCAATCCTGAGCCAACTCCTGCTTTCCAAAAGG AAAGAAAAAGAG |
129 |
Sample quality and barcode selection
The quality and selection of plant samples play a crucial role in efficient database construction. Most plants begin to germinate during the warm spring and enter a period of colour change and leaf shedding subsequently, then complete their life cycle in winter (
Selecting the optimal DNA barcode should be crucial considering the inter- or intraspecific characteristics of different plant taxa (
In this study, a comparative analysis between the primers c-h and g-h amplification was performed and the primer c-h indicated better amplification efficiency, but partial plant species showed primers g-h preference. Additionally, it is speculated that identifying all the plant species is difficult because of the loss of universality and general applicability primers, which may be one of the factors influencing the accuracy of our species identification rates. Although many studies have searched for a universal plant barcode, none of the available loci works across all species (
Local barcoding database of potential foraging plants
The accuracy of species identification was determined by the quality and completeness of reference databases (
Diet composition
Diet composition analysis contributes to assessing the nutritional intake of the wildlife. Shrubs, with more protein and mineral elements than herbs, occupy a vital proportion of the forage plants for sika deer. In the winter when food resources may be scarce, sika deer prefer woody vegetation, particularly shrubs, ensuring a balanced diet (
Different herbivores have different tolerances to tannins, with ruminants having been shown to tolerate a certain amount of tannins in their natural diet (
This study provided new insight into diet identification based on high-throughput sequencing and a local database for sika deer and other sympatric herbivores, which are essential for clarifying the dietary nutrition, food utilisation and transmission and the structure and functionality of ecosystems. However, the construction of a DNA barcoding database is a complex and long-term process, requiring continuous optimisation and improvement. In the follow-up study, a complete and local DNA barcoding database needs to be constructed to cover more potential forage plants and to further improve the accuracy of diet identification for the herbivores.
We are thankful to Xiaohong Liu, Yongjiang Chen and Yulu Chen from the Taohongling National Nature Reserve and Ruitao Wu, Zhiwei Wu and Ruifeng Wu from Jiangxi Agricultural University for their assistance with sample collection. Additionally, we extend our appreciation to Huiyue Kou and others for their assistance with the laboratory work and data analysis.
This work was supported by a grant from the National Natural Science Foundation of China(No. 32470552 and No. 31960118)and Natural Science Foundation of Jiangxi Province of China (No. 20242BAB25349).
No animals were captured and faecal sample analyses were performed, based on the non-invasive principle.
YL performed the experiments, analysed the data, prepared figures and tables and/or approved the final draft. DW and ZC collected the samples, ZB assisted in conducting the experiments, WL and WZ analysed the data and YX conceived and designed the experiments, analysed the data, prepared figures and/or tables and approved the final draft.
Collection information of potential forage plants for sika deer in TNNR.
The local reference database based on the trnL gene (containing 162 plants’ DNA sequence information).