Biodiversity Data Journal :
Research Article
|
Corresponding author: Caroline Chimeno (chimeno@snsb.de)
Academic editor: Ralph Peters
Received: 13 Apr 2023 | Accepted: 02 Jun 2023 | Published: 04 Jul 2023
© 2023 Caroline Chimeno, Stefan Schmidt, Hasmiandy Hamid, Raden Pramesa Narakusumo, Djunijanti Peggie, Michael Balke, Bruno Cancian de Araujo
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Chimeno C, Schmidt S, Hamid H, Narakusumo RP, Peggie D, Balke M, Cancian de Araujo B (2023) DNA barcoding data release for the Phoridae (Insecta, Diptera) of the Halimun-Salak National Park (Java, Indonesia). Biodiversity Data Journal 11: e104942. https://doi.org/10.3897/BDJ.11.e104942
|
Launched in 2015, the large-scale initiative Indonesian Biodiversity Discovery and Information System (IndoBioSys) is a multidisciplinary German-Indonesian collaboration with the main goal of establishing a standardised framework for species discovery and all associated steps. One aspect of the project includes the application of DNA barcoding for species identification and biodiversity assessments. In this framework, we conducted a large-scale assessment of the insect fauna of the Mount Halimun-Salak National Park which is one of the largest tropical rain-forest ecosystems left in West Java. In this study, we present the results of processing 5,034 specimens of Phoridae (scuttle flies) via DNA barcoding. Despite limited sequencing success, we obtained more than 500 clusters using different algorithms (RESL, ASAP, SpeciesIdentifier). Moreover, Chao statistics indicated that we drastically undersampled all trap sites, implying that the true diversity of Phoridae is, in fact, much higher. With this data release, we hope to shed some light on the hidden diversity of this megadiverse group of flies.
tropical forest, Indonesia, Brachycera, Phoridae, Malaise trap, biodiversity, DNA barcoding
Indonesia is the world’s largest archipelago, comprising over 17,000 islands and 95,000 kilometres of coastline (
In 2015, the three-year research project, Indonesian Biodiversity Discovery and Information System (IndoBioSys; www.indobiosys.org), was launched to provide a foundation for the large-scale exploration of the species diversity of Indonesia. Funded by the Indonesian and German Ministries of Research and Education, IndoBioSys is a German-Indonesian collaboration between the Museum für Naturkunde Berlin (MfN), the SNSB-Zoologische Staatssammlung München (ZSM) and the Museum Zoologicum Bogoriense, Research Center for Biology – LIPI in Cibinong, (MZB), Indonesia. Its main goal is to develop a standardised framework for species discovery including all associated steps (e.g. processing, documentation, storage, online inventory) (see
One work package of IndoBioSys is dedicated to the large-scale assessment of the insect fauna of the Mount Halimun-Salak National Park, which is one of the largest tropical rain-forest ecosystems left in West Java. To achieve this, a total of 34 Malaise traps were set up at four localities of the Park in 2015 and 2016. Malaise traps are commonly used for sampling of terrestrial insects because they provide standardised sampling, are very effective at capturing flying insects and are easy to use (
Here, we present the results of large-scale DNA barcoding, applied to specimens of Phoridae (scuttle flies) that were captured with the aforementioned Malaise traps. Phorids (scuttle flies) are megadiverse, highly abundant, have a worldwide distribution and occupy all trophic levels in an environment (
All fieldwork and laboratory procedures were conducted in the framework of IndoBioSys. These are presented in
Samples processed in this study originate from eight Malaise traps that were deployed in the Halimun-Salak National Park (Fig.
Metadata of the collection samples processed in this study, including Malaise trap data, number of phorid specimens processed with DNA barcoding and number of COI-sequences obtained.
Collection sample |
Nr. of phorid specimens |
Nr. of COI sequences |
Malaise trap |
Locality |
Lat |
Long |
Elevation (m a.s.l.) |
1 |
172 |
157 (91%) |
Trap 1 |
Cidahu |
-6.73761 |
106.714 |
1233 |
2 |
325 |
234 (72%) |
Trap 2 |
Cidahu |
-6.73438 |
106.713 |
1310 |
3 |
190 |
124 (65%) |
Trap 3 |
Cidahu |
-6.72846 |
106.712 |
1432 |
4 |
977 |
161 (17%) |
Trap 4 |
Cidahu |
-6.72636 |
106.714 |
1474 |
5 |
55 |
0 (0%) |
Trap 5 |
Cikaniki |
-6.75045 |
106.532 |
1233 |
6 |
258 |
0 (0%) |
Trap 6 |
Cikaniki |
-6.75 |
106.531 |
1276 |
7 |
1,791 |
1,465 (82%) |
Trap 7 |
Cikaniki |
-6.74863 |
106.536 |
1121 |
8 |
1,266 |
744 (59%) |
Trap 8 |
Cikaniki |
-6.74775 |
106.537 |
1095 |
We extracted the gDNA by submerging the entire specimen in 10 μl of Lucigen QuickExtract solution and heating it to 65°C for 18 minutes and 98°C for 2 minutes. We then amplified the 313 bp fragment of the Cytochrome Oxidase 1 (CO1) gene with the following primer combination: mlCO1intF: 5′-GGWACWGGWTGAACWGTWTAYCCYCC-3′ (
We pooled all the amplicons into a 50 ml falcon tube, based on the presence and intensity of bands on gels. The pooled samples were cleaned with Bioline SureClean Plus and then outsourced for library preparation at the Genome Institute of Singapore (GIS) using NEBNext DNA Library Preparation Kits (NEB). Paired-end sequencing was carried out on Illumina Hiseq 2500 platforms (2 × 250-bp). We processed the raw Illumina reads through the bioinformatics pipeline and quality-control filters outlined by
All specimen metadata and sequence data were uploaded to the Barcode of Life Data System (BOLD), an online workbench and database. All data are publicly available on BOLD as a dataset with a citable DOI (dx.doi.org/10.5883/DS-IBSPHOR). We applied the RESL-algorithm that is provided as part of the analysis tools in BOLD, Assemble Species by Automatic Partitioning (ASAP;
Using R version 4.2.1 (
All data are publicly available on BOLD as a dataset with a citable DOI (dx.doi.org/10.5883/DS-IBSPHOR). The R script and input data are deposited on Figshare (R script: https://doi.org/10.6084/m9.figshare.21806370; BIN input data: https://doi.org/10.6084/m9.figshare.21815142; ASAP input data: https://doi.org/10.6084/m9.figshare.21815064).
We processed 5,034 phorid specimens and recovered 2,885 COI-barcode sequences which represents a sequencing success of 57%. We obtained a total of 522 sequence clusters with the RESL algorithm, 504 MOTUs with ASAP and 506 and 489 MOTUs, respectively, when using the 2 and 3% thresholds with SpeciesIdentifier (Table
Number of clusters obtained from the COI sequence data of each Malaise trap when applying different clustering algorithms (RESL; ASAP) including output results from biodiversity assessments.
Output |
RESL |
ASAP |
SpeciesIdentifier (2%) |
Sample size (n) |
2,886 |
2,886 |
2,886 |
Number of observed clusters |
522 |
504 |
506 |
Number of rare clusters |
365 |
353 |
354 |
Sample coverage |
90.4% |
90.5% |
90.5% |
Chao1 estimate |
950 ± 72 |
969 ± 80 |
977 ± 80 |
Extrapolation to 2n |
725 ± 0.9 |
711 ± 0.9 |
715 ± 0.9 |
Accumulation curves for number of clusters obtained with each method: a. Diversity profile for sequence clusters obtained with the RESL algorithm; b. Diversity profile for sequence clusters obtained with the ASAP algorithm; c. Diversity profile for sequence clusters obtained with the SpeciesIdentifier using a 2% threshold; d. Diversity profile for sequence clusters obtained with the SpeciesIdentifier using a 3% threshold. The empirical (BIN counts; dotted blue) and estimated (Chao1; red) diversity profiles for communities where Malaise traps were deployed, as quantified by Hill numbers (q) from 0 to 3 with 95% confidence intervals (shaded areas, based on bootstrap analysis of 100 permutations). Species richness is depicted by q = 0; Shannon diversity by q = 1; and Simpson diversity by q = 2.
Against the backdrop that the majority of the tropic’s biodiversity is associated with insects, studying the megadiverse Diptera in such a setting can be overwhelming. Fortunately, the ongoing development of molecular techniques enables fast and accurate diversity assessments, coupled with a much smaller workload than when applying traditional methodologies. Here, we analyse more than 5,000 phorid specimens that were captured with Malaise traps in the Mt Halimun-Salak National Park located in West Java, Indonesia to provide a first glimpse of their truly impressive species-richness.
As depicted in the diversity profiles (Figs. 2b-c), we significantly undersampled our sampling sites, indicating that the true diversity of Phoridae is, in fact, much higher. This is not surprising, as we: (1) processed samples that were only collected with Malaise traps; (2) processed specimens from only eight samples and (3) have obtained a comparatively low sequencing success. Malaise traps are one of the most effective methods for capturing flying insects (
Scanning through literature, we were only able to find a handful of studies referring to the fauna of Phoridae specifically from the Indomalayan region and those that do only focus on single species or genera without providing information at a larger scale (see
The IndoBioSys project was developed to inventory the insect biodiversity of the Halimun-Salak National Park in order to establish a system that provides baseline information on Indonesia’s entomofauna. With this study (and all past studies conducted in the framework of this project), we show how little is really known about the diversity of generally abundant insect groups like the scuttle flies and that a large proportion of species is still awaiting discovery. For example, when
We thank the Ministry of Research and Higher Education of the Republic of Indonesia for providing a foreign research permit to BCA, SS and OS (number 2B/TKPIPA/E5/Dit.KI/ II/2016). The IndoBioSys project was funded by the German Federal Ministry of Education and Research (BMBF) within the bilateral "Biodiversity and Health" funding programme (Project numbers: 16GW0111K, 16GW0112); the Indonesian counterpart institutions were funded by DIPA PUSLIT Biologi LIPI 2015-2016.
"Biodiversity and Health" funding program (Project numbers: 16GW0111K, 16GW0112).