1urn:lsid:arphahub.com:pub:F9B2E808-C883-5F47-B276-6D62129E4FF4urn:lsid:zoobank.org:pub:245B00E9-BFE5-4B4F-B76E-15C30BA74C02Biodiversity Data JournalBDJ1314-28361314-2828Pensoft Publishers10.3897/BDJ.6.e299272992710001Research ArticleChalcidoideaColeopteraDipteraHymenopteraLepidopteraBiodiversity & ConservationFaunistics & DistributionMolecular systematicsSystematicsTaxonomyIndonesiaJavaThe Mt Halimun-Salak Malaise Trap project - releasing the most species rich DNA Barcode library for IndonesiaCancian de AraujoBrunochalcididae@gmail.comhttps://orcid.org/0000-0002-0562-917X1SchmidtStefanhttps://orcid.org/0000-0001-5751-87061SchmidtOlgahttps://orcid.org/0000-0001-8229-85731von RintelenThomashttps://orcid.org/0000-0002-6253-30782UbaidillahRosichon3BalkeMichaelhttps://orcid.org/0000-0002-3773-65861SNSB-Zoologische Staatssammlung München, Munich, GermanySNSB-Zoologische Staatssammlung MünchenMunichGermanyMuseum für Naturkunde, Leibniz-Institut für Evolutions- und Biodiversitätsforschung, Berlin, GermanyMuseum für Naturkunde, Leibniz-Institut für Evolutions- und BiodiversitätsforschungBerlinGermanyMuseum Zoologicum Bogoriense, Research Center for Biology, Indonesian Institute of Sciences, Cibinong, IndonesiaMuseum Zoologicum Bogoriense, Research Center for Biology, Indonesian Institute of SciencesCibinongIndonesia
Corresponding author: Bruno Cancian de Araujo (chalcididae@gmail.com).
Academic editor: Gergin Blagoev
2018191220186e299271C3AE4A2-AA39-59A4-8CEB-46D338100F2225272752109201828112018Bruno Cancian de Araujo, Stefan Schmidt, Olga Schmidt, Thomas von Rintelen, Rosichon Ubaidillah, Michael BalkeThis is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
The Indonesian archipelago features an extraordinarily rich biota. However, the actual taxonomic inventory of the archipelago remains highly incomplete and there is hardly any significant taxonomic activity that utilises recent technological advances. The IndoBioSys project was established as a biodiversity information system aiming at, amongst other goals, creating inventories of the Indonesian entomofauna using DNA barcoding. Here, we release the first large scale assessment of the megadiverse insect groups that occur in the Mount Halimun-Salak National Park, one of the largest tropical rain-forest ecosystem in West Java, with a focus on Hymenoptera, Coleoptera, Diptera and Lepidoptera collected with Malaise traps. From September 2015 until April 2016, 34 Malaise traps were placed in different localities in the south-eastern part of the Halimun-Salak National Park. A total of 4,531 specimens were processed for DNA barcoding and in total, 2,382 individuals produced barcode compliant records, representing 1,195 exclusive BINs or putative species in 98 insect families. A total of 1,149 BINs were new to BOLD. Of 1,195 BINs detected, 804 BINs were singletons and more than 90% of the BINs incorporated less than five specimens. The astonishing heterogeneity of BINs, as high as 1.1 exclusive BIN per specimen of Diptera successfully processed, shows that the cost/benefit relationship of the discovery of new species in those areas is very low. In four genera of Chalcidoidea, a superfamily of the Hymenoptera, the number of discovered species was higher than the number of species known from Indonesia, suggesting that our samples contain many species that are new to science. Those numbers shows how fast molecular pipelines contribute substantially to the objective inventorying of the fauna giving us a good picture of how potentially diverse tropical areas might be.
The Indonesian archipelago features an extraordinarily rich biota that is, amongst other factors, derived from its sheer size and geographic position, basically linking the Oriental and Australian regions. This transition was first described in detail by Wallace (1860), who laid the foundation for the discipline of biogeography in this region. Our understanding of the biogeography of the region has steadily advanced since then, increasingly embracing new technology and interdisciplinary research approaches (see Lohman et al. 2011). However, the actual taxonomic inventory of the archipelago remains highly incomplete (see Schmidt 2015) and there is hardly any significant taxonomic activity that utilises recent technological advances (but see Barlow and Woiwod 1990, Riedel et al. 2013, Riedel et al. 2014, Wibowo et al. 2017, Hubert et al. 2015, Dahruddin et al. 2016, Cancian de Araujo et al. 2017, Cancian de Araujo et al. 2018). Large-scale databasing, in particular of hyperdiverse invertebrates of the region, is also in its infancy. The GBIF to date only features 147,463 occurrence data published for Indonesia, for 13,210 species - surprisingly few compared, for example, to Germany with 37,917,568 occurrences and 16,742 species (GBIF, accessed on 1 July 2018). At the same time, vast areas of supposedly high biodiversity disappear every year (Brooks et al. 2002, Curran et al. 2004, Gaveau et al. 2013, Wilcove et al. 2013, Abood et al. 2014, Margono et al. 2014) and with them, possibly thousands of species never formally known to mankind, which means also a significant loss of ecosystem service and knowledge of potentially useful compounds (see Hooper et al. 2005, Loreau 2009, Norris 2011).
The Indonesian and German ministries of Research and Education have therefore provided funding to establish a biodiversity information system (IndoBioSys), that integrates occurrence databasing, species discovery and species characterisation, using morphology and DNA sequence data, specimen vouchering, as well as integrated tools for the discovery of substances of potential use for society. IndoBioSys is, therefore, a case study and foundation for the large-scale exploration of Indonesian species diversity. Moreover, IndoBioSys could be a foundation for the empirical and objective,scientific assessment of species distribution patterns across the archipelago, for example, needed for conservation priority setting.
One work package of the IndoBioSys project was an assessment of the species diversity of the hyperdiverse insect fauna of the Mount Halimun-Salak National Park in West Java, with a focus on sampling with Malaise traps. The National Park has been recognised as one of the largest tropical rain-forest ecosystems left in Java, being designated as a National Park in 2003 with a present area of about 113,357 hectares. Malaise trapping (Malaise 1937, Townes 1972) is a method that allows standardised sampling of flying insects, with a number of highly diverse groups of minute species, e.g. in the Diptera and Hymenoptera.
Subsets of the samples obtained were submitted to a well-established pipeline employing DNA barcoding (Hebert et al. 2003, Ivanova et al. 2006) in order to estimate species diversity (see Ratnasingham and Hebert 2007, Ratnasingham and Hebert 2013) and to obtain data for future beta diversity studies with data from other localities.
Here, we release these data with an analysis of their taxonomic content, an approximation of the species diversity encountered and an evaluation of the novelty of the data with respect to publicly available data from the Barcode of Life Data Systems (BOLD).
Materials and Methods
A summary of fieldwork and laboratory procedures employed in the IndoBioSys project were presented by Schmidt et al. 2017. Methodological steps specific for the work package presented here are described below.
Fieldwork and samples processing
From September 2015 until April 2016, 34 Malaise traps (Townes style, Townes 1972) were placed in four different localities in the southeast of the Halimun-Salak National Park (Fig. 1). The elevation ranged from 932 to 1,638 m with an average of 1,218 m.
The traps were run for about 120 days in total and the collecting bottles changed monthly. Collecting liquid was 300 ml of 96% Ethanol in each bottle.
The samples were taken to the IndoBioSys Indonesian laboratory at the Museum Zoologicum Bogoriense (MZB) in Cibinong, West Java. Using a 3 mm mesh sieve, they were broken down into two fractions, according to the size of the animals with the smaller samples passing the sieve into a collecting tray.
This fractioning is important for optimising the sorting process as well as for separating the specimens that will be sent entirely for molecular laboratory processing ("voucher recovery pipeline") from the ones that are large enough for a procedure where only one or more legs are removed from the voucher for laboratory use ("leg picking pipeline"). Most of the fractions were sent to the IndoBioSys laboratory at SNSB-ZSM in Munich, where they were sorted to order and family level.
Given the enormous number of specimens (we estimated over 300,000 specimens of invertebrates collected during the project), the orders Coleoptera and Hymenoptera were chosen as the main target groups for the present analysis. Selected groups of Diptera, in particular Syrphidae and Phoridae, will be dealt with in a separate data release. Here, we present the results of a few specimens randomly picked from the samples. For Coleoptera and Hymenoptera, specimens were taken quantitatively from the samples except in case of a long series of morphologically similar individuals, in which case we took only representatives. In these cases, a smaller amount of specimens that represents the morphological diversity of the series was chosen in order to prevent cryptic species bias. The number of specimens taken was determined on a case by case basis.
Lepidoptera, another target group of the IndoBioSys project, were collected using a different method, as described earlier (Schmidt et al. 2017). Some Geometridae that were collected using Malaise traps and that were suitable for morphological analysis were processed and included in the present study. A specific release of the geometrid barcode data is currently being prepared (OS in prep.).
All specimens that were not further processed were repatriated to the MZB as ethanol samples. All processed specimens were returned to MZB as dry mounted and labelled voucher specimens (Fig. 2 and Fig. 3).
All specimen data are accessible in BOLD as a single citable dataset (dx.doi.org/10.5883/DS-IDBMTP). The data include collecting locality, geographic coordinates, elevation, collector, one or more digital images, identifier and voucher depository. Sequences data can be obtained through BOLD and include a detailed LIMS report, primer information and access to trace files. The sequences are also available on GenBank (accession numbers MH926363-MH929079).
Data analysis
Locality information and molecular data from the Malaise trapping programme were downloaded from the BOLD IndoBioSys campaign projects. The records downloaded were individualised by trap and by insect order in separate excel worksheets for analysis of spatial and diversity distribution. Here, we only focus on the orders Hymenoptera, Coleoptera, Diptera and Lepidoptera.
Results
A total of 4,531 specimens were prepared for DNA barcoding. Of these, we obtained cox1-5P sequences from 2,732 individuals. Sequences from 2,598 of these individuals were longer than 300 base pairs. In total, 2,380 individuals produced barcode compliant records (Table 1). The success rate was therefore comparably low, with only 60.5% on average, varying between the samples from 2.7% to 100% (Fig. 1).
These 2,380 individuals represent 1,197 exclusive BINs or putative species. They could be assigned to 98 different insect families (Table 2). Gunung Botol had the largest success rate (80.9%) and Sukamantri the lowest (32.2%) in terms of processed specimens producing barcode compliant sequences. From those 1,197 exclusive BINs, only 46 BINs (3.8%) are not new to BOLD. Only 15 BINs were recovered with more than 10 specimens of each BIN. A total of 804 BINs were singletons and more than 90% of the BINs were recorded with less than 5 specimens (Fig. 4).
The highest diversity of BINs was found in Hymenoptera (712 BINs), followed by Coleoptera (398), Diptera (53) and Lepidoptera (34). The diversity per order was always high, with two or less individuals per BIN on average. The diversity per family was also impressive with 50% of the families being composed by BINs represented by singletons or doubletons.
Discussion
Given the discrepancy in the sampling effort, it was not possible to compare taxonomic disparities amongst the four sampling areas. The sampling was focused on Cikaniki due to the better conservation of the forest in this area and the presence of the research station that provided better infrastructure to the scientific staff.
Even collecting at four different locations in one nature reserve, the IndoBioSys Malaise trap project alone has added 1,149 new BINs to BOLD. It shows how fast molecular pipelines contribute substantially to objectively inventorying the fauna of megadiverse areas. It also allows us to estimate the enormous diversity of tropical areas like the Halimun-Salak National Park. The astonishing heterogeneity of BINs (See Fig. 5 and Table 2), as high as 1.1 specimen successfully processed per exclusive BIN of Diptera, shows the magnitude of the diversity that is waiting to be discovered in the tropics. Only 15% of the specimens that produced DNA barcode compliant records belong to putative species that have more than five specimens processed, being 81.7% of all BINs represented by singletons or doubletons. It makes the cost/benefit relationship of the discovery of new species in those areas very low, even with low success rates of the molecular processing that this project has been facing. Such large error rates have not been encountered in similar projects of the ZSM and we suspect that the poor quality of the ethanol used for the collecting bottles might have been the crucial issue.
The supraspecific taxonomic diversity was relatively high considering the number of specimens analysed. As a comparison, Hendrich and collaborators in their release of a comprehensive DNA barcode database for Central European beetles (Hendrich et al. 2014) have sequenced 15,948 specimens to obtain 97 families meaning that, on average, a family in the database is represented by 164.4 processed specimens. In the present paper, we recorded 39 families of Coleoptera after processing only 788 specimens, corresponding to one family per 20.7 specimens on average. Therefore and even considering that this discovery process is not linear, it is quite clear that we are far behind the accumulation curve plateau for families and that there are many more to be discovered at Halimun-Salak National Park, especially at the species level.
The diversity of Chalcidoidea, a superfamily of Hymenoptera, gives us a clear picture of the diversity uncovered at Halimun-Salak National Park. The Universal Chalcidoidea Database (Noyes 2018) has returned records for 17 genera and 302 species from Java. Here, we detected 11 genera and 155 species for this superfamily. For four families (Aphelinidae, Eulophidae, Mymaridae and Torymidae), the diversity detected was higher than the diversity described (Fig. 5), showing that those samples are composed of many species new to science.
Acknowledgements
We thank the Ministry of Research and Higher Education of the Republic of Indonesia for providing a foreign research permit to BCA, SS and OS (number 2B/TKPIPA/E5/Dit.KI/II/2016). The IndoBioSys project is funded by the Bundesministerium für Bildung und Forschung within the bilateral "Biodiversity and Health" funding programme (Project numbers: 16GW0111K, 16GW0112), the Indonesian counterpart institutions were funded by DIPA PUSLIT Biologi LIPI 2015-2016.
References2014Relative contributions of the logging, fiber, oil palm, and mining industries to forest loss in Indonesia81586710.1111/conl.121031990Seasonality and diversity of Macrolepidoptera in two lowland sites in Dumoga-Bone National Park, Sulawesi UtaraRoyal Entomological SocietyLondon3430905467632002Habitat Loss and Extinction in the Hotspots of Biodiversity16490992310.1046/j.1523-1739.2002.00530.x2017IndoBioSys – DNA barcoding as a tool for the rapid assessment of hyperdiverse insect taxa in Indonesia: a status report44677610.14203/treubia.v44i0.33812018From field courses to DNA barcoding data release for West Papua - making specimens and identifications from university courses more sustainable6e2523710.3897/bdj.6.e252372004Lowland forest loss in protected areas of Indonesian Borneo30356601000100310.1126/science.10917142016Revisiting the ichthyodiversity of Java and Bali through DNA barcodes: taxonomic coverage, identification accuracy, cryptic diversity and identification of exotic species17228829910.1111/1755-0998.125282013Reconciling forest conservation and logging in Indonesian Borneo88e6988710.1371/journal.pone.00698872003Biological identifications through DNA barcodes270151231332110.1098/rspb.2002.22182014A comprehensive DNA barcode database for Central European beetles with a focus on Germany: adding more than 3500 identified species to BOLD1579581810.1111/1755-0998.123542005Effects of biodiversity on ecosystem functioning: a consensus of current knowledge75133510.1890/04-09222015DNA Barcoding Indonesian freshwater fishes: challenges and prospects3114416910.1515/dna-2015-00182006An inexpensive, automation-friendly protocol for recovering high-quality DNA6998100210.1111/j.1471-8286.2006.01428.x2011Biogeography of the Indo-Australian Archipelago42120522610.1146/annurev-ecolsys-102710-1450012009Linking biodiversity and ecosystems: towards a unifying ecological theory3651537496010.1098/rstb.2009.01551937A new insect trap581481602014Primary forest cover loss in Indonesia over 2000–20124873073510.1038/nclimate22772011Biodiversity in the context of ecosystem services: the applied need for systems approaches367158619119910.1098/rstb.2011.0176Universal Chalcidoidea Database. World Wide Web electronic publicationhttp://www.nhm.ac.uk/chalcidoids2018-07-13T00:00:00+03:002007BOLD: The Barcode of Life Data System735536410.1111/j.1471-8286.2007.01678.x2013A DNA-based registry for all animal species: The Barcode Index Number (BIN) system87e6621310.1371/journal.pone.00662132013Integrative taxonomy on the fast track - towards more sustainability in biodiversity research1011510.1186/1742-9994-10-152014Ninety-eight new species of Trigonopterus weevils from Sundaland and the Lesser Sunda Islands467116210.3897/zookeys.467.82062015List of primary types of the larentiine moth species (Lepidoptera: Geometridae) described from Indonesia - a starting point for biodiversity assessment of the subfamily in the region3e544710.3897/bdj.3.e54472017A streamlined collecting and preparation protocol for DNA barcoding of Lepidoptera as part of large-scale rapid biodiversity assessment projects, exemplified by the Indonesian Biodiversity Discovery and Information System (IndoBioSys)5e2000610.3897/bdj.5.e200061972A light-weight Malaise trap8392392471860On the zoological geography of the Malay Archipelago.41617218410.1111/j.1096-3642.1860.tb00090.x2017DNA barcoding of fish larvae reveals uncharacterised biodiversity in tropical peat swamps of New Guinea, Indonesia6861079108710.1071/mf160782013Navjot's nightmare revisited: logging, agriculture, and biodiversity in Southeast Asia28953154010.1016/j.tree.2013.04.005539CD593-E53B-5B60-992D-32B8AE1BB25E10.3897/BDJ.6.e29927.figure12527297
Spatial and temporal specimens and molecular access success distribution on Mt Halimun-Salak National Park.
Dr Michael Balke (ZSM, Munich) and Dr Rosichon Ubaidillah (MZB) at the Museum Zoologicum Bogoriense in Cibinong, West Java, during the repatriation of over 2,000 voucher specimens and over 20 ethanol samples.
IndoBioSys Chalcidoidea species diversity per family (red line) compared to the Universal Chalcidoidea Database (UCDB) species diversity (blue line). The number of species is presented between parenthesis close to the family name (IndoBioSys / UCDB).