Biodiversity Data Journal :
Research Article
|
Corresponding author: Moritz Fahldieck (m.fahldieck@leibniz-lib.de)
Academic editor: Pavel Starkevic
Received: 09 May 2024 | Accepted: 17 Sep 2024 | Published: 24 Sep 2024
© 2024 Moritz Fahldieck, Björn Rulik, Jana Thormann, Ximo Mengual
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Fahldieck M, Rulik B, Thormann J, Mengual X (2024) A DNA barcode reference library for the Tipulidae (Insecta, Diptera) of Germany. Biodiversity Data Journal 12: e127190. https://doi.org/10.3897/BDJ.12.e127190
|
|
Tipulidae, commonly known as true crane flies, represent one of the most species-rich dipteran families, boasting approximately 4,500 known species globally. Their larvae serve as vital decomposers across diverse ecosystems, prompting their frequent and close observation in biomonitoring programs. However, traditional morphological identification methods are laborious and time-consuming, underscoring the need for a comprehensive DNA barcode reference library to speed up species determination.
In this study, we present the outcomes of the German Barcode of Life initiative focused on Tipulidae. Our DNA barcode library comprises 824 high-quality cytochrome c oxidase I (COI) barcodes encompassing 76 crane fly species, counting for ca. 54% of the German tipulid fauna. Our results significantly increased the number of European tipulid species available in the Barcode of Life Data System (BOLD) by 14%. Additionally, the number of barcodes from European tipulid specimens more than doubled, with an increase of 118%, bolstering the DNA resource for future identification inquiries.
Employing diverse species delimitation algorithms — including the multi-rate Poisson tree processes model (mPTP), Barcode Index Number assignments (BIN), Assemble Species by Automatic Partitioning (ASAP), and the TaxCI R-script — we successfully match 76-86% of the morphologically identified species. Further validation through neighbor-joining tree topology analysis and comparison with 712 additional European tipulid barcodes yield a remarkable 89% success rate for the species identification of German tipulids based on COI barcodes.
This comprehensive DNA barcode dataset not only enhances species identification accuracy but also serves as a pivotal resource for ecological and biomonitoring studies, fostering a deeper understanding of crane fly diversity and distribution across terrestrial landscapes.
Tipulidae, DNA barcoding, crane flies, species identification, biodiversity, biomonitoring, taxonomy, COI barcodes, German Barcode of Life initiative (GBOL), species delimitation algorithm
The superfamily Tipuloidea (Insecta, Diptera), commonly known as crane flies, comprises four families, namely Tipulidae, Pediciidae, Cylindrotomidae, and Limoniidae (
The best-known crane flies are the tipulids, which constitute a very large family comprising approximately 4,500 known species worldwide (
Tipulidae are nematoceran flies with elongated bodies, long wings, thread-like antennae, and very long legs. Like most nematoceran flies, one of the most reliable characteristics for identifying them is wing venation, which is quite uniform and therefore family-specific for the crane flies. Tipulidae species are often the largest among crane flies, making them well-known even to non-entomologists. Species of the genus Tipula are typically large and exhibit a dull and relatively uniform coloration, ranging from brown to grey or yellowish (Fig.
Examples for the three main morphotypes of German Tipulidae. (a) Tipula (Tipula) paludosa Meigen, 1830, ♂, © Chris walker (CC BY 4.0); (b) Ctenophora (Cnemoncosis) festiva Meigen, 1804, ♂, © Bartholomeus van der Geer (CC BY 4.0); and (c) Nephrotoma flavescens (Linnaeus, 1758), ♂, © jerry2018 (CC BY 4.0); all photos taken from iNaturalist, https://www.inaturalist.org, accessed 08.05.2024.
The larvae of Tipulidae develop in soil that typically contains at least some degree of moisture or may even be fully water-saturated, such as along the shores of lakes or riverbanks. In Tipulidae, the vast majority of larvae feed on decaying plant material, including leaf litter, humus, or decaying wood (as seen for Ctenophora). Some larvae can also feed on higher plants, like Tipula (Tipula) paludosa Meigen, 1830, which feeds on the roots and leaves of grass and can, under certain circumstances, be considered a pest (
Due to their high species and specimen numbers across a wide range of ecosystems, crane flies are frequently targeted in biomonitoring programs, either intentionally or incidentally as bycatch (e.g.
Therefore, the International Barcode of Life initiative (iBOL) was formed in the first decade of the 21st century with the aim of addressing the gap between the enormous quantities of material and data that need to be analyzed and the lack of expertise. The initiative seeks to establish DNA barcoding as a standard, cost- and time-efficient approach for the analysis of biomonitoring study results (
Since the start of the GBOL project, several important and valuable studies have been published, presenting and testing built-up DNA barcode databases for other arthropod groups in Germany, such as Heteroptera (
Beyond GBOL, numerous other initiatives and approaches focusing on Diptera or crane flies have also emerged globally. For instance,
Our aim with this study is to present the most comprehensive library of high-quality DNA barcodes for German Tipulidae, along with detailed quality checks for each barcode and species hypothesis based on morphological species identification of all specimens, neighbor-joining tree topology analysis, four different species delimitation algorithms, and a comparison with other already published Tipulidae DNA barcodes on BOLD.
This study includes all the COI barcode sequences of Tipulidae specimens that have been sequenced as part of the German Barcode of Life (GBOL) project and are currently housed at the Leibniz Institute for the Analysis of Biodiversity Change/Museum Koenig (LIB/ZFMK) in Bonn, Germany.
Prior to sequencing, specimens were morphologically identified either by associated taxon experts or by entomologists working in the project. The majority of the included specimens were initially identified or re-identified by the first author using the identification keys from
Wet lab work was primarily conducted in the laboratory of the LIB/ZFMK. A small portion of the samples was part of the Global Malaise Trap program and processed at the Canadian Centre for DNA Barcoding (CCDB) in Guelph, Canada. At the LIB/ZFMK, DNA was typically isolated from one leg and muscle tissue from the coxa using a Qiagen (Hilden, Germany) BioSprint96 magnetic bead extractor and corresponding kits. Polymerase chain reactions (PCR) for the 5’ segment of the mitochondrial cytochrome c oxidase subunit 1 (COI) gene were carried out in total reaction volumes of 20 μl. These included 2 μl of undiluted DNA template, 0.8 μl of each primer (10 pmol/μl), and standard amounts of the reagents provided with the Multiplex PCR kit from Qiagen. LCO1490-JJ [5’-CHACWAAYCATAAAGATATYGG-3’] and HCO2198-JJ [5’-AWACTTCVGGRTGVCCAAARAATCA-3’] designed by
The obtained sequences were checked for the occurrence of stop-codons or hints of nuclear mitochondrial DNA segments (NUMTs) and validated using Geneious R7 and Geneious Prime (Biomatters Ltd.) before being linked to their respective entries in the GBOL database through the Diversity Collection module in Diversity Workbench. The complete dataset, including all COI barcode sequences, identifications, and metadata, was uploaded to BOLD (Dataset: DS-GDIPTIP; https://doi.org/10.5883/DS-GDIPTIP) and subsequently submitted to GenBank (ON843162-ON842360, OR178275-OR178255).
Intra- and interspecific uncorrected distances (p-distances) for the dataset as well as pairwise p-distances were calculated using the DiStats Perl-script (
To better understand the intra- and interspecific p-distances in our dataset, we calculated the mean and mode of these two values. The mean was directly calculated using the DiStats Perl-script (
DNA-based species delimitation was applied to the sequences in the dataset using multiple methods, including the multi-rate Poisson tree processes (mPTP) model (
The mPTP model (
The ASAP method (
The BIN Discordance tool in BOLD uses the methodology developed by
Finally, the TaxCI R-script (
For the sequences of the dataset, the accuracy of the species delimitation methods with prior morphological species identifications was assessed by the match ratio (
As an additional attempt to further scrutinize the dataset, BOLD (https://www.boldsystems.org, accessed on 28.09.2023) was searched for all European tipulid COI barcodes with a length of at least 550 bp and with a species identification of the vouchered specimen. Sequences flagged by BOLD or containing stop codons were excluded. These already published barcodes are used to test the species hypotheses of the specimens in this study. Therefore, a second graphical tree including the GBOL Tipulidae sequences together with the additional European Tipulidae barcodes was created with the TaxCI script. The BOLD dataset DS-EUDIPTIP (https://doi.org/10.5883/DS-EUDIPTIP) comprises all utilized sequences, including flagged ones and those containing stop codons.
For both datasets, one containing only the GBOL Tipulidae sequences and the other incorporating both the GBOL Tipulidae sequences and additional European Tipulidae barcodes, we generated new graphical trees using the TaxCI script. These trees had the same topologies but different tree tip categories for analysis. Instead of analyzing the species names assigned to the sequences as suggested by
Subsequently, a final tree analyzing the species names and incorporating the GBOL Tipulidae and the additional European Tipulidae specimens while excluding clearly misidentified samples, was created using the TaxCI script.
We successfully sequenced the DNA barcode region of 824 tipulid specimens, representing 76 out of the 142 recorded species from Germany (54%). The fragment lengths of the analyzed DNA barcodes ranged from 555 to 658 bp, with only 34 barcodes (4%) being shorter than 658 bp. Additionally, the average base composition of the sequences was: A = 28.6%, C = 16.8%, G = 16.8%, and T = 37.8%. Tables S1.1 to S1.6 (Suppl. material
The inclusion of GBOL Tipulidae specimen barcodes expands BOLD by 813 specimens and 53 species for Germany, and by 824 specimens and 14 species for Europe. In terms of specimen numbers for Germany, there is an increase of more than six fold (from 122 specimens before, an increase of 666%), while in terms of species for Germany, we now almost triple the previous number (from 30 species before, an increase of 177%). Concerning specimen numbers for Europe, we also more than double the previous number (from 698 specimens before, an increase of 118%), and in terms of species for Europe, we increase the previous number by 10% (140 species before). All numbers and percentages are based on barcodes with a minimum length of 550 bp. Notably, the 14 species new to BOLD for Europe represent first-time records, not just at a European level, but also globally. These species are Ctenophora (Cnemoncosis) festiva Meigen, 1804, Ctenophora (Ctenophora) flaveolata (Fabricius, 1794), Nephrotoma lamellata (Riedel, 1910), Tipula (Lunatipula) alpina Loew, 1873, Tipula (Lunatipula) laetabilis Zetterstedt, 1838, Tipula (Lunatipula) livida van der Wulp, 1859, Tipula (Lunatipula) mellea Schummel, 1833, Tipula (Lunatipula) selene Meigen 1830, Tipula (Lunatipula) verrucosa Pierre, 1919, Tipula (Mediotipula) stigmatella Schummel, 1833, Tipula (Pterelachisus) pabulina Meigen, 1818, Tipula (Pterelachisus) sp., Tipula (Pterelachisus) trifascingulata Theowald, 1980, and Tipula (Savtshenkia) subvafra Lackschewitz, 1936.
Exploring the Barcode of Life Data System (BOLD, https://www.boldsystems.org, accessed on 01.08.2022) for previously-published COI barcode sequences of identified Tipulidae from Europe, with a minimum length of 550 bp, resulted in 700 specimen barcodes. However, four specimens were excluded from further analyses: one (PII-20110076) contained a stop codon, while three others were flagged by BOLD. The flagged specimens ABOL-BioBlitz 2019 Szucsich-0234 and bf-dip-00503 were suspected of possible contamination or misidentification, and PII-20110081 was marked due to a putative reading frame shift issue. Consequently, the dataset from BOLD comprised 696 specimen barcodes from 141 species.
Tables S2.1 to S2.3 (Suppl. material
Two TaxCI trees scrutinizing the BINs of the barcodes are presented as Figure S3 (Suppl. material
Tables S3 and S4 (Suppl. materials
Although a barcoding gap analysis was conducted, no distinct, generalized barcoding gap was identified. Nevertheless, most species exhibit a barcoding gap, shown by significantly higher intraspecific genetic distance compared to interspecific distances (see Fig.
Regarding distances to the nearest neighboring species, the minimum interspecific p-distance recorded was 0.15% (between T. (V.) hortorum Linnaeus, 1758 and T. (V.) nubeculosa Meigen, 1804), while the maximum was 12.92% (between T. (Odonatisca) nodicornis Meigen, 1818 and T. (P.) trifascingulata). The mean p-distance to the nearest neighboring species stood at 5.94%. The mode of the p-distances to the nearest neighboring species falls within the range of 5.00% to 5.99%, with the most frequent value occurring in 14 out of 76 cases.
In Table
Match ratios of the species delimitation methods mPTP, BIN, ASAP, and TaxCI applied to the sequences of the 824 specimens of the GBOL Tipulidae.
mPTP |
BIN |
ASAP |
TaxCI |
|
Nmatch |
58 |
65 |
61 |
63 |
Nmol |
77 |
76 |
71 |
74 |
match ratio |
76% |
86% |
83% |
84% |
Nmorph = 76 |
||||
match ratio = 2 × Nmatch/(Nmol + Nmorph) |
Neighbor-joining tree of the 76 species of the GBOL Tipulidae based on morphological identification, contrasted with the outcomes from the species delimitation methods mPTP, BIN, ASAP, and TaxCI applied. Dashed underlined species names indicate more than one cluster of the same species. Blue boxes indicate agreement between molecular species delimitation method and morphological species identification while red boxes indicate disagreement.
Four species pairs are recognized as the same species by at least one of the four species delimitation algorithms, i.e., mPTP, BIN, ASAP, and TaxCI. These pairs also exhibit very low interspecific p-distances (0.15-2.43; see Table S3: Suppl. material
Two other species pairs, with high interspecific p-distances (1.02-3.19 and 6.23; see Table S3: Suppl. material
Three different species are resolved into two unique barcode clusters each: T. (Tipula) paludosa, T. (S.) pagana Meigen, 1818, and the already challenging to identify N. submaculosa. The two clusters of T. (T.) paludosa are not neighboring in the graphical tree, displaying a relatively high p-distance between them (minimal inter-cluster p-distance = 3.34%). For T. (S.) pagana, the two neighboring clusters exhibit an even higher p-distance (minimal inter-cluster p-distance = 5.47%). The two clusters of N. submaculosa are not adjacent in the graphical representation and show a low p-distance between them (minimal inter-cluster p-distance = 1.52%). We were unable to find diagnostic morphological characteristics to distinguish these clusters.
Moreover, we have two species clusters featuring single DNA barcodes with relatively high p-distances (1.67 and 1.37; see Table S4: Suppl. material
Additionally, one female specimen of T. (Pterelachisus) could not be identified to species level. We retained it in the dataset as its barcode did not cluster with any other barcode from the GBOL Tipulidae or with barcodes from other European tipulid specimens in BOLD. For more detailed comments on all species clusters, see Appendix 1 (Suppl. material
The combined cluster of N. submaculosa and N. flavescens presents a complex topology (see Figure S1: Suppl. material
Upon the inclusion of more European tipulid barcodes, the pattern changes (see Figure S2: Suppl. material
Diagnostic morphological characters of the two species are the black marking on the head, which is narrow with straight lateral margins in N. submaculosa and broad with rounded lateral margins in N. flavescens; the three black prescutal stripes, which are completely shiny in N. submaculosa and have a dull border for the medial stripe and dull anterior parts after the bend for the lateral stripes in N. flavescens; and the abdominal lateral black stripes, which are almost continuous in N. submaculosa and broken to single dots in N. flavescens (
Comments. Distinguishing between N. submaculosa and N. flavescens based on adult morphology poses a significant challenge, according to the first author's experience in identifying Tipulidae specimens. The diagnostic characters exhibit transitional variations without clear differentiation. Neither easily recognizable morphological features nor barcoding provides definitive separation between these species. Further molecular markers might be needed in combination with larval and adult morphology to diagnose these taxa.
The barcodes of specimens from N. crocata and N. scalaris form distinct clusters with a remarkably low minimal interspecific p-distance (0.46%; see Figure S1: Suppl. material
After incorporating additional European tipulid COI barcodes from BOLD into the tree, an additional sequence from a Dutch N. crocata specimen emerges, neighboring the primary combined cluster of N. crocata and N. scalaris (see Figure S2: Suppl. material
Morphological characteristics for distinguishing between the two species include: the head, mostly black with a yellow to orange spot in the center for N. crocata, while for N. scalaris, the head is mostly yellow with a small triangular spot; the curved part of the lateral stripe on the thorax back is matt in N. crocata, but glossy in N. scalaris; and the upper side of the abdomen features three to four almost straight transverse bands in N. crocata, compared to five to six more or less triangular bands in N. scalaris (
Comments. The exceptionally low interspecific p-distance suggests that relying solely on COI barcodes may not always definitively differentiate between N. crocata and N. scalaris. This hypothesis gains support from the outcome of all four species delimitation algorithms. Additionally, the inclusion of an additional Dutch N. crocata specimen from BOLD further indicates potential ambiguity in distinguishing these species based solely on their barcodes. Morphologically, however, the two species are easily discernible.
Within the dataset, there are two barcodes for each species, P. subserricornis and P. turcica. The barcodes of both species form distinct clusters, with very low intraspecific p-distances (0.30% in P. subserricornis and 0.46% in P. turcica; see Figure S1: Suppl. material
When additional COI barcode sequences from other European tipulids are added to the P. subserricornis/P. turcica cluster, it results in a combined cluster containing four species of Prionocera. In this expanded cluster, all species' barcodes still form separate and distinct clusters with relatively low interspecific distances to each other. However, one Icelandic specimen's barcode of P. turcica clusters with the barcodes of P. subserricornis (see Figure S2: Suppl. material
Both species are morphologically distinguishable primarily by the characters of the male genitalia. In P. turcica, tergite 9 exhibits four strongly pointed horizontal processes, with the inner pair positioned closely together and splayed. In contrast, in P. pubescens, tergite 9 features only two pointed processes, and the splayed projections are much further apart (
Comments. Given the small sample size and the already low minimal interspecific p-distance, along with the species delimitation algorithms grouping both species as one, it is conceivable that distinguishing these two species solely based on their COI barcodes may be challenging. However, morphologically, these two species are easily identifiable and distinguishable by examining the shape of the male genitalia.
The barcodes of the specimens of T. (V.) hortorum and T. (V.) nubeculosa form one shared unique cluster without any recognizable separation pattern (see Figure S1: Suppl. material
The inclusion of sequences from four Bavarian specimens and three Finnish specimens of T. (V.) nubeculosa, as well as one Finnish specimen of T. (V.) hortorum from BOLD, does not alter the cluster's composition (see Figure S2: Suppl. material
Morphologically, both species are primarily distinguishable by the characteristics of the male genitalia. In T. (V.) hortorum, the last tergite features short lateral lobes, and the last sternite bears processes on the hind apical corners, which are obliquely truncate and almost twin-spined at the apex. In contrast, in T. (V.) nubeculosa, the last tergite exhibits long sinuous lateral lobes, and the last sternite shows short, simple processes on the hind apical corners (
Comments. T. (V.) nubeculosa and T. (V.) hortorum share the same haplotype in their COI barcode sequences, making them indistinguishable based on their barcodes. Morphologically, these two species are easily distinguishable and can be separated based on the shape of the male genitalia.
Despite a relatively low minimal interspecific p-distance (1.52%) between the barcodes of the specimens of T. (P.) varipennis and T. (P.) pseudovariipennis, the clusters for both species are resolved separately (see Figure S1: Suppl. material
Expanding the tree with additional barcodes of European tipulids (26 for T. (P.) varipennis, five for T. (P.) pseudovariipennis, and eight for T. (P.) submarmorata) maintains the same topology, despite TaxCI continuing to identify T. (P.) varipennis and T. (P.) pseudovariipennis as one species. However, the clusters for all three species remain consistent (see Figure S2: Suppl. material
Morphologically, distinguishing between the two species relies on a combination of several often subtle characteristics. For instance, in T. (P.) varipennis, the third antennal segment is black, the eyes are widely separated beneath, and the front femora are black in the apical third. Additionally, the male inner clasper bears only a tiny spine, and the outer clasper is not excessively slender. In contrast, in T. (P.) pseudovariipennis, the third antennal segment is yellow, the eyes are less separated beneath, and the front femora are black only at the apex. Moreover, the male inner clasper features a strong dorsal spine, and the outer clasper is very slender (
Comments. Differentiating between T. (P.) varipennis and T. (P.) pseudovariipennis through barcoding is consistently achievable. Additionally, these two species are morphologically distinguishable and can be separated based on a combination of characteristics.
One species delimitation algorithm, mPTP, groups the single barcode of the T. (S.) staegeri with the single barcode of T. (S.) subvafra, despite their high interspecific p-distance of 6.23% (see Figs
When including all European tipulid barcodes from BOLD, the barcode of T. (S.) staegeri clusters alongside another T. (S.) staegeri specimen from the UK, forming a distinct cluster notably distant from its nearest neighbor. On the other hand, the barcode of T. subvafra is next to a cluster of three barcodes from Finnish specimens of T. (S.) limbata Zetterstedt, 1838, displaying only a very low interspecific distance (see Figure S2: Suppl. material
Morphologically, distinguishing males of T. (S.) staegeri and T. (S.) subvafra or T. (S.) limbata is straightforward. T. (S.) staegeri is easily identifiable as the only European species of T. (Savtshenkia) with very long paired median lobes on the last sternite (
Comments. Tipula (S.) staegeri and T. (S.) subvafra are easily distinguishable by their COI barcodes and morphological characteristics. However, it is questionable whether T. (S.) subvafra and T. (S.) limbata have very similar COI barcodes or whether either the one specimen of T. (S.) subvafra or the three specimens of T. (S.) limbata are misidentified. Given the morphological similarity between both species, both possibilities are plausible.
The barcodes of the T. (T.) paludosa specimens form two distinct clusters, suggesting the presence of potentially two different species (see Figure S1: Suppl. material
Expanding the dataset to include additional European tipulid specimens brings about significant changes in the topology. The barcode of T. (T.) subcunctans now forms a cluster alongside two Finnish and one Norwegian specimen of the same species. This cluster is adjacent to all T. (T.) paludosa barcodes, with the primary cluster remaining unchanged. The previously distinct T. (T.) paludosa barcodes now unite, forming a cluster inclusive of a misidentified specimen of T. (Acutipula) luna Westhoff, 1879, from the United Kingdom, along with eight male T. (T.) paludosa specimens from Portugal (see Figure S2: Suppl. material
Morphologically, T. (T.) paludosa and T. (T.) subcunctans differ notably in antennal segment count, inter-eye distance, and subtle male genitalia characteristics (
Comments. Given the morphological differentiation of T. (T.) paludosa and T. (T.) subcunctans, ASAP shows an evident case of overlumping. For the two T. (T.) paludosa clusters, despite an extensive morphological examination by the first author, no discernible differences were identified between the specimens, all of which align with the species description of T. (T.) paludosa. Notably, within Germany, no other alike species have been identified thus far, although several morphologically similar species exist across Europe, such as T. (T.) italica Lackschewitz, 1930 (
The barcodes of the T. (S.) pagana specimens form two clusters, with a significantly high p-distance (5.47%) between them, each containing three barcodes (see Figure S1: Suppl. material
Additional T. (S.) pagana barcodes from Finland and Norway cluster with the group of three specimens where the brachypterous female lies (see Figure S2: Suppl. material
Comments. Intensive study by the first author failed to find morphological differences between the specimens from both clusters. The most morphologically similar species to T. (S.) pagana is T. (S.) holoptera Edwards, 1939. While well-documented in Great Britain and also known in Czechia, Slovakia, and Spain (
One specimen within the five-specimen cluster of N. guestfalica exhibits a p-distance of 1.67% to its nearest neighbor from the same cluster (see Figure S1: Suppl. material
When we included additional barcodes from three Portuguese specimens in the analysis and applied the TaxCI script, all N. guestfalica barcodes were consistently classified as belonging to a single species (see Figure S2: Suppl. material
Comments. After morphologically re-examining the questionable specimen by the first author, no differences from the species description of N. guestfalica were detectable. It seems that mPTP and TaxCI might exhibit oversplitting, possibly due to the small sample size and variability in the COI barcode of the species.
One specimen within the cluster of 39 specimens of T. (L.) lunata exhibits a p-distance of 1.37% to its nearest neighbor from the same cluster (see Figure S1: Suppl. material
Comments. After morphologically re-examining the questionable specimen, the first author found no differences from the species description of T. (L.) lunata. Most probably, this is a case of oversplitting by mPTP and BIN, likely due to the variability in the COI barcode of the species.
The single barcoded female specimen of T. (Pterelachisus) sp. could not be identified to the species level. We decided to include it in the dataset because it belongs to the subgenus T. (Pterelachisus), but it does not correspond to any of the other species within this subgenus found in our dataset. These species include T. (P.) trifascingulata, T. (P.) irrorata Macquart, 1826, T. (P.) submarmorata, T. (P.) varipennis, and T. (P.) pseudovariipennis. The mPTP algorithm recognizes both Tipula sp. and T. trifascingulata as belonging to the same species, while BIN, ASAP and TaxCI recognize Tipula sp. as a separate species (see Figs
In the combined tree of the barcodes from the GBOL specimens and additional barcodes from European tipulids, the barcode of Tipula sp. remains isolated, neighboring a cluster of four barcodes from specimens of T. (P.) cinereocincta Lundstrom, 1907 from Finland (see Figure S2: Suppl. material
Comments. Morphologically, distinguishing between females of T. (Pterelachisus) can be challenging, as many lack diagnostic characters. Based solely on morphology, Tipula sp. could be identified as a female of T. (P.) trifascingulata. However, the high p-distance observed between the barcodes of Tipula sp. and T. (P.) trifascingulata (5.93%) suggests that Tipula sp. is more likely to belong to another European T. (Pterelachisus) species not yet represented in BOLD.
The BIN algorithm achieved the highest rate mirroring our morphologically identified species for the GBOL Tipulidae dataset, with a match ratio of 86% of the 76 studied species. By analyzing the results of all four species identification algorithms (mPTP, BIN, ASAP, and TaxCI) together, and by cross-referencing these results with additional European tipulid barcodes from BOLD, we were able to unmistakably identify (in other words, assign COI barcodes to species identified morphologically) 68 out of 76 species using DNA barcodes. Among the 68 morphological species with unequivocally molecular correlation two species exhibited more than one DNA barcode cluster, warranting further investigation with a broader sampling, and two species clusters displayed outlier barcodes, each identified by two species delimitation algorithms as different species. This behavior for the outlier barcodes is likely attributable to oversplitting by the algorithms. Only four species pairs (eight morphological species in total) were not clearly separable by COI barcodes in our GBOL Tipulidae dataset. In addition, one species of these not clearly separable species pairs also exhibited multiple species clusters.
Our success rate of 89% (68 out of 76 species) aligns with similar arthropod barcoding studies within the German Barcode of Life project (GBOL), such as
Despite taking more than a decade to build, our database still lacks more than 45% of the recorded species for Germany. Nevertheless, it represents a significant improvement over previously-published data. Our 824 newly-sequenced high-quality COI barcodes, available on BOLD, complement the existing 696 published barcodes with a length of at least 550 bp for European tipulid specimens. This extends the database for high-quality barcodes of European tipulid specimens by 118% and contributes to enhancing future barcode identification requests for the database.
Additionally, through using the TaxCI script analyzing the BINs assigned to the sequences instead of the species names, we devised and tested a simple method to help identify potential misidentifications in large datasets.
We extend our gratitude to all the contributors and donors of material during all three phases of the German Barcode of Life project (GBOL). Special thanks go to Rainer Heiß, who assisted the first author in gaining expertise in the morphological identification of Tipulidae and who helped in the re-evaluation of some of the more challenging specimens.
The German Barcode of Life project (GBOL) is funded by grants from the German Federal Ministry of Education and Research (FKZ 01LI1101, 01LI1501 and 01LI1901A-E).
Sex distribution, life stage distribution, collection countries, collection years, collection methods, and storage methods for the 824 specimens of the GBOL Tipulidae.
TaxCI tree of the 824 specimens of the GBOL Tipulidae.
Sex distribution, life stage distribution and collection countries of the 696 specimens of the European Tipulidae.
Combined TaxCI tree of the 824 specimens of the GBOL Tipulidae and of the 696 specimens of the European Tipulidae.
TaxCI tree of the 824 specimens of the GBOL Tipulidae scrutinized by their Barcode Index Numbers (BINs).
Combined TaxCI tree of the 824 specimens of the GBOL Tipulidae and of the 696 specimens of the European Tipulidae scrutinized by their Barcode Index Numbers (BINs).
Combined TaxCI tree of the 824 specimens of the GBOL Tipulidae and of 688 specimens of the European Tipulidae, with eight clearly misidentified specimens removed.
Intra- and interspecific p-distances among the 76 species of the GBOL Tipulidae.
Pairwise p-distances among all 824 specimens of the GBOL Tipulidae.
Species-specific remarks for the GBOL Tipulidae.