Biodiversity Data Journal :
Taxonomic paper
|
Corresponding author:
Academic editor: Daniel Whitmore
Received: 21 Jul 2015 | Accepted: 30 Sep 2015 | Published: 06 Oct 2015
© 2015 Torsten Dikow, Donat Agosti
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Dikow T, Agosti D (2015) Utilizing online resources for taxonomy: a cybercatalog of Afrotropical apiocerid flies (Insecta: Diptera: Apioceridae). Biodiversity Data Journal 3: e5707. https://doi.org/10.3897/BDJ.3.e5707
|
A cybercatalog to the Apioceridae (apiocerid flies) of the Afrotropical Region is provided. Each taxon entry includes links to open-access, online repositories such as ZooBank, BHL/BioStor/BLR, Plazi, GBIF, Morphbank, EoL, and a research web-site to access taxonomic information, digitized literature, morphological descriptions, specimen occurrence data, and images. Cybercatalogs as the one presented here will need to become the future of taxonomic catalogs taking advantage of the growing number of online repositories, linked data, and be easily updatable. Comments on the deposition of the holotype of Apiocera braunsi Melander, 1907 are made.
cybertaxonomy, open-access, online repositories
Cybertaxonomic tools enable us to utilize web-based databases and data repositories to store and retrieve information on taxon names, publications, digitized literature, morphological descriptions, molecular sequences, occurrence data, or images. The availability of these kinds of data in an open-access, online framework allows scientists to test and support taxonomic and phylogenetic hypotheses readily as well as link data in support of biodiversity research across taxon boundaries. Furthermore, future research programs are enhanced by re-using and re-purposing available data in analyses and syntheses. The open-access movement encourages researchers to share primary data (
There is currently no central online gateway to deposit all of the above kinds of data. However, the Encyclopedia of Life (EoL) strives to present a species page for each known taxon summarizing diverse data from disparate online sources. As a data aggregator, it relies on information being either stored elsewhere and harvested regularly or entered directly into the EoL database.
The present cybercatalog attempts to summarize available information on Afrotropical Apioceridae and provide unique identifiers or URLs to access these data online. While the EoL species page will provide several of the data kinds, it lacks others such as ZooBank unique identifiers. Furthermore, only through the upload of images of Afrotropical Apiocera species to Morphbank or taxonomic treatments to Plazi by the authors, for example, EoL is in a position to harvest and include these data in the respective species page.
The hope is that this cybercatalog will encourage entomologists to utilize available cybertaxonomic tools in their research and publications and upload previously or newly published information to databases and data repositories (
The term "cybercatalog" is here used to denote a taxonomic catalog, which is referring the reader to taxon-specific, open-access information on the world wide web. The name should not be confused with web-sites that provide access to numerous mail-order catalogs as revealed by searching the world wide web.
The forthcoming Manual of Afrotropical Diptera (MAD, web-site) edited by Ashley Kirk-Spriggs and Bradley Sinclair will be an outstanding resource on the status of research on Diptera occurring in the Afrotropical Region. The taxon-specific chapters will address each of the 109 families known to occur in this zoogeographic region and provide identification keys to the genera and a synopsis of each genus. The manual will cover the Afrotropical Region as proposed by
The last comprehensive catalog to the genera and species of Afrotropical flies was published 35 years ago by
While the species diversity of Apioceridae in the Afrotropical Region is small, currently only three species are known, this catalog should be seen as an example highlighting the usefulness of how online, open-access resources can provide valuable taxon and specimen information. Furthermore, it highlights the need to organize research information and unique identifiers locally (see below) in order to export them for inclusion in data papers. The BDJ Checklist template, which has been employed here, allows for the easy import of a taxonomic catalog for publication as well as an update of a previously published catalog.
The present cybercatalog includes a novel feature available in the Biodiversity Data Journal (BDJ) publishing platform in that this catalog can be updated easily in the future and re-published under a new Digital Object Identifier (DOI) should a new species be described or other taxonomic changes be made. This feature will facilitate the continuous update of the taxonomic information not possible through traditional publishing means in book or article form.
The LSIDs (Life Science Identifier), GUIDs (Globally Unique Identifier), UUIDs (Universally Unique Identifier), and URLs (Uniform Resource Locator) to access data in the various repositories listed below are locally stored in a custom FileMaker Pro database (Fig.
A custom layout (Fig.
Of the unique identifiers provided below, only the ZooBank and Plazi GUIDs, consisting of a 32 alpha-numeric UUID, are Globally Unique Identifiers (GUID) whereas all other identifiers are unique within their respective data storage system. However, they should not be seen as less permanent because the identifiers will provide a permanent and persistent way to link to the resource (as long as funding of the data storage system is continued). GUIDs, Digital Object Identifiers (DOIs) or Archival Resource Keys (ARKs), would be preferable (
The data repositories included in this cybercatalog are listed below together with introductory information. While this sort of detail is obviously not needed in a taxonomic catalog, it is included here to make the reader familiar with different tools and in order to encourage the taxonomic community to utilize these repositories in research and publications.
ZooBank (
The present catalog includes the ZooBank GUID to all taxa, which have been registered for the purpose of this catalog. For each nomenclatorial act (e.g., new taxon name) Zoobank issues a UUID as part of the LSID, which is minted. Zoobank and Plazi (see below) share the UUID allowing the retrieval of the related treatment (see below) by adding the respective treatment-specific prefix.
ZooBank will also be the de facto resource to resolve authors including all of her/his publications containing new taxon descriptions, e.g., GUID for T. Dikow = F8869067-4618-4CCE-960C-E8A107F162FB. It can also serve as a summary of all nomenclatural acts, such as a count of newly described species by an author, by utilizing the ZooBank API.
The ZooBank records provide access to:
The Biodiversity Heritage Library (BHL) is a digital archive of natural history literature and works collaboratively to make biodiversity literature openly available as part of a global biodiversity community. It digitizes any natural history literature that is out of copyright and published prior to 1923. In copyright books and journals can also be digitized by BHL after an agreement with a publisher has been signed. The community can propose titles to be digitized by BHL free of charge by entering information on the BHL scanning request form.
BioStor (
The Biodiversity Literature Repository (BLR) is part of CERN’s digital Zenodo archive for scientific data. The BLR is focused on biodiversity literature, specifically articles and illustrations. The items stored are at article- or subarticle-level (e.g., individual treatments) for which a DataCite Digital Object Identifier (DOI) is provided, which allows for a citation of the item in a standardized way. Furthermore, it has the potential to assign DOIs to all legacy literature and with that make these articles first-class citizens. BLR is currently administered by Plazi (see further information below) and Pensoft. Upload of articles is open and free to anybody. Generally, any article published before 2000 without a DOI can be made accessible. BLR is mainly focused on providing access to collections covering specific taxa, regions or subjects such as all ant, proctotrupine or drosophilid taxonomy.
The DOIs/URLs provide access to, if applicable:
Plazi is an association supporting and promoting the development and service of persistent and openly accessible digital taxonomic literature and its contents. The main emphasis is to provide human- and machine-readable access to taxonomic treatments and data therein as well as to make them easily citable and retrievable. A treatment is a part of an article that is explicitly provided by an author to define his understanding of a taxonomic name usage at the time of publication (
Through the use of the GoldenGATE editor, taxonomic articles can be marked-up in XML making the underlying information (such as taxonomic names, descriptions, diagnoses, etymology, material examined, type locality, and notes) accessible in machine-readable form for harvest by aggregators. Both, previously (retroactive) or newly (proactive) published articles/books can be marked-up to extract the species descriptions (
While the text of the original species description might be available through the BHL, BioStor, or BLR portals, Plazi provides a machine-readable version in TaxonX mark-up language and Resource Description Framework (RDF) that focuses on easy and consistent retrieval of content so that comparison of multiple descriptions dealing with the same or different taxa can be achieved. Plazi treatments can also be used for quantitative comparative studies of published taxonomic research (
In collaboration with Pensoft and the U.S. National Library of Medicine, a taxonomy-specific Journal Article Tag Suite (JATS), called TaxPub, has been developed that includes treatments and other elements specific to taxonomic publications and which is now underlying Pensoft journals such as Biodiversity Data Journal or ZooKeys (
The Plazi persistent identifiers/queries provide access to:
The Global Biodiversity Information Facility (GBIF) is a data aggregator that gathers specimen occurrence data from numerous natural history collections and herbaria around the world. It provides free and open access to biodiversity data on all types of life on Earth.
An increasing number of entomological collections have started to digitize their holdings by databasing the specimen occurrence data and submitting them to GBIF. Once the collecting localities have been geo-referenced, GBIF will include the specimens in global distribution maps. Data can also be directly submitted by a researcher or a natural history collection to GBIF through the use of an Integrated Publishing Toolkit (IPT
The GBIF URLs provide access to:
Morphbank :: Biological Imaging is an image repository for scientific images of organisms and parts thereof. It provides permanent, open-access to the "published" images and the user can access the original image files. This, for example, allows users to zoom in to see more detail than is available when an image is published in a traditional book or journal article. Furthermore, the user can access specimen-level information, e.g. collecting locality, unique specimen identifier, and institutional depository of the photographed specimen as well as imaging technique, and specifics of the view presented. When submitting images to Morphbank, the user has the option to allow the images to be harvested by the EoL (see below) and so the same image can be available on both the Morphbank platform and on the EoL species page. However, a separate entry to Morphbank is advantageous here as not every user provides access to their images on the EoL. Since the number of images of insect specimens from museum collections and publications of a particular taxon will only increase, Morphbank can be an ever-increasing resource for digital photographs.
The Morphbank URLs provide access to:
The Encyclopedia of Life (EoL) is a data aggregator that harvests information such as scientific names, images, descriptions, digitized literature, and others pertaining to species or higher taxa. It attempts to provide a summary page, the so called species page, for every single species and higher taxon known to science and provides all information in an open-access framework.
Some of the data provided on the EoL species page is duplicated from other individual data sources employed here, such as GBIF and Morphbank. Similar to GBIF, the EoL can also directly receive data, such as images, from museum collection databases or treatments from Pensoft journals, e.g., ZooKeys, or Plazi. For example, images uploaded into the Smithsonian Institution's National Museum of Natural History (USNM) KE EMu specimen-level database will be shown on the EoL species page (compare images of the USNM record of the holotype of Apiocera braunsi to the EoL species page for that taxon).
The EoL URLs provide access to:
The research web-site of the senior author (asiloidflies.si.edu) provides access to various data on Apioceridae, Asilidae, and Mydidae flies using the Drupal content management system.
The URLs provided will take the user directly to either an interactive distribution map or a table with specimen occurrence data of the respective taxon. These specimen records were gathered and geo-referenced by the senior author from numerous natural history collections. The records will in part be duplicating results shown in GBIF, but will also include records not included in GBIF because so far only a limited number of institutions provide geo-referenced specimen-level records to GBIF and the vast majority of entomological collections have not been digitized at the specimen level. Through the senior author's continued efforts in databasing specimens of Apioceridae and related taxa for research purposes, the distribution maps are enhanced regularly and will provide a more complete picture of the ranges of taxa based on explicit specimen data. For convenience, the same specimen records can also be accessed in table format to more easily search the data.
The research web-site URLs provide access to:
To facilitate access to specimens for further research, the notes section includes a list of institutions that have specimens of the respective species in their holdings. It should be noted that additional museum collections might have specimens available as this list is based on visits by the senior author to numerous institutions including those holding the majority of Afrotropical Diptera. A link to the record in the Global Registry of Biodiversity Repositories (GRBio) of that institution is provided to disambiguate the institutional acronym.
Apiocera
Ripidosyrma
Apiocera, new junior synonymy by
Apiocera (Ripidosyrma), new subgeneric rank by
Asilus alastor
Apiocera (Ripidosyrma) alastor, new combination by
Apiocera (Ripidosyrma) africana Paramonov, 1950, new junior synonymy by
South Africa (Northern Cape, Western Cape)
Apiocera badipeniculata
South Africa (Northern Cape/Western Cape). The type locality of Tankwa Karoo, which can be interpreted to overlap with the Tankwa Karoo National Park, straddles the border of the Northern and Western Cape provinces.
Institutions with specimens: SAMC.
Apiocera braunsi
Institutions with specimens: AMGS, BMNH, NMSA, SAMC, SANC, SMNS, TMSA, USNM, ZSMC.
Two museum collections claim to house the holotype of A. braunsi.
The present cybercatalog summarizes taxonomic information on Afrotropical Apioceridae by utilizing freely available digital resources from numerous data repositories. While the initial work to locate information in or upload images, treatments, and publications to data repositories might seem daunting, the long-term effects of using openly accessible digital content through persistent identifiers and linked data far outweighs the time spent. Open-access science, in which taxonomists share data freely and electronically, will on the one hand advance the discipline within the biological sciences and on the other hand make taxonomic hypotheses easily testable through the re-use and re-purpose of previously gathered data with the addition of new data. A taxonomist's dream of having access to all previously published descriptions, images, illustrations, and notes of a particular taxon can become true if the community supports sharing research results in machine-readable form and structured data repositories as exemplified above.
At the same time, this approach is opening up taxonomy to the widest possible community and only this step allows to position taxonomy as a central resource within the life sciences, applied fields of study, and beyond to the public and society at large. We taxonomists have always claimed to fulfill this role, but never understood why it did not happen.
We would like to thank Jeremy Miller and Guido Sautter for discussion. Furthermore, we thank the two peer reviewers and academic editor Daniel Whitmore for providing critical comments that improved the manuscript. Two synthesis meetings in 2011 ("From Taxonomic Literature to Cybertaxonomic Content" organized by J. Miller and T. Dikow) and 2012 ("Asiloidea Flies and Cybertaxonomic Tools" organized by T. Dikow), which took place at the Field Museum of Natural History's (Chicago, IL) Biodiversity Synthesis Center (BioSynC), helped form the background of this contribution. These meetings were funded by the Encyclopedia of Life / Biodiversity Synthesis Center and partial funding for the 2012 meeting was also provided by a U.S. National Science Foundation REVSYS Grant (DEB 0919333; PI T. Dikow, Co-PI David Yeates). Any opinions, findings, and conclusions or recommendations expressed in this manuscript are those of the authors and do not necessarily reflect the views of the National Science Foundation. For the realization of this project Plazi and Pensoft were partially supported by the EC-FP7 EU BON project (ENV 30845, Building the European Biodiversity Observation Network).