Corresponding author: Yde de Jong (
Academic editor: Ross Mounce
Reliable taxonomy underpins communication in all of biology, not least nature conservation and sustainable use of ecosystem resources. The flexibility of taxonomic interpretations, however, presents a serious challenge for end-users of taxonomic concepts. Users need standardised and continuously harmonised taxonomic reference systems, as well as high-quality and complete taxonomic data sets, but these are generally lacking for non-specialists. The solution is in dynamic, expertly curated web-based taxonomic tools.
The
This paper describes the results of PESI and its future prospects, including the involvement in major European biodiversity informatics initiatives and programs.
The
PESI merges data from multiple sources and publishes it online. This requires a mapping between the different schemas used by the different data sources and/or an implementation of standards within those data sources. A pragmatic approach was taken for the main databases. A bespoke procedure was developed at the
In order for the data sets to be merged in this way they need to share common vocabularies for some fields – these include: taxon status, nomenclatural status and occurrence status. PESI provides species lists based on geographic regions and European legislation (e.g. the Habitat Directive, Birds Directive, CITES and IUCN (conservation status) directives). The resulting dataset includes a total of nearly 450,000 scientific names (which include 240,000 valid species and infraspecific names) and 190,800 vernacular names in 117 languages. In the
PESI builds upon previous European taxonomic projects (like
PESI was initiated during the EC-FP6
The strengthening and integration of European taxonomic communities has been progressing since the start of the taxonomic indexing EU framework programmes
PESI makes a significant addition to the species registers through the expansion of the network of expertise towards Eastern Europe. In addition to providing more comprehensive data for the pan-European species registers, this expansion has overcome the separation of knowledge and taxonomic practice over decades. Another achievement is the closer collaboration of the taxonomic societies, especially with respect to improving taxonomic coverage and addressing long-term maintenance and upgrading of the taxonomy.
In addition to creating a network of taxonomic experts, PESI has developed a network of regional (often national) focal points. These focal points (either individuals or organisations) complement the taxonomic network through: (1) liaising with national governmental bodies on the implementation of European standards relevant to, for instance, national and European regulations and environmental monitoring, (2) collecting and transferring local expertise and applied tools, (3) lobbying and public policy assistance at national and European level, and (4) supporting closer collaboration of scientific contributor and user communities across Europe. Focal points contribute country-specific information about species, relevant databases, local literature, experts, professional societies and major users such as government organisations.
In biology, taxon names provide anchors that allow information about organisms to be linked. A taxonomic name, typically a species name, is attached to every primary data object (field observation, specimen, genetic data, etc.). Therefore names, together with their organisation into taxonomic classifications, are understood as core (meta-)data for biological information systems. There are many challenges in integrating data sources that contain taxonomic names and classifications, particularly where the sources extend over different biological kingdoms or national boundaries. Names may be erroneously assigned or incomplete and so searches based on exact character matching against names in current use may fail. Names that are synonyms or old combinations no longer in current use may occur in museum and herbarium specimen catalogues or in legislative lists. Names with orthographical errors occur in legislative lists on national or international level and some of them are often in use by certain taxonomists. There may also be disagreement amongst experts on the identity of specimens and on the taxonomic constituents of genera and the arrangement of classifications. The partners involved in PESI have extensive experience with handling such problems and PESI has produced practical solutions for many of these issues. The availability of authoritative taxonomic metadata standards is of particular relevance where species are directly linked to societal issues such as conservation and environmental control. PESI promotes harmonisation and certification of taxonomic metadata standards of prioritised taxa that are listed in various EU regulations and legislative lists. To address these issues, PESI had the following objectives:
To prepare a roadmap (conceptual development and strategic plan) for the application of taxonomic standards within Europe, with the purpose of overcoming the instability and inconsistency of taxon names (and concepts) and attached data. This work addressed technical, linguistic, educational and legal barriers to progress in defining and implementing appropriate standards. To promote co-operation between PESI and other networks and organisations. This optimised the cross-linking of European biodiversity resources using approved taxonomic data standards, and improved data quality and consistency. This facilitated discovery and exchange of biodiversity data, both within Europe, and between Europe and globally. To work closely with relevant standards organisations to identify appropriate authoritative standards and schemes and to ensure their adoption within the European biodiversity community. Work included development of a management classification scheme, utilisation of globally unique identifiers for names (GUIDs) and support for nomenclators (such as
PESI has technically integrated the pan-European species registers into an e-infrastructure by creating a joint access (middle) layer, the PESI data warehouse. This includes an index of species names associated with a number of attributes, such as synonyms, their place in the management classification and their geographical distribution. This data content results from the integration of the earlier pan-European checklists into a unified directory, following advanced routines on data verification and harmonisation. These routines are laid down in the PESI Common Data Model store (PESI CDM-store), an instance of the
PESI has built an interactive, multilingual web portal to carry out the dissemination of the developed species names service and to support the use of the pan-European species data in the e-science domain. This includes relevant supplementary data, such as (region-based) occurrence details, literature and DNA sequence information, and applies dynamic links to other pertinent data services. Additionally, web services allow users to link PESI-functions into desktop applications as well as service-oriented information infrastructures and hereby establish enhanced access to species names, GUIDs, occurrence details and the hierarchical classification. The PESI web portal provides the interface to the European Taxonomic Backbone.
PESI resulted in an integrated overview regarding the taxonomy and occurrence of European species, including their current legislation status and other important metadata annotations (like vernacular or common names). The PESI Taxonomic Backbone serves as a taxonomic data standard resource, facilitating and optimising the integration and sharing of European biodiversity data, supporting a wide range of European services, major biodiversity programs and stakeholders on nature conservation and biodiversity management.
PESI has set out a working Intellectual Property Rights (IPR) model, based on the
To secure the continuity of electronic biodiversity data, PESI has reviewed gaps in taxonomic expertise, species registers and informatics resources throughout Europe and surveyed potential ways to complete these gaps (Suppl. material
The functional and successful operation of ETW involves a series of informatics resources, which will be addressed within the gap analysis, particularly in relation to successful developments in this area, like Scratchpads, the Platform for Cybertaxonomy and tools for e-Publishing.
Recently the hosting of the EDIT Expert Database (including the ETW) at the
The continuity of relevant electronic taxonomic biodiversity data resources and expertise networks were examined (Suppl. material
The terrestrial and freshwater Focal Point Networks in PESI originate from focal point partners within
For the botanical community, the PESI Focal Point network built on the existing infrastructure of a regional advisory network for
A selection of the
Over three years the entire process resulted in two major achievements: a) it created the largest network of taxonomists in Europe through the National Focal Points, and b) produced a massive amount of information on all kinds of existing nomenclators, museum catalogues, existing taxonomic expertise, societies and systematists assemblages, taxonomic information publishing gates, etc., which are now available through the single multilingual portal of PESI, which can, in turn, be used by any member of the National Focal Point network. However, the overall achievement is that the defragmentation of the European taxonomic community was encountered and that the process paved the way for this community towards a cohesive and challenging framework in the future. The next successful steps are already visible through a number of projects and initiatives (e.g. LifeWatch, EU BON, and COST action applications).
At this moment the public
PESI has a significant outreach to eastern European countries and beyond, connecting local knowledge networks to pan-European expertise and supporting the implementation of relevant e-Infrastructures for information management. For example, the Ukraine PESI Focal Point was selected as
PESI is about integration and presentation of data. To do this effectively a common framework is needed to support the establishment and promotion of controlled vocabularies and metadata standards that enable effective integration of data and best practices. PESI sets out to provide a taxonomic backbone for Europe, at the heart of which is an annotated synonymised checklist of species. The prime focus has been to identify appropriate standards to support this task. In addition to ensuring that data held within PESI are successfully integrated, it is important that our data are accessible and interoperable with other initiatives. We have therefore looked at protocols, data models, controlled vocabularies and ontologiesthat will facilitate this interoperability, working closely with external organisations involved with biodiversity informatics to assist in the process of developing standards and resources for data exchange and data validation.
More specifically PESI focused on:
Defining PESI as an annotated checklist, listing the range of different taxonomic products, differentiating between standard taxonomies and standards used to exchange taxonomic information and build consensus biological classifications (Suppl. material Examining the rationale, logistics and challenges to coordinate a set of independent initiatives that collectively catalogue nomenclatural acts according to the different codes of nomenclature, allowing a differentiation between nomenclature and taxonomy and proposing a strategy for linking nomenclators to PESI (Suppl. material Outlining the success in reaching agreements between key players in the global community and specifically detailing how standards would be used to exchange taxonomic data, in particular by using the Identifying the opportunities for PESI to contribute to the setting up of a global system for managing scientific names of organisms and potential pitfalls (Suppl. material
PESI researched and proposed standards for working with data associated with biological organisms, including:
A management classification integrating the schemes used in the component datasets. An informal classification using informal names for higher groupings familiar to the non-expert (terms such as: snails, butterflies, dragonflies). A scheme for Globally Unique Identifiers (GUIDs) that can be applied to biological names (and other entities) to allow machine-level matching of equivalents. A vocabulary of approved terms to cover occurrence status of organisms. A controlled set of terms for geographical areas, including European marine regions. Adoption of the Darwin Core Archive standard (Taxon & Occurence terms) for data exchange.
The latter has been developed by the
Controlled vocabularies developed and recommended for occurrence status, taxon status, nomenclatural status and geographical regions in use in the PESI data warehouse and PESI Portal are listed as an appendix in the PESI Focal Point Handbook (Suppl. material
PESI partners organised sessions and presented papers at the annual conferences of the Biodiversity Standards Organisation (TDWG) and, most importantly, made use of these occasions to further the development of PESI objectives through networking with the leading participants in the development of biodiversity informatics standards. A significant outcome from these activities is the ‘Montpellier Declaration’. This is an agreement, proposed by PESI, between major biodiversity informatics projects to use a standard approach to sharing data (discussed in Suppl. material
International infrastructures for the production, maintenance, and publication of taxonomic checklists are highly heterogeneous with regard to scope, information models, workflows, and implementation. The PESI project recognised from the beginning that an effort for integrating European checklist information into a single unified system would need an “information broker” responsible for merging disparate and potentially conflicting information, performing data quality measures, and streamlining the process of data publication from the individual checklist to the common European information portal and service layer.
The EDIT Platform for Cybertaxonomy (
The model can be deployed using almost any Database Management System (DBMS). Application programmers can develop all kinds of systems (e.g. portals, editor software, import and export functions) using well defined APIs (Application Programming Interfaces) exposed as a Java Library (
For its deployment as the central merging facility in PESI, the BGBM-team extended the platform and its library (Suppl. material
importing data from the existing pan-European checklists, merging them into a single taxonomy across organism groups, quality control at different levels, creation of detailed reports for feedback to the checklist managers, mapping of the vocabularies used by the individual checklists to the agreed PESI vocabularies (e.g. status values, geographic regions), export of the consolidated checklist into the PESI data warehouse.
With these methods, a data publication cycle in PESI is performed in three basic phases (Fig.
Import. The source checklists are parsed and transformed into the internal CDM data structure using the import layer of the EDIT platform. This step also involves data quality control at three levels. Level 1 (syntax of terms) checks the syntactical correctness of individual terms. Level 2 (structural integrity) checks the completeness and appropriateness of data belonging to individual objects. Level 3 (referential integrity) checks the correctness of relations between objects. The different quality levels and rules have been jointly developed on a common Merging. The individual taxonomic trees are merged into a single taxonomy, which is later used for data publication. This merging involves a final data quality assessment (level 4) for the detection of overlaps and conflicts between the different checklists. Again, a transcript is produced and fed back to the checklists managers responsible for resolving the conflicts or defining priority rules. Export. All data are exported into a Data Warehouse Structure optimized for data publication purposes. From the FTPserver hosted at the BGBM the data are harvested for publication at VLIZ.
The publication of taxonomic data in web portals (for human consumption) is an important aspect of the PESI infrastructure. However, providing information in machine-readable form is becoming increasingly important, because this makes the information reusable in a wide range of potential applications (e.g. as part of the taxonomic backbone in species information systems). PESI addresses this aspect by including a SOAP-compliant web-service interface into its portal implementation. In addition, REST-full services have been implemented, providing light-weight interfaces to the PESI backbone, like
The agreement on a common identifier system had to include clear rules as to when changes to an object imply issuing a new identifier. To ensure consistent application, these rules have to work at a machine level - the operations that turn an existing taxonomic object into a new one had to be defined. The PESI approach has been recognised by the “
The EDIT Platform for Cybertaxonomy is continuously improved by an international team of developers, coordinated by the Biodiversity Informatics Research Group of the BGBM. It is the basis for an increasing number of checklists on different scales as well as geographic and taxonomic scopes. In addition, several new EU e-infrastructure projects have the EDIT-Platform integrated as an important technological component:
With new projects related to EDIT platform developments in progress (like
PESI supported the development of governance plans for EU-based Global or Regional Species Databases (GSDs/RSDs) in the following ways: (a) as a European taxonomic infrastructural component, (b) as a contribution to global efforts like the Catalogue of Life (Fig.
The
Still operating separately, the register’s data are merged about every year in the PESI Data Warehouse and are made available through this single portal. In addition to taxonomic information, PESI harvests information on species (images, literature, conservation status) and provides links to other portals (e.g. national checklists, red species lists and other bioinformatics databases such as the
The portal is an important instrument for standardisation of species names. The search interface is the main public access point to information on species living in Europe. However, the portal also provides services for those building their own species applications. The PESI-website can be consulted in 21 European languages.
The website attracts around 5,000 unique visitors per month (Fig.
The advanced search interface provides a number of fields to generate output based on selected parameters (Fig.
(part of) Scientific name, Common name, Authority Equals/above/below taxon rank (e.g., species, family, class, …) Belonging to a group (a higher rank e.g.,
Priority lists + Priority status (e.g. IUCN-endangered, EU Bird directive, HYPPZ,... ). For a common agenda on prioritised species see: Suppl. material Occurrence + Occurrence status (Absent, Present, Introduced,... )
The occurrence status “present” includes all other statuses except for “absent”. The list of areas is linked to the
Besides creating lists of species, the user can search for a particular taxon by entering (part of) the scientific name, name authority or common name. However, if there is no exact match, the search tool performs a number of ‘intelligent’ consecutive queries until matches are found:
fuzzy match (Tony Rees’ checks if the name is present in the checks if the name is present in the checks if the name is present in the checks other potential genus-species combinations:
FaEu model: it checks for reverse synonyms, e.g. when the species epithet occurs in a current combination and you enter a synonymous genus or species name or both in the search box. For example, if you enter WoRMS model: checks if the species epithet occurs in other genera within the same
The PESI web portal also provides a number of tools for quality control and to standardise a user's own species names. The ‘
The PESI taxon match tool has been promoted as an important tool within the PESI focal points network and at various other meetings (e.g. GBIF EU-nodes meetings). During the project phase, as part of the PESI Focal Points validation process, 357 files have been uploaded and matched.
In contrast to the taxon match, where users have to upload a species list, the portal also provides a platform-independent
A few examples of possible applications:
getGUID: Get the first exact matching GUID for a given name. getPESIRecords: Get one or more matching (max. 50) PESIRecords for a given name. getPESINameByGUID: Get the correct name for a given GUID. getPESIRecordByGUID: Get the complete PESI Record for a given GUID.
The PESI portal shows species distribution maps, if occurrence details are available. There are over 6 million distribution records in the PESI database. The maps are built on
There are two types of occurrence data on the portal:
The first type is provided by the component databases and directly included in the PESI Data Warehouse. The areas used have been standardised to The second type of occurrence data are those provided by
The PESI portal provides links to other portals (see above). On name matching, more specifically:
The
The
The PESI project recognises the need to link the conventional scientific publications to the species databases to provide users with a more comprehensive resource. To that end the project engaged with major publishers of scientific journals to understand how their online information systems were developing, and to explain how the species databases were evolving. A first workshop was held on 16th July 2009 in Amsterdam inviting key publishers, including PLoS, InterResearch, Allen Press, CRC Press (Taylor & Francis), Oxford University Press, Scopus, Science Direct, Elsevier, ISI Web of Science of Thomson Reuters, OvidSP (Biological Abstracts), Wiley, ProQuest (part of Cambridge), and JSTOR (Suppl. material
The process of contacting publishers revealed that most publishers were very limited in the functionality they had on their websites, and constrained by the limitations of the commercially provided software they used to provide added functionality. A first step in that regard has been the initiation of a special
A similar approach was followed later on by Fauna Europaea in collaboration with
Another conclusion of this publishers workshop was that an important, practical, option to link to a wider range of journals would be the use of RSS feeds, because this would provide no or little action on the side of the journal. Such a system would need the feeds to be aware of what species (or higher taxonomic) names to search and match to the published papers. The
The highest priority journals for species databases to be linked are those describing new species, and rationalising species nomenclatures (e.g. identifying synonyms or reclassifying species). One of the leading taxonomic journals in this field is
In relation to this subject, the 'Environmental and Natural Science Publishing in Europe' (ENSI) proposal was drafted to the European Commission FP7-SCIENCE-IN-SOCIETY-2011-1 call for funding entitled ‘Improved dissemination and preservation of natural history publications’ (FP7 289063).
The project results have been dissemimated in various ways. Some main public communication tools are mentioned below.
The PESI project homepage can be found here:
An interface to the European Taxonomic Backbone is provided by the PESI webportal:
PESI statistics for version 3 are summarised in Fig.
An introduction to PESI is available as a videoclip:
A PESI brochure is available here: Suppl. material
A PESI Flyer is available here: Suppl. material
PESI is well situated within the EC infrastructural and policy development. This is partly due to the fact that many end-users, stakeholders and EC directives (like
For establishing standards and sharing resources, PESI makes use of a huge network of taxonomic specialists in all European countries. Together with principal partners (like
In addition, PESI is selected by diverse EC bodies (like
PESI contributes to ongoing e-infrastructural developments, like
Some instances of the above synergies and developments are highlighted below.
The
Part of the PESI future progress lies in its potential to further engage and organise the taxonomic community participation as a vital, virtual workforce including (i) the instalment of proper expert network governance, (ii) the application of proper ownership licences, (iii) the use of appropriate mechanisms for acknowledging expert contributions, (iv) the ability to function as an efficient 'knowledge hub', supporting biodiversity research and decision making more generally, (v) the successful involvement of more open and dynamically organised social networks and communities, like non-professional taxonomists and citizen scientists.
Considering the overall decline of taxonomy as a scientific discipline, the maintenance of a basic (taxonomic) expertise capacity will be essential to satisfy the knowledge needs for a wide range of biodiversity-related information services in the near future, including the pan-European checklists. In PESI we moved forward from the efforts of
A network of European leading taxonomic institutions forms the
An important vehicle for authoring metadata provenance, providing a clear recognition of all contributors and enabling the receipt of credits as a formal scientific publication by means of citation, are 'data papers'. Data papers also play an important role in the publication of small bits and pieces of information, like new distributions, which would otherwise be difficult to publish (termed 'micro-publication'). Data papers allow a more flexible/dynamic way of expert involvement within the process of data collation and reviewing, because contributions are more easily acknowledgeable than is currently the case. Finally, by using novel e-publishing tools, the process of manuscript drafting is highly automated, being more convenient for the editors, but also enabling a direct cross-indexing with other relevant metadata resources, supporting feedback mechanisms in foreseen name annotation workflows (see also 'next generation name infrastructure' subsection, below).
PESI already supports the publishing of data papers accompanying the WoRMS, Fauna Europaea and Euro+Med updating process (see 'Results'). More emphasis will be put on this work within the
As part of the
Currently around 50% of the experts contributing to the updating of the pan-European checklists includes professional taxonomists. However, the contributions of non-professional taxonomists will significantly increase in the near future. PESI should effectively anticipate this development by implementing relevant social networking mechanisms, including mentoring, educational and accreditation systems, in close collaboration with taxonomic institutes and societies.
An associated exercise is the adequate integration of efforts of voluntary biodiversity recorders, who provide a major contribution to the continuing monitoring of Europe's biodiversity. During PESI an ESF networking proposal was submitted entitled “Citizens Monitoring Biodiversity (CMD)” to optimise the involvement of the volunteer biodiversity observation community into the European biodiversity programs (
Indexing biodiversity is a global challenge. PESI contributes to worldwide efforts on preparing global catalogues (like CoL) and supports relevant name services (like GBIF-ECAT) on increasing the resolving power for integrating biodiversity data.
More particularly, PESI has extended the pan-European checklists geographic scope by involving Focal Points from outer European Union territories, with the ultimate intention of covering the whole Palearctic (see 'Outlooks' section). The Palearctic is the largest biogeographic area of the world, containing a very rich and unique flora and fauna within a vastly diverse environment of unique habitats, especially in biogeographic transition and refuge zones regions, like the Caucasus. Integrating the available taxonomic expertise, data and resources into shared research infrastructures is crucial for globalising biodiversity assessments.
PESI partners from Eastern European countries, up to Russia and the Caucasus, are actively involved in carrying out aspects of the PESI work plan. To progress the participation of Caucasus partners a proposal was drafted to the European Commission
Similarly, for the Mediterranean, a connection was made with national partners (Morocco, Algeria, Tunisia, Egypt) and biodiversity networks (
Vernacular names are the most important search terms for non-professional users to retrieve biodiversity information. By means of the network of Focal Points (see 'Results' section), PESI collected substantial additional information on European species, including non-scientific names.
As part of the
PESI is participating in innovative biodiversity-informatics projects, developing and staging the ongoing virtualisation of the biodiversity research domain, building virtual tools and workbenches, including the collective use of data from multiple sources, and the automation of workflows of various tasks and processes.
The pan-European checklists have been applying automation for around 15 years since the initial versions of their data management systems. This has included implementing advanced virtual workbenches, including largely automated data-entry and data cleaning routines, which has eliminated a lot of manual processing. PESI continued this practice by installing automated routines for, among others, assembling the PESI Data Warehouse and for supporting the validation of checklists by means of the automated mapping tools (see Results).
More efficiency in workflow automation is obtained when restrictions of distributed architectures are further reduced, enhancing cross-platform operationability. This requires a broad set of infrastructural adaptations, including the harmonisation and standardisation of APIs, data exchange formats and ontologies. As part of the
In its original stage, the PESI infrastructure implemented two gateways to European biodiversity based on the taxon-level information provided by the participating checklists:
a feature-rich web-portal offering convenient human-readable access a webservice layer, which has a small and effective set of methods for retrieving XML-encoded information that can be further processed by machines on any platform using any programming language.
Both webportal and webservices are optimised for a usage scenario with requests on individual objects (e.g. a particular name or taxon, information related to a particular identifier, etc.). However, we believe that PESI services will play an increasingly important role in workflow-driven systems. In this context, PESI will, for example, be used to expand a taxon name query to include its synonyms when combining independent scientific services that use different taxonomies. For this purpose several measures should be taken, including: (i) the extension of existing service layers supporting the retrieval of massive amounts of data, (ii) the optimisation of services for performance and reliability to ensure their usefulness in a workflow environment, and (iii) the optimisation of the PESI data warehouse structure for efficient output-oriented queries.
With these extensions, new workflow-oriented scientific applications can be realised and add further value to the PESI infrastructure. Both the EC-FP7
Further European Taxonomic Backbone advancements are scheduled as part of the EU BON and LifeWatch projects:
In the
A complementary effort is foreseen in
Special cases of automated workflows are Virtual Research Environments (VREs), which will provide the next generation of shared research environments. Virtual labs provide an interactive environment in which researchers can access, collect, integrate and explore large amounts of data from multiple resources for analysis, data mining, and visualisation, supported by automated workflows and standardised web services.
PESI is involved in e-Science projects developing virtual labs for scientists (as part of BioVeL) and fishery agencies (as part of
An example of how virtual labs could increase the analytical value of biodiversity portals is given in Fig.
Other instances of interactive environments can be found in advanced biodiversity data portals. As part of the Focal Points workplans, PESI stimulates the sharing of best practices on portal development, especially regarding taxonomic checklist governance, data exchange and sophisticated web tools. As an example, the Romanian PESI Focal Point receives dedicated support from the
Similarly the
A marine virtual research environment (VRE) has been created as the first operating component of the
Environments are changing rapidly all over the planet, making it imperative that we have our systems for communicating biological information working efficiently and reliably. Data based on proper identification of taxa are essential to monitor changes in nature; information needs to be integrated on the dynamics on species existence (migration, extinction, intrusion) and on instability of the associated ecosystems. Ongoing, critical environmental assessments are important to document and control critical disorders, like the decline of (native) pollinator species, the impact of algal blooms, the effect of overexploitation and the invasion of pest species. Taxonomy is a foundational science, but its reliable application is hindered by the limited knowledge of many aspects of biodiversity and the relative disorganisation and inaccessibility of taxonomic information. PESI contributes to the synthesis and access to existing taxonomic knowledge by maintaining a network of outstanding experts and by taking care about the delivery of persistent standards and data integration routines, securing a high-level access to biodiversity data.
The biodiversity community, in anticipation of the European Commission H2020 call for larger, more integrated networks, is: (i) pushing biodiversity research as an innovative information science (
PESI contributes to the development of a next generation linked open-data names architecture, expanding the inter-platform operability and making the workflow orchestration and task automation more efficient between associated name services (Fig.
In the longer term a next generation names architecture is developing feedback mechanisms to automate the cross-annotation of different workflows using taxonomic information. This further virtualisation provides a number of advantages. Firstly, because most biological information (observations and knowledge) is linked to names, this will significantly increase the shared use and discovery of biological information. Secondly, because every virtual interaction provides a virtual documentation, this will enable the generation and accumulation of new 'information facts', finally resulting in a novel data ecosystem, also called 'big data' science.
However, virtualisation and automation aren't self-evident processes. They need a careful monitoring of the existing information environment and an exact knowledge on the amount of virtualisation and automation required, together with a defined strategy on the relevant infrastructural changes to be made. PESI is considering these steps, provisionally called
Taxonomic decisions are based on consideration of data in an interpretive framework. The selection of what data are used, how it is weighted, and the philosophical framework for interpretation, all are individual choices made by the taxonomist. The choice is based on a taxonomist's skills, past experience, data availability, educational background, and the structure of diversity in the organisms under study. Thus, many taxonomists have a natural scepticism towards 'big data' attempts, as they see this a limiting their ability to make taxonomic judgements. However, in the case of a
As an example,
After beginning in 2008 as a stand-alone system, like many other current nomenclatural and taxonomic sources, it was seen that ZooBank’s effectiveness would be exponentially increased if it were developed as a service within the
ZooBank is the forerunner model for GNUB-based registration systems that can be developed in other nomenclatural domains. In addition there are many other services that GNUB can facilitate, leveraging the power of a robust code-based nomenclatural ‘skeleton’ and a usage-based ‘body’. Some of the GNUB services are indicated in the ZooBank interface shown in Fig.
GNUB/GNA provides a single shared platform for all cross-links, such that anytime a record is indexed in GNA, it is automatically cross-linked to all other data systems that are indexed in GNA. Unlike most existing biodiversity data initiatives, the components of GNA (particularly GNUB) are not intended to provide novel information; rather, GNUB is an index of core facts that are shared across all of biology. Nothing in GNUB is original or novel content; it merely represents a structured way of organizing information to facilitate broader data integration among other databases that do contain original information. Thus, the GNUB index does not compete with other data resources; but rather serves as a core infrastructure for cross-linking (and thereby empowering) other biological data sources.
Despite the traditional role of scholarly publications to serve as the primary vehicle for disseminating and reusing peer-reviewed scientific findings, it was recognised in PESI that scholarly publishing cannot persist anymore as just a method of communicating final results, because of the obstacle it is for efficient data sharing, reproducibility and reuse (see section e-Publications). Rather, scholarly publishing should become a part of the scientific process itself (
EU-nomen will contribute to the further integration of scholarly publishing services as an integral and interlinked part of the whole data gathering, data mobilisation and research process, synchronised with data sources, data aggregators, attribution and annotation services, and ontology frameworks. Technically this integration of the publishing and research processes will partly be achieved through a commonly developed and shared API library, based on community-agreed data exchange formats.
On expert engagement, a special role in the process will be played by the "data paper" concept as an important instrument of data mobilisation, publication and community involvement (
Because taxonomic information provides the primary identification of an object and is a prerequisite to make other biodiversity data discoverable and available, the evaluation of the state and trends of biodiversity is only possible with help of an elaborated taxonomy. Accordingly the
EU BON supports advances on the PESI work program, focussing on improving the pan-European terrestrial (
Already from the earliest stage, because of the extended geographic scope of the pan-European checklists, experts and institutes of European neighbourhood countries (including Russia) have been involved in the checklist work programs. Consequently, enhancing the range and collaboration towards a full Palearctic coverage follows the footprints of earlier projects, like
For two centuries, the Natural History Institutes in Eastern European and Central Asian countries have accumulated an extensive knowledge and huge collections on the flora and fauna of this wide region. The integration and implementation of this information into existing global or European biodiversity databasing initiatives is, however, inadequate because of a suboptimal use of shared virtual and social infrastructures. The proposed "Flora/Fauna/Mycota Palearctica" project will solve this impediment by intensifying the existing collaboration with
The pan-European checklists initial set ups were established as part of the respective EC-FP4
Preparatory work on PESI was done during the EC-FP6
The further development of PESI (as 'EU-Nomen') is partially covered by the EC-FP7
We thank the
Five community networks (horizontal) are integrated in five categories of coordination effort (vertical) in PESI. Community networks represent the FP4 and FP5 key programs on European taxonomic indexing: Fauna Europaea, ERMS, Euro+Med PlantBase, supplemented by Index Fungorum and AlgaeBase.
PESI Expert Communities Common Infrastructure outline, showing a common governance organisation, including relevant expert(ise) network management tools.
Europe's integrated taxonomic workforce as established in EDIT and PESI, brought together (as a pilot) into a shared expert system.
The
PESI Focal Points Network, showing the respective Fauna Europaea, Euro+PlantBase and ERMS focal point partners. Northern African 'proto focal points' are indicated with open circles. Pan-Caucasian Plant Biodiversity Initiative partners are indicated with green asterisks.
Focal Points network details are described in Suppl. material
The role of PESI Focal Points as an addition to the expert network.
Simplified architecture of the EDIT Platform for Cybertaxonomy. Internal data stores are encapsulated by Java and web service APIs.
Information flow for merging and publication of checklist data in PESI (after Suppl. material
Simplyfied
PESI web-portal homepage (screenshot).
PESI web-portal statistics showing the number of unique (monthly) visitors and the number of (monthly) visits since the start of the project (see: Suppl. material
PESI web-portal advanced search interface (screenshot).
PESI web-portal example of a search and output of all species and infraspecies of molluscs
PESI Portal and PESI Datawarehouse (version 3) statistics. Source:
Aspects of increased biodiversity platform interoperability as established in the ViBRANT project (after: Suppl. material
Example of virtual lab application (Fauna Europaea VRE pilot), showing the species distribution predictions of
Screen capture of a biodiversity dedicated user interface in myBiOSis environment.
Screen capture of the Assessment toolkit in myBiOSis environment. This figure demonstrates flexibility of the system to accommodate distinct projects and research scopes.
LifeWatch marine virtual research environment (VRE) (source:
Potential workflows in a next generation linked open-data names architecture.
Global Names Architecture schematic representation of three cross-reference layers model.
An example ZooBank page (
Contribution to the Global Name Architecture (GBIF/ECAT white paper)
File: oo_44321.pdf
The European Taxonomic Work force (ETW), its tasks, activities and operational standards inspiration by the Open Source Society
File: oo_43994.pdf
The Government of IPR of Electronic Biodiversity Data
File: oo_43998.pdf
How to complete taxonomic gaps in the pan-European species registers, including experts and informatics resources
File: oo_44000.pdf
PESI workshop on linking taxonomic databases with online science journals
File: oo_44002.pdf
Design of a mechanism to keep control of the continuity of European electronic (taxonomic) biodiversity data resources and expertise networks
File: oo_44001.pdf
PESI Business Plan
File: oo_44003.pdf
The future of taxonomy – the role of national focal points networks in taxonomic information infrastructure networks
File: oo_44012.pdf
PESI Focal Points Working Plan
File: oo_44007.pdf
PESI Focal Points Handbook
File: oo_44008.pdf
Report on authoritative taxonomic standards from multiple sources suitable for deployment within European Research Area.
File: oo_44049.pdf
Report on Procedures and Mechanisms for the functioning of Nomenclators within the e-Infrastructure
File: oo_44050.pdf
The future of taxonomy – the role of GSD-networks and nomenclators in taxonomic information infrastructure networks (GNOMA).
Global Nomenclator Architecture (GNOMA) meeting on the contribution of nomenclators to the Global Names Architecture (GNA). ZooBank provisioning meeting on defining strategies for the uploading of ZooBank, especially on the contribution of taxonomic (zoological) key-resources.
File: oo_44047.pdf
The future of taxonomy – the role of GSD-networks and nomenclators in taxonomic information infrastructure networks (ZooBank).
Initial scoping meeting on GSDs and nomenclators involvement
File: oo_44048.pdf
Application and Adoption of Taxonomic Standards
File: oo_44063.pdf
Report on the contributions to the set up of a Global Name Architecture
File: oo_44064.pdf
Global Names Europe (GN-EU) – a names based cyber-‐infrastructure
File: oo_44068.pdf
GN-EU – a names based cyberinfrastructure contributing to the Global Names Architecture developments as a necessary component of Research Data e-Infrastructures : Framework for Action in H2020
File: oo_44069.pdf
PESI Report on the criteria, procedures and mechanisms for quality control
File: oo_44209.pdf
PESI joint e-infrastructure disseminating pan-European checklists
File: oo_44210.pdf
Versioning and the use of GUIDs for PESI
File: oo_44211.pdf
Working plan to support European GSDs maintenance and updating
File: oo_45976.pdf
Sustainability of European GSDs: Quantify financial and other resources to ensure long-term maintenance of European GSDs database systems
File: oo_45977.pdf
PESI web portal
File: oo_44220.pdf
Towards a Common Agenda on Prioritised Taxa
File: oo_44318.pdf
Applying taxonomy to organise and deliver publications to biologists
File: oo_44186.pdf
EDIT Scientific Publishing in Natural History Institutions 3nd meeting
Joint EDIT - PESI meeting on Scientific Publishing in NHIs
File: oo_44009.pdf
A recommended design for “BiodiversityKnowledge”, a Network of Knowledge to support decision making on biodiversity and ecosystem services in Europe
File: oo_44375.pdf
ViBRANT: Design of robust services
Also available at: http://vbrant.eu/content/d43-design-robust-services
File: oo_44570.pdf
Taxonomic backbone databases integrated with EDIT platform and EU BON portal [EU BON MS121]
File: oo_45086.pdf
EU BON taxonomic backbone and services prototype integrated in EU BON portal [EU BON MS122]
File: oo_45087.pdf
PESI Brochure
File: oo_45625.pdf
PESI Flyer
File: oo_45626.pdf
PESI Focal Point Network description
File: oo_57883.pdf
PESI Webstatistics
xlsx
File: oo_49019.xlsx
PESI videoclip
wmv
File: oo_57459.wmv
Coordination framework for grouped PESI focal points
png
Eight institutes commited themseleves to a coordinating role for Focal Point activities covering larger regions in the EU and adjacent countries if funding would become available. This figure gives a schematic presentation of this future effort to elaborate.
File: oo_57933.png