Biodiversity Data Journal :
Data Paper (Biosciences)
Corresponding author: Nina Filippova (
Academic editor: Dmitry Schigel
Received: 20 Oct 2021 | Accepted: 06 Dec 2021 | Published: 13 Dec 2021
© 2021 Nina Filippova, Dmitry Ageev, Sergey Bolshakov, Evgeny Davydov, Aleksandra Filippova, Ilya Filippov, Sergei Gashkov, Irina Gorbunova, Ludmila Kalinina, Nadezhda Kudashova, Ekaterina Palomozhnykh, Natalia Shabanova, Maria Tomoshevich, Olga Vayshlya, Anastasia Vlasenko, Vyacheslav Vlasenko, Irina Vorob'eva, Lidia Yakovchenko, Elena Zvyagina
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Filippova N, Ageev D, Bolshakov S, Davydov EA, Filippova A, Filippov I, Gashkov S, Gorbunova I, Kalinina L, Kudashova N, Palomozhnykh E, Shabanova N, Tomoshevich M, Vayshlya O, Vlasenko A, Vlasenko V, Vorobʼeva I, Yakovchenko L, Zvyagina E (2021) The fungal literature-based occurrence database for southern West Siberia (Russia). Biodiversity Data Journal 9: e76789.
The paper presents the initiative on literature-based occurrence data mobilisation of fungi and fungi-related organisms (literature-based occurrences, Darwin Core MaterialCitation) to develop the Fungal literature-based occurrence database for the southern West Siberia (FuSWS). The initiative on mobilisation of literature-based occurrence data started in the northern part of West Siberia in 2016. The present project extends the initiative to the southern regions and includes ten administrative territories (Tyumen Region, Sverdlovsk Region, Chelyabinsk Region, Omsk Region, Kurgan Region, Tomsk Region, Novosibirsk Region, Kemerovo Region, Altai Territory and Republic of Altai). The area occupies the central to southern part of the West Siberian Plain and extends for about 1.5 K km from the west to the east from the eastern slopes of the Ural Mountains to Yenisey River and from north to south—about 1.3 K km. The total area equals about 1.4 million km2.
The initiative is actively growing in spatial, collaboration and data accumulation terms. The working group of about 30 mycologists from eight organisations dedicated to the data mobilisation was created as part of the Siberian Mycological Society (informal organisation since 2019). They have compiled the almost complete bibliographic list of mycology-related papers for the southern West Siberia, including over 900 publications for the last two centuries (the earliest dated 1800). All literature sources were digitised and an online library was created to integrate bibliography metadata and digitised papers using Zotero bibliography manager. The analysis of published sources showed that about two-thirds of works contain occurrences of fungi for the scope of mobilisation.
At the time of the paper submission, the database had been populated with a total of about 8 K records from 93 sources. The dataset is uploaded to GBIF, where it is available for online search of species occurrences and/or download. The project's page with the introduction, templates, bibliography list, video-presentations and written instructions is available (in Russian) at the web site of the Siberian Mycological Society. The initiative will be continued in the following years to extract the records from all published sources.
The paper presents the first project with the aim of literature-based occurrence data mobilisation of fungi and fungi-related organisms in the southern West Siberia. The full bibliography and a digital library of all regional mycological publications created for the first time includes about 900 published works. By the time of paper submission, nearly 8 K occurrence records were extracted from about 90 literature sources and integrated into the FuSWS database published in GBIF.
occurrence, specimen, materialCitation, funga, fungi, Mycobiota, digitisation, biodiversity data mobilisation, GBIF
The mycological research in the southern part of West Siberia stems from isolated studies at the end of the 19th century, yet regular and systematic research only began in the second half of the century. Over the following decades, several dozen researchers worked in the area and a total of over 1000 scientific works were published. The history of research of particular fungal groups was earlier described in a series of publications (
The lichen diversity in the region has been studied for more than hundred years. Irregular collections of lichens started at the end of the 19th century by broad-scale collectors (from:
Agaricoid basidiomycetes is a well, but unevenly, studied group in the region. Scientists performed targeted surveys on the group in Novosibirsk Region, Tomsk Region, Republic of Altai and Altai Territory. In the 1930s, the prominent mycologist of the 20th century Rolf Singer visited the Altai Mountains, accompanied by Lubov' N. Vasiljeva. The collections made during the fieldwork were studied and cited in a number of papers, including the monumental "Das System der Agaricales III" (
Gasteroid fungi were studied by Yury A. Rebriev with co-authors (
Clavarioid fungi were inventoried in different regions by Anton G. Shiryaev with colleagues (
The history of aphyllophoroid basidiomycetes research is described in
Study of fungal plant pathogens in West Siberia started at the beginning of the 20th century. Several regional checklists were created back then (
The history of myxomycetes research in West Siberia was presented in a paper (
The description of the history of research was not intended to be complete and only describes the main fields of research of fungal diversity in the region and lists the key researchers and works. For a full mycological bibliography for the southern West Siberia, the reader is invited to read Suppl. material
The data mobilisation working group of the Siberian Mycological Society.
The working group of about 30 mycologists from eight organisations dedicated to the fungal literature-based records mobilisation initiative was created as part of the Siberian Mycological Society (informal organisation since 2019).
The project was aimed at mobilisation of species records accumulated in the course of previous mycological studies and published in peer-reviewed scientific literature from the beginning of research up to date (
The following protocol was used to standardise and improve the mobilisation workflow:
The bibliography was compiled using Zotero bibliographic manager. Only published works (peer-reviewed papers, conference proceedings, PhD theses, monographs or book chapters) were selected. If possible, the sources were scanned and added to the library as PDF files.
The template of the FuSWS database was made with Google Sheets and simple Microsoft Excel templates. The Darwin Core standard was applied to the database field structure to accommodate the relevant information extracted from the publications. In total, 31 fields (see detailed description in Data resources) were selected to describe the literature-based occurrence data in the needed detail.
From the available publications related to the region, the only works with species occurrence reports were selected for the databasing purpose. The main source of occurrences were annotated species lists with exact localities of the records. However, different sorts of other species citations were also included, provided that they had the connection to any geography and could be georeferenced at least to the regional level.
Most of the occurrences were georeferenced, either from the coordinates provided in the paper or from the verbatim description of the field work locality. The georeferencing of the verbatim descriptions was made using Yandex or Google map services. Depending on the quality of georeference provided in publications, the coordinate uncertainty was estimated as follows: 1) the coordinate of a fruiting structure or a plot provided in the publication gives the uncertainty about 3-30 m; 2) the coordinate of the field work locality provided in publication gives the uncertainty between 500 m to 5 km; 3) the report of the species presence in a particular region gives the centroid of the area with the uncertainty radius to include its borders.
The locality names were reserved in the field «verbatimLocality» for accuracy.
When possible, the «eventDate» was extracted from the annotation data. Whenever this information was absent, the date of the publication was used instead, with the remarks in the «verbatimEventDate» field about the origin of the date.
The ecological features, habitat or relief were written in the «habitat» field and reserved in Russian.
The substrate is important feature of fungal occurrences and was extracted in the «fieldNotes» field.
Other annotation records, including the abundance, fruiting season and others, were accommodated in the «occurrenceRemarks» field.
The original scientific names reported in publications were filled in the «verbatimScientificName» field and reserved in the original database. This field was used to create the «ScientificName» field after spelling errors correction using the GBIF Species Matching tool. This tool was also used to create the additional fields of taxonomic hierarchy from species to kingdom, to fill in the «taxonRank» field and to synonymise according to the GBIF Backbone Taxonomy.
To track the digitisation process, a metadata worksheet was maintained. Each bibliographic record had a series of fields to describe the digitisation process and its results: the total number of extracted occurrence records, general description of the occurrence quality, presence of the observation date, details of georeferencing and the name of a person responsible for the digitisation.
The dataset is limited by the administrative borders of ten regions (Tyumen Region, Sverdlovsk Region, Chelyabinsk Region, Kurgan Region, Omsk Region, Tomsk Region, Novosibirsk Region, Kemerovo Region, Altai Territory and Republic of Altai).
The region occupies the central to southern part of the West Siberian Plain. The area extends for about 1.5K km from the west to the east from the eastern slopes of the Ural Mountains to Yenisey River and from north to south – about 1.3K km. The total area equals about 1.4 m km2.
The area is very diverse in biogeographical terms, including several vegetation zones from steppe to taiga forest and mountain ecosystems. The relief in the central part is mainly a plain, but the south-eastern part of the area is occupied by several mountain systems of Altai, Salair, Kuznetsk Alatau and Gornaya Shoriya. The western part of West Siberia is bordered by the Ural Mountains.
Most administrative divisions were covered by mycological research, but the intensity of the research varies (Fig.
49.309 and 60.907 Latitude; 61.518 and 95.4909 Longitude.
According to the database summary report by the time of paper submission, there are occurrences of about 2200 species mobilised in the FuSWS database, which represent 800 genera, 230 families, 80 orders, 19 classes, five phyla and three kingdoms (Fungi, Protozoa, Chromista) (Fig.
About 90 publications for the last century.
This work is licensed under a Creative Commons Attribution (CC-BY) 4.0 License.
The dataset includes a table in Darwin Core format with 31 original fields and about 8 K records.
Column label | Column description |
occurrenceID |; an identifier of a particular occurrence, unique within this dataset. We used simple 5-digit incremental number format. |
basisOfRecord |; according to DwC recommendation, all literature-based records published to GBIF should have a value “MaterialCitation” (currently unavailable in IPT, but we will change it to this value in the future). |
bibliographicCitation |; the bibliographic citation of a publication from which the occurrence was extracted, Elsevier - Harvard (with titles). |
catalogNumber |; the collection number or field number of the specimen, if provided in annotation (for example LE 255111 - specimen stored in Komarov Botanical Institude RAS). |
coordinateUncertaintyInMetres |; see "Sampling methods" for the description of the uncertainty calculation protocol. |
countryCode |; the standard code for the country in which the locality occurs (RU). |
county |; the full, unabbreviated name of the next smaller administrative region than stateProvince (район). |
decimalLatitude |; the geographic latitude provided in publication or determined from the provided geographic description; see "Sampling methods" for georeferencing details. |
decimalLongitude |; the geographic longitude provided in the publication or determined from the provided geographic description; see "Sampling methods" for georeferencing details. |
eventDate |; the full date of the observation event if provided in annotation or the year of publication itself, if absent in annotation of the record. In case the year of publication added, a corresponding remark was added in eventRemarks. |
fieldNotes |; the description of substrate. |
geodeticDatum |; the geodetic datum upon which the geographic coordinates are given. |
georeferencedBy |; a person who determined the georeference. |
georeferenceProtocol |; see "Sampling methods" for georeferencing details. |
georeferenceSources |; the resource used to georeference the locality. |
habitat |; the description of habitat, including vegetation or relief. |
identifiedBy |; a person who identified the taxon. |
kingdom |; the full scientific name of the kingdom in which the taxon is classified. |
locality |; the original locality description of the collection place below county level, in English. |
occurrenceRemarks |; other annotations to the record, including abundance, phenology etc. |
recordedBy |; a person responsible for the original occurrence record, if present in annotation. |
scientificName |; the original names as provided in publication, but corrected for spelling mistakes using GBIF Species Matching tool. |
stateProvince |; the name of the next smaller administrative region than country (область, край, республика). |
taxonRank |; the taxonomic rank of the most specific name in the scientificName as it appears in the original publication. |
verbatimElevation |; the original description of the elevation. |
eventRemarks |; information whether the eventDate was extracted from annotation or from a year of publication. |
verbatimLocality |; the original locality description of the collection place below county level, reserved in original language. |
verbatimLatitude |; the original latitude format as it was provided in a publication. |
verbatimLongitude |; the original longitude format as it was provided in a publication. |
verbatimEventDate |; the original representation of the date as it was provided in a publication. |
language |; a language of the dataset, which is Russian for some of the fields (bibliographicReference, verbatimLocality, occurrenceRemarks, habitat) and English for other fields. |
The research was partially funded by the grant of the Tyumen Region Government in accordance with the Program of the West Siberian Interregional Scientific and Educational Center, National Project "Nauka"; and a grant for organisation of New Young Researcher Laboratories as part of the implementation of the National Project "Science and Universities"; Elena A. Zvyagina’s work was supported by a grant from the Russian Foundation for Basic Research (RFBR) 20-04-00349; Maria A. Tomoshevich and Irina G. Vorob'eva were funded by budgetary project No. AAAA-A21-121011290027-6 of the Central Siberian Botanical Garden of the Siberian Branch of the Russian Academy of Sciences (CSBG SB RAS), Novosibirsk, Russia.
All authors participated in compilation of bibliography and extraction of the species occurrences included in the database and participated in the revision of the paper. Nina Filippova was an initiator of the digitisation initiative and responsible for data integration and publishing in GBIF. Sergei Bolshakov made the major data cleaning work with the database. Sergei Bolshakov and Dmitry Ageev compiled the bibliography in Zotero. Ilya Filippov prepared the distribution map.
The bibliography presents all mycological scientific publications (journal papers, conference proceedings, PhD theses, monographs and book chapters) related to the mycological research in southern West Siberia from the beginning of research to date.