The fungal literature-based occurrence database for southern West Siberia (Russia)

Abstract Background The paper presents the initiative on literature-based occurrence data mobilisation of fungi and fungi-related organisms (literature-based occurrences, Darwin Core MaterialCitation) to develop the Fungal literature-based occurrence database for the southern West Siberia (FuSWS). The initiative on mobilisation of literature-based occurrence data started in the northern part of West Siberia in 2016. The present project extends the initiative to the southern regions and includes ten administrative territories (Tyumen Region, Sverdlovsk Region, Chelyabinsk Region, Omsk Region, Kurgan Region, Tomsk Region, Novosibirsk Region, Kemerovo Region, Altai Territory and Republic of Altai). The area occupies the central to southern part of the West Siberian Plain and extends for about 1.5 K km from the west to the east from the eastern slopes of the Ural Mountains to Yenisey River and from north to south—about 1.3 K km. The total area equals about 1.4 million km2. The initiative is actively growing in spatial, collaboration and data accumulation terms. The working group of about 30 mycologists from eight organisations dedicated to the data mobilisation was created as part of the Siberian Mycological Society (informal organisation since 2019). They have compiled the almost complete bibliographic list of mycology-related papers for the southern West Siberia, including over 900 publications for the last two centuries (the earliest dated 1800). All literature sources were digitised and an online library was created to integrate bibliography metadata and digitised papers using Zotero bibliography manager. The analysis of published sources showed that about two-thirds of works contain occurrences of fungi for the scope of mobilisation. At the time of the paper submission, the database had been populated with a total of about 8 K records from 93 sources. The dataset is uploaded to GBIF, where it is available for online search of species occurrences and/or download. The project's page with the introduction, templates, bibliography list, video-presentations and written instructions is available (in Russian) at the web site of the Siberian Mycological Society. The initiative will be continued in the following years to extract the records from all published sources. New information The paper presents the first project with the aim of literature-based occurrence data mobilisation of fungi and fungi-related organisms in the southern West Siberia. The full bibliography and a digital library of all regional mycological publications created for the first time includes about 900 published works. By the time of paper submission, nearly 8 K occurrence records were extracted from about 90 literature sources and integrated into the FuSWS database published in GBIF.


Introduction
The mycological research in the southern part of West Siberia stems from isolated studies at the end of the 19 century, yet regular and systematic research only began in the second half of the century. Over the following decades, several dozen researchers worked in the area and a total of over 1000 scientific works were published. The history of research of particular fungal groups was earlier described in a series of publications (Milovidova 1983, Davydov and Skachko 2014, Shirjaeva 2015, Sedel'nikova 2017. Below, we 2 th describe the history of mycological research in the southern part of West Siberia by traditionally-studied morphological or ecological groups.

Overview of the mycological research reflected in the database
The lichen diversity in the region has been studied for more than hundred years. Irregular collections of lichens started at the end of the 19 century by broad-scale collectors (from: Sedel'nikova 2017). This first period was summarised in a book chapter by Savich and Elenkin (1950). More systematic research of lichens started in the region in the second half of the 20 century. In total, about 10 lichenologists worked in the area and published the results of the inventory or monitoring work. Systematic research of lichen diversity was made by Nellya V. Sedelnikova in several regions of southern West Siberia (occurrences summarised in Sedel'nikova 2017). Evgeny A. Davydov has been studying the lichen biota in Altai mountains (Davydov 2001, Davydov 2004, Davydov et al. 2007, Davydov and Printzen 2012, Davydov and Printzen 2012, Davydov 2012, Davydov et al. 2012 and Elena Y. Skachko in Altai plain (Barnaul vicinity) (Skachko 2003). Vera V. Koneva described the lichen communities and diversity in Tomsk Region (Koneva 2003). Eugene V. Barsukov studied lichen communities of pine forests in Novosibirsk Region (Barsukov 2001). In Omsk Region, lichen diversity of forest-steppe zone was studied by Natalia V. Sorokina (Sorokina 2001a, Sorokina 2001b. Ekaterina V. Romanova revealed bioindicator activity of lichens in Novosibirsk Region (Sedel'nikova and Romanova 2010). A detailed history of lichen research in Altai Territory provided in Davydov and Skachko (2014) and for West Siberia as a whole in (Sedel'nikova 2017). A number of important new records and species new for science were reported recently (Vondrák et al. 2016, Davydov and Konoreva 2017, Davydov and Yakovchenko 2017, Yakovchenko et al. 2017, Yakovchenko and Davydov 2018, Paukov et al. 2019, Davydov et al. 2021. Agaricoid basidiomycetes is a well, but unevenly, studied group in the region. Scientists performed targeted surveys on the group in Novosibirsk Region, Tomsk Region, Republic of Altai and Altai Territory. In the 1930s, the prominent mycologist of the 20 century Rolf Singer visited the Altai Mountains, accompanied by Lubov' N. Vasiljeva. The collections made during the fieldwork were studied and cited in a number of papers, including the monumental "Das System der Agaricales III" (Singer 1943). In the 1960s, Nina V. Perova actually established the "mycological centre" in Novosibirsk which initiated the surveys of larger fungi of southern West Siberia, namely in Altai Republic, as well as Novosibirsk and Tomsk Regions with minor data from Kemerovo Region (Perova and Gorbunova 2001). In the 1990s, her successor Irina A. Gorbunova continued the work in various parts of the region, including several protected areas (Perova and Gorbunova 2007, Gorbunova et al. 2011, Gorbunova 2017, Gorbunova 2018. In the 2000s, Natalia P. Kutafieva started surveys in the Tomsk Region (Kosheleva and Kutaf'eva 2004). Later, Nadezhda N. Kudashova (Agafonova N.N.) with colleagues summarised all known data on larger fungi of the region (Kosheleva and Kutaf'eva 2004) and later added new information (Kudashova et al. 2016a, Kudashova et al. 2016b). In the borders of Kurgan, Omsk, Kemerovo and Tyumen Regions, only scattered data on agaricoid fungi can be found in several th th th summarising works (Stepanova and Sirko 1977, Anonymous 1980, Mukhin 1993, Perova and Gorbunova 2001. A detailed history of mycological studies in Sverdlovsk Region can be found in the summary by Olga S. Shiryaeva (Shirjaeva 2015).
Clavarioid fungi were inventoried in different regions by Anton G. Shiryaev with colleagues (Shiryaev 2008b, Shiryaev and Gorbunova 2012, Vlasenko and Vlasenko 2017b. The history of aphyllophoroid basidiomycetes research is described in Vlasenko (2013a). The southern West Siberia was less studied compared to the northern part, until recently. In the beginning of the 20 century, the sporadic collection work was done by Nikolay N. Lavrov, Konstantin E. Murashkinskiy, M. K. Ziling, V. P. Dravert, V. V. Popov and others (from: Vlasenko 2013a). E. Zhukov (2002) and Zhukov (2005) were studying lignicolous basidiomycetes in Altai, Novosibirsk and Tomsk Regions in more detail at the end of the 20 century. A relatively well-studied area by the school of aphyllophorologists is Sverdlovsk Region, importantly in mountains of the Urals (Mukhin 1993, Mukhin et al. 2003. In Altai Territory, the important inventories were made in the pine forests of foreststeppe zone (Vlasenko 2010, Vlasenko 2013a, in plantations and native forests of Novosibirsk (Vlasenko 2013b, Vlasenko 2014 and in different nature protected areas (Vlasenko and Vlasenko 2015, Vlasenko and Vlasenko 2017a. In Tomsk Region, the first checklist of aphyllophoroid fungi was published in Agafonova et al. (2007b). The biological activity of agaricoid and aphyllophoroid basidiomycetes has been studied in Novosibirsk in different research projects, mentioning a few in the following references , Protsenko et al. 2019).
The history of myxomycetes research in West Siberia was presented in a paper (Vlasenko 2008). The first inventory of this group in Tomsk Region was made by Nikolay N. Lavrov (Lavrov 1927, Lavrov 1931. Recently, the systematic research of different ecological groups of myxomycetes in different regions was advanced by Anastasia V. Vlasenko with co-authors (Vlasenko 2011, Vlasenko and Novozhilov 2011, Vlasenko 2013, Vlasenko 2020, amongst others).
The description of the history of research was not intended to be complete and only describes the main fields of research of fungal diversity in the region and lists the key researchers and works. For a full mycological bibliography for the southern West Siberia, the reader is invited to read Suppl. material 1. This list is to be updated in the future and the latest version can be found in the working group' web page. For the history of mycological research in northern West Siberia, please refer to Filippova et al. (2020).

Project description
Title: The data mobilisation working group of the Siberian Mycological Society.

Personnel:
The working group of about 30 mycologists from eight organisations dedicated to the fungal literature-based records mobilisation initiative was created as part of the Siberian Mycological Society (informal organisation since 2019).

Sampling methods
Study extent: The project was aimed at mobilisation of species records accumulated in the course of previous mycological studies and published in peer-reviewed scientific literature from the beginning of research up to date (Filippova et al. 2021). The geography extended throughout the southern part of West Siberia, in the administrative borders of ten regions. About 900 publications were compiled in a bibliography and a digital library and the species occurrence records were extracted from about 90 selected works by the time of paper submission. The initiative will be continued in the following years to extract the records from all published sources.
Step description: The following protocol was used to standardise and improve the mobilisation workflow: 1.
The bibliography was compiled using Zotero bibliographic manager. Only published works (peer-reviewed papers, conference proceedings, PhD theses, monographs or book chapters) were selected. If possible, the sources were scanned and added to the library as PDF files.

2.
The template of the FuSWS database was made with Google Sheets and simple Microsoft Excel templates. The Darwin Core standard was applied to the database field structure to accommodate the relevant information extracted from the publications. In total, 31 fields (see detailed description in Data resources) were selected to describe the literature-based occurrence data in the needed detail.

3.
From the available publications related to the region, the only works with species occurrence reports were selected for the databasing purpose. The main source of occurrences were annotated species lists with exact localities of the records. However, different sorts of other species citations were also included, provided that they had the connection to any geography and could be georeferenced at least to the regional level.

4.
Most of the occurrences were georeferenced, either from the coordinates provided in the paper or from the verbatim description of the field work locality. The georeferencing of the verbatim descriptions was made using Yandex or Google map services. Depending on the quality of georeference provided in publications, the coordinate uncertainty was estimated as follows: 1) the coordinate of a fruiting structure or a plot provided in the publication gives the uncertainty about 3-30 m; 2) the coordinate of the field work locality provided in publication gives the uncertainty between 500 m to 5 km; 3) the report of the species presence in a particular region gives the centroid of the area with the uncertainty radius to include its borders.

5.
The locality names were reserved in the field «verbatimLocality» for accuracy.

6.
When possible, the «eventDate» was extracted from the annotation data. Whenever this information was absent, the date of the publication was used instead, with the remarks in the «verbatimEventDate» field about the origin of the date.

7.
The ecological features, habitat or relief were written in the «habitat» field and reserved in Russian.

8.
The substrate is important feature of fungal occurrences and was extracted in the «fieldNotes» field.

9.
Other annotation records, including the abundance, fruiting season and others, were accommodated in the «occurrenceRemarks» field.

10.
The original scientific names reported in publications were filled in the «verbatimScientificName» field and reserved in the original database. This field was used to create the «ScientificName» field after spelling errors correction using the GBIF Species Matching tool. This tool was also used to create the additional fields of taxonomic hierarchy from species to kingdom, to fill in the «taxonRank» field and to synonymise according to the GBIF Backbone Taxonomy.

11.
To track the digitisation process, a metadata worksheet was maintained. Each bibliographic record had a series of fields to describe the digitisation process and its results: the total number of extracted occurrence records, general description of the occurrence quality, presence of the observation date, details of georeferencing and the name of a person responsible for the digitisation.

Geographic coverage
Description: The dataset is limited by the administrative borders of ten regions (Tyumen Region, Sverdlovsk Region, Chelyabinsk Region, Kurgan Region, Omsk Region, Tomsk Region, Novosibirsk Region, Kemerovo Region, Altai Territory and Republic of Altai).
The region occupies the central to southern part of the West Siberian Plain. The area extends for about 1.5K km from the west to the east from the eastern slopes of the Ural Mountains to Yenisey River and from north to south -about 1.3K km. The total area equals about 1.4 m km .
The area is very diverse in biogeographical terms, including several vegetation zones from steppe to taiga forest and mountain ecosystems. The relief in the central part is mainly a plain, but the south-eastern part of the area is occupied by several mountain systems of Altai, Salair, Kuznetsk Alatau and Gornaya Shoriya. The western part of West Siberia is bordered by the Ural Mountains.
The distribution of the occurrence records from the FuSWS on Landsat satellite image of the area. The clustering of points was made within a radius of 100 km; the scale breaks were selected manually after plotting the frequency distribution histogram.
Most administrative divisions were covered by mycological research, but the intensity of the research varies (Fig. 1). Up to 85% of all records in the database currently made from five regions (Novosibirsk Region -35%, Tomsk Region -16%, Republic of Altai -15%, Altai Territory -11% and Kemerovo Region -11% of total occurrences). Other regions are less covered in the database and the subject of future work.
Notes: About 90 publications for the last century.

Author contributions
All authors participated in compilation of bibliography and extraction of the species occurrences included in the database and participated in the revision of the paper. Nina Filippova was an initiator of the digitisation initiative and responsible for data integration and publishing in GBIF. Sergei Bolshakov made the major data cleaning work with the database. Sergei Bolshakov and Dmitry Ageev compiled the bibliography in Zotero. Ilya Filippov prepared the distribution map.