Vascular plants from European Russia in the CSBG SB RAS Digital Herbarium

Abstract Background The Central Siberian Botanical Garden of the Siberian Branch of the Russian Academy of Sciences (CSBG SB RAS) is the largest botanical institution in the Asian part of Russia. Founded in 1946, CSBG SB RAS is historically a consortium of two herbarium collections with their own acronyms (NS and NSK) and registration in the Index Herbariorum (Thiers 2020). At present the NS+NSK collections contain about 800,000 herbarium specimens comprising vascular plants (680,000), mosses (25,000), lichens (80,000) and fungi (15,000) gathered, not only in Siberia, but also in the European part of Russia and other parts of the Eurasian and American continents. CSBG SB RAS has the third largest collection in Russia after the Komarov Botanical Institute of RAS (LE) and Moscow State University (MW) collections. The dataset consists of 5,384 records of digitised herbarium specimens of vascular plants belonging to 111 families, collected since the 19th century in 54 administrative regions from the European part of Russia and kept in NS+NSK collections. Herbarium specimens were digitised using two special scanners, both ObjectScan 1600, according to international standards, at 600 dpi, with a barcode, 24-colour scale and spatial scale bar and placed into the CSBG SB RAS Digital Herbarium. For each specimen, the species name, locality, collection date, collector, ecology and revision label are recorded. More than 94% of the records have coordinates that fall within the area of European Russia, west of the Ural Mountains. New information A total of 5,384 records of vascular plant occurrences with 94.8% geolocations in the territory of the European Russia West of the Ural Mountains were entered.


Introduction
Free and open access to biodiversity data is essential for informed decision-making to achieve conservation of biodiversity and sustainable development Penev 2011, Penev et al. 2017). Preserved specimen collections are the most important source of scientific information about the distribution of specimens in the past and present, which allows simulation of the dynamics of objects in the future. Only the herbarium sample reliably confirms the presence of the plant organism in a specific point of space at a certain time. Herbarium collections and the data they hold are valuable, not only for the traditional studies of taxonomy and systematics, but also for ecology, bioengineering, conservation, food security and the human social and cultural elements of scientific collection (Baird 2010, James et al. 2018). The value and universality of herbarium specimens are recognised in most countries, where national and large regional herbaria are actively developing and improving (Costello et al. 2013, Cranston et al. 2014, Kovtonyuk 2017, Pearse et al. 2017. The digitisation and open access to the collections have become a common trend in biodiversity collections management, the latest stage in improving the inventory and modernisation of herbarium collections of the leading botanical institutions in the world (Heberling et al. 2019, Le Bras et al. 2017, Seregin 2020, Kovtonyuk et al. 2019a. With the digitisation of natural history collections over the last decades, their traditional roles for taxonomic studies and public education have been greatly expanded into the fields of biodiversity assessments, climate change impact studies, trait analyses, sequencing, 3D object analyses etc. (Nelson and Ellis 2018, Raes et al. 2019, Watanabe 2019. Herbarium specimens represents snapshots of phenological events and have been reliably used to characterise phenological responses to climate (Willis et al. 2017).
The CSBG SB RAS was founded in 1946 and currently is the largest botanical institution in the Asian part of Russia. The first herbarium collection at the CSBG SB RAS was organised in 1944 on the basis of herbarium sheets transferred from the Medical and Biological Institute (Novosibirsk), currently the collection named after I.M. Krasnoborov (NS). The NSK collection was transferred from Irkutsk in 1978, the collection named after M.G. Popov. Historically, it is a consortium of two herbarium collections with their own acronyms (NS and NSK) and registration in the Index Herbariorum.
Digitisation of vascular plants at 600 dpi was initiated in 2014 by using the herbarium scanner Herbscan (JSTOR 2020), starting with the type specimens of M.G. Popov's Herbarium (Kovtonyuk 2015). At the end of 2017, the research group "Unique scientific unit -Herbarium of higher plants, lichens and fungi (NS, NSK)" with the short name "USU-Herbarium" was organised in CSBG SB RAS for herbarium digitisation and herbarium management. The goal of the research group is to provide open access to the digitised collections of CSBG SB RAS as a worldwide data resource for the study of biodiversity. The digitisation of herbarium specimens started in 2018 using two herbarium scanners ObjectScan 1600 (Microtek 2020) according to international standards. For each specimen, the species name, locality, collection date, collector and ecology were digitised and verified (Kovtonyuk et al. 2019b). To date, more than 47,000 herbarium specimens have been digitised, verified and placed into the CSBG SB RAS Digital Herbarium (http://herb.csbg.nsc.ru:8081).
This datapaper describes the data about the herbarium specimens digitised in 2020 under the initiative "Call for data papers from European Russia", which were digitised and geolocated and the taxonomic status of the specimens was revised. The digitisation of the herbarium will be continued and the dataset will be updated in the future.

General description
Purpose: The purpose of this paper is to describe a dataset published in GBIF (Kovtonyuk et al. 2020) in the format of a peer-reviewed journal paper and to provide recognition for the effort by means of a scholarly article Penev 2011, Penev et al. 2017 Table 2.
Most active collectors from European Russia in the dataset.

Vascular plants from European Russia in the CSBG SB RAS Digital Herbarium
Sampling description: In both NS and NSK collections, European Russia is not separated as a single section and does not have a separate catalogue. The digitisation of the herbarium specimens fom European Russia in NS and NSK started under the "Call for data papers from European Russia". In total, 4139 herbarium specimens from the NS collection which were applicable for the target region were first digitised. Another 1245 specimens were mounted, barcoded and digitised, then accessioned in NSK and included in the collection's database. The total number of the herbarium specimens from European Russia in CSBG SB RAS ranges between 10,000 and15,000 and its digitisation will continue in the future. Quality control was carried out by staff of USU-Herbarium group during the verification of the digitised samples. Label metadata information was placed into Calc table (Open Office) and then modified into a table of Darwin Core Standard.
Mounting of dry plant material on to a herbarium sheet; 2.
Reviewing the identification and nomenclature by a specialist; 3.
Barcoding the specimen: printing a barcode on the thermal printer and mounting it to the herbarium sheet; 4.
Placing the herbarium sheet, 24-colour scale and scale bar on the scanner platform and image capturing; 5.
Generating metadata, labelling OCR by ScanWizard Botany and verification of the label text by experts; 6.
Georeferencing using Open Google map, Yandex map and other open maps.

Taxonomic coverage
Description: The taxonomic coverage of the dataset includes 111 families from 41 orders and 6 classes of vascular plants, following GBIF Backbone Taxonomy (GBIF Secretariat   Table 3.    The ellipsoid, geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude as based.

Data resources
coordinateUncertaintyInMetres The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location.