Distribution of vascular plants north of Lake Baikal: a new, open access dataset

Abstract Background The area north of Lake Baikal has been poorly studied. Moreover, most of the studies conducted in this region were focused on mountain ridges or river valleys. This region includes a part of Baikal-Amur Mainline (BAM), a broad-gauge railway in the centre of Siberia, Russia. The railway is an alternative route of the Trans-Siberian Railway; BAM starts in southern Siberia (Taishet station of Irktusk Oblast), passes through the northern part of Lake Baikal and finishes in the Russian Far East (Sovetskaya Gavan station of Khabarovsky Krai). BAM has four connections with the Trans-Siberian Railway and is the centre of economic development for many regions of Russia. Maya Ivanova and Alexandr Chepurnov summarised the existing floristic information for this region in detailed species distribution maps which they published in the book “Flora of the western part of developing regions of Baikal-Amur Mainline (BAM)” (1983). After publishing this book, very few floristic studies have been performed in the study region. All available botanical information is still accumulated in a number of printed papers or books with limited circulation, which are not widely known to the international scientific community. New information We have digitised the point distribution maps from the book of Ivanova and Chepurnov and georeferenced all occurrence and sampling localities. The resulting dataset includes 9972 occurrences for 770 vascular plant species and subspecies from the area north of Lake Baikal. Additionally, the dataset includes information on the distribution of 43 rare and endangered species with 366 occurrences. From our point of view, the dataset makes a contribution to the global biodiversity data mobilisation, providing plant species distribution data for such a remote mountainous area.


Introduction
Lake Baikal and its surrounding terrestrial ecosystems have recently undergone diverse climate change processes (Moore et al. 2009). The surface air temperature has warmed by 1.2°C during the last century; temperature increases have been observed in all seasons, but are greatest in winter and spring (Shimaraev et al. 2002). These changes are reflected in shifts in the phenology of vascular plants in the Barguzinsky Nature Reserve over the last 40 years with significant advances of spring events and delays of those associated with senescence in autumn (Rosbakh et al. 2021). The purpose of our work is to make available essential baseline data for analysing the manner in which the flora and vegetation around the northern end of Lake Baikal (Fig. 1) is responding to the global climate change. The northern part of this region includes a section of the Baikal-Amur Mainline (BAM), a broad-gauge railway line that goes through the centre of Siberia, which has led to the economic development along its route.
The mountains and river valleys around the northern part of Lake Baikal have been covered by a few botanical studies (Malyshev 1972, Tyulina 1976, Tyulina 1981, Ivanova and Chepurnov 1983. Ivanova and Chepurnov (1983) summarised previous floristic surveys and herbarium collections from the western part of BAM. They listed 1352 species and subspecies from 428 genera and 97 families occurring in the region. Species distribution maps for a larger area were published previously in "Alpine flora of Stanovoye Nagorye Upland" (Malyshev 1972) and "Flora of Central Siberia" (Malyshev andPeshkova 1979a, Malyshev andPeshkova 1979b).
Maps from these monographs have been critically analysed (Ivanova and Chepurnov 1983). In some cases, herbarium specimens were verified for clarification of species localities. New records of vascular species in the study area were added, based on specimens collected by N.S. Vodopyanova, M.M. Ivanova, Yu.N. Petrochenko, A.A. Chepurnov, M.G. Azovsky, V.V. Telyatiev and other researchers who worked at BAM (Table  1). These botanical studies were summarised in a book on the flora of the Baikal Siberian Region and its genesis (Malyshev and Peshkova 1984 1955 1958 1966 1966-1967 1967 1974 1979 Figure 1.
General map of the study area.
The information on region topography, water bodies, floristic regions and protected areas are combined in one map. Floristic regions (according to Ivanova and Chepurnov 1983) are marked by colours. Table 1.
The eastern part of BAM, the Chara floristic region, has been affected by large-scale human activities: copper mining in the Udokan Range, gold mining in the Olekma-Chara highland and proposals for extensions of BAM.
The development of portable satellite trackers has made incorporating georeference information into collection and observation records common. Our purpose for digitising the maps published in Ivanova & Chepurnov (1983) and freely sharing the resulting species occurrences is to provide the baseline data that will aid all those interested in the BAM's flora and in mapping its changes over time.

General description
Purpose: Digitising the vascular plant species distribution maps covering the western part of Baikal-Amur Mainline, which are published in Ivanova & Chepurnov (1983). This source contains crucial information on species distributions in the northern part of Lake Baikal, which is a less studied area of the Baikal Siberian Region. Other distribution maps currently available for this territory have a larger scale and many plant species are represented only by a few occurrences there.

Study area description: Baikal Region, Russia
Design description: The project is designed to benefit many different areas of study, such as: plant taxonomy, floristics, vegetation science, plant biology and population ecology, fauna and ecology of insects, ecology and geography of vertebrates.

Study extent:
The study area is situated on the northern edges of three regions of Russia: Irkutsk Oblast, Republic of Buryatia and Zabaikalsky Krai. Some of the species occurrences at the north-western part of Lake Baikal, including Baikal Range, are now included in the Baikalo-Lensky Nature Reserve. The eastern part of the study area is legally protected in the Barguzinsky and Dzherginsky Nature Reserves, Zabaikalsky National Park and Frolikhinsky Sanctuary. The north-eastern part of Irkutsk Oblast includes the Vitimsky Nature Reserve (Fig. 1).

Sampling description:
In total, 770 maps were scanned from the book. Using the position of Lake Baikal and neighbouring rivers, we defined the projection of the maps (Fig. 2). We used a similar technique as employed for our previous dataset describing the distributions of endemic alpine species of northern Asia ). All the maps were adjusted to the same size and horizontal position in order to obtain standardised images of the maps. Digitalisation was performed in QGIS 3.10 software with the help of its georeferencing tools. The most accurate projected coordinate system was Asia North Albers Equal Area Conic. The water bodies shapefile was downloaded from the open source (https://vsegei.ru/ru/info/ggk_1000ns/) in scale 1:1 Mio. The river drainage shapefile fits very well with the original paper maps, but there were problems with the shape of Lake Baikal, especially in its northern part. In such cases, species distribution maps were georeferenced by snapping control points to the destination vector shapefile, which was the contour of Lake Baikal. We used control points (usually 5-8) to link maps to the destination shapefile, which resulted in the transformation of the maps according to the spatial projection of the destination features (WGS 1984). Subsequently, species distribution locations were digitised from each map. Coordinates of each location were calculated in the attribute table .

Quality control:
We performed the final examination of the digitised species distribution maps in QGIS 3.10. For each species, we compared the output digitised occurrences with the original maps in order to check missing distribution records. The majority of occurrences (98%) matched consistently with the printed maps. Other 187 distribution records were manually adjusted for better matching with their habitats. These records mostly belong to the psammophytes occurring along the shoreline of Lake Baikal, especially at its northern part (Fig. 3). The diameter of points denoting the species occurrences is equal to 16 km. In this process, digitised localities of the psammophyte plants were moved closer to the shoreline. Taking this procedure into account, we estimate the coordinate uncertainty as 20 km for all the species in this study, taken as a matter of precaution.

Geographic coverage
Description: The study area includes the western part of BAM from Ust-Kut Town in the west and the Chara Depression in the east. It is a mountainous region involving several ranges of Stanovoy Highlands (Upper Angara, North Muya, South Muya, Kodar, Udokan), Baikal and Barguzin Ranges (Fig. 1). The main river of the study area is the Lena River. One of its southern tributaries is the Vitim River, which flows to the Lena from the northeast of Lake Baikal. The Vitim has tributaries draining the area, the Muya, Mamakan and Mama tributaries from the west and the Kalar and Kalakan tributaries from the east. The two major rivers on the west side of the region originate in the western part of the Stanovoy Highlands: the Upper Angara River that flows into the northern end of Lake Baikal and the Chaya River that flows into the Lena River. Georeferencing the distribution map of Scrophularia incisa in QGIS 3.10.
Overlay is the GIS shapefiles (denoted by colours), background is the original printed map (black and white).
The territory is divided into several floristic regions (Ivanova and Chepurnov 1983) (Fig. 1 We mapped all localities recorded in the original printed maps. Most of these localities were included within the study area, but a few lay outside the digitised floristic regions (Fig.  4) due to the presence of general distribution data in the original maps.

Taxonomic coverage
Description: The dataset includes 770 species and subspecies of vascular plants with 9972 occurrences from 81 families and 266 genera. The whole list of the flora of this region includes 1352 species and subspecies. Therefore, the dataset contains more than a half of the flora (57%) because the distribution maps were provided for the most common species only. In reporting the data, we retained the family attributions used in the source to facilitate comparisons. The top 10 families include 58.9% of the taxa and 56.9% of the occurrences (Table 2). In the original floristic analysis, Scrophulariaceae appeared in the top 10 families (Ivanova and Chepurnov 1983), but it is replaced by Apiaceae in our dataset. Scrophulariaceae in the current circumscription is represented in the dataset by only one species , Scrophularia incisa (Fig. 3). Comparisons of percentages for all other families reveal further similarities between the complete floristic checklist and the species included in the dataset ( Table 2). The comparisons testify that the dataset is representative for some part of the flora including families with high numbers of species. The digitised data can also be used for studies of the distribution patterns of key vascular plant species in the study region.  Our comparisons revealed that the list of top 10 genera was the same in the book and the dataset (Table 3). Carex and Salix are the leading genera in both lists. Standing next in the floristic list, Potentilla and Artemisia do not have distribution maps for the widely distributed species and that is why their position within the dataset is not so high. Other genera have similar positions as in the whole floristic checklist of the region. The dataset contains information on the distribution of vascular plants species which are included in regional Red Data Books of the Baikal Siberian Region (Pronin 2013, Polyakov 2017, Trofimova 2020) (Table 4). These data are complementary to the recently-published dataset with occurrences of rare and endangered species of the Transbaikalia ) and will be helpful in planning and implementing future conservation activities.

Temporal coverage
Notes: Dates of the specimen records used to prepare the printed maps ranged from 1912 to 1979 (Table 1).  The list of vascular plant species included in regional Red Data Books of the Baikal Siberian Region.
floristic information for this region with point distribution maps of vascular plant species is summarised in the book by M.M. Ivanova and A.A. Chepurnov "Flora of the western part of developing regions of Baikal-Amur Mainline (BAM)" (Ivanova and Chepurnov 1983). All available maps from this book have been digitised and occurrences of vascular plants were organised in a dedicated dataset. The dataset includes 9972 occurrences for 770 vascular plant species and subspecies occurring around the northern part of Lake Baikal (the western part of Baikal-Amur Mainline), which is a hard-to-access mountainous region. order The full scientific name of the order in which the taxon is classified. family The full scientific name of the family in which the taxon is classified. decimalLatitude The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location decimalLongitude The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location.
georeferencedBy A list of persons who determined the georeference (spatial representation) for the Location.

geodeticDatum
The ellipsoid, geodetic datum or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude are based.
eventDate The date-time or interval during which an Event occurred. This is the publication date of the book by M.M. Ivanova and A.A. Chepurnov (1983) "Flora of the western part of developing regions of Baikal-Amur Mainline (BAM)".
coordinateUncertaintyInMetres The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the taxonRemarks Comments or notes about the taxon or name. Usually contains notes about definition of the taxon "sensu lato" or "sensu stricto" as recorded in the book by M.M. Ivanova and A.A. Chepurnov (1983) "Flora of the western part of developing regions of Baikal-Amur Mainline (BAM)".