Flora of Vladimir Oblast, Russia: an updated grid dataset (1867–2020)

Abstract Background The dataset covers wild tracheophytes (native species, naturalised aliens and casuals) of Vladimir Oblast, Russia. It includes only one occurrence per species per grid square, thereby recently confirmed earlier records are not duplicated. Georeferences are based on the WGS84 grid scheme with 342 squares with areas ranging from 94.7 km2 in the northernmost part to 98.2 km2 on the southern boundary (5′ lat. × 10′ long.). Each occurrence is linked to the corresponding grid square centroid, therefore actual coordinates, habitat details and voucher information are unavailable. In late 2011, the earlier version of the dataset was used for the production of grid maps in the standard "Flora of Vladimir Oblast: checklist and atlas". Additional records, obtained during field excursions of 2012 and 2013, were fully included in the "Flora of Vladimir Oblast: grid data analysis". The stable version of the dataset with 123,054 grid records (as of 1867–2013) was published in GBIF in 2017. New information Data obtained in the field during 2014–2020, as well as those extracted from recently published sources, were digitised, structured and finally published in GBIF in April 2021. The last update added 7,000 new grid records. Currently, "Flora of Vladimir Oblast, Russia: an updated grid dataset (1867–2020)" contains 130,054 unique occurrences of 1,465 vascular plant taxa (species, hybrids, species aggregates) from Vladimir Oblast and tiny parts of the adjacent areas. The average number of grid records has grown over the seven years from 363 to 380 species. The grid occurrences are largely based on the field studies by the author, performed during 1999–2020 (121,737 records), as well as on data extracted from the relevant literature, unpublished sources, herbarium collections and citizen science projects (8,317 records). The taxonomic backbone of the occurrence grid dataset follows the accompanying checklist dataset to ensure correct cross-linking of the names. As of April 2021, the dataset on the Vladimir Oblast flora represents the fourth largest dataset on vascular plants of Russia published in GBIF.


Introduction
Since 1999, the author has been working on the grid mapping of the Vladimir Oblast flora. The region covers an area of 29,074 km . The oblast was divided into 342 grid squares measuring 5′ lat. x 10′ long. or ca. 9.2 x 10.4 km. Thus, the area of the grid cells slightly increases southwards from 94.7 to 98.2 km (Fig. 1). Cyrillic letters were used to designate 21 rows from north to south, while numbers were used to indicate the squares within the rows from west to east. The northern border of the northernmost row А follows 56°50′N, while the southern border of the southernmost row Х follows 55°05′N, the western border of the squares Г1 and Д0 follows 38°10′E, while the eastern border of the square З28 follows 43°00′E. The grid is available as a supplementary *.kml file (Suppl. material 1) with a copy on Zenodo (https://doi.org/10.5281/zenodo.4724913). The grid is visualised on Google Maps at https://maps.google.com/maps/ms?msid=200284766630468455543.00 0462414ec0fd70a9c6f&msa=0 Every year, data obtained by the author in the field were imported into the distribution database on the Vladimir Oblast flora (MS Excel spreadsheet). The earlier version of the database supplemented by all available records from the literature and herbarium collections was used to produce maps for the standard "Flora of Vladimir Oblast: checklist and atlas" (Seregin 2012). At the time of map production for the flora in November 2011, the database contained 118,231 records. In 2012-2013, the author continued the grid mapping of the Vladimir Oblast flora. By the end of 2013, the regional flora included 1,399 species of vascular plants . The stable version of the dataset with 123,054 grid records (as of 1867-2013) was published in GBIF in November 2017 (Seregin 2021b). In line with the call for data papers describing datasets from Russia by GBIF, we completely revised the dataset and made the following improvements and ammendments:

1.
Field data obtained by the author during 2014-2020 and new data published recently in various references were fully integrated into the dataset. New field data were obtained by the author during 77 standard one-day grid square surveys, as well as dozens of occasional field excursions focused on specific plant habitats, communities or species.

2.
This update added 7,000 new grid records into the dataset, including records of 26 new species. For at least 11,190 grid records, the date of the last record was updated to show current presence of the species. 3.
Three new grid squares were added on the fringes of Vladimir Oblast. The average number of grid records increased within seven years from 363 to 380 species (Table 1). 4.
The taxonomic backbone of this occurrence dataset follows , available in GBIF as a checklist dataset (Seregin 2021c) to ensure correct crosslinking of the names. 5.
An aggregation of the records by standard grid square surveys was performed using the "eventID" field of the DarwinCore.  We amended the dataset on 29 Apr 2021 after a thorough data audit, performed by Dr Robert Mesibov (https://www.datafix.com.au) in line with preparation of the data paper.
As of 19 April 2021, the Vladimir Oblast occurrence dataset on the flora makes the seventh largest dataset on biodiversity of Russia published in GBIF (Table 2) and the fourth largest for vascular plants after Ueda (2021), Seregin (2021a) and Artemov and Egorova (2021  The growth of the dataset during 2011-2020. The earlier version of the dataset with 118,231 grid records (as of late 2011) was used for the map production in the standard flora (Seregin 2012). Amongst the datasets published by the Russian institutions, this occurrence dataset on the flora makes the fourth largest dataset available in GBIF (   16-18 April (Strizhev 1973). However, in the last decades, spring phenological events have been shown to begin earlier as compared to the long-term average values. For instance, T. farfara now starts blooming 21 days earlier than a century ago in the City of Kirov (Soloviev 2007).

Vegetation and floristic divisions:
Vladimir Oblast is situated in the ecotone zone between boreal coniferous and temperate broadleaf (hardwood) forests. Distribution of the forest types within the region is clearly determined by the soil conditions. Both boreal coniferous forests dominated by Pinus sylvestris L. and Picea abies (L.) H. Karst. on various nutrient-poor substrata and temperate broadleaf forests with Quercus robur L., Tilia cordata L. and Ulmus glabra Huds. on loamy eutrophic soils being the main components of the original (pre-man) vegetation.
Other native plant communities of Vladimir Oblast are peat bogs, xeric meadows on steep slopes and alder stands along smaller streams, as well as meadows, marshes and willow thickets on flood plains. Currently, 29.9% of land is used for agriculture, while 55% is covered by forests (official data).
Gorokhovets Ridge (no counts due to small area of the division and low number of corresponding grid squares) 9.

Sampling methods
Study extent: The dataset combines two types of records, namely, field records by the author and data from other sources. The field records collected by the author (121,737 ocurrences) were obtained during 594 standard grid surveys. Typically, two surveys were performed in each grid square: (1) a summer survey (between June and September) and an additional (2) spring survey (late April to May). The numbers of grid records, obtained during the most comprehensive one-day standard grid surveys, are given on the map (Fig.  3).
Data extracted from the relevant literature, unpublished sources, herbarium collections and citizen science projects are not massive (8,317 records), since the dataset comprises only the latest records per grid for each species. A short historical overview of the most important sources was published in Russian in Seregin (2012) and . Additionally, we integrated data from the citizen science project "Flora of Vladimir Oblast" ( https://www.inaturalist.org/projects/vladimir-oblast-flora), initiated by the author on iNaturalist as part of the "Flora of Russia" initiative (Seregin et al. 2020). Surprisingly, the number of new grid records from the community was fairly modest. Only 959 occurrences out of 19,239 (as of 29 March 2021) were identified as new grid records, whereas another 200 occurrences accounted for recent confirmations of historical records.
Sampling description: A standard one-day survey began with the preparation of the route using satellite images. It was designed to link known localities of rare species and areas of potential interest. Route planning helps to avoid various delays and fruitless searches. Plants that are difficult to identify in the field were collected for further examination as herbarium specimens. Previously-known localities of rare species were to be revisited. Numbers of grid records obtained during the most comprehensive one-day standard grid surveys (equalling the number of taxa).
Usually, a floristic survey of a grid square took one day (6-9 h, sometimes up to 12 h). The track was permanently controlled using GPS in the field. Before 2018, the author used a printed spreadsheet in a field notebook with a list of the most common plants, which comprised about half of the regional flora (Fig. 4). Rarer plants were placed at the end of the list, whereas both species not identified with certainty and those of interest were collected. In 2019 and 2020, field documentation of the flora was performed using a smartphone in line with the "Flora of Russia" initiative (Seregin et al. 2020).
Quality control: During field surveys, we kept a record of 680 most widely distributed species on printed spreadsheets to avoid omissions of common species. Nonetheless, a  map of omissions of the top-100 most recorded species (Fig. 5) suggests that some grid squares were likely under-surveyed. One can see some under-surveyed grid squares on the fringes of Vladimir Oblast (i.e. on the borders of the Region), as well as a few poorly sampled grid squares across the area. A group of red squares on the north-eastern corner shows the Balakhna Lowland ( Fig. 2) with unfavourable conditions of nutrient-poor acid habitats, such as extremely dry pine forests on alluvial sands.

Geographic coverage
Description: Vladimir Oblast, Russia, in its administrative borders and some records from adjacent parts of the grid squares, which are only partly within the Vladimir Oblast borders. During 21 years, the area was evenly sampled, thus the number of recorded species across grid squares gives a good overview of natural patterns, rather than sampling efforts (Fig. 6). Spatial data on the vascular plant flora of Vladimir Oblast were published earlier in the form of 1,370 species distributional maps (Seregin 2012).
The second book of the series  included an analytical part of the survey. A quantitative spatial assessment at various scales, an overview of distributional patterns for common and rare species and spatial analysis of grid distributions led to recognition of the regional chorotypes (i.e. distributional species groups within the Region) and confirmed the presence of ten floristic divisions (Fig. 2).
Coordinates: 55 and 57 Latitude; 38 and 43 Longitude. Number of records per grid (equalling the number of taxa).

Taxonomic coverage
Description: A total of 1,465 vascular plant taxa-largely species, but also hybrids, microspecies, undivided genera and some uncertain species.  The year of observation is clearly indicated in 113,578 grid records (87.3%). Undated records resulted from digitisation of old references and specimen records, as well as from earlier surveys during which an interval instead of a specific date was indicated. As we include only the latest grid records for each species, the number of undated records is permanently decreasing. Merely all dated records (i.e. 112,992) were made during 2000-2020. In 2009, 21,220 grid records were added into the dataset (Fig. 7).  Table 5.
Growth in the number of grid records during the last three years (2017 vs. 2020) across Vladimir Oblast.
Presumable causes of the data growth include true expansion of the alien species across the region; earlier under-recording of species from some habitats (such as alder forests, nutrient-poor meadows, flood plains etc.); life cycle of some orchids when they can be abundant or completely invisible from year to year; or short life cycle of spring plants.
Carex elongata L. (Cyperaceae) is used here as an example of a previously underrecorded species to show the recent progress in data collection (Fig. 8). This species was reported from 79 grid squares in the standard flora ( Fig. 8a) (Seregin 2012). In Vladimir Oblast, C. elongata is a typical plant of Alnetea glutinosae communities (alder forests), which are extremely inhospitable for a researcher during the spring and summer seasons due to mosquitoes and boggy ground. Therefore, the data on this species were far from complete. Further focused surveying of this habitat during the last decade and expertise in identification of this sedge without fruits helped us to double the number of the known records published in this dataset (Fig. 8f).
By the end of 2017, many biased maps of species grid distributions were updated as a result of extensive field surveys. Thereby, the data collected during the last three years (2018 to 2020) clearly indicate further expansion of invasive or potentially invasive species (Seregin 2010, Seregin 2015. For instance, Erigeron septentrionalis (Fernald et Wiegand) Holub, Epilobium tetragonum L. agg., Oenothera biennis L., Anisantha tectorum (L.) Nevski and Jacobaea vulgaris Gaertn. are the most rapidly expanding aliens in the last three years (Table 5). Surprisingly, a steady growth of the grid records for common orchids like Platanthera bifolia (L.) Rich. and Dactylorhiza fuchsii (Druce) Soó is noticeable as well.

Usage licence
Usage licence: Other

IP rights notes: This work is licensed under a Creative Commons Attribution (CC-BY) 4.0
License.
Year of the latest grid records within the dataset (Seregin 2021b).
Alternative identifiers: 7afb26e9-aad6-47cb-a5bf-de49dc7597a4, https://depo.msu.ru/ ipt/ resource?r=vladimir The grid occurrences are largely based on the field studies by the author performed in 1999-2020 (121,737 records), as well as on the data extracted from relevant literature, manuscripts, herbarium collections and citizen science projects (8,317 records). An aggregation of the grid records by 342 grid squares was performed using "Event ID" field of the DarwinCore. Taxonomic backbone of the occurrence grid dataset is following  which is available in GBIF as a checklist dataset (Seregin 2021c) to ensure smooth cross-linking of the names.
As of April 2021, "Flora of Vladimir Oblast, Russia: an updated grid dataset (1867-2020)" is the fourth largest dataset on vascular plants of Russia published via GBIF. basisOfRecord The specific nature of the data record -a subtype of the dcterms:type. A variable (three terms: "Literature", "PreservedSpecimen", "HumanObservation" before translation). "Literature" was translated as "HumanObservation" following Darwin Core Type Vocabulary.
informationWithheld Additional information that exists, but that has not been shared in the given record. The name of the continent in which the location occurs. A constant ("Europe"). country The name of the country or major administrative unit in which the location occurs.
A constant ("Russian Federation"). countryCode The standard code for the country in which the location occurs. A constant ("RU"). stateProvince The name of the next smaller administrative region than country (state, province, canton, department, region etc.) in which the location occurs. A constant ("Vladimir Oblast"). decimalLatitude The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a location. A variable (latitude of a grid square centroid). decimalLongitude The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a location. A variable (longitude of a grid square centroid).
geodeticDatum The ellipsoid, geodetic datum or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude are based.
coordinateUncertaintyInMeters The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the location. A constant ("7000" or an average distance between a grid square centroid and a grid square corner).
georeferencedBy A list (concatenated and separated) of names of people, groups or organisations who determined the georeference (spatial representation) of the location. A constant ("Alexey P. Seregin").
identifiedBy A list (concatenated and separated) of names of people, groups or organisations who assigned the Taxon to the subject. A variable (for example, "Alexey P. Seregin"). scientificName The full scientific name, with authorship and date information, if known. A variable (for example, "Diphasiastrum complanatum (L.) Holub"). kingdom The full scientific name of the kingdom in which the taxon is classified. A constant ("Plantae"). phylum The full scientific name of the phylum or division in which the taxon is classified. A constant ("Tracheophyta"). genus The full scientific name of the genus in which the taxon is classified. A variable (for example, "Diphasiastrum"). taxonRank The taxonomic rank of the most specific name in the scientificName. A variable (three options: "Species", "Genus", "Variety"). taxonomicStatus The status of the use of the scientificName as a label for a taxon. A constant ("accepted"). The taxonomy is linked to a checklist dataset (https://doi.org/ 10.15468/7zk2y5) that defines the concept.