Biodiversity Data Journal :
Data Paper (Biosciences)
|
Corresponding author: Alexey P. Seregin (botanik.seregin@gmail.com)
Academic editor: Alexander Sennikov
Received: 15 Sep 2021 | Accepted: 18 Oct 2021 | Published: 20 Oct 2021
© 2021 Alexey P. Seregin, Yurii Basov
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Seregin AP, Basov YM (2021) Fleroff goes digital: georeferenced records from "Flora des Gouvernements Wladimir" (Fleroff, 1902). Biodiversity Data Journal 9: e75299. https://doi.org/10.3897/BDJ.9.e75299
|
Global Biodiversity Information Facility (GBIF) has uneven data coverage across taxonomic, spatial and temporal dimensions. Temporal imbalances in the data coverage are particularly dramatic. Thus, 188.3M GBIF records were made in 2020, more than the whole lot of the currently available pre-1986 electronic data. This underscores the importance of reliable and precise biodiversity spatial data collected in early times. Biological collections certainly play a key role in our knowledge of biodiversity in the past. However, digitisation of historical literature is underway, being a modern trend in biodiversity data mining. The grid dataset for the flora of Vladimir Oblast, Russia, includes many historical records borrowed from the "Flora des Gouvernements Wladimir" by Alexander F. Fleroff (also known as Flerov or Flerow). Intensive study of Fleroff's collections and field surveys exactly in the same localities where he worked, showed that the quality of his data is superb. Species lists collected across hundreds of localities form a unique source of reliable information on the floristic diversity of Vladimir Oblast and adjacent areas for the period from 1894 to 1901. Since the grid dataset holds generalised data, we made precise georeferencing of Fleroff's literature records and published them in the form of a GBIF-mediated dataset.
A dataset, based on "Flora des Gouvernements Wladimir. I. Pflanzengeographische Beschreibung des Gouvernements Wladimir" by
GBIF has uneven data coverage across taxonomic, spatial and temporal dimensions. Temporal imbalances in the data coverage are particularly dramatic (Fig.
Distribution of GBIF-mediated records from 1800 to 2021 by years showing the disproportion in temporal data coverage across GBIF (source: https://www.gbif.org/, as of 05 September 2021).
Biological collections certainly play a key role in our knowledge of biodiversity in the past. In GBIF, 8.23 M out of 10.54 M pre-1900 records are based upon museum specimens. Nonetheless, digitisation of literature is underway. Direct on-purpose digitisation and transcription into the form of GBIF-mediated data of published sources is a modern trend in biodiversity data mining. In particular, numerous datasets from Plazi.org platform (https://www.gbif.org/publisher/7ce8aef0-9e92-11dc-8738-b8a03c50a862) contributed 480,751 occurrences from taxonomic treatment articles.
In the Russian segment of GBIF, digitised points from the printed atlases are the largest datasets based upon literature sources. For instance, dot maps from the "Flora of Siberia" (
Vladimir Oblast in GBIF. Vladimir Oblast (29,084 km2) is the first-level administrative unit of the Russian Federation situated east of Moscow. This is a region with a high density of GBIF-mediated data on floristic diversity. To date, 188,790 records of tracheophytes originated from Vladimir Oblast out of 3,437,051 records available for the flora of Russia. Average data density on vascular plants from this area is 6.49 records per 1 km2. The most extensive datasets are:
The largest grid dataset with ca. 130 K records (
The experience of the author's (A.P. Seregin) work on the grid atlas, his intensive study of Fleroff's herbarium collections and field surveys exactly in the same localities where Alexander F. Fleroff (Fig.
Spelling of the surname. In modern standards, the Russian surname "Флёров" could be transcribed into English as "Flerov" following the spelling (BSI standard) or "Flyorov" following the pronunciation (GOST 7.79-2000). However, in the past, it was a common practice to use "-off" ending for the Russian surnames like "Sokoloff" (Соколов), "Smirnoff" (Смирнов) etc. In his book,
However, IPNI suggests another forms as standard ones, like "Flerow" (urn:lsid:ipni.org:authors:2781-1, https://www.ipni.org/a/2781-1) for tracheophytes and "Flerov" (urn:lsid:ipni.org:authors:20035717-1, https://www.ipni.org/a/20035717-1) for fungi. These both LSIDs refer to him.
The purpose of this newly-created dataset (
Structure of the original source: The book "Flora of Vladimir Governorate" by
The first part is written in two languages, i.e. the main text in Russian (338 pages) with the extended summary in German (18 pages) (Fig.
Schmutz-titles of two parts of the original source by
From the point of view of a 21st century researcher, the most important fragments of the first part are lists of species in Latin for individual communities with a clear indication of localities (Fig.
Examples of pages from the original source (
The second part of the Fleroff's book is a checklist written in Latin on 70 pages and entitled "Flora Gubernii Wladimiriensis. II. Enumeratio plantarum" (Fig.
Fleroff intensely revised the nomenclature of the checklist prior to its publication. He made some adjustments and name substitutions according to the recently-published monographs. Therefore, he altered some names widely used in the first part (like Betonica officinalis L., Clinopodium vulgare L., Orobus vernus L. etc.). Later, species entries from the second part of
In 1902,
Fleroff's herbarium
Fleroff's herbarium collections from Vladimir Governorate are now preserved in two herbaria, i.e. the Moscow University Herbarium (MW) and the Komarov Institute Herbarium (LE). The specimens collected in 1894–1901 document data from the original source (
The MW Herbarium has been entirely digitised (
The MW Herbarium holds 676 specimens collected by Fleroff in Vladimir Governorate in 1894–1896: nine specimens of fairly rare species are dated back to 1894 (Fig.
A herbarium specimen MW0389465 collected by Fleroff in 1894 (preserved and digitised in the Moscow University Herbarium).
A herbarium specimen MW0271262 collected by Fleroff in 1896 (preserved and digitised in the Moscow University Herbarium).
A herbarium specimen MW0298466 collected by Fleroff in 1900 (preserved and digitised in the Moscow University Herbarium).
The LE Herbarium contains later collections by Fleroff from Vladimir Governorate (1897–1907). Judging by the labels, the specimens for 1897, 1900 and 1901 were undoubtedly collected during the preparation of the original source (
Georeferencing of digitised species lists (see below) was carried out, based on the expert knowledge of the area, analysis of modern satellite images and old topographic maps. Fleroff's lists of routes, which he gave at the beginning of each chapter of the original source, were of great help for us. For each route, he gave a sequential list of localities (i.e. villages, stations, rivers, lakes etc.), which allows us to understand his transportations. The mean accuracy of records of the entire dataset is 2,447 m. For 2,460 records, the georeferencing accuracy is 1,000 m or less (28%), whereas for 6,070 records, it is 2,000 m or less (68%). That level of accuracy was unattainable for most herbarium collections of the late 19th century.
1. List of species. In the original source (
2. Georeferences and their list. Simultaneously, but independently from the first step, we made a spreadsheet of localities and communities studied and documented by
We used two main sources for georeferencing: (1) modern satellite images and electronic maps of Yandex (https://yandex.ru/maps/) and a detailed digitised map by Mende of Vladimir Governorate, 1848–1850 (http://www.etomesto.ru/map-vladimir_mende/). From time to time, we have used other cartographic sources and textual descriptions of places from a wide variety of sources on the Internet. We georeferenced Fleroff's records to 367 centroids, because sometimes the author described several closely-situated communities within the same locality (for example, aquatic plants and coastal plants of the lake). The first map (Fig.
Three places mentioned by
3. Harmonisation of species lists and georeferences. On this step, we merged and harmonised two spreadsheets, i.e. species lists by pages and a list of georeferenced localities. At this stage, the original source was always at hand. We identified and eliminated some accidental omissions and typos. Location descriptions were standardised. We excluded some Latin names mentioned without localities (for example, in conclusions or discussion).
4. Excluding non-original data.
5. Adding records based upon the Russian vernacular names. A remarkable feature of the book by
6. Cleaning list of species, synchronisation with a backbone. We checked the list of re-typed names for errors of two kinds, i.e. typos in the original text and typos by the input operator. These cases have been standardised. The standardisation of orthography reduced the number of taxa entries from 766 to 678.
The orthographically-clean set of names was further synchronised with the nomenclature according to
7. DarwinCore format. We transformed the final spreadsheet with 8,889 records into the DarwinCore format. It includes 20 variable fields, whereas an additional 28 constant fields were set directly in the IPT. After publication, the data cleaning procedure was based on the "Issues and flags" section on the dataset page (https://doi.org/10.15468/8qf7sh).
A dataset covers Vladimir Governorate of the Russian Empire in the borders of 1901. Currently, records by
General overview of digitised data from
Modern region | Number of centroids | Number of species | Number of records |
Vladimir Oblast | 195 | 534 | 4,611 |
Yaroslavl Oblast | 66 | 409 | 2,013 |
Nizhny Novgorod Oblast | 37 | 307 | 942 |
Ivanovo Oblast | 36 | 273 | 667 |
Moscow Oblast | 32 | 203 | 656 |
Total | 367 | 654 | 8,889 |
General overview of digitised data from
Modern district | Modern region | Number of centroids | Number of species | Number of records |
Pereslavsky District | Yaroslavl Oblast | 66 | 409 | 2,013 |
Aleksandrovsky District | Vladimir Oblast | 40 | 317 | 1,318 |
Sergievo-Posadsky District | Moscow Oblast | 29 | 199 | 599 |
Yuryev-Polsky District | Vladimir Oblast | 14 | 257 | 586 |
Vachsky District | Nizhny Novgorod Oblast | 13 | 249 | 523 |
Suzdalsky District | Vladimir Oblast | 28 | 200 | 482 |
Vyaznikovsky District | Vladimir Oblast | 26 | 187 | 399 |
Gorokhovetsky District | Vladimir Oblast | 14 | 190 | 375 |
Gavrilovo-Posadsky District | Ivanovo Oblast | 11 | 202 | 368 |
Navashinsky District | Nizhny Novgorod Oblast | 16 | 144 | 283 |
Kovrovsky District | Vladimir Oblast | 11 | 134 | 258 |
Kirzhachsky District | Vladimir Oblast | 13 | 132 | 258 |
Yuzhsky District | Ivanovo Oblast | 18 | 137 | 244 |
City of Kovrov | Vladimir Oblast | 5 | 113 | 193 |
Kameshkovsky District | Vladimir Oblast | 6 | 102 | 139 |
Sudogodsky District | Vladimir Oblast | 6 | 76 | 123 |
Muromsky District | Vladimir Oblast | 7 | 83 | 108 |
Melenkovsky District | Vladimir Oblast | 4 | 76 | 94 |
Petushinsky District | Vladimir Oblast | 8 | 63 | 85 |
City of Vladimir | Vladimir Oblast | 4 | 55 | 65 |
Volodarsky District | Nizhny Novgorod Oblast | 2 | 57 | 64 |
Teykovsky District | Ivanovo Oblast | 7 | 41 | 55 |
Pavlovsky District | Nizhny Novgorod Oblast | 5 | 44 | 53 |
City of Murom | Vladimir Oblast | 1 | 40 | 45 |
Town of Aleksandrov | Vladimir Oblast | 1 | 37 | 42 |
Orekhovo-Zuyevsky District | Moscow Oblast | 2 | 30 | 39 |
Selivanovsky District | Vladimir Oblast | 3 | 22 | 24 |
Kulebaksky District | Nizhny Novgorod Oblast | 1 | 17 | 19 |
Taldomsky District | Moscow Oblast | 1 | 15 | 18 |
Town of Suzdal | Vladimir Oblast | 1 | 7 | 7 |
Sobinsky District | Vladimir Oblast | 1 | 1 | 1 |
Town of Vyazniki | Vladimir Oblast | 1 | 1 | 1 |
Kolchuginsky District | Vladimir Oblast | 1 | 1 | 1 |
The list of localities include some places completely transformed by human activity in the 20th century. For instance, Berendeyevo Peat Bog has been drained and mined since 1918 (Fig.
Berendeyevo peat bog (Pereslavsky District, Yaroslavl Oblast) in 1901 (a) and 2021 (b), an example of fully transformed natural object precisely studied by
55 and 57 Latitude; 37.5 and 43.5 Longitude.
The checklist by
The following species names by
Rank | Scientific Name |
---|---|
phylum | Tracheophyta |
phylum | Bryophyta |
phylum | Marchantiophyta |
phylum | Chlorophyta |
phylum | Charophyta |
phylum | Ascomycota |
A book by
8,889 georeferenced records of 654 taxa from the first part of "Flora des Gouvernements Wladimir" (Fleroff 1902), which include species lists by localities studied by the author in 1894-1901. The nomenclature is given against Seregin, A.P. 2014. Flora of Vladimir Oblast, Russia: Grid data analysis. Moscow, KMK Scientific Press. 441 p. ISBN 978-5-9905832-9-0 (http://dx.doi.org/10.13140/2.1.1148.2407).
Column label | Column description |
---|---|
occurrenceID | An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). A variable constructed from a combination of two identifiers in the record that will most closely make the occurrenceID globally unique (datasetID + ID of a record within the dataset). For example, "'urn:lsid:biocol.org:col:15550:11:5030". |
dcterms:type | The nature or genre of the resource. A constant ("Dataset"). |
dcterms:modified | The most recent date-time on which the resource was changed. A constant ("2021-09-04"). |
dcterms:language | A language of the resource. A constant ("en | ru", i.e. English and Russian) |
dcterms:license | A legal document giving official permission to do something with the resource. A constant ("http://creativecommons.org/licenses/by/4.0/legalcode"). |
dcterms:rightsHolder | A person or organisation owning or managing rights over the resource. A constant ("Moscow State University"). |
dcterms:accessRights | Information about who can access the resource or an indication of its security status. A constant ("Use under CC BY 4.0"). |
institutionID | An identifier for the institution having custody of the object(s) or information referred to in the record. A constant ("http://grbio.org/institution/moscow-stateuniversity" for the Moscow State University). |
collectionID | An identifier for the collection or dataset from which the record was derived. A constant ("urn:lsid:biocol.org:col:15550" for the Moscow University Herbarium). |
datasetID | An identifier for the set of data. May be a global unique identifier or an identifier specific to a collection or institution. A constant ("urn:lsid:biocol.org:col:15550:11"). |
institutionCode | The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. A constant ("Moscow State University"). |
datasetName | The name identifying the dataset from which the record was derived. A constant ("Flora des Gouvernements Wladimir" (Fleroff, 1902): georeferenced records). |
ownerInstitutionCode | The name (or acronym) in use by the institution having ownership of the object(s) or information referred to in the record. A constant ("Moscow State University"). |
basisOfRecord | The specific nature of the data record - a subtype of the dcterms:type. A constant ("HumanObservation"). |
catalogNumber | An identifier (preferably unique) for the record within the dataset or collection. A variable. For example, "Flerov:5030". |
recordedBy | A list (concatenated and separated) of names of people, groups or organisations responsible for recording the original occurrence. A variable. For example, "Alexander F. Fleroff". |
occurrenceStatus | A statement about the presence or absence of a taxon at a location. A constant ("present"). |
associatedReferences | A list (concatenated and separated) of identifiers (publication, bibliographic reference, global unique identifier, URI) of literature associated with the Occurrence. A variable with a page reference. For example, "Fleroff (1902), p. 182 [Fleroff A. (1902). Flora des Gouvernements Wladimir. I. Pflanzengeographische Beschreibung des Gouvernements Wladimir. Moskva. 338 p.]". |
eventDate | The date or interval during which an event occurred. For occurrences, this is the date when the event was recorded. A constant ("1894/1901"). |
higherGeography | A list (concatenated and separated) of geographic names less specific than the information captured in the locality term. A variable. For example, "Europe | Russian Federation | Vladimir Oblast | Petushinskii raion". |
continent | The name of the continent in which the location occurs. A constant ("Europe"). |
country | The name of the country or major administrative unit in which the location occurs. A constant ("Russian Federation"). |
countryCode | The standard code for the country in which the location occurs. A constant ("RU"). |
stateProvince | The name of the next smaller administrative region than country (state, province, canton, department, region etc.) in which the location occurs. A variable. For example, "Vladimir Oblast". |
county | The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the Location occurs. A variable. For example, "Petushinskii raion". |
verbatimLocality | The original textual description of the place. A variable. For example, "озеро Верхнее по р. Ушма, берега". |
locationRemarks | Comments or notes about the Location. A constant ("original description in Russian by Fleroff (1902)") |
decimalLatitude | The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a location. A variable. |
decimalLongitude | The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a location. A variable. |
geodeticDatum | The ellipsoid, geodetic datum or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude are based. A constant ("WGS84"). |
coordinateUncertaintyInMeters | The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the location. A variable. |
coordinatePrecision | A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude. A constant ("0.0001"). |
georeferencedBy | A list (concatenated and separated) of names of people, groups or organisations who determined the georeference (spatial representation) of the location. A constant ("Alexey P. Seregin"). |
georeferencedDate | The date on which the Location was georeferenced. A constant ("2021-08"). |
georeferenceSources | A list (concatenated and separated) of maps, gazetteers or other resources used to georeference the Location, described specifically enough to allow anyone in the future to use the same resources. A constant ("https://yandex.ru/maps/ | http://www.etomesto.ru/map-vladimir_mende/"). |
georeferenceRemarks | Notes or comments about the spatial description determination, explaining assumptions made in addition or opposition to the those formalised in the method. A variable. For example, "centroid position: у Бельских двориков". |
identifiedBy | A list (concatenated and separated) of names of people, groups or organisations who assigned the Taxon to the subject. A constant ("Alexander F. Fleroff"). |
dateIdentified | The date on which the subject was identified as representing the Taxon. A constant ("1894/1901"). |
taxonID | An identifier for the set of taxon information (data associated with the Taxon class). May be a global unique identifier or an identifier specific to the dataset. A variable. For example, "VLA0034" as a reference for Pinus sylvestris L. in https://doi.org/10.15468/7zk2y5. |
nameAccordingToID | An identifier for the source in which the specific taxon concept circumscription is defined or implied. See nameAccordingTo. A variable. For example, doi: 10.15468/7zk2y5. |
scientificName | The full scientific name, with authorship and date information, if known. A variable (for example, "Scirpus sylvaticus L."). |
nameAccordingTo | For taxa that result from identifications, a reference to the keys, monographs, experts and other sources should be given. A variable. Two options: (1) "Seregin A (2021). Flora of Vladimir Oblast (Seregin, 2014): accepted names. Lomonosov Moscow State University. Checklist dataset https://doi.org/10.15468/7zk2y5 accessed via GBIF.org on 2021-09-04"; (2) "Fleroff A. (1902). Flora des Gouvernements Wladimir. I. Pflanzengeographische Beschreibung des Gouvernements Wladimir. Moskva. 338 p." |
phylum | The full scientific name of the phylum or division in which the taxon is classified. A variable. For example, "Tracheophyta". |
taxonRank | The taxonomic rank of the most specific name in the scientificName. A variable. (four options: "species", "variety", "genus", "speciesAggregate"). |
vernacularName | A common or vernacular name. A variable. For example, "сфагны". |
nomenclaturalCode | The nomenclatural code (or codes in the case of an ambiregnal name) under which the scientificName is constructed. A constant ("International Code of Nomenclature for algae, fungi and plants"). |
taxonomicStatus | The status of the use of the scientificName as a label for a taxon. A constant ("accepted"). |
taxonRemarks | Comments or notes about the taxon or name. A variable. For example, "тростник in Fleroff (1902)". |
We are deeply indebted to an academic editor of the paper and an anonymous reviewer for a number of suggestions and amendments, which helped to improve the style and clarity of the manuscript.
The study was supported by the grant from Russian Science Foundation (project # 21-77-20042) for georeferencing, data procession and publication of the GBIF-mediated dataset.
Curation of the MW Herbarium is performed within the State Assignment 121032500090-7 for the Moscow State University ("Plant biodiversity of Russia and adjacent countries: scientific approach to processing of collections of the Herbarium of Moscow State University as a basis for the study of regional floras", under A.P. Seregin).
Yurii M. Basov digitised species lists from the original source and produced maps for the paper.
Alexey P. Seregin designed the study, performed georeferencing, produced GBIF-mediated dataset and wrote the paper.