Biodiversity Data Journal :
Data Paper (Biosciences)
|
Corresponding author: Florencia Grattarola (grattarola@fzp.czu.cz), Daniel Pincheira-Donoso (d.pincheira-donoso@qub.ac.uk)
Academic editor: Gianniantonio Domina
Received: 23 Jul 2020 | Accepted: 27 Aug 2020 | Published: 26 Oct 2020
© 2020 Florencia Grattarola, Andrés González, Patricia Mai, Laura Cappuccio, César Fagúndez-Pachón, Florencia Rossi, Franco Teixeira de Mello, Lucía Urtado, Daniel Pincheira-Donoso
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Grattarola F, González A, Mai P, Cappuccio L, Fagúndez-Pachón C, Rossi F, Teixeira de Mello F, Urtado L, Pincheira-Donoso D (2020) Biodiversidata: A novel dataset for the vascular plant species diversity in Uruguay. Biodiversity Data Journal 8: e56850. https://doi.org/10.3897/BDJ.8.e56850
|
|
South America hosts some of the world’s most prominent biodiversity hotspots. Yet, Uruguay – a country where multiple major ecosystems converge – ranks amongst the countries with the lowest levels of available digital biodiversity data in the continent. Such prevalent data scarcity has significantly undermined our ability to progress towards evidence-based conservation actions – a critical limitation for a country with a strong focus on agricultural industries and only 1.3% of the land surface guarded by protected areas. Under today’s rapid biodiversity loss and environmental changes, the need for open-access biodiversity data is more pressing than ever before. To address this national issue, Biodiversidata – Uruguay’s first Consortium of Biodiversity Data – has recently emerged with the aim of assembling a constantly growing database for the biodiversity of this country. While the first phase of the project targeted vertebrate biodiversity, the second phase presented in this paper spans the biodiversity of plants.
As part of the second phase of the Biodiversidata initiative, we present the first comprehensive open-access species-level database of the vascular plant diversity recorded in Uruguay to date (i.e. all species for which data are currently available and species presence has been confirmed). It contains 12,470 occurrence records from across 1,648 species and 160 families, which roughly represents 60% of the total recorded flora of Uruguay. The primary biodiversity data include extant native and introduced species from the lycophytes, ferns, gymnosperms and angiosperms groups. Records were collated from multiple sources, including data available in peer-reviewed scientific literature, institutional scientific collections and datasets contributed by members of the Biodiversidata initiative. The complete database can be accessed at the Zenodo repository: doi.org/10.5281/zenodo.3954406
Species occurrence records, biodiversity data gaps, data mobilisation, Tracheophyta, Río de la Plata grasslands, South America, Uruguay
South America stands out as one of the planet’s regions with the highest levels of species-richness and endemisms (
The compilation of georeferenced plant data is a relatively-recent practice in Uruguay (
Biodiversidata – Uruguay’s first Consortium of Biodiversity Data (https://biodiversidata.org/) – has recently emerged with the aim of assembling a constantly growing, open-access database for Uruguay’s biodiversity (
Records collected per group showing number of occurrence records, number of species, records with date of collection, records collected in the last 30 years and records with coordinates, with percentage in parentheses.
Group | Number of Occurrence Records | Number of Species | Records with Date (%) | Records from the last 30 years (%) | Records with Coordinates (%) |
Lycophytes | 13 | 6 | 13 (100) | 11 (84.6) | 13 (100) |
Ferns | 540 | 78 | 540 (100) | 508 (94.1) | 540 (100) |
Gymnosperms | 48 | 5 | 41 (85.4) | 39 (81.2) | 48 (100) |
Angiosperms | 11,869 | 1,559 | 10,527 (88.7) | 9,585 (80.8) | 11,869 (100) |
Total | 12,470 | 1,648 | 11,121 (89.2) | 10,143 (81.3) | 12,470 (100) |
The primary data were collated from a range of different sources such as online databases, field guides, reports and primary literature, as well as Biodiversidata members’ original field/herbarium records. A complete list of sources for the occurrence records is shown in Table
List of sources used to build the Biodiversidata plant dataset, including the source type, the plant groups included in each source and the number of records extracted from each of the sources.
Source |
Source Type |
Groups |
Records |
|
Journal Article |
Ferns, Gymnosperms, Angiosperms |
252 |
|
Journal Article |
Ferns, Gymnosperms, Angiosperms |
107 |
|
Journal Article |
Angiosperms |
3 |
|
Thesis |
Gymnosperms, Angiosperms |
34 |
|
Short Communication |
Angiosperms |
2 |
|
Short Communication |
Angiosperms |
2 |
|
Short Communication |
Angiosperms |
3 |
|
Journal Article |
Angiosperms |
8 |
|
Journal Article |
Angiosperms |
6 |
|
Journal Article |
Ferns, Angiosperms |
153 |
|
Journal Article |
Angiosperms |
26 |
|
Thesis |
Angiosperms |
71 |
|
Biodiversidata member |
Lycophytes, Ferns, Gymnosperms, Angiosperms |
340 |
|
Journal Article |
Angiosperms |
52 |
|
Online Database |
Lycophytes, Ferns, Gymnosperms, Angiosperms |
3428 |
|
Biodiversidata member |
Angiosperms |
101 |
|
Thesis |
Angiosperms |
781 |
|
Thesis |
Angiosperms |
991 |
|
Thesis |
Gymnosperms, Angiosperms |
1343 |
|
Journal Article |
Gymnosperms, Angiosperms |
897 |
|
Journal Article |
Angiosperms |
17 |
|
Journal Article |
Angiosperms |
14 |
|
Thesis |
Angiosperms |
68 |
|
Thesis |
Angiosperms |
20 |
|
Thesis |
Lycophytes, Ferns, Angiosperms |
220 |
|
Biodiversidata member |
Ferns, Angiosperms |
520 |
|
Journal Article |
Angiosperms |
50 |
|
Thesis |
Angiosperms |
9 |
|
Thesis |
Angiosperms |
152 |
|
Report |
Lycophytes, Ferns, Gymnosperms, Angiosperms |
1357 |
|
Journal Article |
Angiosperms |
2 |
|
Journal Article |
Ferns, Angiosperms |
53 |
|
Journal Article |
Lycophytes, Ferns, Gymnosperms, Angiosperms |
283 |
|
Report |
Lycophytes, Ferns, Gymnosperms, Angiosperms |
710 |
|
Journal Article |
Angiosperms |
20 |
|
Thesis |
Angiosperms |
9 |
|
Biodiversidata member |
Angiosperms, Ferns |
366 |
The data from bibliographic references were obtained from searches based on the use of more than 30 sources which were largely heterogeneous in the amount of information available for each record. The information about the source was captured for each record using the ‘associatedReferences’ Darwin Core term. The data extracted consisted of taxa names, their geographic location and date of the collection/observation event when available, as well as information about collectors and identifiers. In some cases, georeferencing of the point locations was needed and relevant information was captured under the terms ‘coordinateUncertaintyInMeters’, ‘coordinatePrecision’ and ‘georeferenceRemarks’ (see more details in Steps description subsection).
The data from online sources were accessed through GBIF via ‘rgbif’ (
A single dataset with 5,138 occurrence records was downloaded, available at: https://doi.org/10.15468/dl.wc2fm7. After the data cleaning and quality check process was performed (see details in Quality control subsection), we kept 3,428 data records. Of those records, 1,787 corresponded to specimens and were contributed to GBIF by 51 different institutions around the world. The major contributor was the Missouri Botanical Garden (28.8% of the 1,787 records), followed by Universidade Federale do Rio Grande do Sul of Brazil (11.8%) and Universidade de São Paulo (6.6%). The 1,637 human observations were mainly derived (99.6%) from the citizen-science platform iNaturalist.
The data provided by Biodiversidata members were curated (e.g. taxonomic names updated, fields standardised) and uploaded to GBIF as four separate datasets, one for each data contributor (see sources in Table
For data to be fit for use, they must be accurate, complete, consistent with other sources and provide a proper level of detail (
We checked misspellings, format errors and resolved synonymy and we completed higher taxonomic and infraspecific ranks terms and taxonomic authority for the scientific names using the R packages 'taxize' (
We checked dates accuracy and completed the 'eventDate' term with the format YYYY-MM-DD (e.g. 2020-02-10 for 20 February 2010). If only the year were known, 'eventDate' was represented as YYYY and if only the year and month were known, as YYYY-MM.
We filtered records occurring outside Uruguay's continental territory and checked for inaccuracy and incompleteness in georeferences. The data accessed via GBIF was filtered by keeping records with coordinate uncertainty values of less than 10 km and discarding those records with country centroid as georeference protocol. This hard filter was performed to reduce processing time and avoid location inaccuracy for posterior analyses. For the data extracted from literature, when coordinates were missing, we georeferenced point localities from maps figures using Google Earth Pro 2020 and marked them as requiring further verification. From the data provided by members of Biodiversidata or collated from literature, when geographic coordinates were presented either as degrees, minutes and seconds or degrees and decimal minutes, we georeferenced the locations to decimal degrees, following georeferencing best practices (
Finally, we generated a unique 'occurrenceID' for every record in our database, except the data accessed from GBIF for which we kept the original ID.
The database covers extant species of vascular plants reported for locations within the borders of Uruguay. The occurrence records are spatially biased (Fig.
Distribution in Uruguay of (a) the total 12,470 occurrence records of vascular plants in Biodiversidata, (b) sampling effort with 25 × 25 km grid-cell resolution (the mid-size resolution used for Biodiversidata's analyses) and (c) urban areas (orange dots with size relative to surface in km2), routes (international, primary and secondary) and main rivers. Projection WGS1984.
-58.43882 and -53.266525 Latitude; -30.10818 and -34.973188 Longitude.
The database includes 1,362 native species, 271 introduced and 15 species of yet unknown establishment means. According to
Number of occurrence records of vascular plants from Uruguay per family within each clade, in Biodiversidata. For Ferns and Angiosperms, only families with more than 20 and 50 records, respectively, are shown. On top of the bars, the number of species for each family is included along with the corresponding number of species that is expected by
Rank | Scientific Name | Common Name |
---|---|---|
kingdom | Plantae | Plants |
phylum | Tracheophyta | Vascular Plants |
The records included in the database cover samples reported in Uruguay during the period of 1877–2020 (Fig.
Representation of vascular plants families from Uruguay over time, grouped by clades, in Biodiversidata. Dots indicate there is at least one occurrence record for a species in the given family. For Ferns and Angiosperms, only families with more than 20 and 50 records, respectively, are shown.
Creative Commons Attribution License (CC-BY 4.0)
The dataset provides primary biodiversity data on extant vascular plant species recorded within Uruguay between 1877–2020 (Suppl. material
Column label | Column description |
---|---|
occurrenceID | An identifier for the existence of a particular organism at a particular place at a particular time | dwc:occurrenceID |
otherCatalogNumbers | A list (concatenated and separated) of previous or alternate fully qualified catalogue numbers or other human-used identifiers for the same particular organism, whether in the current or any other dataset or collection | dwc:otherCatalogNumbers |
basisOfRecord | The specific nature of the data record (e.g. PreservedSpecimen, HumanObservation, unknown) | dwc:basisOfRecord |
recordedBy | A list (concatenated and separated) of names of people, groups or organisations responsible for recording the original existence of a particular organism at a particular place at a particular time | dwc:recordedBy |
establishmentMeans | The process by which the biological individual(s) represented in the record established at the spatial region or named place (e.g. native (= nativa), introduced (= introducida), unknown (= desconocido)) | dwc:establishmentMeans |
previousIdentifications | A list (concatenated and separated) of previous assignments of names to the recorded organism | dwc:previousIdentifications |
eventDate | The date during which the recording event occurred (format YYYY-MM-DD) | dwc:eventDate |
year | The four-digit year in which the recording event occurred, according to the Common Era Calendar | dwc:year |
month | The ordinal month in which the recording event occurred | dwc:month |
day | The integer day of the month on which the recording event occurred | dwc:day |
higherGeography | A list (concatenated and separated) of geographic names less specific than the information captured in the locality term | dwc:higherGeography |
continent | The name of the continent in which the spatial region or named place occurs | dwc:continent |
country | The name of the country or major administrative unit in which the spatial region or named place occurs | dwc:country |
countryCode | The standard code for the country in which the spatial region or named place occurs | dwc:countryCode |
stateProvince | The name of the next smaller administrative region than country (state, province, canton, department, region etc.) in which the location occurs | dwc:stateProvince |
locality | The standardised description of the spatial region or named place of an event | dwc:locality |
decimalLatitude | The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a spatial region or named place | dwc:decimalLatitude |
decimalLongitude | The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a spatial region or named place | dwc:decimalLongitude |
geodeticDatum | The ellipsoid, geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude as based | dwc:geodeticDatum |
coordinateUncertaintyInMeters | The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the spatial region or named place | dwc:coordinateUncertaintyInMeters |
coordinatePrecision | A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude | dwc:coordinatePrecision |
georeferencedBy | A list (concatenated and separated) of names of people, groups or organisations who determined the georeference (spatial representation) for the spatial region or named place | dwc:georeferencedBy |
identifiedBy | A list (concatenated and separated) of names of people, groups or organisations who assigned the taxon to the subject | dwc:identifiedBy |
taxonID | An global unique identifier for the set of taxon information | dwc:taxonID |
scientificName | The full scientific name, with authorship and date information if known | dwc:scientificName |
nameAccordingTo | The reference to the source in which the specific taxon concept circumscription is defined or implied | dwc:nameAccordingTo |
higherClassification | A list (concatenated and separated) of taxa names terminating at the rank immediately superior to the taxon referenced in the taxon record | dwc:higherClassification |
kingdom | The full scientific name of the kingdom in which the taxon is classified | dwc:kingdom |
phylum | The full scientific name of the phylum or division in which the taxon is classified | dwc:phylum |
class | The full scientific name of the class in which the taxon is classified | dwc:class |
order | The full scientific name of the order in which the taxon is classified | dwc:order |
family | The full scientific name of the family in which the taxon is classified | dwc:family |
genus | The full scientific name of the genus in which the taxon is classified | dwc:genus |
specificEpithet | The name of the first or species epithet of the scientificName | dwc:specificEpithet |
infraspecificEpithet | The name of the lowest or terminal infraspecific epithet of the scientificName, excluding any rank designation | dwc:infraspecificEpithet |
taxonRank | The taxonomic rank of the most specific name in the scientificName | dwc:taxonRank |
scientificNameAuthorship | The authorship information for the scientificName | dwc:scientificNameAuthorship |
institutionCode | The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record | dwc:institutionCode |
collectionCode | The name, acronym, coden or initialism identifying the collection or dataset from which the record was derived | dwc:collectionCode |
catalogNumber | An identifier (preferably unique) for the record within the dataset or collection | dwc:catalogNumber |
recordNumber | An identifier given to the occurrence at the time it was recorded | dwc:recordNumber |
associatedReferences | A list (concatenated and separated) of identifiers (publication, bibliographic reference, URI) of literature associated with the existence of a particular organism at a particular place at a particular time | dwc:associatedReferences |
verbatimLocality | The original textual description of the spatial region or named place of the record event | dwc:verbatimLocality |
georeferenceRemarks | Notes or comments about the spatial description determination, explaining assumptions made | dwc:georeferenceRemarks |
vernacularName | A list (concatenated and separated) of common or vernacular names | dwc:vernacularName |
locationAccordingTo | Information about the source of the location information | dwc:locationAccordingTo |
georeferencedDate | The date on which the location was georeferenced | dwc:georeferencedDate |
georeferenceSources | A list (concatenated and separated) of maps, gazetteers or other resources used to georeference the Location, described specifically enough to allow anyone in the future to use the same resources | dwc:georeferenceSources |
georeferenceVerificationStatus | A categorical description of the extent to which the georeference has been verified to represent the best possible spatial description (e.g. requires verification) | dwc:georeferenceVerificationStatus |
georeferenceProtocol | A description or reference to the methods used to determine the spatial coordinates and uncertainties | dwc:georeferenceProtocol |
verbatimLatitude | The verbatim original latitude of the location (spatial region or named place) | dwc:verbatimLatitude |
verbatimLongitude | The verbatim original longitude of the location (spatial region or named place) | dwc:verbatimLongitude |
verbatimCoordinateSystem | The spatial coordinate system for the verbatimLatitude and verbatimLongitude of the location | dwc:verbatimCoordinateSystem |
locationRemarks | Comments or notes about the location (spatial region or named place) | dwc:locationRemarks |
measurementType | The nature of the measurement, fact, characteristic or assertion | dwc:measurementType |
measurementValue | The value of the measurement, fact, characteristic or assertion | dwc:measurementValue |
measurementDeterminedBy | A list (concatenated and separated) of names of people, groups or organisations who determined the value of the measurement, fact, characteristic or assertion | dwc:measurementDeterminedBy |
measurementRemarks | Comments or notes accompanying the measurement, fact, characteristic or assertion | dwc:measurementRemarks |
organismRemarks | Comments or notes about the particular organism recorded | dwc:organismRemarks |
language | Language of the resource | dcterms:language |
license | A legal document giving official permission to do something with the resource | dcterms:license |
Biodiversidata is a collaborative association of experts with the aim of assembling a constantly-growing database for Uruguay’s biodiversity. The initiative was launched in 2018 under the direction of Florencia Grattarola as part of her PhD project at the University of Lincoln in partnership with the MacroBiodiversity Lab at Queen’s University Belfast (UK), led by Daniel Pincheira-Donoso. Its open-access platform (https://biodiversidata.org/) aims to make the biodiversity data of Uruguay openly available by integrating a broad range of resources including databases, publications, maps, reports and infographics, derived from the work of the team of expert scientific members. Current funds for developing Biodiversidata are conditional upon Grattarola's PhD project concluding in December 2020. The database presented in this study will continue to be improved and updated with new records periodically (yearly expected); check the Zenodo repository for the latest version: doi.org/10.5281/zenodo.3954406
Florencia Grattarola, Germán Botto, Laura Capuccio, Inés da Rosa, César Fagúndez, Noelia Gobel, Andrés González, Enrique M. González, Javier González, Daniel Hernández, Gabriel Laufer, Patricia Mai, Raúl Maneyro, María Martínez, Juan A. Martínez-Lanfranco, Daniel E. Naya, Ana L. Rodales, Florencia Rossi, Franco Teixeira de Mello, Lucía Urtado, Lucía Ziegler and Daniel Pincheira-Donoso.
The research leading to these results has received funding from Agencia Nacional de Investigación e Innovación (ANII POS_EXT_2016_1_136663). The project is partially funded by the School of Biological Sciences, Queen’s University Belfast (UK) through a grant to DP-D. FTM wish to thank PEDECIBA (Programa de Desarrollo de las Ciencias Básicas) and ANII. LU, FR, LC, PM and FTM want to thank CSIC-PAIE (Programa de Apoyo a la Investigación Estudiantil - Comisión Sectorial de Investigación Científica, Universidad de la República). We thank Paula Zermoglio and William Ulate for their critical suggestions and many insightful comments.
FG was responsible for data compilation, standardisation, quality control, management and analysis. FG and DP-D have drafted the first manuscript. AG, PM, CF, LU, LC, FR and FTM contributed with acquisition of data. AG, PM and CF checked the final species list. All authors collaboratively contributed to the interpretation of the data and finalised the manuscript.
Comma-separated csv data file containing the 12,470 species occurrence records held in the Biodiversidata database by 2020-08-28