Biodiversity Data Journal :
Data Paper (Biosciences)
|
Corresponding author: Janaína Gomes-da-Silva (jgomes_da_silva@yahoo.com.br)
Academic editor: Quentin Groom
Received: 17 Mar 2021 | Accepted: 30 May 2021 | Published: 03 Jun 2021
© 2021 Janaína Gomes-da-Silva, Joâo Lanna, Rafaela Forzza
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Gomes-da-Silva J, Lanna J, Forzza RC (2021) Distribution of endemic angiosperm species in Brazil on a municipality level. Biodiversity Data Journal 9: e66043. https://doi.org/10.3897/BDJ.9.e66043
|
|
Herbarium collections and the data they hold are the main sources of plant biodiversity information. These collections contain taxonomical and spatial data on living and extinct species; consequently, they are the fundamental basis for temporal and spatial biogeographical studies of plants. Mega projects focused on providing digital and free access to accurate biodiversity data have transformed plant science research, mainly in the past two decades. In this sense, researchers today are overwhelmed by the many different datasets in online repositories. There are also several challenges involved in using these data for biogeographical analyses. Analyses performed on the data available in the repositories show that 70-75% of the total amount of data have spatial deficiencies and a high number of records lack coordinates. This shortage of reliable primary biogeographical information creates serious impediments for biogeographical analyses and conservation assessments and taxonomic revisions consequently produces obstacles for evaluations of threats to biodiversity at global, regional and local levels. With the aim of contributing to botanical and biogeographical research, this paper provides georeferenced spatial data for angiosperm species endemic to Brazil. The information from two reliable online databases, i.e. the Flora do Brasil 2020 floristic database (BFG) and Plantas do Brasil: Resgate Histórico e Herbário Virtual para o Conhecimento e Conservação da Flora Brasileira (REFLORA), which are both based on records collected over the course of the last two centuries, is used to create this spatial dataset.
We provide three taxonomically-edited and georeferenced datasets for basal angiosperms, monocots and eudicots, covering a total of 14,992 endemic species from Brazil. Producing this consolidated dataset involved several months of detailed revision of coordinates and nomenclaturally updating of the names in these datasets. The information provided in this geo-referenced dataset, covering two centuries of specimen collections, will contribute to several botanical and mainly biogeographical studies.
Endemic species, data re-use, flowering plants, occurrence records, primary biodiversity data, South America
Herbarium collections and the data they hold have been one of the main sources of plant biodiversity information through time (
Widespread access to taxonomic and distributional data is producing great advances in botanical and biogeographical research, as well as supporting more accurate evaluations of extinction risks (
Manipulating millions of records is an extremely complicated task. In recent years, workflows, tools and methods have been developed for dealing with taxonomic and geographic errors, simplifying the process (
As manual data cleaning is laborious (
Brazil has the highest biodiversity of vascular plants on the planet (BFG:
The geographical range of a species forms the basis for biogeographical studies. Repositories, such as
Geo-referenced spatial data for angiosperm species endemic to Brazil
The
This georeferenced occurrence dataset for endemic species provides the basis for a wide range of biodiversity studies, for example, spatial studies conducted at various hierarchical levels, i.e. family, genus, species; effects of global change; changes in distributions of species; conservation; and systematics.
Conselho Nacional de Desenvolvimento Cientıfico e Tecnológico (CNPq) and FAPERJ - Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro for the postdoctoral fellowship granted to JGS. RCF received a Research Productivity Fellowship from CNPq (proc.303420/2016-2) and FAPERJ (processes n° E-26/202.778/2018) through Programa Cientista do Nosso Estado.
Brazilian angiosperms dataset
Species list compilation:
The list of species was established in two phases. First, the initial list of names of all endemic species of Angiospermae was generated through the BFG in the Brazilian Flora (
Based on this list of all endemic Brazilian angiosperm species retrieved from the BFG floristic database between August 2018 and October 2019, all occurrence records were downloaded from the
We created a protocol to clean the datasets (Fig.
Subsequently, we conducted manual cleaning procedures on the records. For cleaning the records, three steps were performed on the geographic data. In the first step, records of specimens with imprecise or vague descriptions of locations (e.g. Negro River, north coast, south coast) and incomplete (e.g. Amazonia, Bahia, Brazil) or incongruent information concerning locations (e.g. with no administrative unit, location in the ocean) were excluded. In the second step, we removed the taxonomic duplicates and records of duplicate samples with the same species name and place of occurrence and voucher information. In the final dataset, each record corresponds to a single herbarium specimen for which the geographical location has been checked and is unique to that locality. Duplicates were removed from the list, based on locality, collector name, collector number and the year in which the sample was collected. After data cleaning, the total number of records dropped from 827,016 to 183,201 occurrence records with complete voucher information.
The use of GPS became more widespread in 1995-1996, but there were still few satellites at that time (
The final checklist is composed of native and endemic angiosperms and includes only vouchers identified to the species level, based on the Brazilian Flora (
The geographic coverage encompasses the national territory of Brazil, which extends from 5° to -34° Latitude; -34° to -73° Longitude and covers a total area of approximately 8.5 million km² (IBGE). The dataset comprised all species of Angiospermae found exclusively in Brazil and it contains occurrence records in six phytogeographic domains, i.e. Amazonia, Caatinga, Cerrado, the Atlantic Forest, Pampa and Pantanal, in Chacoan, Parana, South Brazilian and South-eastern Amazonian dominions (Fig.
-34 and -5° Latitude; -73° and -34 Longitude.
To facilitate the search for taxa at different hierarchical levels, the dataset comprises three different worksheets of specimens collected over the past two centuries organised according to APG IV classification (
(1st Worksheet) A total of 649 species of basal angiosperms belonging to five orders, i.e. Canellales, Laurales, Magnoliales, Nymphaeales and Piperales from 13 families and 50 genera. Number of records is georeferenced by order in Fig.
(2nd Worksheet) A total of 3,854 species of monocots belonging to nine orders, i.e. Alismatales, Arecales, Asparagales, Commelinales, Dioscoreales, Liliales, Pandanales, Poales and Zingiberales from 32 families and 370 genera. Number of records is georeferenced by order in Fig.
(3rd Worksheet) A total of 10,489 eudicots, belonging to 31 orders, i.e. Apiales, Aquifoliales, Asterales, Boraginales, Brassicales, Caryophyllales, Celastrales, Cornales, Cucurbitales, Dilleniales, Dipsacales, Ericales, Escalloniales, Fabales, Gentianales, Geraniales, Gunnerales, Lamiales, Malpighiales, Malvales, Myrtales, Oxalidales, Picramniales, Proteales, Ranunculales, Rosales, Santalales, Sapindales, Solanales, Vitales and Zygophyllales from 128 families and 1,199 genera. Number of records is georeferenced by order in Fig.
Data containing the geographic distribution of 649 species of basal angiosperms from 13 families.
Column label | Column description |
---|---|
family | The scientific name of the family in which the taxon is classified. |
genus | The scientific name of the genus in which the taxon is classified. |
specificEpithet | Scientific name. |
country | The country where the species occur. |
stateProvince | State of Brazil where species occur. |
municipality | Municipality of Brazil where species occur. |
decimalLatitude | The latitude component (N/S) of the coordinates of the municipality where the species occur, in decimal degrees. |
decimalLongitude | The longitude component (E/W) of the coordinates of the municipality where the species occur, in decimal degrees. |
Data containing the geographic distribution of 10,489 eudicots from 128 families.
Column label | Column description |
---|---|
family | The scientific name of the family in which the taxon is classified. |
genus | The scientific name of the genus in which the taxon is classified. |
specificEpithet | Scientific name. |
country | The country where the species occur. |
stateProvince | State of Brazil where species occur. |
municipality | Municipality of Brazil where species occur. |
decimalLatitude | The latitude component (N/S) of the coordinates of the municipality where the species occur, in decimal degrees. |
decimalLongitude | The longitude component (E/W) of the coordinates of the municipality where the species occur, in decimal degrees. |
Data containing the geographic distribution of 3,854 species of monocots from 32 families.
Column label | Column description |
---|---|
family | The scientific name of the family in which the taxon is classified. |
genus | The scientific name of the genus in which the taxon is classified. |
specificEpithet | Scientific name. |
country | The country where the species occur. |
stateProvince | State of Brazil where species occur. |
municipality | Municipality of Brazil where species occur. |
decimalLatitude | The latitude component (N/S) of the coordinates of the municipality where the species occur, in decimal degrees. |
decimalLongitude | The longitude component (E/W) of the coordinates of the municipality where the species occur, in decimal degrees. |
Despite the digitisation efforts of numerous museums and herbaria, data gaps remain. We strongly encourage and recommend that distributional data be correctly georeferenced in collections in order to increase the quality of the spatial data used in future analyses.
Due to the immeasurable importance of primary occurrence data and the difficulties in georeferencing inaccurate geographical distribution data, we recommend that collectors strive to prioritise and record exact coordinates for their collections (see discussion in
We would like to thank the scientific and technical teams of the Flora do Brasil and REFLORA. The work, presented in this paper, is part of a postdoctoral study conducted by the first author at the Jardim Botânico do Rio de Janeiro. The authors are grateful to the following Brazilian funding agencies: FAPERJ - Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro (2021) and CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico (2020) for the postdoctoral fellowship granted to JGS. RCF received a Research Productivity Fellowship from CNPq (proc. 303420/2016-2) and FAPERJ (processes n° E-26/202.778/2018) through Programa Cientista do Nosso Estado. This work was supported by funds from Natura. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Gomes-da-Silva, Janaina (conceived the presented idea, dataset preparation, dataset editing, manuscript writing, manuscript editing).
João Lanna (dataset editing).
Forzza, Campostrini Rafaela (conceived the presented idea, supervised the findings of this work and the project, manuscript editing).