Literature-based occurrences data of marine species in Venezuela

Abstract Background Venezuela has suffered a severe academic and research management crisis and funding opportunities for marine research and data management have been practically absent. This has worsened over the past five years and, as a result, libraries and other institutional spaces have been repeatedly vandalised, with hundreds of records, specimens and historical data stolen, destroyed or burned. To avoid the loss of irreplaceable data on Venezuelan biodiversity, an initiative was promoted, aimed at digitising information to create a rich dataset of biodiversity records, with emphasis on marine protected areas for the country, as well as to fill gaps in the distribution and status of marine biodiversity in Venezuela. Nighteen (19) institutions in the country focusing on marine science have consistently produced a wealth of information about Venezuela’s marine biodiversity in the form of specimen collections, unpublished sampled data and research theses through the work of hundreds of researchers and students. An inventory of available data sources at these national institutions was conducted under the National Biodiversity Data Mobilization Grant and the Biodiversity Information for Development Program, together with the Global Biodiversity Information Facility (GBIF) support. All recovered and processed datasets were published in the Ocean Biodiversity Information System (OBIS) and the Global Biodiversity Information Facility (GBIF) repositories. New information This occurrences data collection represents a major contribution to the marine biodiversity inventory in Venezuela. It is based on numerous published papers, reports, books and checklists provided by experts, covering a broad taxonomic collection from which we obtained species occurrences (present and absent), organised into 59 datasets containing 40,881 records. This represents a 28.49% contribution to the records of the Venezuelan marine biodiversity reported to the OBIS (143,513 records in the OBIS until November 2022). The extracted data showed 3,041 marine species, with representatives of each of the six kingdoms: Animalia, Chromista, Bacteria, Plantae, Fungi and Protozoa. The datasets provide information on occurrence since 1822, extending the temporal coverage of the species occurrence inventory for Venezuela, which was established in 1879 before this project. The number of records for Venezuela increased by 41.3% compared with the data available before the project. Most of the occurrences (63.47%) were registered in Marine Protected Areas. Data collection included records of non-native species, descriptions of new species and species listed under different IUCN categories.


Introduction
Venezuela is amongst the top ten countries with the greatest biodiversity in the world (Aguilera et al. 2003, Grande 2018).However, due to the enormous impact of human activities, such as tourism, overexploitation of marine resources, physical alteration and pollution, marine environments are at great risk and their biodiversity is highly threatened (Miloslavich et al. 2003).Coastal area management involves assessing changes in the distribution and abundance of coastal and marine species.However, the Venezuelan Integrated Plan for Coastal Management (Plan de Ordenamiento y Gestión Integral de Zonas Costeras) reveals a lack of information related to biodiversity attributes and indicators, which forms the basis for projecting risks and identifying actions to reduce coastal vulnerability (Minamb 2013, Peralta Brichtova 2021).On the other hand, Venezuela is suffering a severe academic and research management crisis and funding opportunities for marine research and data management have been practically absent (Requena 2003, Requena 2012, Bull and Rosales 2020, Van Roekel and De Theije 2020, Garcia Zea 2020, Requena 2021).This has worsened over the past five years and, as a result, libraries and other institutional spaces have been repeatedly vandalised with hundreds of records, specimens and historical data stolen, destroyed or burned.To preserve the information that will serve assessments, planning and management, an initiative for mobilising marine data was promoted by Fundación Caribe Sur.Through the "Rescuing the knowledge base of Venezuela's marine biodiversity" project supported by the Global Biodiversity Information Facility (GBIF) and funded by the European Union via the Biodiversity Information for Development Programme-BID, the project managed to identify and digitise the Venezuelan marine biodiversity data found in articles and grey literature stored in many national academic institutions.This article summarises the rescued dataset collections derived from this project, which are hosted in the Ocean Biodiversity Information System (OBIS) and GBIF to date.The resulting data collection is composed of 59 datasets (occurrence and sampling events) with 40,881 records of marine organisms from a broad range of taxonomic categories registered within the Venezuelan maritime area, including some of its islands (Table 1).

Dataset title
No. records

Resource type
Resource citation    Data collection from Venezuela used in the compilation, including number of records and references from OBIS/GBIF.Design description: Fundación Caribe Sur, supported by the Global Biodiversity Information Facility-GBIF, carried out the project "Rescuing the knowledge base of Venezuela's marine biodiversity".This Project convened researchers affiliated to seven national academic institutions and two NGOs (Universidad Simón Bolívar, Universidad Central de Venezuela, Universidad de Carabobo, Universidad de Oriente, Instituto Venezolano de Investigaciones Científicas, Universidad Nacional Experimental Francisco de Miranda, Universidad del Zulia, Fundación Museo del Mar -Museo Marino de Margarita, Fundación Caribe Sur) to safeguard the largest amount of information on marine biodiversity that has been produced in the country.The project participants rescued data on marine biodiversity from most Venezuelan marine areas by digitising and mobilising information on marine biodiversity found in each of the national institutions mentioned above.Consequently, the project integrated national researchers into the community of contributors and users of georeferenced biodiversity data of Venezuelan marine environments.

Funding:
The resources to undertake this project have been received from the European Union and GBIF under the National Biodiversity Data Mobilisation Grant and the Biodiversity Information for Development Programme -BID implemented in the Caribbean region (led by GBIF), under Proyecto GBIF-Caribe Sur / ID: BID-CA2020-025-NAC "Rescate de la Data sobre Biodiversidad Marina en Venezuela"

Sampling methods
Sampling description: Data collection, curation and digitisation were performed by a team of 14 researchers affiliated to the most important universities, scientific research centres and NGOs that deal with marine science and marine management in Venezuela.The work contains literature-based sampling information on marine organism occurrences collected from institution libraries from which theses, research project reports and journal publications were reviewed (Table 2) to obtain data on the taxonomic groups, location of occurrence, collection dates, measurements of habitat features (such as physical and chemical parameters of the environment), biotic measurements (e.g.body size, abundance and biomass) and details regarding the nature of the sampling or observation methods, equipment and sampling effort.

Geographic coverage
Description: The data coverage was extracted directly from the literature and checked for any misreported georeferences, covered the entire Venezuelan mainland coast and some of its islands, including diverse marine coastal habitats, such as coral reefs, mangroves, rocky shores, sandy beaches, seagrass beds, coastal lagoons, sandy bottoms, oceanic water column and sea floor.Most occurrences (63.45%) were registered within marine protected areas, including seven national parks (Morrocoy, La Restinga, Archipiélago de Los Roques, San Esteban, Mochima, Médanos de Coro and Península de Paria) and four wildlife refuges (Cuare, Boca de Caño, Hueque-Sauca and Isla de Aves) (Table 3, Fig. 1).However, some MPAs (Laguna de Tacarigua, Turuépano, and Mariusa) still show important gaps in their biodiversity records.This project recorded new occurrences from areas that traditionally lacked biodiversity information, such as the Orinoco Delta and Atlantic Front, Paria Peninsula and coastal areas of western Venezuela states (Falcón and Zulia).

Taxonomic coverage
Description: The taxonomic structure of the Venezuelan marine biodiversity collection at the time of publication represents a total of 30 Phyla, belonging to the kingdoms Animalia (17), Chromista (6), Plantae (4), Bacteria (1), Fungi (1) and Protozoa (1) (Table 4).The total number of records identified at the species level was 34,615, representing 84.67% of all the records.The remaining 15% of the records were identified at the family and genus levels.

Notes:
The data records extracted from literature have at least a year of collection.They include records from 1822 to 2022 (Fig. 2).Most occurrences were registered in the 1960s onwards, with the largest number of documented records in the 2000 decade.Description: The database provides information on observations since 1822, including a broad taxonomic group of marine organisms compiled from 59 datasets (Table 1) with a total of 40,881 records.Most datasets are structured using Event Core Schema with Occurrences and Extended Measurements or Facts (eMOF) extensions; therefore, they contain not only georeferenced occurrence records, but also sampling protocols and environmental and biotic measurements.

Column label Column description
identifier A related resource that is referenced, cited or otherwise pointed to by the described resource.licence A legal document giving official permission to do something with the resource. basisOfRecord The specific nature of the data record.
occurrenceID An identifier for the Occurrence (as opposed to a particular digital record of the occurrence).In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique.
occurrenceStatus A statement about the presence or absence of a Taxon at a Location.

eventDate
The date-time or interval during which an Event occurred.For occurrences, this is the date-time when the event was recorded.Not suitable for a time in a geological context.

year
The four-digit year in which the Event occurred, according to the Common Era Calendar.
scientificNameID An identifier for the nomenclatural details of a scientific name.
Literature-based occurrences data of marine species in Venezuela

Column label
Column description identifier A related resource that is referenced, cited or otherwise pointed to by the described resource. scientificName The full scientific name, with authorship and date of information, if known.When forming part of an Identification, this should be the name in the lowest level taxonomic rank that can be determined.This term should not contain identification qualifications, which should instead be supplied in the IdentificationQualifier term.

Additional information
Some of the datasets compiled for this project have additional columns (Table 6).

Column label Column description scientificNameAuthorship
The authorship information for the scientificName formatted according to the conventions of the applicable nomenclaturalCode. institutionCode The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record.
collectionCode The name, acronym, coden or initialism identifying the collection or dataset from which the record was derived.
catalogNumber An identifier (preferably unique) for the record within the dataset or collection.
recordedBy A list (concatenated and separated) of names of people, groups or organisations responsible for recording the original Occurrence.The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first.Table 6.
Additional columns present in some of the datasets compiled.
Literature-based occurrences data of marine species in Venezuela

Figure 1 .
Figure 1.Location and aggregation of occurrences reported in this work for the Venezuelan coast and its islands.Dark blue regions represent MPAs: A Ciénaga de los Olivitos National Park; B Médanos de Coro National Park, Laguna Boca de Caño Wildlife Refuge and Hueque-Sauca Wildlife Reserve; C Morrocoy and San Esteban National Parks, Cuare Wildlife Refuge; D Laguna de Tacarigua National Park; E Mochima National Park; F Laguna de La Restinga National Park; G Archipiélago de Los Roques National Park; H Península de Paria and Turuépano National Parks; I Mariusa National Park and Delta del Orinoco Biosphere Reserve.

Figure 2 .
Figure 2.Historical series for Venezuelan marine species occurrences.

kingdom
The full scientific name of the kingdom in which the taxon is classified.taxonRankThetaxonomic rank of the most specific name in the scientificName.decimalLatitude The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location.Positive values are north of the Equator, negative values are south of it.Legal values lie between -90 and 90, inclusive.decimalLongitude The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location.Positive values are east of the Greenwich Meridian, negative values are west of it.Legal values lie between -180 and 180, inclusive.language A language of the resource.waterBody The name of the water body in which the Location occurs.country The name of the country or major administrative unit in which the Location occurs.countryCode The standard code for the country in which the Location occurs.datasetName The name identifying the data set from which the record was derived.phylum The full scientific name of the phylum or division in which the taxon is classified.class The full scientific name of the class in which the taxon is classified.order The full scientific name of the order in which the taxon is classified.family The full scientific name of the family in which the taxon is classified.genus The full scientific name of the genus in which the taxon is classified.genericName The genus part of the scientificName without authorship.specificEpithet The name of the first or species epithet of the scientificName.continent The name of the continent in which the Location occurs.

individualCount
The number of individuals present at the time of the Occurrence.lifeStageTheage class or life stage of the Organism(s) at the time the Occurrence was recorded.preparationsA preparation or preservation method for a specimen.disposition The current state of a specimen with respect to the collection identified in collectionCode or collectionID.associatedReferences A list (concatenated and separated) of identifiers (publication, bibliographic reference, global unique identifier, URI) of literature associated with the Occurrence.associatedTaxa A list (concatenated and separated) of identifiers or names of taxa and the associations of this Occurrence to each of them.occurrenceRemarks Comments or notes about the Occurrence.organismRemarks Comments or notes about the Organism instance.eventID An identifier for the set of information associated with an Event (something that occurs at a place and time).May be a global unique identifier or an identifier specific to the data set.parentEventID An identifier for the broader Event that groups this and potentially other Events.eventTime The time or interval during which an Event occurred.month The integer month in which the Event occurred.day The integer day of the month on which the Event occurred.verbatimEventDate The verbatim original representation of the date and time information for an Event.

Table 2 .
Sampling information source.

Table 4 .
Number of records by Phylum represented in this collection.Literature-based occurrences data of marine species in Venezuela The data included records of non-native species, new species descriptions (Spongicola liosomatus, Haplophragmoides venezuelanus and Neopateorislopsis chichirivensis) and 78 species listed under different Threatened and Near Threatened IUCN categories (IUCN 2021): five species Critically Endangered (CR), nine Endangered (EN), 43 Vulnerable (VU) and 21 Near Threatened (NT) (Table5).

Table 5 .
Considering the IUCN Red List categories.