Biodiversity Data Journal :
Data Paper (Biosciences)
|
Corresponding author: Iwan Le Berre (iwan.leberre@univ-brest.fr), Jean-Luc Jung (jean-luc.jung@mnhn.fr)
Academic editor: Vesela Evtimova
Received: 22 May 2021 | Accepted: 07 Jul 2021 | Published: 22 Jul 2021
© 2021 Lorraine Coché, Elie Arnaud, Laurent Bouveret, Romain David, Eric Foulquier, Nadège Gandilhon, Etienne Jeannesson, Yvan Le Bras, Emilie Lerigoleur, Pascal Jean Lopez, Bénédicte Madon, Julien Sananikone, Maxime Sèbe, Iwan Le Berre, Jean-Luc Jung
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Coché L, Arnaud E, Bouveret L, David R, Foulquier E, Gandilhon N, Jeannesson E, Le Bras Y, Lerigoleur E, Lopez PJ, Madon B, Sananikone J, Sèbe M, Le Berre I, Jung J-L (2021) Kakila database: Towards a FAIR community approved database of cetacean presence in the waters of the Guadeloupe Archipelago, based on citizen science. Biodiversity Data Journal 9: e69022. https://doi.org/10.3897/BDJ.9.e69022
|
In the French West Indies, more than 20 species of cetaceans have been observed over the last decades. The recognition of this hotspot of biodiversity of marine mammals, observed in the French Exclusive Economic Zone of the West Indies, motivated the French government to create in 2010 a marine protected area (MPA) dedicated to the conservation of marine mammals: the Agoa Sanctuary. Threats that cetacean populations face are multiple, but well-documented. Cetacean conservation can only be achieved if relevant and reliable data are available, starting by occurrence data. In the Guadeloupe Archipelago and in addition to some data collected by the Agoa Sanctuary, occurrence data are mainly available through the contribution of citizen science and of local stakeholders (i.e. non-profit organisations (NPO) and whale-watchers). However, no observation network has been coordinated and no standards exist for cetacean presence data collection and management.
In recent years, several whale watchers and NPOs regularly collected cetacean observation data around the Guadeloupe Archipelago. Our objective was to gather datasets from three Guadeloupean whale watchers, two NPOs and the Agoa Sanctuary, that agreed to share their data. These heterogeneous data went through a careful process of curation and standardisation in order to create a new extended database, using a newly-designed metadata set. This aggregated dataset contains a total of 4,704 records of 21 species collected in the Guadeloupe Archipelago from 2000 to 2019. The database was called Kakila ("who is there?" in Guadeloupean Creole). The Kakila database was developed following the FAIR principles with the ultimate objective of ensuring sustainability. All these data were transferred into the PNDB repository (Pöle National de Données de Biodiversité, Biodiversity French Data Hub, https://www.pndb.fr).
In the Agoa Sanctuary and surrounding waters, marine mammals have to interact with increasing anthropogenic pressure from growing human activities. In this context, the Kakila database fulfils the need for an organised system to structure marine mammal occurrences collected by multiple local stakeholders with a common objective: contribute to the knowledge and conservation of cetaceans living in the French Antilles waters. Much needed data analysis will enable us to identify high cetacean presence areas, to document the presence of rarer species and to determine areas of possible negative interactions with anthropogenic activities.
cetaceans, citizen science, observation, database, FAIR, French West Indies
Roughly 40% of the world’s human population live within 100 km of a coast*
The Guadeloupe Archipelago is a hotspot of marine biodiversity where understanding the interactions between cetaceans and human activities is essential. It has also led the French government to create a marine protected area dedicated to marine mammals within the French Exclusive Economic Zone of the West Indies: the Agoa Sanctuary. However, adequate cetacean conservation can only be achieved if relevant and reliable data are available. In the Guadeloupe Archipelago, besides a PhD thesis (
This data paper presents the process of structuring heterogeneous multi-source data in order to build a robust and standardised database of cetacean observations around the Guadeloupe Archipelago (Fig.
Area of study. Perimeter of the the Agoa Sanctuary, which corresponds to the French Economic Zone in the West Indies and localisation of the Guadeloupe Archipelago (data sources: map base, http://www.caribbeanmarineatlas.net; Agoa protection zone, https://inpn.mnhn.fr).
The FAIRification process of the Kakila database (Table
FAIR principles ( |
FAIRness assessment criteria used for the Kakila database |
FINDABLE |
- Using unique identifiers for each observation occurrence, observer, boat excursion, taxon, collector organism and geographic sectors. - Making persistent metadata and datasets thanks to the deposit to the French Pôle National de données de Biodiversité (PNDB, https://www.pndb.fr/) which is a national infrastructure data repository. - Providing a data dictionary to guarantee the reusability of the database. - Using the Ecological Metadata Language (EML) internationally recognised standard to describe the database metadata and its associated projects, including standardised search keywords. - Using a metadata format validator thanks to the MetaShARK ( - Using a versioning system to allow future updates. - Generating a Darwin Core Archive from the Kakila database. The Darwin Core Standard (DwC) offers a stable, straightforward and flexible framework for compiling biodiversity data, notably occurrences, from varied and variable sources ( |
ACCESSIBLE |
- Storing data in the PNDB repository with respect to the guidelines for quality standards (e.g. use of EML). - Efficient and rich services for various uses and users provided by the PNDB. - Working to adapt the Kakila database in order to integrate it in the GBIF. |
INTEROPERABLE |
- Using standard vocabularies for some fields (e.g. Beaufort Wind Scale for the wind speed). - Using keywords of international thesaurus, such as GEMET/INSPIRE ( - Using a data dictionary including the Darwin Core mapping. - Associating a Darwin Core archive with the Kakila database. The Darwin Core Standard (DwC) offers a stable, straightforward and flexible framework for compiling biodiversity data from varied and variable sources ( |
REUSABLE |
- Using an open format for the dataset (Tab Separated Values .tsv and OpenDocument .ods for the original database) and open source software to reuse it. - Including in the EML metadata the provenance for raw and derived data. - Explaining in this data paper the data processing steps, the data curation protocol, the data quality assurance processes, the methods and tools that permit long term integrity and understandability of data. - Using a time range clearly mentioned in the EML metadata and in this data paper. The same applies for geographical and taxonomic coverages and the CC-BY licence and rules for large reuse. - Using a Darwin Core Archive to facilitate the reusability of the Kakila database, because it enables the publication into the GBIF. This compact package (a ZIP file) contains interconnected text files and enables users to share their data using common terminology. |
List of taxa recorded between 2000 and 2020 from the Guadeloupean Archipelago.
Rank of the taxa identified |
Family |
Scientific name |
Common name (in French) |
Common name (in English) |
code_taxref |
Infraorder |
Cetacea |
Cétacés |
Cetaceans |
186224 |
|
Family |
Balaenopteridae |
Balaenopteridae |
Balénoptéridés - rorquals |
Rorquals |
186226 |
Delphinidae |
Delphinidae |
Delphinidés |
Oceanic dolphins |
186227 |
|
Kogiidae |
Kogiidae |
Kogiidés - petits cachalots |
Kogidae |
351415 |
|
Physeteridae |
Physeteridae |
Physétéridés - cachalots |
Sperm whales |
186231 |
|
Ziphiidae |
Ziphiidae |
Ziphiidés - Hyperoodontidés |
Beaked whales |
186232 |
|
Species |
Balaenopteridae |
Balaenoptera acutorostrata |
Petit Rorqual |
Minke whale |
60856 |
Balaenoptera physalus |
Rorqual commun |
Fin whale |
60861 |
||
Megaptera novaeangliae |
Baleine à bosse |
Humpback whale |
60867 |
||
Balaenoptera edeni |
Rorqual de Bryde |
Bryde’s whale |
60860 |
||
Delphinidae |
Feresa attenuata |
Orque naine ou pygmée |
Pygmy killer whale |
60883 |
|
Globicephala macrorhynchus |
Globicéphale tropical |
Short-finned pilot whale |
60887 |
||
Lagenodelphis hosei |
Dauphin de Fraser |
Fraser’s dolphin |
60897 |
||
Orcinus orca |
Orque Epaulard |
Killer whale, Orca |
60905 |
||
Peponocephala electra |
Péponocéphale ou Dauphin d'Electre |
Melon-headed whale, Electra dolphin |
60908 |
||
Pseudorca crassidens |
Pseudorque |
False killer whale |
60911 |
||
Stenella coeruleoalba |
Dauphin bleu et blanc |
Striped dolphin |
60918 |
||
Stenella attenuata |
Dauphin tacheté pantropical |
Pantropical spotted dolphin |
60914 |
||
Stenella clymene |
Dauphin de Clymène |
Clymene dolphin |
60917 |
||
Stenella frontalis |
Dauphin tacheté de l'Atlantique |
Atlantic spotted dolphin |
60921 |
||
Stenella longirostris |
Dauphin à long bec |
Spinner dolphin |
60916 |
||
Steno bredanensis |
Steno rostré |
Rough-toothed dolphin |
60924 |
||
Tursiops truncatus |
Grand dauphin |
Bottlenose dolphin |
60927 |
||
Kogiidae |
Kogia sima |
Cachalot nain |
Dwarf sperm whale |
79307 |
|
Physeteridae |
Physeter macrocephalus |
Grand cachalot |
Sperm whale |
60949 |
|
Ziphiidae |
Mesoplodon europeaus |
Baleine à bec de Gervais |
Gervais’ beaked whale |
60962 |
|
Ziphius cavirostris |
Baleine à bec de Cuvier |
Cuvier’s beaked whale |
60970 |
Deposit to national and international aggregators. In order to allow a wide dissemination and to improve its accessibility, the Kakila database content has been deposited in the PNDB (Pôle National de Données de Biodiversité, Biodiversity French Data Hub, https://www.pndb.fr) infrastructure data repository. In accordance with DataOne network guidelines, data were structured using rich metadata thanks to the use of the Ecological Metadata Language (EML) v.2.2.0 (
The data were collected around the Guadeloupe Archipelago (Fig.
Sampling consisted, in a first phase, in conducting a preliminary survey of the different NPOs and professional whale-watchers known to record cetacean observation data around the Guadeloupean Archipelago and whose expertise was previously recognised: for example, co-authorship of scientific publications (
Data description: the data consisted in marine mammal species observations collected during daily-boat excursions related to citizen science data acquisition or related to tourism (whale watching) (Figs
Data dictionary - metadata repository - of the Kakila DB. Datasets and Column labels are also presented in the "Data resources" part. The Darwin core data standards are described in
Datasets and Column labels | Definition | Data type | Darwin Core term code | Darwin Core term definition |
Dataset "sortie" (Trip) | ||||
code_sortie | Code of the boat trip carried out by an organisation and reported by an observer | Text | eventID | An identifier for the set of information associated with an Event (something that occurs at a place and time). May be a global unique identifier or an identifier specific to the data set. |
date_sortie | Date of the trip. | Date | eventDate | The date-time or interval during which an Event occurred. For occurrences, this is the date-time when the event was recorded. |
code_observateur | Observer Code | Text | ||
heure_depart | Departure time of the trip. | Hour | ||
heure_retour | Return time of the trip. | Hour | ||
duree_sortie | Duration of the trip. | Numeric | ||
etat_mer | Sea state. Parameter value estimated by the observer using the Douglas Scale. | Text | fieldNotes | One of a) an indicator of the existence of, b) a reference to (publication, URI), or c) the text of notes taken in the field about the Event. |
visibilite | Horizontal visibility. Category specifying the maximum distance at which an observer can see and identify an object located close to the horizontal plane on which he is himself (good - average - bad). | Text | ||
code_vent_beaufort | Wind force estimated by the observer using the Beaufort Scale from 0 to 12 (value or interval). | Numeric | ||
vent_classe | Wind force estimated by the observer classified in 4 classes (no-wind – light wind – moderate wind – strong wind). | Text | ||
sortie_positive | Code 1 if at least one marine mammal was observed and 0 if none was observed during the trip. | Numeric | ||
commentaire_sortie | Miscellaneous comment associated with the boat trip. | Text | eventRemarks | Comments or notes about the Event. |
Dataset "observateur" (observer) | ||||
code_observateur | Observer Code | Text | ||
code_organisme | Code of the organisation having carried out the trip | Text | ||
expertise_observateur | Level of expertise of the observer (beginner, intermediate, expert). The level of expertise is determined on the basis of the number of years of experience with regard to the identification of cetaceans. | Text | identificationRemarks | Comments or notes about the Identification. |
Dataset "observation" (observation) | ||||
code_observation | Observation code combining the code_sortie and an observation number | Text | occurrenceID | An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. |
code_sortie | Code of the boat trip carried out by an organisation and reported by an observer | Text | eventID | An identifier for the set of information associated with an Event (something that occurs at a place and time). May be a global unique identifier or an identifier specific to the data set. |
code_observateur | Observer Code | Text | ||
code_secteur_geog | Code of the observation site as the initials of the location (city, bay, ...) closest to the observation | Text | ||
latitude | Latitude of the observation expressed in decimal degrees. | Numeric | decimalLatitude | Geographic Longitude (in decimal degree, using the spatial reference system in "Reference system") |
longitude | Longitude of the observation expressed in decimal degrees. | Numeric | decimalLongitude | Geographic Latitude (in decimal degree, using the spatial reference system in "Reference system") |
profondeur | Sea depth at the place of the observation expressed in metres from the surface. It was estimated either from a GPS sonar from the boat or by a calculation from the digital terrain model of the French Antilles available on shom.fr (source: SHOM, France). The method is specified in the comment field. | Numeric | minimumDepthInMetres | The lesser depth of a range of depth below the local surface, in meters. |
heure_observation | Observation time. | Hour | eventTime | The time or interval during which an Event occurred. |
code_taxon | Internal code assigned to the taxon identified | Text | ||
nombre_minimum | Observer's estimation of the minimum number of individuals observed (can be equal to nombre_maximum if the number of individuals has been precisely determined). | Numeric | individualCount | The number of individuals represented present at the time of the Occurrence. |
nombre_maximum | Observer's estimation of the maximum number of individuals observed (can be equal to nombre_minimum if the number of individuals has been precisely determined). | Numeric | ||
presence_juvenile | Presence (1) or absence (0) of juveniles at the time of observation. | Numeric | occurrenceRemarks | Comments or notes about the Occurrence. |
nombre_juvenile | Observer’s estimation of the number of juveniles (to be completed only if presence_juvenile = 1). | Numeric | occurrenceRemarks | Comments or notes about the Occurrence. |
preuve_visuelle | Visual evidence of observation (photography) (1) or lack of visual evidence (0). This is particularly important in the case of observers described as "beginners". | Numeric | ||
commentaire_observation | Miscellaneous comments made by the observer on the observation. | Text | occurrenceRemarks | Comments or notes about the Occurrence. |
Dataset "organisme" (organisation) | ||||
code_organisme | Code of the organisation having carried out the trip | Text | ||
nom_organisme | Name of the organisation responsible for the management of reported observation data. | Text | recordedBy | A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first. |
acronyme_organisme | Acronym of the organisation. | Text | ownerInstitutionCode | The name (or acronym) in use by the institution having ownership of the object(s) or information referred to in the record. |
activite_organisme | Type of activities carried out by the organisation. | Text | ||
Dataset "secteur_geog" (observation site) | ||||
code_secteur_geog | Code of the observation site as the initials of the location (city, bay, ...) closest to the observation | Text | ||
nom_secteur_geog | Name of the observation site as the name of the location (city, bay, ...) closest to the observation. | Text | locationID | An identifier for the set of location information (data associated with dcterms:Location). May be a global unique identifier or an identifier specific to the data set. |
Dataset "taxon" (taxon) | ||||
code_taxon | Internal code assigned to the taxon identified | Text | ||
taxon_rang | Taxonomic rank of the taxon identified. | Text | taxonRank | Taxonomic rank of the taxon identified, using the Taxonomic Rank GBIF Vocabulary |
taxon_famille | Family of the taxon observed. | Text | family | The full scientific name of the family in which the taxon is classified. |
taxon_nom_usage | Common name of the taxon identified. | Text | originalNameUsage | The taxon name, with authorship and date information if known, as it originally appeared when first established under the rules of the associated nomenclaturalCode. The basionym (botany) or basonym (bacteriology) of the scientificName or the senior/earlier homonym for replaced names. |
taxon_nom_scientifique | Scientific name of the taxon identified in the form "genus species". | Text | scientificName | The full scientific name, with authorship and date information if known. When forming part of an Identification, this should be the name in lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in the IdentificationQualifier term. |
code_taxref | Code CD_REF of the taxonomic base TAXREF v.14.0 (2020-12-15). | Numeric | ||
code_espece_omm_gde_cca | Internal code used by the different observation bodies (OMMAG, Guadeloupe Evasion Découverte, Cétacés Caraïbes) to describe the species observed. | Text | ||
code_espece_ema | Internal code used by Aventures Marines Company to describe the species observed. | Text | ||
code_espece_agoa | Internal code used by the Agoa Sanctuary to describe the species observed. | Text | ||
uri_taxref | URI designating the taxon on the INPN site composed of a fixed URL "https://inpn.mnhn.fr/espece/cd_nom/" followed by the TAXREF code | Text | taxonID | An identifier for the set of taxon information (data associated with the Taxon class). May be a global unique identifier or an identifier specific to the data set. |
An effort to centralise and harmonise siloed data was made by controlling the join keys (eg. "code_observation", "code_sortie" etc.) between linked tables using dynamic pivot tables. Content quality controls were also used, such as a controlled dropdown menu for many fields that avoid potential input errors. Geolocations, often transformed into decimal degrees, were verified using the Geographic Information System QGIS 3.10 (long-term release) software.
In addition, data were checked for errors: 10% of the entries were randomly selected and checked by two persons. One person carried out the random draw from the “observation” table and the other operator checked the selected lines in the database against the original datasets provided by the data owners. The data entry was invalidated if it contained an error in any field. The error rate was calculated as follows: the proportion of the number of data entries containing an error on the total number of checked data entries and was estimated at 0.073 in the Kakila database.
the structure of the Kakila database was based on the original structures of the datasets and on the functional dependencies between the data. New fields of the Kakila database were defined and approved by the data providers. Then a data dictionary was defined (Table
The overall structure of the Kakila database was then designed to allow the establishment of relationships between the variables within the database. Kakila contains six main tables (Fig.
- The table “observateur” (observer) lists the volunteers and whale watchers who made the observations, together with a level of expertise (from beginner "débutant" to expert "expert") for each of them.
- The table "organisme" (organisation) lists the data providers, NPOs and whale watchers.
- The table "sortie" (field trip) lists the field trips recorded in the Kakila database (n = 3249), and contains information on the date and duration of trips, observer(s) on board, sea state and visibility.
- The table "observation" (observation) lists the observations of marine mammal species recorded during the corresponding field trip. Place and time of the observation are recorded, as well as the taxon identified (see table "code_taxon") and the number of individuals observed. The availability of a picture for the observation is stated.
- The table "taxon" (taxa) lists the marine mammal taxa recorded (e.g. species, genus, family ...), including scientific and common names, as well as the TAXREF code.
- The table "secteur_geog" (geographical place) lists the geographical area that observers used to localise their observation in preference to GPS data. The geographical areas were defined using the initials of the name of the closest town or locality on the sea coast and the direction between the observation site and the locality.
The relationship of the six tables is defined by the primary/foreign key fields “code_observateur” (present in tables "observateur" and "sortie"), “code_sortie” (in tables "sortie" and "observation"), “code_taxon” (in tables "observation" and "taxon"), "code_organisme" (in tables "organisme" and "observateur") and "code_secteur_geog" (in tables "observation" and "secteur_geog") (Fig.
Our study focuses on the coastal waters surrounding the Guadeloupean Archipelago (Fig.
The observation consisted, whenever possible, in a taxonomic identification at the species level. Twenty-one species of cetaceans have been observed and identified. Some observations did not allow us to identify the species; in these cases, the identification was done at the family level or at the suborder level (Table
Rank | Scientific Name | Common Name |
---|---|---|
infraorder | Cetacea | Cetaceans |
Data came from different observation structures, each with its own period of time. Data were collected between 2012-2019 for OMMAG, in 2019 for Cetacés Caraibes, between 2017 and 2019 for GED, between 2012 and 2016 for Aventures Marines, between 2007 and 2011 for BREACH, between 2012-2016 for Agoa and in 2000 for the IFAW survey.
Data are shared under a CC-BY 4.0 licence
Content of BDD_Kakila_v2_20210221_sortie.tsv
Column label | Column description |
---|---|
code_sortie | Code of the boat trip carried out by an organisation and reported by an observer. |
date_sortie | Date of the trip. |
code_observateur | Observer Code. |
heure_depart | Departure time of the trip. |
heure_retour | Return time of the trip. |
duree_sortie | Duration of the trip. |
etat_mer | Sea state. Parameter value estimated by the observer using the Douglas Scale. |
visibilite | Horizontal visibility. Category specifying the maximum distance at which an observer can see and identify an object located close to the horizontal plane on which he is himself (good - average - bad). |
code_vent_beaufort | Wind force estimated by the observer using the Beaufort Scale from 0 to 12 (value or interval). |
vent_classe | Wind force estimated by the observer classified in 4 classes (no-wind – light wind – moderate wind – strong wind). |
sortie_positive | Code 1 if at least one marine mammal was observed and 0 if none was observed during the trip. |
commentaire_sortie | Comments or notes about the Event. |
Content of BDD_Kakila_v2_20210221_observation.tsv
Column label | Column description |
---|---|
code_observation | Observation code combining the code_sortie and an observation number. |
code_sortie | Code of the boat trip carried out by an organisation and reported by an observer. |
code_observateur | Observer Code. |
code_secteur_geog | Code of the observation site as the initials of the location (city, bay, ...) closest to the observation. |
latitude | Latitude of the observation expressed in decimal degrees. |
longitude | Longitude of the observation expressed in decimal degrees. |
profondeur | Sea depth at the place of the observation expressed in metres from the surface. It was estimated either from a GPS sonar from the boat or by a calculation from the digital terrain model of the French Antilles available on shom.fr (source: SHOM, France). The method is specified in the comment field. |
heure_observation | Observation time. |
code_taxon | Internal code assigned to the taxon identified. |
nombre_minimum | Observer's estimation of the minimum number of individuals observed (can be equal to nombre_maximum if the number of individuals has been precisely determined). |
nombre_maximum | Observer's estimation of the maximum number of individuals observed (can be equal to nombre_minimum if the number of individuals has been precisely determined). |
presence_juvenile | Presence (1) or absence (0) of juveniles at the time of observation. |
nombre_juvenile | Observer’s estimation of the number of juveniles (to be completed only if presence_juvenile = 1). |
preuve_visuelle | Visual evidence of observation (photography) (1) or lack of visual evidence (0). This is particularly important in the case of observers described as "beginners". |
commentaire_observation | Miscellaneous comments made by the observer on the observation. |
Content of BDD_Kakila_v2_20210221_organisme.tsv
Column label | Column description |
---|---|
code_organisme | Code of the organisation having carried out the trip. |
nom_organisme | Name of the organisation responsible for the management of reported observation data. |
acronyme_organisme | Acronym of the organisation. |
activite_organisme | Type of activities carried out by the organisation. |
Content of BDD_Kakila_v2_20210221_secteur_geog.tsv
Column label | Column description |
---|---|
code_secteur_geog | Code of the observation site as the initials of the location (city, bay, ...) closest to the observation. |
nom_secteur_geog | Name of the observation site as the name of the location (city, bay, ...) closest to the observation. |
Content of BDD_Kakila_v2_20210221_observateur.tsv
Column label | Column description |
---|---|
code_observateur | Observer Code. |
code_organisme | Code of the organisation having carried out the trip. |
expertise_observateur | Level of expertise of the observer (beginner, intermediate, expert). The level of expertise is determined on the basis of the number of years of experience with regard to the identification of cetaceans. |
Content of BDD_Kakila_v2_20210221_taxon.tsv
Column label | Column description |
---|---|
code_taxon | Internal code assigned to the taxon identified. |
taxon_rang | Taxonomic rank of the taxon identified. |
taxon_famille | Family of the taxon observed. |
taxon_nom_usage | Common name of the taxon identified. |
taxon_nom_scientifique | Scientific name of the taxon identified in the form "genus species". |
code_taxref | Code CD_REF of the taxonomic base TAXREF v.14.0 (2020-12-15). |
code_espece_omm_gde_cca | Internal code used by the different observation bodies (OMMAG, Guadeloupe Evasion Découverte, Cétacés Caraïbes) to describe the species observed. |
code_espece_ema | Internal code used by Aventures Marines Company to describe the species observed. |
code_espece_agoa | Internal code used by the Agoa Sanctuary to describe the species observed. |
uri_taxref | URI designating the taxon on the INPN site composed of a fixed URL " https://inpn.mnhn.fr/espece/cd_nom/ " followed by the TAXREF code. |
Threats that cetacean populations face are multiple, but well-documented (
Providing metadata has been eased by a development version of MetaShARK. Since this application was maturing, some parts of the data description had to be handled manually: turning the files encoding from Windows-1252 to UTF-8 and correcting EML Assembly Line templates when needed.
The Kakila database is the first attempt at gathering all available local knowledge on cetacean presence in the Guadeloupe Archipelago. Clearly the long-term strategy to maintain and enrich the Kakila database must focus on careful monitoring of stakeholders' interests, motivations and ultimate expectations. One of its first scientific valorisations will be to help detect and identify key areas of interaction between cetaceans and marine traffic in the Guadeloupe Archipelago in the framework of the TRAFIC project *
The Kakila database exists because volunteers invested their personal time and competences on the study of marine mammals at sea and this over years. We are particularly grateful to Caroline Azzinari (BREACH) and to all the volunteers of the OMMAG NPOs who collated the observation data contained in the Kakila database. We also warmly thank the whale watchers who decided to share their own and precious data and, in particular, Cedric Millon (whale-watching Cétacés Caraïbes), Claire Freriks (Evasion Marine and Guadeloupe Evasion Découverte) and Jean-Pierre Concaud (Guadeloupe Evasion Découverte).
Ellen Feunteun (Agoa Sanctuary) effectively helped us analysing the data from the Agoa Sanctuary.
This research was funded by the LabEx DRIIHM French programme “Investissements d’Avenir” (ANR-11-LABX-0010), which is managed by the ANR. It has been awarded in the OHM Littoral Caraïbe 2019 call for proposal. Lorraine Coché was recruited with the support of the Fondation de France, within the framework of the TRAFIC project, laureate of its research programme "The future of the coastal and sea worlds" (project N° 1940). The last step of the process, the Darwinisation of the Kakila database, was partially funded by the SO-DRIIHM project (ANR-19-DATA-0022) from the Flash call on Open Science.
During his ERINHA AISBL involvement, Romain David was supported by the EOSC-Life European Programme under grant agreement N°824087 and ERINHA-Advance European Programme under grant agreement N° 824061.
The Kakila database was used as a case study during the ecoinfofair2020 workshop (Concarneau, 19-21 octobre, organised by Yvan Le Bras, PNDB, MNHN). We are grateful to all the workshop participants, who helped in the FAIRisation process of Kakila and, in particular, to Guillaume Body (OFB), Claudia Lavalley (UMR TETIS), Sophie Pamerlon (GBIF) and Sarah Valentin (OFB).
This manuscript strongly benefited from constructive comments provided by two reviewers.
See also the OHM Littoral Caraïbe webpage: https://ohm-littoral-caraibe.in2p3.fr/methodologie#trafic