Biodiversity Data Journal :
OMIC Data Paper
|
Corresponding author: Katerina Skaraki (kskaraki@gmail.com), Christina Pavloudi (christina.pavloudi@embrc.eu)
Academic editor: Lyubomir Penev
Received: 27 Oct 2023 | Accepted: 14 Jan 2024 | Published: 19 Jan 2024
© 2024 Katerina Skaraki, Christina Pavloudi, Thanos Dailianis, Jacques Lagnel, Adriani Pantazidou, Antonios Magoulas, Georgios Kotoulas
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Skaraki K, Pavloudi C, Dailianis T, Lagnel J, Pantazidou A, Magoulas A, Kotoulas G (2024) Microbial diversity in four Mediterranean irciniid sponges. Biodiversity Data Journal 12: e114809. https://doi.org/10.3897/BDJ.12.e114809
|
|
This paper describes a dataset of microbial communities from four different sponge species: Ircinia oros (Schmidt, 1864), Ircinia variabilis (Schmidt, 1862), Sarcotragus spinosulus Schmidt, 1862 and Sarcotragus fasciculatus (Pallas, 1766). The examined sponges all belong to Demospongiae (Class); Keratosa (Subclass); Dictyoceratida (Order); Irciniidae (Family). Samples were collected by scuba diving at depths between 6-14 m from two sampling sites of rocky formations at the northern coast of Crete (Cretan Sea, eastern Mediterranean) and were subjected to metabarcoding for the V5-V6 region of the 16S rRNA gene.
sponge metagenome, marine metagenome, amplicon sequencing, 454 GS FLX Titanium, pyrosequencing, eastern Mediterranean
Porifera (sponges) is one of the oldest metazoan Phyla (
This dataset includes microbial taxa inhabiting the demosponges Ircinia oros, I. variabilis, Sarcotragus spinosulus and S. fasciculatus and contributes to the ongoing efforts of the Ocean Biogeographic Information System (OBIS) which aims at filling the gaps in our current knowledge of the world's oceans. The dataset has been also published in GBIF (
Sponge samples were collected in 2009-2010 by scuba diving from two sampling sites of rocky formations at the northern coast of Crete (Cretan Sea, eastern Mediterranean). Specimens of Sarcotragus spinosulus, Ircinia oros and I. variabilis were collected from Alkes along with a seawater sample (2 litres), whereas samples of Sarcotragus fasciculatus were collected from Elounda. Samples were transferred to the laboratory in a cooler with ice packs. Upon arrival, sponge samples were preserved in 97% ethanol and stored at -20°C prior to DNA extraction; the seawater sample was filtered through a Whatman 0.45 um mixed cellulose ester membranes filter, which was stored at -80°C prior to DNA extraction. Sponge samples were identified both morphologically, as well as by molecular markers (18S rRNA and COI).
For each sponge specimen, mesohyl and ectosomal tissue samples (10-15 mg each) were cut and proceeded separately. DNA from those tissues was extracted using DNeasy Blood & Tissue™ kit (Qiagen) following the manufacturer’s guidelines. DNA from the filter was extracted according to a freeze–thaw–boiling protocol of
The V5-V6 hypervariable region (~ 280 bp) of the 16S ribosomal RNA (rRNA) gene was amplified using the degenerate universal primers 802F: 5'-GGATTAGATACCCBNGTA-3' (originally designed as reverse primer by
Amplicon libraries' preparation and sequencing were performed at the DNA Sequencing platform of the Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC, HCMR) using 454 GS FLX Titanium technology (Roche, 454 Life Sciences), according to the manufacturer's recommendations.
The 454 GS FLX Titanium metabarcoding data were processed in the following steps: a) raw data were denoised using the software package AmpliconNoise v.1.29 (
The phyloseq package (version 1.42.0) (
The analysis was carried out through Zorba, the High Performance Computing (HPC) system of IMBBC (
The majority of the reads were unclassified (~ 26%). The most abundant phyla were Proteobacteria (~ 19%), Chloroflexota (~ 9%), Cyanobacteria (~ 8%) and Poribacteria (~ 6%). The processed sequences, along with their corresponding taxonomic information and metadata, can be found in the occurrence dataset available from GBIF. In addition, processed sequences using the MGnify pipeline (4.1) and their respective taxonomic information is available with the MGnify study id MGYS00004687 (available at https://www.ebi.ac.uk/metagenomics/studies/MGYS00004687) (
The target of the dataset was to amplify prokaryotic taxa associated with the studied sponges.
Archaea, Bacteria
Details for the samples can be found in Suppl. material
The most abundant phyla across all the samples are Proteobacteria (~ 19%), Chloroflexota (~ 9%), Cyanobacteria (~ 8%) and Poribacteria (~ 6%) (Fig.
Since the advent of pyrosequencing, research papers describing and discussing findings, based on eDNA and DNA metabarcoding data, have been a major part of the scientific literature. However, submission of such data into repositories was not possible until very recently, with the development of the Darwin Core (DwC) DNA derived data extension (
It should be noted, however, that certain metabarcoding limitations should be taken into account when preparing DwC DNA occurrence data for submission to GBIF and/or OBIS. Notably, the uncertainty of organism occurrences, along with the less precise taxonomic identifications inferred from sequencing data, should be clearly mentioned in the dataset descriptions. Moreover, MIxS checklists in host-associated datasets, while capturing the host species information in the sample metadata, fail to do so in the run, i.e. sequence, metadata; an example would be this particular dataset, where the taxonomy associated with the run accession numbers in ENA is "sponge metagenome", with no further details shown, although the host species is mentioned in the sample metadata, i.e. it can be retrieved when searching ENA with the sample accession number. In addition, standards should be continuously updated to incorporate changes in sequencing technologies and protocols, in order to aid researchers document such data in a more FAIR way. If discrepancies between repositories and platforms are minimised and even eliminated in the future, the already high intrinsic value of DNA derived datasets will increase even further.
This research was supported in part through computational resources provided by IMBBC (Institute of Marine Biology, Biotechnology and Aquaculture) of the HCMR (Hellenic Centre of Marine Research). Funding for establishing the IMBBC HPC has been received by the MARBIGEN (EU Regpot) project, LifeWatchGreece RI and the CMBR (Center for the study and sustainable exploitation of Marine Biological Resources) RI. In addition, the authors would like to acknowledge the BiCIKL project (Grant No 101007492).