MidMedPol: Polychaetes from midlittoral rocky shores in Greece and Italy (Mediterranean Sea)

Abstract This paper describes a dataset of polychaetes (Annelida) from 14 midlittoral rocky shore sampling sites in Greece and Italy (Mediterranean Sea). The dataset combines the outcome of four different projects studying the hard substrate midlittoral zone in the Mediterranean between 1984 and 2009. Samples were collected by scraping and collecting the organisms from a framed area. The maximal sampling depth was 1.5 m. In total, 123 polychaete species were recorded, five of which are new records for the respective biogeographic sectors of the Mediterranean. The dataset contains 788 occurrence records, fully annotated with all required metadata. These data contribute to the knowledge of a previously very understudied regional habitat, since at present, comprehensive lists of the midlittoral communities in the Mediterranean are provided through only a few, paper-based, studies. This dataset is one of the first electronic data compilations of the Mediterranean midlittoral zone communities and certainly the most comprehensive of its kind, contributing to the ongoing efforts of the Ocean Biogeographic Information System (OBIS) which aims at filling the gaps in our current knowledge of the world's oceans. It is accessible at http://ipt.vliz.be/resource.do?r=mediterraneanpolychaetaintertidal.


Introduction
The Mediterranean Sea is an enclosed water basin with a very low tidal range, in the range of 20-40 cm (Day et al. 1995). Its intertidal zone is accordingly very narrow, and is often referred to as "midlittoral zone" instead of "intertidal zone", following the terminology of Stephenson and Stephenson (1949). Pérès and Picard (1964) subsequently described the hard bottom biocoenoses of the midlittoral zone in the Mediterranean Sea and defined its ecological attributes by using characteristic species. The midlittoral zone can also be created by considerable and steady wave-action without the existence of true tides (Stephenson and Stephenson 1949). Such irregular rhythms of immersion/ desiccation which depend on weather conditions create an extreme environment, allowing only species with certain characteristics to survive.
Despite the ecological importance and easy accessibility from the shore, only few studies have examined the species communities of the Mediterranean midlittoral zone (e.g. Ben-Eliahu and Safriel 1982, Cardell and Gili 1988, Sardà 1991. Most of these studies are paper-based and the information contained within is not readily accessible in machinereadable formats. Electronically available biogeographic information for the Mediterranean Sea is still fragmented for all subregions and habitats (Arvanitidis et al. 2006), and none of the global biogeographic databases (OBIS, http://www.iobis.org; GBIF, http://data.gbif.org) contain systematically collected data on the Mediterranean midlittoral zone. This study attempts to increase our current knowledge of the rocky midlittoral zone of Mediterranean Sea by providing species occurrence data of polychaete species, assembled from four independent and previously unpublished datasets. Polychaetes are often used as a representative group of macrobenthic communities because they tend to be the dominant taxon in these communities and hence, they are used as indicators of environmental disturbance (e.g. Giangrande et al. 2005, Olsgard et al. 2003. The present dataset contains georeferenced and fully documented information on 123 species (788 individuals) of polychaetes, recorded from 14 regions/ sampling sites in the Aegean Sea and in Italy, from 1984 to 2009 (Table 1). Five species are new records for the respective biogeographic sectors in the Mediterranean region.

Project description
Title: This dataset combines the data of four independent sampling campaigns: (a) the monitoring of midlittoral rocky shores in Crete in the framework of the NaGISA project (Natural Geography in Shore Areas, http://www.coml.org/projects/natural-geography-shore-  Map of the sampling locations impacted by detectable anthropogenic activity, though a sandy beach in ca 500 m distance of the sampling area in Elounda is subjected to moderate beach tourism and increased leisure boat traffic in the summer months.

Evripos channel:
The area is located in the town of Chalkida (Euboea, Eastern Mediterranean) and is characterised by strong hydrodynamic changes caused by strong tidal currents. The midlittoral zone of this channel is an artificial hard bottom habitat (concrete). Three stations were chosen in this area with different levels of hydrodynamism: Evripos_1a with low, Evripos_1b with moderate and Evripos_1c with high hydrodynamic intensity. Evripos_1a is characterised by dense photophilous algal coverage dominated by Corallina elongata. Evripos_1b is covered by photophilous macroalgae (60%) and by the mollusk Mytilus galloprovincialis (40%). Finally, the station Evripos_1c is characterised by high densities of M. galloprovincialis. Despite their urban location, the stations are not noticeably affected by organic discharges since the strong currents prevailing in the area dissipate pollution.
Thermaikos Gulf: Thermaikos Gulf is an embayment in the North part of the Aegean Sea (Eastern Mediterranean) and is strongly impacted by urban pollution. The midlittoral zone sampled here is an artificial hard bottom habitat (concrete). At this site, three stations were sampled, with an increase of pollution intensity from station Thermaikos_2a to Thermaikos_2c. The station Thermaikos_2a is located in Nea Mixaniona and is characterised by low hydrodynamic intensity. The algal coverage at this station is dominated by the macroalga Antithamnion cruciatum. The station Thermaikos_2b is located in Neoi Epivates and receives intense wave action. The substrate of this station is covered by beds of the mollusk Mytilus galloprovincialis. The station Thermaikos_2c is located in front of the Thessaloniki Concert Hall and is sheltered from strong waves. The substrate of this station is covered by the mollusk M. galloprovincialis and the alga Ulva lactuca.
Nea Roda and Porto Karas: Both areas are located in Chalkidiki (North Aegean Sea, Eastern Mediterranean) but differ in terms of wave exposure: Nea Roda is moderately exposed, Porto Karas sheltered. The substrate in Nea Roda consists of granite, in Porto Karas the substrate is artificial (concrete). Mollusks are the dominant taxon in Nea Roda, whereas the midlittoral zone of Porto Karas is characterised by low densities of photophilous macroalgae. Nea Roda is a pristine area, whereas the stations in Porto Karas are located in a typical hotel marina and are subjected to slightly increased levels of organic pollution.

Porto Lagos:
The sampling stations are located in a small port in Vistonicos Gulf (North Aegean Sea, Eastern Mediterranean) and is characterised by low-intensity hydrodynamism, low salinity and an artificial substrate (concrete). The midlittoral zone is dominated by the polychaete Ficopomatus enigmaticus which forms extensive biogenic calcareous layers of 3-4 cm height. Inside the port area, slightly increased levels of organic pollutions were detected.
Balestrate and Zingaro: Both areas are located in the Gulf of Castellammare. Balestrate is an outcrop of calcarenitic rocks surrounded by sand and is located in the centre of the Gulf. In this area, Sabellaria alveolata reefs temporarily proliferated between 1984-89 (preceding the sampling activities) in the infralittoral and midlittoral layers as a consequence of a wine distillery outfall. In the midlittoral zone, S. alveolata was associated with Mytilaster spp. beds. Zingaro, now a terrestrial and coastal reserve without influences from major anthropogenic stressors, is a steep calcareous cliff that stretches along the westernmost side of the Gulf. The midlittoral zone is characterised by the presence of vermetid reefs formed by the mollusk Dendropoma petraeum. Both areas are exposed to moderate wave action.
Capo Gallo: Capo Gallo, now a marine protected area, is a steep calcareous cliff located at the northern end of the Gulf of Palermo, not far from the city of Palermo. As in Zingaro, the midlittoral zone is characterised by the presence of vermetid reefs formed by the mollusk Dendropoma petraeum. The area is exposed to the dominant wind direction, resulting in increased wave action at the shore. No major sources of pollution are present in the vicinity.

Sampling methods
Study extent: The data cover several independent sampling events over a time period of 25 years ) and originate from 14 sampling sites in Italy and Greece (Mediterranean Sea). Samples were collected from the midlittoral zone from a maximum depth of 1.5 m. Concerning the distribution of polychaetes, this habitat is understudied in the Mediterranean Sea -in fact, the Ocean Biogeographic Information System contains less than 300 polychaete distribution records in the depth range of 0-5 m for the entire Mediterranean Sea, and none of these are from the intertidal zone. The present dataset thus provides an important addition to the exiting data for this habitat in the region (Fig. 2).  . At each site, the high, mid-and low midlittoral zone was determined and five random replicate units were collected from each zone by placing a plexiglas frame (25x25 cm) on the substrate and scraping the framed area completely. The samples were then collected with a netted shovel into plastic bags, washed through a 0.5 mm mesh sieve and fixed in 99% ethanol. In the laboratory, all samples were identified to the most precise taxonomic level possible, using the most recent literature for the taxon. Animals without a head were considered as fragments and were not identified. The individual taxon counts were directly entered into electronic worksheets (Microsoft Excel), along with all metadata concerning the identification (date, identifier, notes, literature used). Thus, the introduction of additional errors during the transcription of lab notes into an electronic format was avoided.
Samples from Evripos channel, Thermaikos Gulf, Chalkidiki and Porto-Lagos were collected from September 1997 until October 1997. At each site, five random replicate units were collected. Two kind of samplers were used: (a) a metallic frame (20x20 cm) with a 0.5 mm mesh bag attached to its upper part (Chintiroglou and Koukouras 1992); (b) an iron frame (20x20 cm) with plastic threads woven through holes on the sides of the frame, forming a grid. The framed surface of the substrate was scraped and collected into plastic bags with 10% formalin. In the laboratory, the samples were washed through a 1.5 mm and a 0.5 mm mesh sieve and fixed in 5% formalin. All samples were sorted into major taxonomic groups and identified to species level using various identification keys, but only the polychaete species were digitised and included in the present dataset, in order to form a thematic entity. Data from the five replicates were pooled, the dataset for these records thus contains the average of abundances.
Samples from Italy were collected in 1984, 1986 and 1989. In Zingaro, samples were collected in spring of 1984, in Capo Gallo in spring, autumn and winter of 1986 and in Balestrate once per season in 1989. The number of replicate units per sample vary between 4 and 13. Samples were collected by scraping the surface of a 20x20 cm square, stored in plastic bags and subsequently fixed in a 5% solution of sea water and formalin.In the laboratory, samples were sieved through a 0.5 mm mesh size and preserved in 75% ethanol. Polychaetes were sorted into families and then identified to species level using various identification keys.
Quality control: All scientific names were standardised against the World Register of Marine species using the Taxon Match tool (http://www.marinespecies.org/aphia.php? p=match). If recent taxonomic reviews were available that had not been incorporated into WoRMS at the time of standardisation, nomenclature follows those reviews. Subjective synonyms were kept in the dataset as they had been originally recorded, with a reference to the currently accepted name.
Step description: The samples had been obtained independently by three different research teams over a period of 25 years as described in detail above. In an attempt to assemble polychaete occurrence data of the Mediterranean midlittoral zone, the datasets included in this study were obtained from the respective colleagues, cross-checked, annotated, quality-controlled and transformed into a standard electronic format (Fig. 3).

Geographic coverage
Description: Samples were collected at 14 sampling sites in Italy and Greece, Mediterranean Sea, from a maximum depth of 1.5 m (Table 1, Fig. 1). All data are collected from the midlittoral zone, characterised by the low and high water marks at those places where a tide is present, and the characteristics of the ecological zonation where the midlittoral zone is defined mainly by the gradient of emersion/ desiccation resulting from wave action. Overview of all steps leading to the final release of the dataset: 1 sampling, independently performed at the three different institutions (AUTH = Aristotle University of Thessaloniki, UNIPA = University of Palermo, HCMR = Hellenic Centre for Marine Research) 2 identification of polychaete specimens in the laboratory 3 data in paper-based format 4 digitisation 5 data in electronic format (spreadsheets) 6 integration of the three independent datasets into a standardised format, exclusion of records not identified to species level, retrieval of missing information, georeferencing of coordinates through Google Maps, standardisation of taxonomy against the World Register of Marine Species and recent literature, general quality control 7 export of data as a DarwinCore Archive 8 generation of dataset-level metadata 9 publication of the data as a data paper and through an IPT server 10 in the future, further dissemination of data by integration into other databases, personal downloads, archiving, etc.

Coordinates:
The present dataset contains the first electronically available quantitative data on midlittoral polychaetes in the entire Mediterranean Sea. Previous studies of the habitat in the region are scarce, often qualitative and not electronically available.  Terebellidae Terebellalapidaria Linnaeus, 1767Fauvel 1927 The species richness of the 22 families is very heterogeneous. Syllidae are the family with the highest species richness, comprising 33.3% of the species in the dataset, followed by Nereididae with 12.6% of the found species and Serpulidae with 10.6% (Fig. 4). Only nine families are represented by more than 3 species, whereas ten families are represented by a single species only.
Species richness at the different sampling sites is very heterogenous, with only a single species found in Porto Karas to 34 species found in Capo Gallo. Likewise, the number of higher taxa is different across locations, e.g. the 24 species recorded in Balestrate belong to 15 different families, whereas the 30 species recorded each in Alykes and Evripos St. 1c belong to only 10 families (Fig. 5).  The full scientific name of the genus in which the taxon is classified. family The full scientific name of the family in which the taxon is classified. order The full scientific name of the orde in which the taxon is classified.

class
The full scientific name of the class in which the taxon is classified. phylum The full scientific name of the phylum in which the taxon is classified. kingdom The full scientific name of the kingdom in which the taxon is classified. GeoreferenceSources A list (concatenated and separated) of maps, gazetteers, or other resources used to georeference the Location, described specifically enough to allow anyone in the future to use the same resources. coordinateUncertaintyInMeters The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the sampling location.
habitat A category or description of the habitat from which the samples were collected.

minimumDepthInMeters
The lesser depth of a range of depth below the local surface, in meters.

maximumDepthInMeters
The greater depth of a range of depth below the local surface, in meters.

samplingProtocol
The description of the method or protocol used for sample collection. basisOfRecord The specific nature of the data record, as described in http://rs.tdwg.org/dwc/ terms/type-vocabulary/index.htm.
preparations Preparations and preservation methods for a specimen.
individualCount The number of individuals in a replicate sample unit. In cases where replicates had been pooled, the average abundances are not included under "individualCount" but under "dynamicProperties" dynamicProperties Includes here as the only attribute "meanAbundance". These are the average abundances of those samples where the replicates had been pooled.
recordedBy A list (concatenated and separated) of names of people responsible for recording the original Occurrence.
identifiedBy A list (concatenated and separated) of names of people, groups, or organizations who identified the specimen.

dateIdentified
The date on which the specimen was identified.
identificationReferences A list (concatenated and separated) of references (publication, global unique identifier, URI) used for identifying the specimen.

institutionCode
The name (or acronym) in use by the institution having custody of the object (s) or information referred to in the record.
institutionID An identifier for the institution having custody of the object(s) or information referred to in the record.
datasetID An identifier for the set of data.
datasetName The name identifying the data set from which the record was derived.
rights Information about rights held in and over the resource (copyright, intellectual property, etc.). rightsHolder A person or organization owning or managing rights over the resource.
id A unique identifier for the record within the data set or collection, autoincrementing number automatically added by the system. taxonID Aphia ID (Unique Identifier for the taxon within the World Register of Marine Species -www.marinespecies.org)