Biodiversity Data Journal :
Data Paper (Biosciences)
|
Corresponding author: Maxim Shashkov (max.carabus@gmail.com), Natalya Ivanova (natalya.dryomys@gmail.com), John Wieczorek (gtuco.btuco@gmail.com)
Academic editor: Vincent Smith
Received: 08 Jul 2021 | Accepted: 18 Nov 2021 | Published: 08 Dec 2021
© 2021 Maxim Shashkov, Natalya Ivanova, John Wieczorek
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Shashkov M, Ivanova N, Wieczorek J (2021) Ecological data in Darwin Core: the case of earthworm surveys. Biodiversity Data Journal 9: e71292. https://doi.org/10.3897/BDJ.9.e71292
|
This sampling-event dataset provides primary data about species diversity, age structure, abundance (in terms of biomass and density) and seasonal activity of earthworms (Lumbricidae). The study was carried out in old-growth broad-leaved and young forests of two protected areas ("Kaluzhskiye Zaseki" Nature Reserve and Ugra National Park) of Kaluga Oblast (Russia).
The published dataset provides new data about earthworm communities in European Russia. We propose a new schema according to Darwin Core for the standardisation of the soil invertebrates survey data.
sampling-event data, soil biodiversity, "Kaluzhskiye Zaseki" Nature Reserve, Ugra National Park
Earthworms occur in soils almost across the whole world, preferring moist habitats of moderate temperature. They are amongst the major components of terrestrial ecosystems dominating the biomass of soil invertebrates in non-acidic soils (
In our opinion, this situation can be explained by two reasons. The first one is time-consuming and labour-intensive field data collection (
The matter that the data standardisation process is not clear is the second barrier to earthworm data exchange and integration because, usually, earthworms are collected according to sampling-event design. Nowadays, Darwin Core (
Here, we provide the sampling event dataset of long-term earthworm surveys (
We used data collected by the hand-sorting method in our example. During each survey (usually taken during one day), soil samples of fixed size were randomly collected within the sampling plot (in similar tree and herb cover and soil type). Each soil monolith was hand-sorted by layers for earthworms (see details in the Sampling description section). An example of sampling design is shown in Fig.
Thus, our primary data included information for each individual in the soil sample (species, biomass and life stage) and earthworm density (number of individuals) for the survey. During the data standardisation process, we considered three types of events (Fig.
Thus, we included in the dataset occurrences of two levels (Table
Event type | Number of events | Number of associated occurrences | Traits |
The survey | 39 | 271 | Density |
Soil sample | 338 | 0 | - |
Soil sample layer | 628 | 6673 | Individual biomass, life stage |
Used event hierarchy allowed us to maintain data consistency and completeness. Nevertheless, our method has some bottlenecks. Firstly, it is not common practice to combine events of different levels in one dataset. At the same time, each event level should be described in the dataset. This information requires a particular Darwin Core term, but it is currently absent. We used the general term dwc: dynamicProperties as a temporary solution in this work. Secondly, the event hierarchy includes 338 events (the soil sample level), which are not assigned to any occurrences. These events are empty not because no species were registered. We used this event level for the relationship between survey and sample layer event types. However, empty events are not shown on the GBIF dataset page. Moreover, complete data (with empty events) are available for download via the IPT installation page, not the GBIF interface. This fact restricts the reuse of our data.
Possibly, another data standardisation design could be more understandable. It would be simpler to use the soil sample as the event and bind samples from one sample plot via dwc: locationID and different surveys via dwc: parentEventID. This scheme avoids empty events not related to occurrences. However, its implementation is not possible due to technical IPT limitations. We cannot assign different depths for occurrences into one event because dwc: verbatimDepth, dwc: minimumDepthInMeters and dwc: maximumDepthInMeters are related to the Event Core.
On the other hand, events of different levels made it possible to provide different level traits. In our dataset, we provided life stage and biomass for each specimen and density for the survey. This is an essential advantage for ecological data re-analysis.
Overall, our solution is not optimal. This approach is a trade-off between the need to provide as complete data as possible, the current state of the Darwin Core standard and the technical limitations of the IPT. We believe that further development of biodiversity data standards and data publishing protocols will optimise the process of ecological sampling-event data mobilisation and facilitate their reuse.
The study area was located in the central part of the East European Plain. Earthworms were collected in 13 locations of old-growth broad-leaved forests and young birch forests in the "Kaluzhskiye Zaseki" Nature Reserve and Ugra National Park. There were 10 sampling plots in old-growth broad-leaved forests at a late successional stage or subclimax (Fig.
Sampling plot code (dwc: locationID) |
Protected area (dwc: locality) |
Survey periods | Coordinates |
Habitat (dwc: habitat) |
Soil type |
T1 | Ugra National Park | May, June and September 2003, June 2004 | N 53.89400, E 35.86468 | Broad-leaved forest | Luvisol grey forest |
T2 | Ugra National Park | May, June and September 2003, June 2004 | N 53.90408, E 35.83320 | Broad-leaved forest | Luvisol grey forest slightly podzolics |
VZv | Ugra National Park | May, June and September 2003, June 2004 | N 53.88742, E 35.81388 | Broad-leaved forest | Luvisol light grey forest |
Poima | Ugra National Park | September 2003, June 2004 | N 53.92215, E 35.73175 | Black alder forest, small river floodplain | Luvisol alluvial gleic |
Val | Ugra National Park | May, June 2003, June 2004 | N 53.91861, E 35.73266 |
Broad-leaved forest, natural levee of oxbow |
Luvisol illuvial-ferruginous |
33 kv | Kaluzhskiye Zaseki Nature Reserve (Northern cluster) | August 2004 | N 53.77853, E 35.73524 | Broad-leaved forest | Luvisol sod illuvial-ferruginous contact-gleyic |
43 kv | Kaluzhskiye Zaseki Nature Reserve (Northern cluster) | August 2004 |
N 53.76148, E 35.73751 |
Broad-leaved forest | Luvisol sod illuvial-ferruginous |
R1 | Kaluzhskiye Zaseki Nature Reserve (Southern cluster) | May 2006, July 2011, May, June, September 2012 | N 53.62363, E 35.87014 | Broad-leaved forest | Phaeozem |
R2(3) | Kaluzhskiye Zaseki Nature Reserve (Southern cluster) | May 2006, July 2011, May, June, September 2012 | N 53.61480, E 35.86794 | Broad-leaved forest | Phaeozem |
R4 | Kaluzhskiye Zaseki Nature Reserve (Southern cluster) | July 2011, May, June, September 2012 | N 53.62309, E 35.86900 | Broad-leaved forest | Luvisol sod-podzolic |
R5 | Kaluzhskiye Zaseki Nature Reserve (Southern cluster) | May, September 2012 | N 53.61943, E 35.87607 | Young birch forest | Luvisol sod-podzolic (with arable layer) |
R6 | Kaluzhskiye Zaseki Nature Reserve (Southern cluster) | May, June, September 2012 | N 53.63121, E 35.88146 | Young birch and willow forest | Luvisol sod-podzolic |
The old-growth forest stands consist of Quercus robur L., Fraxinus excelsior L., Tilia cordata Mill., Ulmus glabra Huds., Acer platanoides L., Acer campestre L., Betula spp. and Populus tremula L. with regrowth of the broad-leaved tree species, except for oak. The herbal layer is dominated by Aegopodium podagraria L., Mercurialis perennis L., Galeobdolon luteum Huds., Pulmonaria obscura Dumort. and nitrophilous fern Matteuccia struthiopteris (L.) Tod.
The second investigated group of forest stands comprises young forests established on abandoned arable field and pasture. The stands of young forest are predominantly composed of Betula spp. and Salix caprea L. Sampling plots were located on abandoned farmlands. The distance to the edgeof old-growth forests was about 30-50 metres.
At each sampling plot, 8-24 randomly located soil samples (25 cm × 25 cm) were dug to a depth of 35 cm for earthworms collection (
Specimen of the most abundant species - Aporrectodea caliginosa. Subadult ontogenetic stage. The photo was taken on 28 March 2016 by Maxim Shashkov on sampling plot R4. Then soil was covered by packed snow and crust of 30-40 depth. The worm was active in the topsoil layer just under the snow.
Kaluga Oblast, Russian Federation
53.615 and 53.922 Latitude; 35.732 and 35.881 Longitude.
Rank | Scientific Name |
---|---|
family | Lumbricidae |
species | Octolasion lacteum Örley, 1881 |
genus | Aporrectodea Orley, 1885 |
species | Aporrectodea rosea (Savigny, 1826) |
species | Aporrectodea caliginosa (Savigny, 1826) |
genus | Lumbricus Linnaeus, 1758 |
species | Lumbricus terrestris Linnaeus, 1758 |
species | Lumbricus rubellus Hoffmeister, 1843 |
species | Lumbricus castaneus (Savigny, 1826) |
species | Eisenia nordenskioldi (Eisen, 1879) |
species | Dendrobaena octaedra (Savigny, 1826) |
The dataset provides three trait types.
Earthworms were distinguished into three ontogenetic stages – juvenile, subadult and adult, based on the development of the clitellum. It is the reproductive gland used for cocoon production by mature earthworms generally forming an obvious band around the mid-section segments. Adult earthworms had a fully developed clitellum. Earthworms were considered subadult if they had any signs of tubercula pubertatis, but no clitellum and adult if they are clitellate (
Preserved specimens were weighed to determine earthworm biomass with portative balance Ohaus SPU 123. This device allows taking weight with precision of 0.001 g with an accuracy of 0.003 g. All the worms were weighed under laboratory conditions in a preserved state. No corrections were made for gut content or dehydration in formaldehyde. Individual biomass was in the range of 2 to 5220 mg. The largest worms were specimens of Aporrectodea caliginosa (max. 1630 mg) and Lumbricus terrestris. The total biomass was highest in old-growth forests on Phaozems (61.4-110.5 g/m2) and Luvisols grey (45.9-104.0 g/m2), as well as the young forest on former pasture (97.3-135.9 g/m2). The lowest values were recorded for the young forest on former arable land (4.4-43.5 g/m2) and the alder forest experiencing seasonal flooding (17.9-25.1 g/m2).
Some worms were damaged during soil excavation with a shovel. The fragment was considered a specimen when it had an anterior end, but each counted for biomass. The most abundant population of earthworms in terms of relative density (individuals per square metre) was revealed in the old-growth forest on Phaozem (R1) and in the young forest on the former pasture. The poorest values were observed in the young forests on the former arable soil.
See Table 2 for details.
Column label | Column description |
---|---|
eventID(Darwin Core Event, Darwin Core Occurrence Extension) | An identifier for the set of information associated with an Event (survey, soil sample or soil sample layer). https://dwc.tdwg.org/terms/#dwc:eventID 1005 unique values, examples: "R5:2012-09:3", "R5:2012-09:6:3:>10". |
parentEventID(Darwin Core Event) | An identifier for the broader Event that groups this and potentially other Events (survey or soil sample). https://dwc.tdwg.org/terms/#dwc:parentEventID 372 unique values, examples: "R3:2006-05", "P2:VZv:2003-05:7". |
dynamicProperties(Darwin Core Event) | Description of the Event in JSON format. https://dwc.tdwg.org/terms/#dwc:dynamicProperties Example: "{'event type':'soil sample','part of survey':'R1:2012-06'}". |
eventDate(Darwin Core Event) | The date which an Event occurred (YYYY-MM-DD format). https://dwc.tdwg.org/terms/#dwc:eventDate 22 unique values ranged between '2000-08-20' and '2012-09-25'. |
samplingProtocol(Darwin Core Event) | The description of the method used during an Event. https://dwc.tdwg.org/terms/#dwc:samplingProtocol Constant: "Digging-out and hand-sorting (by layers) of the soil samples of 25 * 25 cm and a depth of ca. 35 cm". |
sampleSizeValue(Darwin Core Event) | A numeric value for a measurement of the size of a sample in a sampling event (number of soil samples for the 'plot survey' event, size of the soil sample for the 'soil sample' event and area of sampling for the 'soil sample layer' event). https://dwc.tdwg.org/terms/#dwc:sampleSizeValue Constant for soil and layer level: "25×25×35" and "0.0625", respectively. |
sampleSizeUnit(Darwin Core Event) | The unit of measurement of the size of a sample in a sampling event. https://dwc.tdwg.org/terms/#dwc:sampleSizeUnit Constant for each level: "soil samples", "centimetres" and "square centimetres" - survey, soil sample and layer, respectively. |
locationID(Darwin Core Event) | An identifier for the sampling plot. https://dwc.tdwg.org/terms/#dwc:locationID 13 unique values, examples: "R2", "VZv", "33kv". |
countryCode(Darwin Core Event) | The standard code for the country in which the Location occurs according to ISO 3166-1-alpha-2. https://dwc.tdwg.org/terms/#dwc:countryCode Constant: "RU". |
country(Darwin Core Event) | The name of the country or major administrative unit in which the Location occurs. https://dwc.tdwg.org/terms/#dwc:county Constant: "Russian Federation". |
stateProvince(Darwin Core Event) | The name of the next smaller administrative region than country in which the Location occurs. https://dwc.tdwg.org/terms/#dwc:stateProvince Constant: "Kaluga Oblast". |
locality(Darwin Core Event) | Protected area name. Three possible values: "Ugra National Park" , "Kaluzhskiye Zaseki Nature Reserve (Southern cluster)" or "Kaluzhskiye Zaseki Nature Reserve (Northern cluster)". https://dwc.tdwg.org/terms/#dwc:locality |
decimalLatitude(Darwin Core Event) | The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic centre of a Location. https://dwc.tdwg.org/terms/#dwc:decimalLatitude Ranged berween: 53.6148 and 53.92215. |
decimalLongitude(Darwin Core Event) | The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic cenere of a Location. https://dwc.tdwg.org/terms/#dwc:decimalLongitude Ranged between: 35.73175 and 35.88146. |
geodeticDatum(Darwin Core Event) | The ellipsoid, geodetic datum or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitude are based. https://dwc.tdwg.org/terms/#dwc:geodeticDatum Constant: "WGS84". |
coordinateUncertaintyInMeters (Darwin Core Event) | The horizontal distance (in metres) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. https://dwc.tdwg.org/terms/#dwc:coordinateUncertaintyInMeters Constant: 50. |
coordinatePrecision(Darwin Core Event) | A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitude. https://dwc.tdwg.org/terms/#dwc:coordinatePrecision Constant: 0.00001. |
minimumDepthInMeters(Darwin Core Event) | The lesser depth of a range of depth below the local surface, in metres. https://dwc.tdwg.org/terms/#dwc:minimumDepthInMeters Values: 0.0, -0.1, -0.2. |
maximumDepthInMeters(Darwin Core Event) | The greater depth of a range of depth below the local surface, in metres. https://dwc.tdwg.org/terms/#dwc:maximumDepthInMeters Values: 0.0 (litter considered above 0), -0.1, -0.2, -0.35. |
habitat (Darwin Core Event) | A description of the habitat in which the Event occurred.https://dwc.tdwg.org/terms/#dwc:habitat 5 unique values, examples: "Broad-leaved forest", "Young birch forest". |
occurrenceID(Darwin Core Occurrence Extension) | An identifier for the Occurrence. https://dwc.tdwg.org/terms/#dwc:occurrenceID 6935 unique values, example: "758-P2:VZv:2003-09:5:2:0-10". |
basisOfRecord(Darwin Core Occurrence Extension) | The specific nature of the data record. https://dwc.tdwg.org/terms/#dwc:basisOfRecord Constant: "PreservedSpecimen". |
occurrenceStatus(Darwin Core Occurrence Extension) | A statement about the presence or absence of a Taxon at a Location. https://dwc.tdwg.org/terms/#dwc:occurrenceStatus Constant: "present". |
scientificName(Darwin Core Occurrence Extension) | The full scientific name according GBIF Backbone checklist. https://dwc.tdwg.org/terms/#dwc:scientificName 11 unique values, example: "Lumbricus Linnaeus, 1758", "Eisenia nordenskioldi (Eisen, 1879)". |
kingdom (Darwin Core Occurrence Extension) | The full scientific name of the kingdom in which the taxon is classified. https://dwc.tdwg.org/terms/#dwc:kingdom Constant: "Animalia". |
taxonRank(Darwin Core Occurrence Extension) | The taxonomic rank of the most specific name in the scientificName. https://dwc.tdwg.org/terms/#dwc:taxonRank Values: "FAMILY", "GENUS", "SPECIES". |
identificationReferences(Darwin Core Occurrence Extension) | Source of reference used in the Identification. https://dwc.tdwg.org/terms/#dwc:identificationReferences Constant: "Vsevolodova-Perel T.S. The earthworms of the fauna of Russia ...". |
lifeStage(Darwin Core Occurrence Extension) | The life stage of the biological individual at the time the Occurrence was recorded. https://dwc.tdwg.org/terms/#dwc:lifeStage Possible values: "Juvenile", "Subadult", "Adult". |
individualCount(Darwin Core Occurrence Extension) | The number of individuals represented present at the time of the Occurrence (was counted for 'survey' event). https://dwc.tdwg.org/terms/#dwc:individualCount Ranged between 1 and 260. |
organismQuantity(Darwin Core Occurrence Extension) | A value for the quantity of organisms, depends on unit (Quantity Type). https://dwc.tdwg.org/terms/#dwc:organismQuantity |
organismQuantityType(Darwin Core Occurrence Extension) | The type of quantification system used for the quantity of organisms. https://dwc.tdwg.org/terms/#dwc:organismQuantityType Two possible values: "gram" and "individuals/per survey". |
recordedBy(Darwin Core Occurrence Extension) | A person responsible for recording the original Occurrence. https://dwc.tdwg.org/terms/#dwc:recordedBy Constant: "Maxim Shashkov". |
institutionID(Darwin Core Occurrence Extension) | An identifier for the institution having custody of information referred to in the record (https://issp.pbcras.ru/). https://dwc.tdwg.org/terms/#dwc:institutionID Constant: "https://issp.pbcras.ru/". |
institutionCode(Darwin Core Occurrence Extension) | The name of the institution having custody of information referred to in the record. https://dwc.tdwg.org/terms/#dwc:institutionCode Constant: "Institute of Physicochemical and Biological Problems in Soil Science of the Russian Academy of Sciences". |
ownerInstitutionCode(Darwin Core Occurrence Extension) | The name of the institution having ownership of information referred to in the record (Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences). https://dwc.tdwg.org/terms/#dwc:ownerInstitutionCode Constant: "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences". |
identifiedBy(Darwin Core Occurrence Extension) | The person, who assigned the Taxon to thesubject. https://dwc.tdwg.org/terms/#dwc:identifiedBy Constant: "Maxim Shashkov". |