Vernal pool amphibian breeding ecology monitoring from 1931 to present: A harmonised historical and ongoing observational ecology dataset

Abstract Background For 88 years (1931-present), the Mohonk Preserve's Daniel Smiley Research Center has been collecting data on occupancy and reproductive success of amphibian species, as well as associated water quality of 11 vernal pools each spring (February to May). Though sampling effort has varied over the dataset range, the size of the dataset is unprecedented within the field of amphibian ecology. With more than 2,480 individual species sampling dates and more than 151,701 recorded individual occurrences of the nine amphibian species, the described dataset represents the longest and largest time-series of herpetological sampling with paired water quality data. New information We describe the novel publication of a paired dataset of amphibian occurrence with environmental indicators spanning nearly 90 years of data collection. As of February 2020, the dataset includes 2,480 sampling dates across eleven vernal pools and 151,701 unique occurrences of egg masses or individuals recorded across nine species of amphibian. The dataset also includes environmental conditions associated with the species occurrences with complete coverage for air temperature and precipitation records and partial coverage for a variety of other weather and water quality measures. Data collection has included species, egg mass and tadpole counts; weather conditions including precipitation, sky and wind codes; water quality measurements including water temperature and pH; and vernal pool assessment including depth and surface vegetation coverage. Collection of data was sporadic from 1931–1991, but data have been collected consistently from 1991 to present. We also began monitoring dissolved oxygen, nitrate concentrations and conductivity of the vernal pools using a YSI Sonde Professional Plus Instrument and turbidity using a turbidity tube in February 2018. The dataset (and periodic updates), as well as metadata in the EML format, are available in the Environmental Data Initiative Repository under package edi.398.


Introduction
Mohonk Preserve is a nature preserve and land trust located in New Paltz, New York State, USA. As the largest non-profit nature preserve in New York State, Mohonk Preserve protects and manages more than 8,000 acres of the Shawangunk Mountains, a northern section of the Appalachian Mountains. Renowned for its high biodiversity value, the Shawangunks harbour more than 1,400 known plant and animal species, including 57 that are rare and imperilled Huth 2011, Batcher 2000). Mohonk Preserve's conservation science division is affiliated with the Organization of Biological Field Stations and acts as a NOAA Climate Observation Center. Mohonk Preserve's conservation unit is the Daniel Smiley Research Center (DSRC), which coordinates a research and educational network that includes over 20 researchers from regional and national research institutions and more than 1,000 students and dozens of faculties from colleges and universities in the Hudson Valley and beyond. DSRC staff and citizen scientists carry out a variety of longterm monitoring projects, including the described amphibian breeding in woodland vernal pools. In addition to the ongoing monitoring, the DSRC manages an extensive archive of historical observations including 86 years of natural history observations, 123 years of daily weather data, 60,000 physical items, 9,000 photographs and a research library. Of those physical items, there are over 3,000 herbarium specimens, 139 mammal specimens, 107 bird specimens, 140 butterfly specimens, 400 arthropod specimens and over 14,000 index cards with handwritten and/or typed natural history observations. The data and natural history collections underpin the Mohonk Preserve's land management and stewardship ( Fig. 1) and have been crucial to an increasing number of scientific publications (e.g. Cook et al. 2009, Cook et al. 2008, Charifson et al. 2015, Richardson et al. 2016).
The Vernal Pool Monitoring programme began in 1931 with the observations of Daniel Smiley . Smiley documented extensive records of a variety of taxa within the Shawangunk Mountains beginning in the late 1920s (Huth 1996). The majority of his observations are recorded in a card filing system or one-to two-page reports, that are now available in the DSRC archives. He first began monitoring amphibians in 1930 and began regularly monitoring vernal pools in the surrounding regions in the 1950s. Smiley's particular interest in the impacts of acid rain and pesticides provides us with records of water quality extending to some of Smiley's earliest observations (Huth 1996, Weathers et al. 1986).
The presented dataset includes observations at 11 vernal pools on Mohonk Preserve lands: Sleepy Hollow, Ski Loop, Canaan, Long Woodland Swamp, Long Woodland Pool, Talus, Terrace, Bonticou, Oakwood, Hermits and North Mud Pond (Fig. 2). These vernal pools vary in size and are distributed across the Mohonk Preserve landscape at a range of elevations (166-384 m). The current goal of the project is to monitor the seasonality and reproductive ecology of amphibians. As the breeding behaviour of amphibians is highly dependent on environmental quality, particularly water quality, data collection in recent years has strived to provide a holistic environmental context for occurrence records (Semlitsch 2002, Marco et al. 1999, Karraker et al. 2008, Hamer and McDonnell 2008. A major hurdle to making this natural history dataset available to the research community has been the digitisation of the historical data records. The majority of the historical data points were extracted from the notecards (Fig. 3) These cards were scanned, transcribed and formatted for data archiving by volunteer citizen scientists at the DSRC facilities, demonstrating the significant impact of citizen science contributions in natural history data rescue efforts and mitigating data risk factors (Griffin 2017, Brunet and Jones 2011, Mayernik et al. 2020). Due to the size and scope of the DSRC's historical holdings, the digitisation process remains ongoing and, as historical occurrences become available, they will be added to the described dataset periodically. In addition to the digitised historical data, the dataset is also comprised of ongoing, yearly vernal pool monitoring. Combining the historical data with ongoing monitoring enables researchers to elucidate long-term trends, project impacts of climate change or urbanisation and compare these predictions with future and current data collection.
This dataset significantly adds to our knowledge of amphibian breeding ecology by providing a nearly 90-year time series of amphibian reproductive occurrences paired with environmental variables from a single geographic region with multiple replicate pools. With more than 2,480 individual species sampling dates and more than 151,701 recorded individual occurrences of the nine amphibian species, the described dataset represents the longest and largest time-series of herpetological sampling with paired water quality data. Additionally, the dataset incorporates records from Long Woodland Vernal Swamp from years immediately preceding its drying. These records may be of particular interest to researchers interested in transitional ecology. The dataset also extends into the 1930s, allowing for significant investigations into the impacts of climate change and acid rain on amphibian reproductive ecology and phenology. In this paper, we will describe the data structure and synthesise the collection protocols in the hopes of facilitating future reuse and consistent access to this valuable data resource. Representative images of two of the sampled vernal pools: Sleepy Hollow (left) and Canaan (right).

Sampling methods
Sampling description: Amphibian occupancy, breeding and water quality of vernal pools on the Preserve have been monitored since 1931. Sampling effort has varied significantly over the dataset range, with stochastic sampling prior to 2017 and post-2017 weekly spring sampling. From 1931 -2015, data collected at each vernal pool varied, but often included current weather conditions, water level (%), water temperature (°F), water pH and species/ egg mass counts. Observations at each vernal pool occur annually between February and May. From 1931 -1991, sampling at vernal pools and the number of vernal pools sampled each year varied. Starting in 1991, each vernal pool was observed at least two times each spring. Starting in 2016, a more rigorous protocol was adopted from the USGS Amphibian Reproductive Monitoring Initiative programme with sampling at 10 vernal pools being completed at least four times per spring season (Muths et al. 2005).The number of observers during each survey varies and is weather and logistic dependent, but with a goal of weekly sampling of each vernal pool, beginning at the onset of amphibian emergence and continuing until larvae hatch from egg masses.
In 2016, the new protocol dictated more parameters to be sampled consistently. Upon arrival at each vernal pool, initial observations performed included: 1.
weather conditions: sky code, wind code, air temperature (°F) and previous day precipitation occurrence; and 2.
qualitative measurements: water level (%), surface ice (%), surface vegetation (%), surface vegetation species, water odour and fairy shrimp (order Anostraca) presence. Scanned historical occurrence notecards. Dates, locations, occurrence stages and any water quality or environmental indicators were extracted from narrative statements and are included in the described dataset.
The data are recorded by hand on to standardised data sheets (Fig. 4) and later input into a machine-readable format. All parameters of amphibian species counts since 2016 were collected using the Double-Observer Dependent method Jacobson 1979, Nichols et al. 2000). Observer A walks halfway around the pool and calls out counts to Observer B. Observer B records count made by Observer A, but also records counts that Observer A missed without informing Observer A. Observers A and B switch roles when the second half of the pool is surveyed. Prior to November 2017, water from the vernal pool was collected in a high-density polyethylene bottle and temperature was immediately recorded on-site with an analogue thermometer. The sample was then brought back to the DSRC where pH was measured. Prior to 13 December 1991, pH was measured using a Sargent-Welch analogue pH meter. Beginning 13 December 1991, pH was measured with a Fisher Accumet digital pH meter with higher resolution and probes were replaced every 6 months. In 2002, the meter was replaced with a new Fisher Accumet pH meter. Starting in February 2018, temperature and pH were collected in situ using a YSI Sonde Professional Plus hand-held multiparameter meter and additional Sonde measurements of percent dissolved oxygen, nitrate concentrations and conductivity were incorporated in the water quality measurements. Starting in March 2019, a turbidity tube was incorporated to measure turbidity. Current format of the vernal pool monitoring data sheet for ongoing data collection. As additional data are collected, the data object in the Environmental Data Initiative repository (package identifier: edi.398) will be updated accordingly.

Quality control:
Prior to the publication of the dataset, historical records were checked for biological consistency. During field surveys, utmost caution was used when monitoring and sampling from the vernal pools. To avoid disturbance, wading was not done in the pools and the YSI Sonde was only submerged in areas that did not contain visible animals, egg masses, spermatophores or larvae. Starting in 2017, between each vernal pool survey, equipment and boots were disinfected with a 5% bleach solution to avoid transference of disease. The YSI Sonde was calibrated before each vernal pool survey to ensure that water quality measurements were accurate; probes were replaced as needed.

Taxonomic coverage
Description: The described dataset, ongoing as of February 2020, includes 2,480 recorded dates, sampled across the nine amphibian species. Spotted salamanders were surveyed on the greatest number of dates, with 521 unique events and blue-spotted salamanders were surveyed the least, with 144 unique events (Fig. 6). Across all species, life stages and sampling events, a total of 151,701 individuals were recorded. All the species included in the sampling are native to this region. Three of the sampled Ambystoma species: Jefferson salamander, marbled salamander and blue-spotted salamander were listed in 2013 as species of special concern in the State of New York as defined in Section 182.2(i) of 6NYCRR Part 182. As defined by the State, species of special concern need increased consideration and monitoring, but current information does not warrant listing these species as threatened or endangered.
The current taxonomic authority of the dataset is the Integrated Taxonomic Information System (ITIS) (Bisby et al. 2006). If the ITIS taxonomic classification of the monitored species changes, the dataset will be updated at that point to reflect those changes. Particularly, there has been significant debate about the appropriate genus name for green frogs, wood frogs and bullfrogs. The elevation of Lithobates as a genus name for these species has been questioned by Yuan et al. (2016) and Pauly et al. (2009). Presently, Rana is used by the vast majority of ranid systematists around the world, so we anticipate the eventual transition from Lithobates to Rana in ITIS in the near future and the subsequent updating of the data package.
Pie chart showing the percent of dataset occurrence records by species. Jefferson/bluespotted salamander complex omitted from the total, with 13 total records (0.5%).  Fig. 4). Collection of both species occurrence data and environmental data was sporadic from 1931 -1991, but spring data collection from 10 pools has been consistent from 1991 -present (Fig. 7). If pools dry up or access is no longer possible, additional pools may be added to have coverage of 10 pools each spring.

IP rights notes:
This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/ zero/1.0/). It is considered professional etiquette to provide attribution of the original work, if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Description: This dataset includes additional information about the 11 sampling locations included in the dataset, including the name used in the weather and water dataset and the species occurrence dataset and associated information about the vernal pool location and sizes. These data were collected in 1999 and we anticipate an update within the next few sampling years (Barbour 1999).

Column label Column description
Location Vernal pool location name Elevation Elevation of the vernal pool (metres)

MaxDepth
The maximum depth of the vernal pool at full capacity (metres)

Length
The length of the vernal pool at its longest point (metres)

Width
The width of the vernal pool at its widest point (metres)