Dataset from the Snakes (Serpentes, Reptiles) collection of the Museu Paraense Emílio Goeldi, Pará, Brazil

Abstract Background We present a dataset with information from the snake collection of the Museu Paraense Emílio Goeldi, known as the “Ophidia Collection”. This collection currently has 26,728 specimens of snakes, including 9 families, 66 genera and 220 species. For the most part, it represents material from the Amazon Region. Specimens are preserved mostly in wet (alcohol) preparation, with some samples preserved in dry form, as is the case of the shells and skeletons of turtles. The dataset is now available for public consultation on the Global Biodiversity Information Facility portal (https://doi.org/10.15468/lt0wet). New information The Herpetological collection of Museu Paraense Emílio Goeldi comprises the largest collection of its kind in the Amazon region with about 100,000 specimens of amphibians and reptiles (chelonians, alligators, lizards, snakes and amphisbaenians). This collection currently has 26,728 specimens of snakes, including 9 families, 66 genera and 220 species, some of which are endemic to the Amazon rainforest region. The Museu Paraense Emílio Goeldi is the second oldest institution of science in Brazil in activity, founded in 1866.


Introduction
The Museu Paraense Emílio Goeldi (MPEG) or Goeldi Museum, located in Belém, Pará, Brazil, is a federal research institution within the Brazilian Ministry of Science, Technology and Communication (MCTIC). Although herpetological studies in Goeldi Museum were initiated by the Swiss naturalist Emílio Goeldi in the late 19 century, the specimens collected during this period did not, however, remain in the Museum. Other important researchers contributed to the formation and enhancement of the herpetological collections, including Emília Snethlage and Gottfried Hagmann. Emília Snethlage was hired at the Goeldi Museum in June 1905, when she began developing numerous field works on scientific expeditions in the Amazon to collect specimens. In 1914, she became director of the Goeldi Museum, being the first woman to administer a scientific institution in South America (Junghans 2008).
In June 1965, a Division of Herpetology was installed at the Goeldi by Osvaldo R. Cunha. The rapid increase in knowledge of the herpetofauna from the 1950s to 1975 was largely associated with the work of Osvaldo R. Cunha and Francisco P. Nascimento, who developed an ambitious field collection programme (Cunha and Nascimento 1978;Cunha and Nascimento 1993;Hoogmoed et al. 2011). This reptile collection initiative included collaboration with inhabitants of several localities, initially covering the eastern region of Pará, later extending to the south of the state and to the west of Maranhão, Brazil. This long-term study resulted in a series of important papers (as listed in Hoogmoed et al. 2011), which became basic reference works for the eastern Amazonian snakes.
Despite this important sampling effort by Cunha and Nascimento, accessibility and time constraints hindered full coverage of the area, such that many municipalities remained unsampled. More recently, detailed field studies of some of these areas, plus the inclusion of new areas, has increased our knowledge of the herpetological fauna of the region (Avila-th Pires 1995; Avila-Pires et al. 2009;Rodrigues and Prudente 2011;Silva et al. 2011;Prudente et al. 2013;Prudente et al. 2018). The Herpetological collection of the MPEG constitutes an irreplaceable source of information for these studies and represents an historical record for the occurrence of several species in areas that are now totally altered by anthropic actions, especially in the region known as the "arc of deforestation" (Prudente et al. 2018).
Currently, the Herpetological collection comprises the largest collection of its kind in the Amazon region, with about 100,000 specimens of amphibians and reptiles (chelonians, alligators, lizards, snakes and amphisbaenians) (Fig. 1). There are currently three staff researchers and one associate researcher, as well as undergraduate and graduate students from the Postgraduate Program in Zoology (PPGZOOL) of the Federal University of Pará/MPEG and in the recently created graduate programme in Biodiversity and Evolution (PPGBE) at MPEG. In this paper, we describe and synthesise information about Amazonian snake biodiversity as represented in the collection of MPEG, by providing a summary of taxonomic coverage and geographical distribution, in the hopes of facilitating rapid and dynamic access to these records.

Sampling methods
Sampling description: The specimens were preserved mostly in liquid (alcohol) collections, although some individuals were preserved as dry specimens. The collection also had tissue samples for molecular studies. Quality control: The snake collection of MPEG has received collections from dozens of scientists who had used various methods of sampling.They included time-constrained search, pitfall traps with drift fence and incidental encounters (Santos- Costa and Prudente 2003;Maschio et al. 2016). The validity of species' names was checked using the catalogue The Brazilian Reptiles -List of species (Costa and Bérnils 2018). Synonymies were checked across all mentioned lists and, if incongruences were found, the earliest name on the record was used for disambiguation. If the names used in the earlier lists did not resolve the nomenclatural inconsistency, geographical ranges of the species were checked and used to assign the currently accepted species name. If identification problems persisted after these steps, the authors carried out a more thorough re-identification of the specimens. Given the current data policy of the Goeldi Museum biological collections, collection sites without specific geographic coordinates, identified with locality names, were georeferenced using the location of the municipal centre of the region of occurrence, in order to guarantee a consistent distribution for Occurrence Data of Sensitive Primary Species.

Taxonomic coverage
Description: The snakes collection of MPEG includes 26,728 specimens, representing 9 families, 66 genera and 220 species. About 99% of the specimens come from Brazil, with few records in Colombia (27 specimens), Argentina (2 specimens), Ecuador, French Guiana, Peru and Suriname (with one specimen each country). For the most part, this material came from the Amazon. There are 15 holotypes, one neotype (Hydrodynastes bicintus) and 13 paratypes. More than 98% of the specimens are identified at the species level.

Notes:
The temporal range of the records is between 1900-2017 (Fig. 5). A rapid increase in knowledge of the eastern Amazonian herpetofauna from the 1950s to 1975 was largely associated with the work of Osvaldo R. Cunha and Francisco P. Nascimento, MPEG researchers who developed a collection programme in the region during that period (Hoogmoed et al. 2011). Despite significant sampling efforts by several collaborators, accessibility and time constraints hindered full coverage of the area, such that many municipalities remained unsampled.

Data publication protocol
Prior to digitising the collection, the preservation status of specimens was reviewed. Specimens were identified or existing identifications were reviewed. The digitising and publication process followed the protocols of previous work in the Goeldi's Ichthyology collection (Silva et al. 2017), as illustrated in Fig. 6. First, all biodiversity and biological collection data were digitised in a Microsoft Excel spreadsheet adopting the Specify format (SPECIFY 2018). Next, the data spreadsheet was imported into the Specify database, using the Workbench tool to perform a data check for duplicate records, consistency and standardisation errors (for example, geographical coordinates, date etc.). After this data check, the data were imported. Then the data were exported from Specify software to the Darwin Core Archive format v1.4 (Wieczorek et al. 2012), creating a dataset with metadata.
In the fourth and final step, a collection dataset repository was created using the Integrated Publishing Toolkit (IPT), which was submitted and published in GBIF (http://www.gbif.org).

Curatorship and storage
The material is identified by comparison with bibliographic sources and material present in the collection. The data and metadata are digitised and deposited in the collection and maintained in air-conditioning at 22°C. The specimens are fixed in formalin for 24 hours and transferred into a 70% ethanol solution for permanent storage. Snakes are injected at 4-5 cm intervals along the whole length of belly and tail. Moderate pressure at the base of the tail of a freshly killed snake everts its hemipenes. Hemipenial morphology is very helpful in taxonomic determinations. Injection of formalin at the tail base also serves to put pressure to evert hemipenis and harden them at the same time. Samples are stored in glass jars or other containers (for example, high density polyethylene drums), organised in alphabetical order by family, genus and species. The tissue samples are taken from freshly killed specimens, preserved in ethanol and stored in a freezer. Loans, exchanges and donation of materials are made through a request to the curator, who evaluates each proposal.