Biodiversity Data Journal :
R Package
|
Corresponding author: Franz-Sebastian Krah (f.krah@mailbox.org)
Academic editor: Scott Chamberlain
Received: 09 Nov 2018 | Accepted: 08 Jan 2019 | Published: 14 Jan 2019
© 2019 Franz-Sebastian Krah, Scott Bates, Andrew Miller
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Krah F, Bates S, Miller A (2019) rMyCoPortal - an R package to interface with the Mycology Collections Portal. Biodiversity Data Journal 7: e31511. https://doi.org/10.3897/BDJ.7.e31511
|
The understanding of the biodiversity and biogeographical distribution of fungi is still limited. The small number of online databases and the large effort required to access existing data have prevented their use in research articles. The Mycology Collections Portal was established in 2012 to help alleviate these issues and currently serves data online for over 4.3 million fungal records. However, the current process for accessing the data through the web interface is manual, therefore slow, and precludes the extensive use of the existing datasets. Here we introduce the software package rMyCoPortal, which allows users rapid, automated access to the data. rMyCoPortal makes data readily available for further computations and analyses in the open source statistical programming environment R. We will demonstrate the core functions of the package, and how rMyCoPortal can be employed to obtain fungal data that can be used to address basic research questions. rMyCoPortal is a free and open-source R package, available via GitHub.
data portal, database, fungaria, fungi, georeferencing, natural history collections, specimens, MyCoPortal, Symbiota, R package
Global climate and land-use change are major threats to life on earth, and studies continue to document how animal and plant distributions and phenology have changed due to these factors (
Open-source data provide an important resource for studying fungal biodiversity (
Metadata statistics of the Mycology Collections data Portal (MyCoPortal), retrieved in November 2018 via http://mycoportal.org/portal/collections/misc/collstats.php. The MyCoPortal compiles fungal specimen metadata that document distributions of fungi.
Collection Statistic |
Number |
Occurrence records |
4,369,313 |
Georeferenced |
1,843,633 (42%) |
Imaged |
1,913,838 (44%) |
Identified to species |
3,302,781 (76%) |
Families |
1,693 |
Genera |
8,314 |
Species |
113,811 |
Total taxa (including subsp. and var.) |
120,275 |
Although the MyCoPortal has been widely used and highly cited (>43 citations since 2015), data from this portal, however, can be difficult to access for automated analyses as the platform requires users to manually download data through a web interface. This procedure is very time consuming, especially when working with complex queries and building large datasets. After download, the data then needs to be further processed before basic exploration can be undertaken.
To this end, the first author has developed software that allows rapid and automated access to a large global database of fungal distribution records, eliminating the need to use the existing web interface. rMyCoPortal is written as a package for the popular R open-source statistical software (
The package can be downloaded and installed using the R package devtools (
## Install R package rMyCoPortal
install.packages('devtools')
devtools::install.github('FranzKrah/rMyCoPortal')
library('rMyCoPortal')
# Now Docker needs to be installed.
The download and usage of the package does not require a GitHub account. An account is, however, required if the user would like to actively contribute to functions of rMyCoPortal or launch an issue.
Here, we present the core functions of the rMyCoPortal package. rMyCoPortal makes use of several R packages that allow interaction with web data content, including RSelenium (
At the core of rMyCoPortal is the function mycoportal. Using mycoportal, queries can be made to find all records of a known fungal species. Further, all input specifications (i.e., query modifiers) that are present on the website can be adjusted within said function. Some important modifiers are the inclusion of synonyms or the geographic area. The user may also input a higher taxon, e.g., genus or family. The records are then stored in an S4 class object, which can be directly subjected to a variety of plotting functions. The functions plot_distmap and plot_datamap can be used to visualize species distributions and heatmaps of species diversity, respectively (Fig.
Three data analysis techniques enabled by the rMyCoPortal R package.
The following code demonstrates the usage of basic functions of the rMyCoPortal package:
## Download data for Amanita muscaria
am.rec <- mycoportal(taxon = 'Amanita muscaria')
## Plot species distribution
plot_distmap(x = am. rec , mapdatabase= 'state', interactive = FALSE)
## Plot records heatmap for states of USA
plot_datamap(x = am. rec, mapdatabase = 'state', index = 'rec')
The above code demonstrates the core functionality of the rMyCoPortal package for querying fungal records and also for basic data exploration. Using the mycoportal function, we queried the database for all observations for the mushroom-forming fungus Amanita muscaria (fly agaric) and modelled the current and future projected habitat suitability using the biomod2 R package (Fig.
The data contained in the MyCoPortal database is an important resource to address ecological research questions, however there are some limitations to be considered. First, the majority of the data within the database is localized within North America. Second, currently 42% of the records are georeferenced with longitude/latitude values (Table
In this paper, we have shown how the R package rMyCoPortal can be utilized to access the Mycology Collections data Portal. This package allows for easy and rapid access to MyCoPortal fungal data, speeding up a process that would otherwise be tedious and slow. Connecting the MyCoPortal database to the R statistical interface opens a wide range of research possibilities, where queried data can be efficiently processed and used to address scientific questions. Further, the rMyCoPortal package has the potential to be modified to access other data portals, such as those for vascular plants (http://swbiodiversity.org/seinet/) or arthropods (http://symbiota4.acis.ufl.edu/scan/portal/), which are also built on the Symbiota platform. We hope this R package inspires scientists to conduct studies related to how fungal biodiversity and biogeography responds to global climate change.
The package, together with documentation and vignettes, is available on GitHub: https://github.com/FranzKrah/rMyCoPortal
It is open-source software (published under the GPL public licence, ver. 3).
The authors have no conflict to declare.