Capturing biodiversity: linking a cyanobacteria culture collection to the “scratchpads” virtual research environment enhances biodiversity knowledge

Abstract Background Currently, cyanobacterial diversity is examined using a polyphasic approach by assessing morphological and molecular data (Komárek 2015). However, the comparison of morphological and genetic data is sometimes hindered by the lack of cultures of several cyanobacterial morphospecies and inadequate morphological data of sequenced strains (Rajaniemi et al. 2005). Furthermore, in order to evaluate the phenotypic plasticity within defined taxa, the variability observed in cultures has to be compared to the range in natural variation (Komárek and Mareš 2012). Thus, new tools are needed to aggregate, link and process data in a meaningful way, in order to properly study and understand cyanodiversity. New information An online database on cyanobacteria has been created, namely the Cyanobacteria culture collection (CCC) (http://cyanobacteria.myspecies.info/) using as case studies cyanobacterial strains isolated from lakes of Greece, which are part of the AUTH culture collection (School of Biology, Aristotle University of Thessaloniki). The database hosts, for the first time, information and data such as morphology/morphometry, biogeography, phylogeny, microphotographs, distribution maps, toxicology and biochemical traits of the strains. All this data are structured managed, and presented online and are publicly accessible with a recently developed tool, namely “Scratchpads”, a taxon-centric virtual research environment allowing browsing the taxonomic classification and retrieving various kinds of relevant information for each taxon.


Introduction
Biodiversity is the study of the variety of life at all possible levels of the biological organisation (from genes to ecosystems) and scales of observation (from local to global). Therefore, studies of biodiversity are predicated on the capacity to bring together information from across a diverse spectrum of scientific fields (Koureas et al. 2016). The Mediterranean area is a known biodiversity hot spot, however, diversity of microbes is substantially underestimated or unexplored (Coll et al. 2010). The diversity of freshwater cyanobacteria, especially those involved in water blooms, has been brought into attention as studies have shown that prolonged cyanobacterial blooms, dominated by known toxic species, can occur (Gkelis et al. 2014). Furthermore, cyanobacteria are a prolific source of natural products, known from just a handful of genera (Dittmann et al. 2015) and emerging data are providing a genetic basis to the natural product diversity. This is expected to set up an integrated research workflow that will increase the efficiency of biodiscovery pipelines.
Cyanobacteria are a large and morphologically very diverse group of photosynthetic prokaryotes, which occur almost in every illuminated habitat, and quantitatively are among the most important organisms on Earth (Whitton 2012). Today, cyanobacterial diversity is examined using a polyphasic approach by assessing morphological and molecular data (Komárek 2015). The comparison of morphological and genetic data is sometimes hindered by the lack of cultures of several cyanobacterial morphospecies and inadequate morphological data of sequenced strains . Furthermore, in order to evaluate the phenotypic plasticity within defined taxa, the variability observed in cultures has to be compared to the range in natural variation (Komárek and Mareš 2012).
Biodiversity research is at a pivotal point with research projects generating data at an ever increasing rate. Structuring, aggregating, linking and processing these data in a meaningful way is a major challenge (Koureas et al. 2016). The need for efficient informatics tools in biodiversity research is constantly increasing, and this is reflected in the volume of different biodiversity information projects (>680) (http://www.tdwg.org/biodiv-projects/) currently running at a local, regional or global level. However, only very few (less than five) projects are dedicated to bacteria or algae. To the best of our knowledge, apart from the AlgaeBase (Guiry and Guiry 2016) comprising information on all terrestrial, marine and freshwater algae, there is only one online database listing cyanobacteria genera (Komárek and Hauer 2013); other databases contain only taxonomic information and/or images.
In this paper, we present "Cyanobacteria culture collection" a database on cyanobacteria hosting, for the first time, information such as morphology/morphometry, biogeography, phylogeny, microphotographs, distribution maps, toxicology, and biochemical traits of cyanobacteria strains isolated from freshwaters of Greece. All those data are structured managed, and presented online and are publicly available through Scratchpads (Smith et al. 2009).

General description
Purpose: The purpose of this database is to make available data associated with cyanobacteria in Greece. The database features information about different traits (morphological, morphometric, biochemical) for cyanobacteria strains. The dataset represents a long-term and ongoing survey that aims to be useful in future investigations of cyanobacteria diversity, phylogeny, ecology, new metabolites discovery.

Sampling methods
Study extent: This dataset is primarily developed to sum our ongoing effort on exploring the biodiversity (morphological, genetic, metabolite) of photosynthetic organisms. Thus, the strains comprising the dataset are from freshwaters of Greece isolated during the past 15 years. However, marine cyanobacteria strains isolated from the Aegean Sea and thermophilic strains isolated from thermal springs (unpublished data) are soon to be included.

Sampling description:
The strains were isolated during the years 1999-2015 from 12 different freshwater lakes and reservoirs (Table 1). Strains were isolated on solid and/or liquid growth media using classical microbiological techniques and grown as batch clonal unialgal cultures; all strains were derived from a single colony or trichome. More information on sampling sites and strain isolation are given in Gkelis et al. (2015). The strains were identified to the species or genus level according to Komárek and Anagnostidis (1999), Komárek and Anagnostidis. (2005), Komárek (2013), taking into consideration the current taxonomic status (Komárek 2015).

Geographic coverage
Description: All taxa in the database were isolated from several Greek freshwater bodies. However, the database is constantly being expanded, so strains from other locations across Greece will be present in the database in the near future.

Traits coverage
Information for each strain are given in different tabs after choosing a particular strain. Some strains were characterised based on their morphological features and 16S rRNA gene sequences (Gkelis et al. 2005), screened with respect to their ability to produce cyanotoxins (Gkelis et al. 2015) or their antibacterial traits (Lorenzo et al. 2013). This information is contained in the "Descriptions" tab where all available morphological/ morphometrical, toxicity and biochemical data, are given (Fig. 2). The "Media" tab contain microphotographs, whereas "Literature" and "Maps" refer to the relevant literature and the Preview of the "Cyanobacteria culture collection" database collection. The Taxonomy system is presented as part of the "Cyanobacteria" tab; an overview of the strain Chococcus minutus AUTH 0599 is shown as an example.
region where the strain was isolated, respectively (Fig. 3). About 12 traits per isolate are currently given in the database.

Collection data
Collection name: Aristotle University of Thessaloniki (AUTH) microalgae collection (Department of Botany, School of Biology) The "Descriptions" tab including morphometric (cell's width, filament's length), toxicity and biochemical traits data for the strain Microcystis flos-aquae AUTH 1510. These data are shown after clicking the desirable taxon in the backbone taxonomy. "Literature" and "Maps" tabs for the strain Chroococcus minutus AUTH 0599.