Towards retrieving the Promethean treasure: a first molecular assessment of the freshwater fish diversity of Georgia

Abstract In this study, we provide a first estimation of the molecular diversity of the freshwater fishes of Georgia. In addition to field collections, we integrated DNA barcode data obtained from recent works and public databases (BOLD and NCBI GenBank). Currently, the DNA barcode reference library for freshwater fishes of Georgia comprises 352 DNA barcodes for 50 species, 36 genera and 15 families (52% of total Georgian freshwater fish diversity), from which 162 DNA barcodes belonging to 41 species were newly generated as part of this study. A total of 22 species are reported from the Caspian Sea basin and 31 from the Black Sea basin. Amongst the studied taxa, seven species were found with large interspecific divergences (> 2%) while 11 species were found to share DNA barcodes within our dataset. In the course of the study, we found the first evidence of the existence of Gymnocephalus cernua (Linnaeus, 1758) and also confirm the second occurrence of invasive Rhinogobius lindbergi (Berg, 1933) in Georgia. Based on the evaluation of currently-available barcode data for Georgian fishes, we highlighted major gaps and research needs to further progress DNA-based biodiversity studies in Georgia. Though this study lays a solid base for DNA, based biodiversity assessment and monitoring approaches, further efforts within the recently started CaBOL (Caucasus Barcode Of Life) project are needed to obtain reference data for the species still lacking DNA barcodes.

Broadly, the Georgian ichthyofauna can be divided into the eastern Black Sea and Caspian Sea basins. According to Abell et al. (2008), Naseka (2010), within the political borders of Georgia, however, three biogeographic regions exist: A) Black Sea basin, which is separated from east Georgia by the Likhi and Meskheti ridges and covering the whole territory of west Georgia; B) Kura basin and C) Terek basin in the northern part of the country, which is separated from the Kura basin by the Greater Caucasus Mountain Range (Fig. 1).
Since the late eighteenth century, industrial and economic developments have led to severe environmental changes in the whole Caucasus region (Davtyan 2014, Freyhof et al. 2015, similar to other parts of the world (Smith et al. 1999). Since freshwater ecosystems are particularly sensitive and vulnerable to alterations in (gravel mining, hydropower plants) and adjacent to (land use) the water body (Allan and Flecker 1993), negative impacts on the ichthyofauna have likely occurred, but have never been scientifically evaluated. This is also due to a general data deficiency concerning the whole freshwater realm of the region . Along with taxonomic uncertainties in some species, major gaps still exist in the accurate knowledge of species' distribution ranges within the Georgian inland waters, while no local conservation assessments exist for any freshwater fish species (Kuljanishvili et al. 2020). The highly-threatened group of sturgeons (Acipenser spp.), for example, have never been officially assessed in Georgia and only the international IUCN status is known. The Red List data for Georgian fishes are inherited from the Soviet time and thus outdated in many aspects. A new regional assessment, based on the IUCN system, is not yet available. Thus, the extent and magnitude of past disturbance or ongoing threats (such as habitat degradation, poaching, pollution and invasive species) to Georgian freshwater fishes remains largely unknown.
Along with traditional faunistic assessments, molecular genetic tools (such as DNA barcoding) have emerged as an important aid to deal with uncertainties related to taxonomy, species boundaries or cryptic diversity and have helped to enable innovative and efficient ways of biomonitoring (Hebert et al. 2003, Waugh 2007Hajibabaei 2012, Leese et al. 2018. DNA barcoding is a method to identify species via short, standardised and easily-obtainable DNA fragments (Hebert et al. 2003, Bhattacharya et al. 2015, Lakra et al. 2015, Bingpeng et al. 2018). This method is not only limited to identifying specimens, but can also help to screen unrecognised species diversity (Barman et al. 2018).
The important step for DNA barcoding to be useful in biodiversity study/monitoring, is to develop a DNA barcode library for a particular taxa or area. The successful completion of this step, however, requires the integration of traditional taxonomic expert knowledge and DNA technology. While traditional taxonomic expertise (not only in ichthyology), based on academic training, has been largerly neglected in Georgia, as well as in many other countries -part of the phenomenon known as the 'taxonomic impediment' (e.g. Giangrande 2003) -DNA-based technologies for biodiversity inventories and research are gathering momentum in Georgia and can still provide new insights into freshwater fish diversity (Japoshvili et al. 2013, Ninua et al. 2018, Levin et al. 2019. As a result, new, yet not fully exploited DNA-based information on Georgian fish diversity and their distribution has accumulated in recent years. Therefore, the aim of our work was to contribute to the development of a DNA barcode reference library for Georgian freshwater fishes and to summarise the current state of knowledge. We thus establish the starting point for DNA-based biodiversity evaluation and monitoring efforts of freshwater fishes of the Southern Caucasus region. The presented data will further aid in identifying taxonomically-interesting cases that need to be solved in the future.

Data collection and DNA barcoding
In July 2018 and July 2019, concerted collecting activities (BioBlitzes) were organised by the Ilia State University -ISU (Georgia) and the Zoologisches Forschungsmuseum Alexander Koenig -ZFMK (Germany) in the Kintrishi areas in Western Georgia (N41.76 E42.02) and in the Kazbegi region in Northern Georgia (N42.65 E44.64), respectively (Thormann et al. 2019). During these events, fish sampling campaigns (permissions: #5615/01, #21/824 and #3875 -2018/2019 issued by Ministry of Environmental Protection and Agriculture of Georgia) have been conducted in the Kintrishi and Terek River basins via electrofishing (device EFGI-650, http://www.electric-fishing.de) and frame net. After anaesthesia with MS-222 of a subsample of the collected fishes, a fin-clip was taken and stored in 99.9% molecular grade ethanol and specimens were fixed in 5-7% formaldehyde or, alternatively, specimens were directly fixed in 99% molecular grade ethanol. In addition, material collected in different areas ( Fig. 1) prior to the above-mentioned activities was included (see Suppl. material 1). All specimens were identified to species level using standard morphological characters (e.g. Kottelat and Freyhof (2007)) and tissue samples of selected specimens submitted to DNA barcoding routines at ZFMK.

Data processing
Data processing and sequence assembly was done with the software Geneious Pro v.7 (Drummond et al. 2011) and the Muscle algorithm (Edgar 2004) was used to align the DNA barcodes after manually screening for indels or stop codons. All newly-generated DNA sequences with acceptable quality (with less than 1% ambiguous bases and free of stop codons) were submitted to the Barcode of Life Datasystem (BOLD, http:// v4.boldsystems.org/), including relevant metadata where they were automatically assigned Barcode Index Numbers (BINs). They can be accessed via the public dataset "Georgian Freshwater Fishes" (DS-GGBCPIS).
In addition to the newly-generated DNA barcodes, we included all BOLD-deposited DNA barcodes that originated from Georgia. Sequences from the BOLD database were included in our dataset if the specimen metadata explicitly stated the origin of the sample and provided geo-referenced data (Suppl. material 1). Subsequently, the BOLD v4 tools evaluated sequence divergence and relationships between and within taxa, based on uncorrected p-distance. A Neighbour-Joining tree (based on K2P distances) with 1000 bootstrap replicates was constructed to investigate congruence between morphological identity and genetic relationships. Analyses were performed using MEGA X software (Kumar et al. 2018) and statistical tools provided by BOLD Systems (Ratnasingham and Hebert 2007).

Results and discussion
DNA barcodes of 352 individuals representing 50 species, 36 genera and 15 families (52% of Georgian freshwater fish diversity) are currently available for Georgian fishes. From these, 162 COI sequences were newly generated for this study, through the GGBC (Georgian-German Biodiversity Center) initiative, 153 were contributed through the FREDIE (Freshwater Diversity Distribution for Europe) project (https://www.fredie.eu/), 19 sequences stem from the "Russian Freshwater Fishes" project on BOLD and 18 DNA barcodes were mined from GenBank through BOLD. In the final dataset of all 352 barcodes, the length of the COI sequences was, on average, 648 base pairs (minimum 465 and maximum 658) including no stop codons, insertions or deletions. A total of 82 positions out of 658 (13%) were variable, from which 60 positions (9%) were diagnostically informative. On average, nucleotide base frequency (A-24.47%, C-27.67%, G-18.56%, T-29.29%) and GC content (46.24%) were well within the range known for fishes (see, for example, Bingpeng et al. (2018)). Distance summary statistics are provided in Table 1, showing significant changes of average p-distance amongst family, genus and species level.  (Fig. 3), resolved almost all morphological species as unique clusters in congruence with morphological species identification. However, three of the genera ( Chondrostoma, Gobio, Salmo) showed complicated sequence relationships where maximum within-species distances were larger than the minimum between-species distances amongst the congenerics (Suppl. material 2), with also low bootstrap support for species level clusters on the tree (Fig. 3). Table 1.
Summary table of K2P genetic distances within the different taxonomic levels derived from 349 specimens analysed. The list of studied species is provided in Suppl. material 1 Figure 2.
Barcode frequency distribution for Georgian fish species in a BOLD System at the time of writing.   (Fig. 4) showed maximum interspecific divergence larger than 2% (p-distance) pointing them out as interesting subjects for further in-depth studies. All these species await additional studies to clarify their taxonomic positions. The genera Chondrostoma, Rutilus, Alburnus, Neogobius and Oxynoemacheilus are diverse (i.e. more than two species according to Kuljanishvili et al. (2020)) in the Caucasus region for which unambiguous systematics, as well as detailed distribution of separate species or genetic profiling, is still lacking. For instance, Kuljanishvili et al. (2020) assume the occurrence of four species of Rutilus in the South Caucasus. However, a study, based on the Cytochrome b marker by Levin et al. (2017), suggests the possibility that only a single genetically-highly polymorphic R. lacustris occurs in the Caucasus region. Likewise, the systematics of Caucasian species and genera of Gobiidae and Nemacheilidae are amongst the most confusing, as already pointed out by Kuljanishvili et al. (2020). For these taxa, our COI barcode dataset is insufficient in order to draw meaningful conclusions; however, large intraspecific distances further indicate possible yet undescribed diversity within these taxa. For instance, regionally monospecific genera with high intraspecific genetic distances, such as Phoxinus and Proterorhinus, are interesting taxa that might be represented by genetically deeplystructured populations (if not cryptic species complexes) associated with Black vs Caspian Sea basins.  (Fig. 4) is genetically closer to O. brandtii from the Kura River (7.5% min. K2P distance) than to the presumably also Georgian O. cemali (11.9% min. K2P distance). It is, therefore, highly unlikely that the Rioni loach belongs to one of the two species (Fig. 5). As the taxonomic status of Oxynoemacheilus species in the study area is only partially understood (e.g. O. angorae alasanicus, O. bergi, O. brandti gibbusnazus or O. lenkoranensis -see Kuljanishvili et al. (2020)), further studies with larger series of adult specimens are needed to address these issues and allow taxonomic sound examinations and conclusions.

Loaches of the genus
DNA barcoding confirmed for the first time an occurrence of Gymnocephalus cernua (Linnaeus, 1758) and contributed the second record of alien Rhinogobius lindbergi (Berg, 1933) in Georgia. The former species has never been considered to occur in the country (Elanidze 1983, Ninua and Japoshvili 2008, Kuljanishvili et al. 2020), but is abundant in the adjacent northern Caucasus area (Kottelat and Freyhof 2007). Possibly, the species currently extends its range to the Southern Caucasus, although reports from fishermen are lacking. As the species is a strong invader and has drastically expanded its range over the last decades (e.g. to the North American Great Lakes, where it possibly poses a threat to their endemic fish fauna (Gunderson et al. 1998, Newman 1999, monitoring its status in Georgia is recommended. Rhinogobius lindbergi was recently first reported from Georgia as an alien species . Our data confirm the finding and further indicates the widespread distribution of this species in eastern Georgia as already supposed by . The likely introduction pathways or vectors for this species are currently unknown. The direct migration from southern Caspian rivers is probably impossible due to impermeable barriers at the Mingachevir reservoir. Accordingly, R. lindbergi have been introduced in eastern Georgian rivers by humans, most probably unintentionally.

Conclusions
Georgia, as part of the Caucasus and Irano-Anatolian biodiversity hotspots, is distinguished by its unique biodiversity and rich freshwater resources, which have been strongly impacted by anthropogenic pressure throughout the 20 century until the present. During the Soviet time, large-scale industrial projects presumably had a strong influence on the Georgian biodiversity and especially on the freshwater fauna. An example of this is the construction of the Mingachevir Dam in Azerbaijan which acted as an insurmountable obstacle for anadromous fishes such as sturgeons, Caspian lampreys and salmon (Kuljanishvili et al. 2020). As a result, these species (in particular, lampreys and sturgeons) lost the spawning areas in the whole upper Kura basin. In addition, from the 1930s to 1980s, alien species were introduced intentionally or accidentally to Georgia, such as gibel carp and topmouth gudgeon (Ninua and Japoshvili 2008, Japoshvili et al. 2013, Kuljanishvili et al. 2020. Although large industrial projects were halted with the collapse of th Figure 5. the Soviet Union, the recent extensive development of small and medium-sized hydropower plants in Georgia will presumably have negative impacts on the local freshwater biodiversity. In addition, illegal fishing, range expansion of non-native species, water pollution and habitat modification will alter the population dynamics and distribution of most native freshwater fishes of Georgia, including rare, endemic and especially anadromous species. Given these expectations, intensive study and monitoring of fishes is highly recommended to estimate population changes and species distribution and for subsequent planning of conservation activities and mitigation of irreversible diversity loss. The fastest and perhaps most cost-effective tools in this regard will be methods based on DNA barcoding (Kress et al. 2015). However, our study shows that the number of reference barcodes available for Georgian fishes is not yet sufficient to implement full-scale fish diversity monitoring programmes. Indeed, the barcodes of nearly 50% of Georgian fish species are not yet available. This is mainly due to the limited financial/human resources to investigate the fish diversity on one hand and also due to poor museum collections available for Georgian fishes. For example, the largest fish collection kept in the Georgian National Museum has been damaged so badly (as a result of incorrect preservation) that it is no longer useful for genetic study. Thus unresolved taxonomy (such as Oxynoemacheilus or Squalius) and insufficient barcode coverage are currently major gaps that need to be filled in the near future. We hope this is indeed possible, given the relatively-low species number (compared to mega-diverse regions) and the existing progress in fish research in the region. The recently-initiated CaBOL project (Caucasus Barcode of Life) constitutes a chance to close some of these taxonomic gaps over the next few years.

Author contributions
MFG, GE, BJ and LM developed conception and design of the study, collected material and analysed data. FH, JA and LM supported the research logistics. GE, LM and MFG drafted manuscript. All authors participated in revising the article and contributed with intellectual content. All authors gave final approval of the submitted version.

Conflicts of interest
The authors have declared no conflicts of interest.