A metabarcode based (species) inventory of the northern Adriatic phytoplankton

Abstract Background The northern Adriatic is characterised as the coldest and most productive marine area of the Mediterranean, which is due to high nutrient levels introduced by river discharges, the largest of which is the Italian Po River (at the same time also the largest freshwater input into the Mediterranean). The northern Adriatic is a very shallow marine ecosystem with ocean current patterns that result in long retention times of plankton in the area. The northern Adriatic phytoplankton biodiversity and abundance are well-studied, through many scientific and long-term monitoring reports. These datasets were based on phytoplankton morphological traits traditionally obtained with light microscopy. The most recent comprehensive eastern Adriatic phytoplankton checklist was published more than 20 years ago and is still valuable today. Since phytoplankton taxonomy and systematics are constantly being reviewed (partly also due to new molecular methods of species identification that complement classical methodologies), checklists need to be updated and complemented. Today, metabarcoding of molecular markers gains more and more importance in biodiversity research and monitoring. Here, we report the use of high throughput sequencing methods to re-examine taxonomic richness and provide updated knowledge of phytoplankton diversity in the eastern northern Adriatic to complement the standardised light microscopy method. New information This study aimed to report an up-to-date list of the phytoplankton taxonomic richness and phylogenetic relationships in the eastern northern Adriatic, based on sequence variability of barcoding genes resolved with advanced molecular tools, namely metabarcoding. Here, metabarcoding is used to complement standardised light microscopy to advance conventional monitoring and research of phytoplankton communities for the purpose of assessing biodiversity and the status of the marine environments. Monthly two-year net sampling targeted six phytoplankton groups including Bacillariophyceae (diatoms) and Chrysophyceae (golden algae) belonging to Ochrophyta, Dinophyceae (dinoflagellates), Cryptophyceae (cryptophytes), Haptophyta (mostly coccolithophorids) and Chlorophyta with Prasinophyceae (prasinophytes) and Chlorophyceae (protist green algae). Generated sequence data were taxonomically assigned and redistributed in two kingdoms, five classes, 32 orders, 49 families and 67 genera. The most diverse group were dinoflagellates, comprising of 34 found genera (48.3%), following by diatoms with 23 (35.4%) and coccolithophorids with three genera (4.0%). In terms of genetic diversity, results were a bit different: a great majority of sequences with one nucleotide tolerance (ASVs, Amplicon sequence variants) assigned to species or genus level were dinoflagellates (83.8%), 13.7% diatoms and 1.6% Chlorophyta, respectively. Although many taxa have not been detected that have been considered as common in this area, metabarcoding revealed five diatoms and 20 dinoflagellate genera that were not reported in previous checklists, along with a few species from other targeted groups that have been reported previously. We here describe the first comprehensive 18S metabarcode inventory for the northern Adriatic Sea.


Background
The northern Adriatic is characterised as the coldest and most productive marine area of the Mediterranean, which is due to high nutrient levels introduced by river discharges, the largest of which is the Italian Po River (at the same time also the largest freshwater input into the Mediterranean).The northern Adriatic is a very shallow marine ecosystem with ocean current patterns that result in long retention times of plankton in the area.The northern Adriatic phytoplankton biodiversity and abundance are well-studied, through many scientific and long-term monitoring reports.These datasets were based on phytoplankton morphological traits traditionally obtained with light microscopy.The most recent comprehensive eastern Adriatic phytoplankton checklist was published more than 20 years ago and is still valuable today.Since phytoplankton taxonomy and systematics are constantly being reviewed (partly also due to new molecular methods of species identification that complement classical methodologies), checklists need to be updated and complemented.Today, metabarcoding of molecular markers gains more and more importance in biodiversity research and monitoring.Here, we report the use of high throughput sequencing methods to re-examine taxonomic richness and provide updated ‡

Introduction
The entire marine community wealth is directly linked to phytoplankton richness and abundance.Their relatively high turnover rate results in rapid response to biotic and abiotic environmental changes making phytoplankton a valuable indicator for monitoring assessments (Goodwin et al. 2017, Porter andHajibabaei 2018).Defining phytoplankton community structure in marine environments and their spatial and temporal changes are, therefore, included in the Marine Strategy Framework Directive (MSFD) as indicators for Good Environmental Status assessments (2008/56/EC).
Defining environmental status of the aquatic environment assessing phytoplankton composition and biomass is of great importance, especially in dynamic coastal waters such as the northern Adriatic (NA).Located in the northernmost area of the Mediterranean, the northern Adriatic is a shallow basin, with depths up to 50 m and regular exchange of water with the Mediterranean Sea (Gacic et al. 2010) through dominant cyclonic circulation along the eastern Adriatic coast (EAC) bringing high salinity oligotrophic water into the northern Adriatic (Franco and Michelato 1992).Its thermohaline properties are mostly seasonal, mainly shaped by: (1) cold and dry inland bora wind in winter (Belušić et al. 2013) triggering cold dense water formation (Mihanovic et al. 2013) with strong vertical mixing, resulting in stratification absence and (2) large river discharges (Raicich 1996), mostly by the River Po that shape both the salinity and temperature, maintaining strong stratification during the warm part of a year with low-salinity nutrient-rich water spreading horizontally extending anticyclonally into the northern Adriatic interior (Giani et al. 2012).As a result of river intake, a large amount of inorganic, as well as organic nutrients are introduced (Degobbis et al. 2000, Cozzi et al. 2018, affecting circulation by forming southward highly eutrophic West Adriatic Current (WAC), transferring nutrients along the western Adriatic coast (Zavatarelli et al. 1998).Consequently, the northern Adriatic is characterised as one of the most productive areas of the Mediterranean (D'Ortenzio and D'Alcalà 2009) with diverse physical and chemical gradients that affect seasonal variations of phytoplankton species diversity in this area.
There are many long-term studies reporting phytoplankton diversity and abundance as key indicators for changes occurring in the eastern northern Adriatic (Cabrini et al. 2012, Maric et al. 2012, Godrijan et al. 2013, Mozetič et al. 2019, France et al. 2021).Described phytoplankton taxonomic composition is based on morphological traits and, therefore, some species playing important ecological roles may be overlooked in biodiversity surveys.Many papers report phytoplankton diversity and composition acquired with classical microscopy (Revelante and Gilmartin 1976, Bosak et al. 2009, Fanuko and Valcic 2009, Vilicic et al. 2009, Mozetič et al. 2019).However, the most comprehensive recent phytoplankton checklist for the eastern Adriatic Sea phytoplankton remains (Vilicic et al. 2002) which we also use as a reference for biodiversity recovered by light microscopy in this study.
Identification of multiple species from a bulk sample containing entire organisms or from a single environmental sample can be termed DNA metabarcoding (Taberlet et al. 2012).
Recent technological developments in molecular ecology using genetic approaches have expanded our capacity to describe marine plankton community diversity (Caron et al. 2011, Leray andKnowlton 2017) and allow us to better understand phylogenetic relationships and taxonomic structures in environmental samples.Next generation high-throughput sequencing technologies have become a common research tool for biodiversity evaluation with the power to detect even the rarest members of a specific community (Sogin et al. 2006), as well as discriminate between closely-related and cryptic species, based on sequence similarity.The results provide complementary rather than identical phytoplankton community structure estimates when compared to conventional approaches (Keck et al. 2022).Studies of phytoplankton community composition in the northern Adriatic adopting metabarcoding were previously conducted on the eastern Italian coast for seven stations and dates dates inside and in front of the lagoon of Venice (Armeli Minicante et al. 2019).
Here, we report a phytoplankton species inventory acquired from metabarcoding data at two long-term monitoring stations near Rovinj (Croatia) collected monthly over the course of 2 years.This detailed phytoplankton genetic diversity information represents an important molecular tool to analyse community structure, track seasonal dynamics and overall, to better understand future metabarcoding studies occurring in the eastern northern Adriatic and going towards metabarcoding biodiversity monitoring.

Sampling description
Sampling took place at two stations with long term sampling history and a continuous monitoring programme coordinated by the Centre for Marine Research (CIM), monthly during the years 2020 and2021. Stations RV001 andRV004 (45°4'48''N, 13°36'36''E and45°3'42.66''N, 13°32'56.976''E)are located 1 and 4 nautical miles off the Croatian coast and are used in routine monitoring for the collection of representative phytoplankton samples for the northern Adriatic Sea (Fig. 1).Samples for metabarcoding analysis were collected at each site using a phytoplankton net with a defined mesh size of 50 µm (Hydrobios, Germany).Samples were collected by vertical net hauls from ∼ 30 m depth to the surface to cover species from the entire water column.Concentrated samples were collected in 500 ml plastic bottles and filtered on 1.2 μm cellulose ester membranes (47 mm, Whatman, UK) until complete filter saturation (total volume ranging from 20 ml up to 150 ml).Filters were stored in small Petri dishes, at -80°C as suggested by Baricevic et al. (2022) until further lab processing.Map of two sampling sites located near Rovinj with long sampling history used as reference points for sample collection.

Laboratory protocol
Total DNA was extracted from stored filters using either Gentra Puregene DNA Isolation Kit (Qiagen, Germany) or NucleoSpin eDNA Water kit (Macherey-Nagel, Germany), both according to the manufacturer′s protocol.Isolated genomic DNA was eluted in 100 µl per sample and obtained concentrations were quantified by measuring DNA concentration/ absorbance spectra using the Nanophotometer spectrophotometer (Implen, Germany).
Samples were stored at 4°C until further processing.DNA metabarcoding library preparation and sequencing were carried out by AllGenetics & Biology SL (www.allgenetics.eu).DNA concentration was quantified for each extract using the Qubit High Sensitivity dsDNA Assay (Thermo Fisher Scientific).For DNA metabarcoding library preparation, eukaryote specific primers TAReuk454FWD1 (5′-CCAGCA(G/C)C(C/T)GCGGTAATTCC-3′) and TAReukREV3 (5′-ACTTTCGTTCTTGAT(C/T)(A/G)A-3′) (Stoeck et al. 2010, Piredda et al. 2017) were used, targeting around 430 bp of the V4 hypervariable region of the small subunit ribosomal RNA gene with the Illumina sequencing primer sequences attached at their 5' ends.In the amplification step, PCRs were conducted in a final volume of 12.5 μl, containing 1.25 μl of template DNA optimised by diluting the starting template DNA, 0.5 μM of the primers, 3.13 μl of Supreme NZYTaq 2x Green Master Mix (NZYTech) and ultrapure water up to 12.5 μl.The reaction mixture was incubated as follows: an initial denaturation step at 95°C for 5 min, followed by 35 cycles of 95°C for 30 s, 48°C for 45 s, 72°C for 45 s and a final extension step at 72°C for 7 min.The oligonucleotide indices were attached in a second amplification step with identical conditions, but only 5 cycles and 60ºC as the annealing temperature (Vierna et al. 2017).The library size was verified by running the libraries on 2% agarose gels stained with GreenSafe (NZYTech) and imaging them under UV light.Then, the libraries were purified using the Mag-Bind RXNPure Plus magnetic beads (Omega Biotek), following the instructions provided by the manufacturer.Finished libraries were pooled in equimolar amounts according to the results of a Qubit dsDNA HS Assay (Thermo Fisher Scientific) quantification.These pools also contained a testimonial amount (1 μl) of the PCR negative controls.The pool was sequenced in a fraction of NovaSeq PE250 (Illumina) aiming for a total output of 4 gigabases.Illumina paired-end raw data consisted of forward (R1) and reverse (R2) reads and were stored separately sorted by library, as fastq files.

Custom Reference database (CIMPhy18) preparation
The CIM Phytoplankton 18S ribosomal RNA reference database (CIMPhy18) was assembled using ARB software v.7.0 (Ludwig et al. 2004) by merging downloaded phytoplankton sequence records from the Silva (Quast et al. 2013) and Protist Ribosomal Reference (PR2) (Guillou et al. 2013) databases.Phytoplankton groups included Dinophyceae (dinoflagellates), Ochrophyta containing Bacillariophyceae (diatoms) and Chrysophyceae (golden algae), Cryptophyceae (cryptophytes), Haptophyta (mostly coccolithophorids) and Chlorophyceae (green micro algae).First, the Silva-aligned database was imported into ARB software with all additional data and NCBI accession numbers linked to phytoplankton taxa were compared between Silva and PR2 databases.PR2 accession number entries that were not present in the Silva database were downloaded from Entrez Molecular Sequence Database System (www.ncbi.nlm.nih.gov/Web/Search/entrezfs.html) in genbank (.gb) format and imported into ARB.Phytoplankton groups/phyla chosen for the reference database assembly were aligned in ARB Sequence Editor and a reference database phylogenetic tree was constructed.Relevant in-house sequences from the in-house CIM phytoplankton culture collection, representing barcoded sequences from northern Adriatic common phytoplankton taxa, were used to extend the database and to include species not included in the standard release of the databases.To acquire information about the taxonomic completeness of the constructed database, phytoplankton taxa entries were compared to: (1) the eastern Adriatic phytoplankton checklist (Vilicic et al. 2002) and ( 2) the list of northern Adriatic phytoplankton species from long-term monitoring data.Both lists were obtained from long-term studies with light microscopy.Before comparison, taxonomic data were normalised using the Species matching tool from GBIF (www.gbif.org) to ensure accurate classification and taxonomic uniformity.Sequences for missing taxa were searched in NCBI (www.ncbi.nlm.nih.gov),curated and, if suitable, added to the CIMPhy18 database along with accompanying metadata.Sequences for the reference database were extracted in fasta format and their accompanying taxonomy in a tax file, prepared to be imported into Mothur (Schloss et al. 2009).

Bioinformatic pipeline
Illumina paired-end reads for the small subunit region were processed using a custom script for the Mothur pipeline v.1.47.0 (Schloss et al. 2009), based on MiSeq standard operating procedure (https://mothur.org/wiki/miseq_sop/).Contigs were assembled from fastq files containing forward and reverse Illumina MiSeq reads and trimmed to the overlapping section, with no ambiguous bases allowed; the maximum homopolymer size was 8 bp and the maximum tolerated sequence length 450 bp.Sequences were screened for chimeras using the VSEARCH command (Rognes et al. 2016), removing sequences with chimeras from analysis from their group only (dereplicate=t).Clustered sequences are reported as amplicon sequence variants, ASVs (Callahan et al. 2017), with one nucleotide tolerance (cutoff = 1) eliminating erroneous sequences formed due to sequencing mistakes and operational taxonomic units, OTUs were assembled, based on fixed sequence 97%and 99%-identity (99%-OTUs and 97%-OTUs) threshold.Both times column-format distance matrices were calculated by default option (one gap) and were assigned to OTUs using OptiClust algorithm (Westcott and Schloss 2017).The consensus classification for each ASV and OTU clustering units was performed using a naïve Bayesian classifier (Wang et al. 2007) trained using the CIMPhy18 constructed reference database, with a 90% bootstrap confidence threshold.Two ASV and OTU tables were built: one containing all informations, regarding read number present in a cluster (ASVs, 97%-OTUs or 99%-OTUs) and the second, including only clustering units represented with five of more reads per cluster (ASVs*, 97%-OTUs* or 99%-OTUs*).Species scientific names and corresponding authors were checked according to AlgaeBase (www.algaebase.org)prior to checklist formation.All downstream analyses were performed using R Statistical Software (v.4.2.3; R Core Team 2023) and data visualisations were made using 'ggplot2' (Wickham 2009) and 'ggtree' (Yu et al. 2017) Rpackages and assembled using 'aplot' (Yu 2022) Rpackage.Notes: The sequences listed under this taxon were classified into Chrysophyceae; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.

Analysis
From 45 samples during two years monthly sampling at two long-term monitoring stations, a total number of 2,396,571 reads (with 789,775 unique reads after filtering and removing gaps, or 21.0%) was obtained.Ultimately, a total of 3,161 ASVs, 1307 99%-OTUs and 240 97%-OTUs were detected on genus or species level, of which 168, 193 and 81 were present with five of more reads per cluster (ASVs*, 97%-OTUs* or 99%-OTUs*).In total, 5.3% of ASVs, 14.8% and 59.5% of 99%-and 97%-OTUs were found with five or more reads, respectively (Table 1).In terms of genetic diversity, the most present phytoplankton groups were Dinophyceae and Bacillariophyceae independent of the applied clustering method used, while other groups, Chlorophyta, Haptophyta, Cryptophyta and Chrysophyceae were present with low numbers of clustering units, contributing with less than 5% in a total number of either ASVs or OTUs per group (Table 1).

Contribution to genetic diversity for each phytoplankton group and clustering method used
A metabarcode based (species) inventory of the northern Adriatic phytoplankton were represented with three species each (3,5%) and Cryptophyta with two species, both belonging to genus Teleaulax (Fig. 2).
As expected, different clustering methods, tend to show different taxonomic coverage.As a result, 84 taxa were assigned at a species level, based on ASV clustering, while with OTU clustering retrieved 80 species (99%-OTUs) and 71 species (97%-OTUs).Using an additional restriction step which relates to the number of reads generated for each ASV or OTU and keeping only those clustering units represented in five or more reads reduced significantly the number of assigned species.A total number of species generated were 39 assigning ASVs*, 46 99%-OTU and 41 regarding 97%-OTU* which correspond to 46.2%, 57.5% and 57.8% of ASVs, 97%-OTUs and 99%-OTUs retained after removing low abundant sequences/clustering units.
The most diverse genera belong to diatoms and dinoflagellates (Suppl.material 1).Diatom genera with the highest number of assigned species were Chaetoceros (11) and Leptocylindrus (3), along with Protoperidinium (5), Tripos (5) and Alexandrium (3) classified amongst dinoflagellates.Amongst them, genera Chaetoceros, Protoperidinium and Tripos were also assigned up to genus level, indicating even greater diversity within.
Four diatom genera were revealed with molecular methods that have not been reported in previously mentioned checklists.All of them were assigned to species level containing at least one species per genus (Bellerochea, Meuniera, Mediolabrus, Minidiscus).
Dinoflagellates showed a higher number of non-reported genera (30) and a higher number of assigned species not mentioned in earlier complete checklists of this area (Figs 3, 4).

Discussion
Molecular assessments facilitate the investigation of eukaryotic biodiversity by producing a large dataset of sequences for target genetic markers.The sensitivity for species detection is not equally distributed across the observable biodiversity range with significant differences between methodologies (e.g.metabarcodes and light microscopy).Hence a combination of methodologies promises to generate deeper and more complete insights into biodiversity (Huo et al. 2020).In this study, total eukaryotic phytoplankton diversity was explored via the metabarcoding of 18S rDNA gene amplicons and the results for different clustering methods were analysed and further compared with lists of species known for the eastern northern Adriatic.A highly quality controlled reference database incorporating sequences from diverse phytoplankton groups is needed to investigate phytoplankton diversity with metabarcoding data.For this purpose, a custom reference database CIMPhy18 was generated to include as many taxonomically annotated barcode sequences as available.Up-to-date curation of such a database is recommended (Rimet et al. 2019).
Applying metabarcoding to 2 years of monthly sampling set (which is, to our knowledge, currently the most extensive sampling regime for a metabarcoding approach to date for the Mediterranean area), we could recover several previously not listed species for the study area.Some of these were unknown during the generation of the previous checklist (Table 2) and newly described in the meantime, but are today considered a common constituent of eastern northern Adriatic plankton diversity (e.g.Bacteriastrum jadranum (Godrijan et al. (2012) and Leptocylindrus aporus (Kuzat et al. (2022)).Another new observation is the species Mediolabrus comicus belonging to genus Mediolabrus separated from the genus Minidiscus as a new genus (Li et al. 2020).Such observations highlight how molecular approaches can detect cryptic species and serve as important approaches for the (taxonomic) revision of phytoplankton biodiversity (Kermarrec et al. 2013).Species from groups other than diatoms and dinoflagellates, especially Chlorophyta, Chrysophyceae and Cryptophyta, have been poorly reported earlier, probably due to their cell size, belonging to nano-and picoplankton fractions which results in difficulties to morphologically determine them at species or genus level.
Larger species, like Noctiluca scintillans, significantly contributed to the overall read numbers, 49.0% of all generated ASVs were assigned to Noctiluca genus.This is an effect of extended and massive blooms of N. scintilans during our sampling period.These blooms are known to drastically reduce species diversity in a given area.Even though we enlarged the checklist for the phytoplankton of the eastern Adriatic coast with the herereported results, we assume for methodological reasons that a part of the observable biodiversity still remains unknown.Several genera known as predominant benthic genera that are highly represented in phytoplankton checklists, such as Mastogloia, Licmophora or Synedra (Car et al. 2021), but are less likely to be found in a water column, were not represented in our metabarcoding results, which is probably a result of the sampling methodology, where we did avoid net hauls close to the sea floor.As described earlier, variable DNA extraction efficiencies between taxa due to different cell morphologies, for example, robust cell coverings, such as the diatom silica frustule, could potentially lessen extraction efficiency, leaving them undetected in metabarcoding efforts (Maki et al. 2017).It is shown that various clustering methods potentially have an impact on community structure, either by means of the proportion for different groups contributing to the total community or, more noticeably, altering taxonomic coverage of a given community.Genetic community composition discrepancy between ASVs and OTUs as the basis for the taxonomic ranking has been previously described (Jeske and Gallert 2022).Assigning taxonomy to different OTUs which represent a consensus sequence at the respective centroid of the cluster and, therefore, closely-related species (such as species from Pseudo-nitzschia complex) could be overcome using oppressing clustering conditions leading to an underestimation of phytoplankton diversity in the environment.On the other hand, ASVs as a denoising method appear more desirable lately (Callahan et al. 2017).
With one nucleotide tolerance, the probability for assigning genotypes due to sequencing errors is reduced (Kelly et al. 2019) allowing unconstrained identification of closely-related species or intraspecific genetic diversity detection.Consequently, diversity estimates including OTUs or ASVs, not assigned to known species, have to be interpreted with restraint.We hence suggest that our results at the level of genus can be used for an estimation of intrageneric diversity potentially recoverable from the observed environment (Figs 3, 4).However, further research, including the analysis of intrageneric and intraspecific phylogenetic distances for the applied barcode, is necessary to confidently predict the biodiversity at the species level for genera with low numbers of available reference barcodes.
By disregarding sequences or sequence clusters constructed, based on low abundant reads (ASVs*, 99%-OTU* and 97%-OTU*), we attempted the elimination of potentially remaining erroneous sequences.Removing those sequences can negatively affect describing community compositions and greatly reduce the number of taxa present in extremely low copy number due to their reduced amplification ability or low abundance in the community structure.It is shown that ASV methods control errors sufficiently down to the level of single-nucleotide differences over the sequenced gene region and, therefore, we assumed that these low abundant sequences are a product of genetic diversity and not a laboratory bias and that even low read number sequences and sequence clusters should be taken into account when creating species lists.
The metabarcoding technique is a promising tool for revealing phytoplankton communities, although gaps in the reference databases still exist (Weigand et al. 2019).To check CIMPhy18 database completeness, we compared it to published checklists of phytoplankton for the study area.In total, 20 phytoplankton genera known from the area (Vilicic et al. 2002) (eight Bacillariophyceae, four Dinophyceae and eight Prymnesiophyceae) could not be found in our custom database nor in any publicly available database.Depending on a reference database completeness, sequence gaps as such result in reduced detection power.
Reference checklists for the study area always include observations for the entire eastern Adriatic coast.Not necessarily all species are present in the northern Adriatic Sea.Furthermore, the checklists integrate over several decades, while the here-presented metabarcode analysis integrates over 2 years of sampling only.
Nevertheless, the here-reported and characterised results from a 2-year, monthly metabarcode analysis of phytoplankton net hauls resulted in the observation of genetic diversity of not only species already known from the studied area, but could also add additional, so far not reported species for the north-eastern Adriatic Sea.Metabarcoding in combination with light and electron microscopy techniques can significantly improve the assessment of the phytoplankton community (Ruppert et al. 2019, Huo et al. 2020, Pereira et al. 2021) and deliver deeper insights into planktonic biodiversity.

Figure 2 .
Figure 2. Contribution of phytoplankton groups.Changes in contribution in genetic diversity of different phytoplankton groups depending on clustering method used (upper plot) and number of assigned species of each phytoplankton group depending on clustering method used.

Figure 3 .
Figure 3. Dendrogram of diatom genera.Dendrogram showing species as listed inViličić et al. (2002)   as recovered from all published and observed data until 2002, representing a potential diatom diversity observable in the eastern northern Adriatic.Genetic diversity acquired in this study is compared to genera found in the checklist(2002).Species recovered from metabarcoding data, but so far absent from the 2002 checklist, were added to the dendrogram: Filled purple dots at the tip points indicate diatom genera obtained with metabarcoding, while orange dots present metabarcoded genera that were not present in a checklist.The bubble plot shows the number of morphospecies from the checklist (column Species) and the number of different clustering units assigned per genera depending on the clustering method (columns ASVs, 99%-OTUs, 97%-OTUs) and removal of low abundant clustering units (columns ASVs*, 99%-OTUs*, 97%-OTUs*).The bar plot on the right showing the contribution of species/clusters for each genus/clustering method.Genera not found in metabarcoding data were left out.The bottom bar plot indicates the number of morphospecies (first column) and clustering units (other columns) generated by each clustering method for the diatom group.

Figure 4 .
Figure 4.Dendrogram of dinoflagellates genera.Dendrogram showing species as listed inVilicic et al. (2002) as recovered from all published and observed data until 2002, representing a potential diatom diversity observable in the eastern northern Adriatic.Genetic diversity acquired in this study is compared to genera found in the checklist(2002).Species recovered from metabarcoding data, but so far absent from the 2002 checklist, were added to the dendrogram: Filled purple dots at the tip points indicate diatom genera obtained with metabarcoding, while orange dots present metabarcoded genera that were not present in a checklist.The bubble plot shows the number of morphospecies from the checklist (column Species) and the number of different clustering units assigned per genera depending on the clustering method (columns ASVs, 99%-OTUs, 97%-OTUs) and removal of low abundant clustering units (columns ASVs*, 99%-OTUs*, 97%-OTUs*).The bar plot on the right showing the contribution of species/clusters for each genus/clustering method.Genera not found in metabarcoding data were left out.The bottom bar plot indicates the number of morphospecies (first column) and clustering units (other columns) generated by each clustering method for the dinoflagellate group.

phytoplankton checklist 2020-2021, northern Adriatic Class Chlorophyceae
Notes:The sequences listed under this taxon were classified into Chlorophyceae; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.Bathycoccus prasinos W.Eikrem & J.Throndsen, 1990

Phaeocystis sp.
The sequences listed under this taxon were classified into Dinophyceae; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.

Fragilidium subglobosum Balech, 1988
The sequences listed under this taxon were classified into Gonyaulacales; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.The sequences listed under this taxon were classified into Gymnodiniales; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.The sequences listed under this taxon were classified into Noctilucales; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.

Protoperidinium depressum (Bailey) Balech, 1974
Notes:The sequences listed under this taxon were classified into Bacillariophyceae; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.Psammodictyon constrictum (Gregory) D.G.Mann, 1990

Pinnularia appendiculata (C.Agardh) Schaarschmidt, 1881
The sequences listed under this taxon were classified into Sellaphoraceae; however, no reference sequence with high enough similarity was available for determining the genus of the respective sequences.

Table 2 .
(Vilicic et al. 2002) revealed by metabarcoding, assigning taxonomy, based on ASVs as a clustering method with no reads cutoff, not present in checklist(Vilicic et al. 2002)