Vascular plants occurrences in Dokdo Islands, Korea, based on herbarium collections and legacy botanical literature

Abstract Background The vascular flora of the Dokdo Islands has been reported, based on primary collections made in 2012 and 2013 and legacy botanical literature. The Dokdo Islands are the remotest islands of Korea, located in the East Sea approximately 87 km from Ulleungdo Islands. They comprise two main volcanic islands, Dongdo (east islands) and Seodo (west islands) and minor islets surrounding the two main islands. This research was conducted to document vascular plant species inhabiting Korea's most inaccessible islands. We present a georeferenced dataset of vascular plant species collected during field studies on the Dokdo Islands over the past seven decades. New information In the present inventory of the flora of Dokdo, there are listed 108 species belonging to 78 genera and 39 families, including 93 native species and 15 newly human-induced naturalised species for these Islands' flora. The Poaceae and Asteraceae families are the most diverse, with 22 and 15 taxa, respectively. Some of the previously-listed taxa were not found on Dokdo probably because they are rare and the limited time did not allow collectors to find rare species. The spread of introduced species, especially the invasive grass Bromuscatharticus Vahl., affects several native species of Dokdo flora.


Introduction
Biodiversity researchers have identified critical gaps in spatial, temporal and taxonomic coverage of biodiversity observations highlighting barriers to effective data collection, open access and analysis (Amano et al. 2016, Wetzel et al. 2018. To bridge these gaps, biodiversity data must suit the demands of multiple groups, including scientists, policymakers and data contributors (Taylor et al. 2017). Several biodiversity data researchers have emphasised taking the lead in developing new measures. Options like open access publishing with conventional licences accessibility through major biodiversity platforms, such as GBIF, can be used (Faith et al. 2013). The next solution is offering data providers incentives, such as the option to publish in peer-reviewed data journals (Chavan and Penev 2011). Biodiversity data providers should become better data stewards, with a comprehensive understanding of metadata, best data management practices and plans for data archiving and preservation (Hartter et al. 2013). However, data stewardship takes time and resources and data providers cannot be data stewards without sufficient resources and support. The evolution of data stewardship culture causes biodiversity informatics challenges to emerge as data volume and precision increase. Biodiversity data scientists propose that data providers and stakeholders confront current challenges prividing them with detailed recommendations (Ariño et al. 2016).
Geographical location and security level are the main factors causing spatial gaps (Ariño et al. 2016). As biodiversity information is closely related to the temporal and spatial variation in surveying effort, Wallacean shortfall is specifically critical in remote and inaccessible areas (Hortal et al. 2008, Boakes et al. 2010. Sampling certain places better than others is inevitable given the accessibility differences between localities (Rodrigues et al. 2010); therefore, distribution data tend to be heavily biased with historical collection patterns, collation and biodiversity data accumulation (Rodrigues et al. 2010, Meyer et al. 2015. To effectively bridge spatial gaps, it is essential to comprehend the causes for data shortage in some regions. In the case of Banco de Datos de Biodiversidad de Canaris (BIOTA-Canarias, Hortal et al. 2007), it stated that the lack of completeness or large gaps in their spatial coverage compromises their future utility. The previously collected data have limited utility because the data lack detail and geographical coverage is not exhaustive (Soberón et al. 2007). Biodiversity data scientists encourage exhaustive compilation of all available information with sufficient quality and detail (Hortal et al. 2008).
The Dokdo Islands are the most inaccessible islands in Korea,located at 37°14'26.8"N and 131°52'10.4"E,belonging to an administrative district that includes the Ulleung Islands. Since the first botanical survey (Lee 1952), seventy years of sporadic observations have waited to be mobilised to accessible biodiversity data (Jung et al. 2014). This study produces an exhaustive and reliable list of vascular plants from the Dokdo Islands, based on reference herbarium specimens collected in the field and the occurrence data available in the papers .

General description
Purpose: This research focused on the digitisation of plant distribution data on Dokdo Islands acquired by botanists on occasional expeditions to the Islands between 1947 and 2018. These data offer a promising tool to help guide the biodiversity management and conservation of these highly inaccessible island ecosystems.

Project description
Title: Vascular plants occurrences in Dokdo Islands, Korea, based on herbarium collections and legacy botanical literature.

Personnel:
The datasets were digitised by Hui Kim (data manager), Su-Young Jung was the resource creator and Shin Young Kwon, Hyun Tak Shin and Chin-Sung Chang were the content providers. Chin-Sung Chang checked taxonomic changes and georeferencing. S.Y. Jung conducted the field works for two years, from April 2012 to September 2013, collaborating with members from Korea National Arboretum (Jung et al. 2014). S.Y. Jung did preliminary in situ identifications. S.Y. Jung, Hui Kim and Chin-Sung Chang conducted the final species identification.

Study area description:
The small islands of Dokdo are volcanic rocks formed in the Cenozoic era, more specifically 4.6-2.5 million years ago, having a formation mechanism similar to underwater islands (Jo et al. 2021, Kim et al. 2013. The Dokdo Volcano rises roughly 2,100 m a.s.l. and has a diameter of more than 10 km (Song et al. 2017). The Islands have a butterfly wing shape, a relatively steep terrain, a peak elevation of 168 m a.s.l. and a surface area of 18.7 hectares (Fig. 1). The Dokdo Islands consist of two main islets, Seodo and Dongdo, with numerous surrounding rocks. Sedo has multiple berth and tracking routes access points and flora surveys and collections are possible over a comparatively large area. Since Dongdo is more difficult to access by boat, it is challenging to investigate the surface, except there are fewer primary species occurrence data in a few points. Dokdo Islands had a mean annual temperature of 13.8°C, mean annual precipitation of 589 mm, an absolute minimum temperature of -6.4°C and an absolute maximum temperature of 28.2°C. According to meteorologists, automatic weather systems underestimate the amount of snowfall, thereby resulting in missing data .

Sampling methods
Study extent: The Dokdo Islands are the most inaccessible islands in Korea, located at 37°14'26"N and 131°52'05"E, belonging to an administrative district that includes the Ulleung Islands.

Sampling description:
The vascular plant occurrence data, treated in this study, were compiled using fieldwork from 2012 to 2013 and botanical legacy articles from 1947 to 2018. Herbarium surveys were conducted in two Herbaria, including SNUA (Seoul National University, College of Agriculture, herbarium acronym following Index Herbariorum) and KH (Korea National Arboretum). In addition to the authors' collections, datasets on vascular plant occurrences in Dokdo Islands were digitised from several manuscripts in a heterogeneous format (Lee 1952, Lee and Joo 1958, Lee 1978, Sun et al. 2002, Hyun and Kwon 2006, Lee et al. 2007, Park and Lee 2008, Park et al. 2010, Song and Park 2012, Jung et al. 2014, Kim and Lee 2016, Park et al. 2016, Park et al. 2018, Table 1). References to the published literature, from which data were obtained for the occurrence data compilation, are presented in the bibliography section of the metadata.
Quality control: The Dokdo Islands occurrence dataset was manually digitised from scanned documents of the original papers. The quality control processes of biodiversity data management were based on the principles of data quality by Chapman (2005) .  Step description: 1. The content providers carefully reviewed individual floristic publications to manage the irregularity in the format of historical papers. All occurrence records were merged into a spreadsheet, which contained the original species names recorded at the location. In this digitisation stage, obvious typographic errors were corrected. Accepted taxon names and taxonomic classification derived from the local checklist (Chang et al. 2014) were included in the spreadsheet. The result of the above digitisation steps was 838 records with 25 columns containing occurrence data of 108 vascular plant taxa.
2. MS Access was used to create the BRAHMS database layout. All specimen and occurrence information were recorded in the BRAHMS database of the T.B. Lee Herbarium.  Park et al. (2018). When the collection date was written as "several dates," we transcribed the last dates of field works (day, month and year) and provided the full interval date in the eventDate field and the rest of the general information in the verbatimEventDate field. Park and Lee (2008) and  published the floristic list of Dokdo Islands with many vascular plant pictures. As these authors did not provide the collection information, the publication year was used as the year of events.
4. All occurrence records without coordination were georeferenced, either from the coordinates provided in the paper or from the geographic description of the localities.  Data format: Darwin Core Archive

Description:
The present project was focused on digitising the data on plant distribution on Dokdo Islands, collected between 1947 and 2018 by botanists taking part in occasional expeditions to the Islands. These data are expected to contribute to the biodiversity management and conservation of these highly inaccessible island ecosystems.

Column label Column description
occurrenceID An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. recordedBy A list (concatenated and separated) of names of people, groups or organisations responsible for recording the original Occurrence. The primary collector or observer, especially the one who applies a personal identifier (recordNumber), should be listed first. type The nature or genre of the resource. basisOfRecord The specific nature of the data record. institutionCode The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record.
recordNumber An identifier given to the Occurrence at the time it was recorded. Often serves as a link between field notes and an Occurrence record, such as a specimen collector's number. day The integer day of the month on which the Event occurred.
month The integer month in which the Event occurred. year The four-digit year in which the Event occurred, according to the Common Era

Additional information
During the seventy years' observation period , 108 taxa from 39 families were observed. Almost all were flowering plants (only one fern species and one conifer species were recorded), mostly Magnoliopsida (98%  (Fig. 2). The data collected during the last seven decades indicate continuous expansion of invasive species and increase in their richness (Fig. 3). For instance, Bromus catharticus Vahl, Sonchus asper (L.) Hill., Senecio vulgaris L., Setaria pumila (Poir.) Roem. & Schult. and Lycopersicon esculentum Mill. are the most rapidly expanding aliens in the last decade, threatening native flora (Table 2, Fig. 3).  identified increased human visitation as a major predictor of the spatial distribution of invasive species in the flora of Dokdo Islands, assuming a positive relationship between human activities and alien plant species richness. The major threatening species, especially the invasive grass, Bromus catharticus Vahl., affects several native species. Regarding the colonisation status, 14% of total species richness were invasive species and 86% were native to the Korean Peninsula and adjacent islands.