A crowdsourcing approach to track the expansion of the brown marmorated stinkbug Halyomorpha halys (Stål, 1855) in France

Abstract Background Halyomorpha halys (Stål, 1855), the brown marmorated stinkbug (BMSB) is a highly successful invasive species, native to eastern Asia. It has managed to spread into North America and Europe in recent decades, causing severe damage to various crops. BMSB has been detected in Europe in 2004 and has since expanded in more than 20 countries from Sweden to Greece and Spain to Turkey, the South European Territory of Russia (Krasnodar region) and Abkhazia. In 2012, we set up a citizen science survey to monitor BMSB expansion in France. New information The present crowdsourcing survey was initiated in 2012 and provided a large number of occurrence points of BMSB. These data allowed to track the expansion of the species in France from 2012 to 2019 and brought information about its phenology and distribution in various habitats. The dataset comprises both valid and invalid sightings, thereby allowing us to examine changes in the quality of citizen reports during the course of the survey. Despite a large proportion of misidentifications, the survey provided a large number of valid occurrences. Furthermore, valuable information on hemipterans of Pentatomidae and Coreidae families entering habitations were also gathered. The dataset also illustrates that, although designed for a large public, the Agiir application was mostly used by urban dwellers with very few sightings stemming from professionals of agricultural sectors.


Introduction
In Europe, two species of the family Pentatomidae are listed as invasive, namely Halyomorpha halys and Nezara viridula (McPherson 2018). Halyomorpha halys (Stål, 1855), the brown marmorated stinkbug (BMSB) is native to eastern Asia and is common in the temperate regions of China, Japan and the Republic of Korea (Hamilton et al. 2018). It has been reported in North America in 2001 and in Europe in 2004 where it quickly expanded in more than 20 countries, as well as in the South European Territory of Russia and in Abkhazia (Bergmann et al. 2016, Gapon 2016. BMSB was first reported from France in 2012 in the suburb of Strasbourg (Eastern France) (Callot and Brua 2013) and has since colonised large parts of the country. We developed a citizen science operation at the beginning of the invasion (Streito et al. 2014) and we report here the data (Suppl. material 1) that were gathered between 2012 and 2019. Our crowdsourcing survey is based on a smartphone application named Agiir allowing volunteers to report cases of BMSB observations. The data are stored in a database and further carefully checked by professional entomologists. We here provide all reports, either valid BMSB observations or erroneous sightings in order to provide the entire dataset generated in the course of the survey.

Sampling methods
Sampling description: The survey was based on a smartphone application named Agiir (Alerter Gérer les Insectes Invasifs et/ou Ravageurs) developed at our institute Inrae, aiming at providing information about harmful invasive insect species and collecting citizen reports in France. The system is backed by a website providing species diagnostic tools including pictures and morphological description, allowing BMSB identification along with comprehensive information about its biology (https://ephytia.inrae.fr/fr/P/128/Agiir). Volunteers reported their sightings by means of their smartphone (date, coordinates as decimal degrees and the name of the observer are automatically filled in by the phone application). The application is available for smartphones or touch tablets and is distributed free of charge. It was advertised through various channels including national publication in French (Streito et al. 2014), Inrae institutional communication (website) or entomologist networks, such as "Le monde des insectes" (www.insecte.org). Our survey thus consisted of "passive monitoring" that is an approach where scientists did not actively collect data, but rather relied on citizen contribution. The data were stored in a database, examined and confirmed by entomologists. The survey started on 15 August 2012.
Two additional sources of data were gathered alongside the Agiir system: a) extra-Agiir citizen reports received by phone or email and b) BMSB reports from "naturalist" networks. Naturalists refer to a special category of contributors having amateur or advanced knowledge in the field of entomology. Their observations were directly reported to us by phone or emails or through different entomologist networks, for example, Le monde des insectes (www.insecte.org). JCS collected these reports from 15/08/2012 until the end of 2019 and checked specimen identification, dates and spatial location. It should be noted that Agiir reports, generated by naturalists, could not be distinguished from other reports.
Without duplicated records, the dataset contained a total of 4002 reports, 3538 from Agiir application, 377 citizen direct reports and 87 extra-Agiir naturalist reports.

Species identification
Species identification relied on the picture associated with the records (Figs 1, 2, 3). Valid records were associated with the label "Halyomorpha halys" in the field named "scientific_name". Invalid records corresponded to different situations. A set of pictures illustrating the criteria allowing the identification of Halyomorpha halys and Rhaphigaster nebulosa, two species frequently confused by citizen science contributors. A-G. valid reports of H. halys checked using the following criteria: antennae segment V and IV with white rings facing each other, elongated dark spots on the fore wing membrane, no spine on the underside base of abdomen and light callus along scutellum base; H-M. invalid reports where H. halys is confused with Rhaphigaster nebulosa. R. nebulosa is identified on the basis of the following criteria: antennae segment V and IV with white rings separated by a dark ring, rounded dark marks on the fore wing membrane, no light callus on scutellum base, spine on the underside base of abdomen.

1.
Record associated with no picture or poor quality picture that did not allow reliable identification (picture definition too low, blurry picture, truncated picture or unzoomed picture. These cases were associated with labels "no picture" or "poor picture definition" in the "issues" field. No picture implied no identification. In the case of low-quality pictures, we identified the specimen at the best taxonomic possible level.

2.
Reports on eggs, exuvia or young nymphs for which formal identification is difficult were listed as Pentatomidae in the field "scientific_name". 3.
Some reports were associated with screenshots instead of phone pictures. The problem is that such reports might not be illustrated with the picture of the reported specimen, but a picture considered to be representative of the specimen by the contributor. For example, one report showed a screenshot of the Wikipedia page associated of Leptoglossus occidentalis ( https://fr.wikipedia.org/wiki/Leptoglo ssus_occidentalis). Such highly unreliable reports were associated with the modality "screenshot" in the "issues" field. Some sightings revealed other species, generally other pentatomids, such as the mottled shieldbug Rhaphigaster nebulosa which is commonly mistaken for the BMSB in France (Streito et al. 2014). Other species included Nezara viridula, Dolycoris baccarum, Palomena prasina or Pentatoma rufipes. The western conifer seed bug Leptoglossus occidentalis (Coreidae) was frequently reported as well. In certain cases, volunteers reported several hemipteran species at once, in addition to BMSB or not. These species were reported in the "additionnal_species" field. Finally, duplicated records i.e. those sent on several occasions to the application with identical picture were labelled "duplicated form" in the "issues" field. For each duplicate, the id of the corresponding duplicated record is reported in the "initial_duplicated_form" field.
As for Agiir reports without picture, but with potentially important information (e.g. occurrence in a previously unoccupied region), the taxonomic identification was checked by contacting the contributor in order to get either a picture or a specimen. At the beginning of the BMSB expansion in France i.e. 2012-2013, all Agiir records came from naturalists and were checked by specimen examination. We later validated reports on the basis of pictures.

Environment and phenology
A standardised description of the environment where the specimen was observed is available within the Agiir system. It comprises three modalities: "habitation", "garden" and "crop" and appears in the "environment" field. Direct reports (extra-Agiir) were associated with a more detailed environment description that are reported in the "verbatim_environment" field. The latter description was simplified and reported in the "environment" field. As shown in Figure 5, most of the reports occur inside habitations and contributors were mostly urban dwellers reporting domestic nuisances of BMSB entering houses in October. Very few sightings were gathered from the professionals of agricultural sectors.

Additional information
Each Agiir record was associated with the location of the observation through the longitude and latitude coordinates provided by the Global Positioning System (GPS) device of the smartphone system. Standards coordinates provided in decimal degrees correspond to the World Geodetic System WGS 84 aka European Petroleum Survey Group EPSG 4326 (http s://spatialreference.org/ref/epsg/wgs-84/). Some extra-Agiir reports missed spatial coordinates, but were associated with precise locality names. Coordinates were manually deduced by positioning the occurrence at the centroid of the locality.

Geographic coverage
Description: The survey was initially intended to generate observational data in France, but citizens of other countries could contribute, in particular volunteers from neighbouring francophone countries (e.g. some sightings were associated with longitude/latitude coordinates pointing to Canada). Certain reports corresponded to locations falling in the sea and/or in countries far away from the known BMSB range (e.g. Niger, Benin, Cameroun) and were thus labelled "uncertain location" in the "issues" field. Distribution of occurrences gathered by citizens and naturalists is displayed in Fig. 5, along with geographic expansion of BMSB.

Taxonomic coverage
Description: The study concerns the brown marmorated stink bug Halyomorpha halys (Stål, 1855) (Hemiptera: Pentatomidae). Data about other species were also collected: bugs of the Pentatomidae or Coreidae family, various arthropods (e.g. Coccinellidae, Aphididae, Cerambycidae, Curculionidae, Diptera) and arachnids. The diversity of reported organisms is listed in Fig. 6 and depicted in Figs 1, 2, 3. An occurrence of Erthesina fullo, an invasive species not currently established in France, was recorded. It corresponded to a single specimen observed in a postal parcel. This sighting is assimilated to an interception with no proof of establishment.

Temporal coverage
Notes: Data collection started on 15 August 2012 and is on-going. We here report data collected between 15 August 2012 and 31 December 2019.

Usage licence
Usage licence: Creative Commons Public Domain Waiver (CC-Zero)

Data resources
Data package title: A crowdsourcing approach to track the expansion of the brown marmorated stinkbug Halyomorpha halys (Stål, 1855) in France.
Number of data sets: 1 Figure 6.
Barplot showing the taxonomic diversity of the arthropods reported by citizen volunteers.

Data set name:
A crowdsourcing approach to track the expansion of the brown marmorated stinkbug Halyomorpha halys (Stål, 1855) in France.