Insect collecting bias in Arizona with a preliminary checklist of the beetles from the Sand Tank Mountains

Abstract Background The State of Arizona in the south-western United States supports a high diversity of insects. Digitised occurrence records, especially from preserved specimens in natural history collections, are an important and growing resource to understand biodiversity and biogeography. Underlying bias in how insects are collected and what that means for interpreting patterns of insect diversity is largely untested. To explore the effects of insect collecting bias in Arizona, the State was regionalised into specific areas. First, the entire State was divided into broad biogeographic areas by ecoregion. Second, the 81 tallest mountain ranges were mapped on to the State. The distribution of digitised records across these areas were then examined. A case study of surveying the beetles (Insecta, Coleoptera) of the Sand Tank Mountains is presented. The Sand Tanks are a low-elevation range in the Lower Colorado River Basin subregion of the Sonoran Desert from which a single beetle record was published before this study. New information The number of occurrence records and collecting events are very unevenly distributed throughout Arizona and do not strongly correlate with the geographic size of areas. Species richness is estimated for regions in Arizona using rarefaction and extrapolation. Digitised records from the disproportionately highly collected areas in Arizona represent at best 70% the total insect diversity within them. We report a total of 141 species of Coleoptera from the Sand Tank Mountains, based on 914 digitised voucher specimens. These specimens add important new records for taxa that were previously unavailable in digitised data and highlight important biogeographic ranges. Possible underlying mechanisms causing bias are discussed and recommendations are made for future targeted collecting of under-sampled regions. Insect species diversity is apparently at best 70% documented for the State of Arizona with many thousands of species not yet recorded. The Chiricahua Mountains are the most densely sampled region of Arizona and likely contain at least 2,000 species not yet vouchered in online data. Preliminary estimates for species richness of Arizona are at least 21,000 and likely much higher. Limitations to analyses are discussed which highlight the strong need for more insect occurrence data.


Introduction
Insects represent over half of all described species (Mayhew 2007) and perhaps not more than 20% of those that exist have thus far been described (Gaston 1991). The State of Arizona, located in south-western United States along the Mexico border, has high insect diversity and ranks as the State with the most species actively monitored for conservation (Bossart and Carlton 2002). Entomologists from around the country and around the world travel to southern Arizona every year during the monsoon season (late summer and early fall) where popular canyons may have five to ten campsites and road pull-offs occupied by blacklights and collectors scrambling around them until early morning. Despite its insect diversity and popularity as a collecting destination, we are unaware of any empirical studies that assessed total insect species richness within the State or its subregions or explored biases in insect collecting therein.
Biodiversity occurrence records represent an enormously important, invaluable and irreplaceable data source for understanding biodiversity, evolution and ecology , Page et al. 2015, Guralnick et al. 2016, Johnston et al. 2018, Kharouba et al. 2018, Meineke et al. 2018, Lendemer et al. 2019, Hedrick et al. 2020. The vast majority of these records, at least for insects, presently come from digitised preserved specimens from natural history collections. However, we know that the specimens stored within collections are not evenly distributed throughout space and time and have many implicit biases intertwined with the history and methods used to accumulate them (Hortal et al. 2015, Johnston et al. 2018, Kharouba et al. 2018, Cooper et al. 2019, Whitaker and Kimmig 2020, Laney et al. 2021, Davis et al. 2022. Human observations have been rapidly increasing thanks to popular platforms such as iNaturalist which have their own slightly different biases, limitations and strengths. We broadly consider fine-scale documentation of individual insects to be "collecting" for the purposes of this paper, though most of our recommendations are focused on traditional preserved-specimen-based collections.
The goals of this study are twofold. First, we present an analysis of digitised insect occurrence data from the State of Arizona and compare the relative levels of sampling for different mountain ranges and ecoregions. Second, we address one example of an underexplored region and provide the first checklist of beetle species from the Sand Tank Mountains of central Arizona. We hope that these data and analyses can inform and bolster future insect collecting to improve our understanding of Arizona's biodiversity.

Arizona regionalisation
Arizona encompasses a wide array of habitat types ranging from extreme deserts to mesic conifer forests. To efficiently classify these regions, different levels of the hierarchical ecoregions defined by Omernik and Griffith (2014) can be used. Level 3 of those ecoregions gives a broad look at the State and is helpful to consider distributions and collecting efforts in broad strokes (Fig. 1a). However, this level of classification does not account for the fine scale habitat and plant community shifts that are seen, especially in the mountainous parts of the State (see Brown (1978), Brown et al. (2007)).
Arizona can also be regionalised by its many mountain ranges. The Madrean Sky Islands are a series of discrete mountain ranges that arise from surrounding grasslands and are variously forested at their higher elevations (Fig. 1a area 12.1.1). These mountains are situated in the only gap of the North American Cordillera between the Rocky Mountain range to the north and the Sierra Madre Occidental range to the south and are a priority in insect conservation and phylogeographic research (Stock and Gress 2006, Ober et al. 2011, Moore et al. 2013, Halbritter et al. 2019, Yanahan and Moore 2019. Beyond the Madrean Sky Islands, the western and southern parts of Arizona are part of the Basin and Range Province of western North America which is characterised by a large number of mountain ranges that have formed as the Earth's crust stretched in this region (Morrison 1991) and which covers the Sonoran and Mojave Deserts (Fig. 1a ( Fleischner et al. 2017) which is a slightly oblique area of plateaus and associated mountains that generally separates north-eastern Colorado Plateau from the southern Basin and Range Province (Fig. 1a area 13.1.1 in centre of the State).
Outlines of the 81 mountain ranges in Arizona with the highest peaks were geographically mapped for use in this study and are shown in Fig. 1b. The shapefiles of Arizona ecoregions and mountain ranges now allow for exploration of digitised insect occurrence records (Fig. 1c,d) to understand underlying patterns in bias and diversity of these areas. a b c d Figure 1.

Sand Tank Mountains
The Sand Tank Mountains, located in south central Arizona ( Fig. 1b label 78, Fig. 2), cover a moderately large area in the Lower Colorado River Basin region of the Sonoran Desert ( Brown 1978). The mountain range is situated with roughly its northern half on the Sonoran Desert National Monument bounded by US Interstate 8 to the north and its southern half on the Barry M. Goldwater Air Force Range. The mountains are, therefore, nearly entirely on public land, though access and collecting largely requires permits from the latter two entities. The highest point in the range, Maricopa Peak, only reaches 1234 m in elevation. The mountains are named for a series of tanks or tinajas (natual stone water catchments) that were often largely filled with sand and typically available to wildlife and humans for most of the year (Bryan 1925: 224-228 Insect collecting bias in Arizona with a preliminary checklist of the beetles ... scientific literature. Brown (1978) included the Sand Tank mountains in a list of lowerelevation Sonoran Desert ranges which had relictual patches of grassland and chapparal species on them. The Sand Tanks also are the location of a notable Jaguar (Panthera onca (Linnaeus, 1758), family Felidae) record from 1930 which represents the south-western known limit of the species in the State and likely the furthest documented excursion of the species into the Sonoran Desert (Babb et al. 2022).
Prior to the study presented here, a total of 27 occurrence records representing 16 insect taxa were available online (GBIF 2022a). This includes only a single record for the order Coleoptera which represents nearly 25% of all described species on Earth, from a photo voucher on iNaturalist. We were unable to find any other beetle records from the mountains in the published scientific literature or in our own work in Arizona natural history collections.

Data sources and region delimitation
Occurrence records for insects ( Fig. 1) were downloaded from the online aggregator Global Biodiversity Information Facility (GBIF). Records were downloaded from GBIF (GBIF 2022a) by searching for every record that had geographic coordinates, contained 'Arizona' in the stateProvince data field and that belonged to Class Insecta, resulting in a dataset of 712,309 occurrence records. GBIF was chosen as the only data source for this analysis in part because of its versioned DOI for downloads and also because it provided the most records of any other portal. The Integraded Digitised Biocolections (iDigBio) portal contains 612,142 records using the same search parameters and the Symbiota Collections of Arthropods (SCAN) portal contains 683,645 records, nearly all of which are overlapping between the portals. The GBIF mediated data are further enhanced by their backbone taxonomy which is a synthetic management classification for the portal (GBIF Secretariat 2022). All records are harmonised to the GBIF taxonomy which helps to clean misspellings and differently formatted data from the various data contributors making diversity and species richness estimates more plausible. However, the influence of the GBIF taxonomy is influential in another way since there are so many taxonomic names that are not yet known to GBIF. This may affect as many as 75% of records and names for major insect orders (Waller 2022).
The occurrence records were imported into qGIS 3.24 (qGIS Development Team 2022) and checked against shape files with polygons representing ecoregions from the United States EPA (Omernik and Griffith 2014) and mountain ranges within Arizona. The list of mountain ranges was generated primarily by consulting online resources for mountain climbers. A curated list of mountain ranges and their highpoints (Anonymous 2022) was used as the starting point and each range was verified through a combination of United States Geological Survey (USGS) topographic maps, Google Maps searches and consulting regional gazetteers and atlases. Our working definition of a mountain range for the purposes of biological regionalisation is as follows: a geographically contiguous string of mountains which seem to have a shared geological origin and are separated from other such groups by a lower elevational region which appears to have different geology and/or vegetative cover as assessed via satellite imagery. These mountain ranges typically matched very closely those labelled on topographic maps and gazetteers. Polygons for each mountain range were drawn by hand around geological formations as viewed in satellite imagery; topographic maps from the USGS, personal experience in the field and mountain range and place names in google maps were used to ascertain a polygon that represented the footprint around the mountain range. Shifts in geology and vegetative cover were especially helpful to define the periphery of mountain ranges. Our definitions attempted to delimit potentially biologically meaningful entities more than they were an attempt to perfectly outline the underlying geology. Any occurrence found within the footprint of one of the included shapes was annotated as such. A custom script (Suppl.

Evaluating digitised records for collecting bias
For entomological field work, differences in occurrence records likely reveal a compilation of biological differences (e.g. increased insect biomass and population densities would increase the number of occurrence records), differences in survey effort (e.g. one area may have been visited by 100 researchers a year and another area by 10 researchers per year) and differences in social practices and research interests (e.g. one person may collect 100 of 200 observed individual insects at a particular event, while another person may collect 5 of 200 observed individual insects at a different event). Insect occurrence records were, therefore, analysed according to three different metrics, namely records, collecting events and species. First, the total number of occurrence records for a given ecoregion or mountain range were tallied as a sum. Second, collecting effort was approximated by pooling records into putative collecting events. All insect records from a particular ecoregion or mountain range that had an identical date (using dwc:day, month and year fields) and collector (dwc:recordedBy field) were considered to belong to a single collecting event. Third, putatively unique insect taxa were totalled for each ecoregion and mountain range by counting unique scientific names (dwc:scientificName field). These names correspond to the taxonomic interpretation according to the GBIF backbone taxonomy. This count may be considered an overestimate because different individuals of the same taxon may have been identified to different ranks (e.g. subspecies, species, genus and family) and be counted multiple times. However, because so many taxa at the species level are not known to the GBIF taxonomy, many differently identified taxa are prone to being 'lumped' into a higher classification level (Waller 2022). For studies where the goal is to create a verified checklist of names, the original verbatim data from individual providers are included on GBIF, but we deemed the taxon names as filtered by GBIF to be more standardised and at least easily comparable across ecoregions and mountain ranges. All data are made available as supplemental materials for annotated occurrence records (Suppl. material 1), summarised data for ecoregions (Suppl. material 2) and mountain ranges (Suppl. material 3).
Sampling effort to geographic area relationships were explored using linear regressions of both total occurrence records and tabulated collecting events to geographic area of regions (both for ecoregions and mountain ranges Our study is primarily focused on understanding the scale and bias of insect records as they relate to geographic areas in Arizona and, therefore, presents somewhat simplistic explorations of the data as a first step towards future studies which may employ more complex models to explore specific biological questions. However, we did assess our dataset for normality since different analytical techniques might apply to these data depending on the underlying biological power laws at play (García Martín andGoldenfeld 2006, Packard 2014). The untransformed data were not normal, but the log-transformed data were. Normality assessments and analyses on log-transformed data and plots of species by geographic area of EcoRegions and mountain ranges are available in (Suppl. material 6).
Possible factors responsible for underlying bias within the occurrence records were assessed using the R package sampbias (Zizka et al. 2020) to examine how spatial distribution of roads, cities, airports and rivers might affect where insects are collected. The analyses were run using all georeferenced insect records for the State using default settings within sampbias which performs a Bayesian analysis to determine the range of posterior probabilities for how each factor biases the underlying dataset. The bias each factor introduces is then compiled into a spatial model for an expected sampling effort given the calculated biases. The resulting bias model was calculated for Arizona and was then visualised along with a heatmap of insect occurrence records for the State.
Species richness within areas was estimated using the R package iNEXT (Chao et al. 2014, Hsieh et al. 2020) to perform species rarefaction and extrapolation analyses. Counts were tabulated for the total number of records for each unique taxon within a region and these abundance data were given to iNEXT and analysed using q = 0 for the appropriate Hill number estimation for abundance data (Chao et al. 2014). Our analyses were primarily focused on exploring relative completeness of species richness sampling found within occurrence data, but future studies primarily interested in modelling precise species richness would likely need to explore records in more detail to discern where there is and is not overlap at different taxonomic scales (e.g. how should records to the genus level be counted if a single species from that genus is already counted from the area?). We analysed taxa as unique name strings as described above for all analyses. We further reanalysed several areas with a more conservative approach where we only used the subset of records that were identified to species (i.e. dwc:taxonRank = SPECIES) to explore how that changed extrapolation of total species richness. None of the rarefaction and extrapolation analyses presented here approaches an asymptote within an estimated doubling of sampling effort and, therefore, has limitations in truly accounting for unobserved taxa in species richness estimates (Willis 2019); nevertheless, the rarefaction curves and estimates are still useful tools to understand uses and limitations of the underlying data.

Checklist of Sand Tank Mountains Coleoptera
Three collecting trips were made to the Sand Tank Mountains to survey for beetles. The first was on 29 April 2022 where blacklighting and night searches with headlamps were performed in a rocky basin near a paved wildlife water catchment basin ( Arnett et al. (2002) to the level of family and genera. Species-level identifications were then performed by using appropriate primary literature or by consulting local taxonomic experts. The final identification resource for each species in the checklist is provided. For taxa identified by experts where a specific source is unknown, we attribute the identification to that person as unpublished data.
The checklist of species was built using the Ecdysis portal checklist tool from all of the digitised specimen records created as part of this project. The curated checklist was then exported for publication and inserted into the ARPHA writing platform for this journal.
Families are presented in alphabetical order and all species are presented alphabetically under their family. A total of 140 new species level records were identified, anchored by 914 fully digitised pinned and labelled voucher specimens. When combined with the previously available record, the following checklist enumerates 141 species of Coleoptera from the mountain range.
Notes: Identification of this genus requires examination of male terminalia. Our single putatively female specimen was only identified to the subgenus Scymnus (Pullus), of which there are a number of species known from this region.
Notes: This diverse genus is difficult to identify without genitalic dissections and we were unabe to identify our specimen to species.
Notes: This genus has limited identification resources available. Our two specimens resemble Mordellina testacea (Blatchley, 1910) -a species only reported from the eastern United States.
Notes: A moderate series of this Oxycopis species likely represent an undescribed species which we were unable to associate with any currently known from the western United States.
Notes: This speciose genus is in need of a modern revision. Our single specimen has elytra that lack discernible striae and may be near Tricorynus lentus (Fall, 1905).
Notes: Our specimens somewhat resemble Ahasverus rectus (LeConte, 1854), but differ in several characters from the holotype of that species. We have seen conspecific specimens to ours labelled as "Ahasverus n.sp." in collections and think it is likely that it is, indeed, an undescribed species.

Notes:
We were unable to identify our single specimen of this species beyond the level of genus in this speciose group.

Notes on the Sand Tank Mountains Coleoptera
The checklist provided herein significantly raises the entomological knowledge of this mountain range. Our collecting efforts unfortunately were comprehensively limited as they did not include sampling during the peak flowering season that typically occurs between late February and April or in the winter which has a distinct insect fauna that often does not overlap with the taxa found during the warmer times of year. We also were unable to access a number of distinct habitats, including the relictual chaparral plant communities, that likely would have greatly increased our taxon count.
Many of the species reported from this study occur throughout the Lower Colorado River Basin subregion of the Sonoran Desert, but are often poorly represented in natural history collections or in digitised occurrence records. Six species recorded by us had no prior digitised records from the State of Arizona even though they are known in literature (Diclidia greeni, Horistonotus lutzi, Mulsanteus arizonensis, Niptus ventriculus, Oxycopis mariae and Ptinus paulonotatus). Many more represent the second digitised record or the first preserved specimen, as opposed to a human observation, from the State. These are notable in that they demonstrate specific examples of how digitised records both fall short of representing the full knowledge of the State's fauna, as well as the limited distributional information available for many species. It is also notable that three collecting events produced likely three undescribed species (Ahasverus sp., Allopoda sp. and Oxycopis sp.). Our specimens of Asbolus mexicanus angularis are the first reported from Maricopa County in Arizona and represent a roughly 50 mile (ca. 75 km) north-east range extension of that species. Many other species we report may represent additional notable range extensions, though the limited knowledge and digitised specimens from the region hinder more in-depth analyses.
The actual number of Coleoptera species that inhabit the Sand Tank Mountains is surely much higher than what is recorded here. Based on our experience in the region, we presume this list is no more than 30% of the actual diversity and recommend future studies should focus on flower-feeding taxa and employ other trapping techniques, such as flightintercept traps and Lindgren funnels. Estimating species richness using rarefaction and extrapolation (Fig. 3) estimates a total richness of 193 species which would mean we have sampled roughly 72% of the diversity so far. The lower estimate found in this analysis may be due to our employing similar collecting methods on all our trips. The estimate is perhaps a better reflection of the total number of species we could collect given the same techniques, while not accounting for taxa that diversifying techniques would add.
We would define the Coleoptera fauna of the Sand Tank Mountains as typical of the Lower Colorado River Basin of the Sonoran Desert. Many species we collected are typically found in arid low elevation regions of the State which is exemplified by the 30 species of Tenebrionidae collected which are highly diverse in such habitats. We postulate that the beetle fauna of the Sand Tank Mountains is likely similar to the fauna found throughout most of the low mountain ranges in the south-western portion of Arizona -but direct comparison is stifled by the lack of knowledge of those other mountain ranges.

Collecting bias across ecoregions
Insect records and diversity for the ecoregions of Arizona are summarised in Fig. 4. The number of occurrence records are not very well correlated with the geographic area of the regions (Fig. 4a). When distinct collecting events are compared to geographic area (Fig.  4b), a trend of slightly more even distribution of sampling effort per area is observed. It seems clear that, relative to all the ecoregion in the State, the Madrean Archipelago (label 12.1.1 in figures) is disproportionately highly sampled, while the Arizona/New Mexico Plateau (label 10.1.7 in figures) is comparatively weakly sampled. This lack of correlation means that collecting efforts are not evenly distributed throughout space or between the different ecoregions.

Collecting bias across mountain ranges
Insect records and collecting events for Arizona mountain ranges by geographic area are summarised in Fig. 5. In contrast to the data for ecoregions presented above, mountain ranges show a much less even distribution of collecting records. Both individual occurrence records by area (Fig. 5a) and collecting events by area (Fig. 5b) are highly skewed by a few very disproportionately well-collected mountain ranges and a large number of ranges with almost no sampling.
The most distant outlier by far is the Chiricahua Mountains (label 5 in Fig. 5) which are located in the extreme south-eastern corner of the State and represent 117,396 (40%) out of 296,421 total occurrence records which were mapped to all 81 mountain ranges examined here. This high sampling rate is, in large part, due to an active research station located within the range. The following four mountain ranges were also incredibly highly sampled, though nowhere near the sampling effort seen in the Chiricahuas. The Huachuca Mountains (label 6 in Fig. 5)

Factors driving collecting bias
Analysis of the influence of proximity to roads, cities, airports and rivers is shown in Fig. 6. Proximity to roads was found to be the strongest factor of bias within our dataset, followed by proximity to cities and then airports. Proximity to rivers apparently has almost no effect on sampling bias. The underlying layer of roads was largely made up of paved, government-maintained roads and does not contain all smaller roadways that are often unpaved which provide access to most mountain ranges in Arizona. The analysis estimated both the weight of each biasing factor (Fig. 6a) and how that bias behaved by distance (Fig. 6b).
These biasing factors together generate a model of expected sampling frequency across Arizona. Fig. 7a shows this model rasterised across the State and Fig. 7b shows a heatmap of insect records overlaid on top. These visualisations demonstrate that, while proximity to population centres and roads are important, they clearly do not, alone, explain the distribution of insect collecting records across the State. In fact, the most heavily sampled Sky Islands in the south-western portion of the State are in areas of low expected collecting effort, while regions lying along major highways between population centres, such as the Interstate 10 corridor in central Arizona, are disproportionately less well collected. a b Figure 5.
Occurrence records by mountain range from the State of Arizona. Point labels match mountain ranges in Fig. 1b

Species richness estimates
The disproportionate levels of data amongst mountain ranges discussed above demonstrate that it is too soon to accurately model insect diversity from occurrence records for most ranges. However, the Chiricahua Mountains are so disproportionately highly collected that they offer an important case study into what we can infer about insect diversity from occurrence records. Analysis of species richness for the Chiricahuas (Fig. 8) indicate that we are fairly far from reaching a plateau or accurate assessment of the actual taxonomic diversity. The preliminary estimate for all taxa (Fig. 8a) suggests 9,600 unique taxa are present, while more conservative estimate of species (Fig. 8b) suggests 6,500 a b Figure 6.
Biasing factors in relation to digitised insect records for Arizona. Proximity to roads, cities, airports and rivers are shown as inferred via sampbias (Zizka et al. 2020).
a: Inferred posterior weight of each biasing factor. Each is represented by a narrow range, likely due to the size of our occurrence dataset. b: Estimated sampling rate as a function of distance in kilometres from the biasing factor. species are present. Perhaps more important than the total numbers, both estimates suggest that current digitised data at best account for 70% of the actual diversity from those mountains.
Scaling up to ecoregions, species richness is similarly incompletely sampled by current collecting efforts (Fig. 9). The Madrean Archipelago, the proportionately highest sampled region by area (Fig. 4), boasts the largest recorded taxonomic diversity of the six ecoregions with just over 15,300 taxa which falls well short of a preliminary estimate of over 21,700. All ecoregions apparently require more than double the current sampling effort to begin to find a plateau and accurate species richness estimate.
Species richness estimates for the entire State of Arizona again fail to plateau with the available data (Fig. 10). Preliminary species richness estimates are much higher than those observed with all taxa (Fig. 10a) predicting roughly 36,000 total taxa and speciesonly data (Fig. 10b) predicting just over 21,600 species. As with the Chiricahua Mountains and ecoregions discussed above, both Arizona richness estimates imply that the current data only represent around 70% of predicted diversity and similarly demonstrate that online data will need to be greatly expanded before accurate estimates can be made.

Species richness estimates
All rarefaction and extrapolation estimates of species (or taxonomic) richness failed to plateau and provided very similar results that only 70% of the full estimated species richness were observed. As all the analyses across the three scales explored here gave a b Figure 7.
Map plotting the model of expected sampling density given calculated bias of roads, cities, airports and rivers within Arizona.
a: Expected sampling effort given model of collecting bias. Darker hues represent higher expected sampling, while lighter hues represent lower expected sampling. b: Expected sampling effort overlaid with heatmap of insect occurrence records showing actual sampling effort compared to the model. these similar proportional results, it seems clear that the estimates are strongly affected by incomplete sampling. We again urge readers to be cautious with the absolute numbers presented here. However, given our knowledge of the Sand Tank Mountains Coleoptera study and its limitations along with the slopes of all rarefaction curves, the species richness estimates presented here seem to be extremely conservative counts and might be useful as a lowest-end predictor of what the true diversity is. We did not assess the diversity of collecting techniques represented in our dataset. This may mean that we are underestimating total insect taxa, even at larger scales, due to inadequate sampling techniques. a b Figure 8.
Species richness estimation curve for the Chiricahua Mountains in Arizona by rarefaction and extrapolation.
a: Rarefaction and estimation curve for all taxa from all ranks for the Chiricahua Mountains (6,663 distinct taxon names observed). b: Rarefaction and estimation curve for species level taxa for the Chiricahua Mountains (4,604 distinct taxon names observed).

Additional factors driving collecting bias
The analyses presented here clearly demonstrate that, according to available data, insect collecting has not been done evenly throughout the State. The underlying factors that drive a b Figure 9. the biases seen in the data are likely numerous and difficult to fully ascertain. We hypothesise that two of the primary drivers are habitat accessibility and social interactions.
Habitat access for insect collectors is very important and has many facets. Proximity to roads and populations centres is clearly important, but not the only limiting factor and not all cities and roads are the same. For example, the Chiricahua Mountains have roads accessible to passenger cars that go to the highest elevations. The Mountains are almost entirely on public lands and there are nearby towns with accommodation and stores, as well as a popular research station. In contrast, all sites visited in the Sand Tank Mountains involved rugged back-country roads requiring high clearance and four-wheel drive vehicles a b Figure 10. and Range ecoregion is an interesting example of how these factors interact where most of the lands are public, but the terrain is very rugged and roads and population centres are limited which is likely why there are so few insect records from the region even though there are no major permitting restrictions.
The Patagonia Picnic Table Effect (Laney et al. 2021), named after the town of Patagonia, Arizona, is a term from the birdwatching community to describe how one sighting of a rare species leads to increased birdwatching effort in the immediate region. The equivalent in the insect world would be one collector finding a very rare or charismatic species which prompts future collectors to either go to the same locality to collect either that same species for themselves or in hopes that it might also produce a rare species of their own group of interest. Laney et al. (2021) analysed 10 years of birdwatching data and demonstrated that there is an increase of activity following the initial discovery, but there was no increased likelihood to find additional rare species in that area compared to any other. It seems clear that, despite its potential lack of utility in rare species documentation, it is a social phenomenon in naturalist communities which may also contribute to uneven insect collecting thoughout the state.

Recommendations for future collecting
The full scale of insect diversity has been under-documented for the State of Arizona, its constituent ecoregions or even its most popular mountain ranges, at least in available online data. It is important that the entomological community continues to survey for and collect insects everywhere in the State. Continuing and increasing efforts to mobilise specimen data from natural history collections also remains a high priority and will likely help to account for many species which are not currently represented in online data. It is estimated that not more than 5% of specimens in insect collections of the United States have been fully digitised . It is possible that some of the biases found in our dataset will be corrected as this proportion increases and it will be very interesting to see what will happen to species accumulation curves as the data increase. We would recommend that gap analyses be done on digitised insect data when the number of records approximately doubles from its current state since none of the species richness estimation curves reported here approached an asymptote.
We urge collectors to make a concerted effort to go to new places and consider targeting specifically undercollected regions and mountain ranges. Small and targeted studies can exponentially improve our understanding of Arizona insect fauna and are likely the best way to increase knowledge of species distributions and may be crucial to understanding the entire State fauna. Our example of the Sand Tank Mountains beetles highlights how a modest collecting effort can still provide new occurrence records for species from the State and report on new localities for taxa that are otherwise considered rare in collections. We do not recommend that collectors avoid the classic and popular sites; indeed, we still need to sample those, but we would advocate that entomologists consider dividing their time in the field and only spend part of their efforts in the well-known habitats and spend the next day somewhere new.
The paucity of insect data from so many mountain ranges in the State strongly limit our ability to adequately protect and conserve insect biodiversity. Entomologists and insect collections should partner with local, State and federal land management agencies to increase insect sampling throughout the State. Increasing partnerships and professional connections with tribal nations within the State are also strongly recommended.
Opportunites for occurrence-data driven estimates for species diversity are in their infancy, even for a biodiversity hotspot that is accessible and popular. Nevertheless, the growing availability of occurrence data is an important resource to continue to develop to understand the diversity and distributions of insects.