Analysis of Chagas disease vectors occurrence data: the Argentinean triatomine species database

Abstract Background Chagas disease is a neglected tropical disease and Trypanosoma cruzi (its etiological agent) is mainly transmitted by triatomines (Hemiptera: Reduviidae). All triatomine species are considered as potential vectors; thus, their geographic distribution and habitat information should be a fundamental guide for the surveillance and control of Chagas disease. Currently, of the 137 species distributed in the Americas (Justi and Galvão 2017), 17 species are cited for Argentina: Panstrongylus geniculatus, P. guentheri, P. megistus, P. rufotuberculatus, Psammolestes coreodes, Triatoma breyeri, T. delpontei, T. eratyrusiformis, T. garciabesi, T. guasayana, T. infestans, T. limai, T. patagonica, T. platensis, T. rubrofasciata, T. rubrovaria and T. sordida. Almost 20 years have passed since the publication of the “Atlas of the Triatominae” by Carcavallo et al. (1998) and no work has been done to provide an updated complete integration and analysis of the existing information for Argentinean triatomine species. Here we provide a detailed temporal, spatial and ecological analysis of updated occurrence data for triatomines present in Argentina. New information This is the first database of the 17 triatomine species present in Argentina (15917 records), with a critical analysis of the temporal, spatial and ecological characteristics of 9788 records. The information spans the last 100 years (1918–2019) and it was mostly obtained from the DataTri database and from the Argentinean Vector Reference Center. As 70% of the occurrences corresponded to the last 20 years, the information was split into two broad periods (pre-2000 and post-2000). Occurrence data for most species show distribution range contractions, which, from the pre-2000 to post-2000 period, became restricted mainly to the dry and humid Chaco ecoregions. Concurrently, the highest species richness foci occurred within those ecoregions. The species T. infestans, T. sordida, T. garciabesi and T. guasayana mostly colonise human dwelling habitats. This study provides the most comprehensive picture available for Argentinean triatomine species and we hope that any knowledge gaps will encourage others to keep this information updated to assist health policy-makers to make decisions based on the best evidence.


Introduction
Chagas disease is a public health problem in the Americas and elsewhere (World Health Organization 2017) that has Trypanosoma cruzi as its etiological agent. Despite the existence of other infection routes (blood transfusions, laboratory accidents, congenital transmission and orally via contaminated food), vectorial transmission by Triatominae (Hemiptera: Reduviidae) is currently the main route of infection. Thus, up-to-date information on the geographic occurrence of triatomine species becomes important and necessary to understand the epidemiological and control aspects of this disease. Although all Neotropical triatomine species (around 150) are considered as potential vectors, about 70 species have been found to be naturally infected with this parasite (Galvão and Justi 2015) and their geographic distribution information should be a fundamental guide for the surveillance and control of Chagas disease. Some studies that integrated the geographic information of Argentinean triatomines (Abalos and Wygodzinsky 1951, Martinez and Cichero 1972, Carcavallo et al. 1998) are still used as bibliographical references, despite being 25-70 years old. Almost 20 years have passed since the publication of the "Atlas of the Triatominae" by Carcavallo et al. (1998) and no work has been done to provide an updated complete integration and analysis of the existing geographic information for Argentinean triatomine species, as in other Latin American countries, such as Brazil (Galvão 2014), Mexico (Salazar-Schettino et al. 2010, Ramsey et al. 2015, Colombia (Guhl et al. 2007), French Guiana (Bérenger et al. 2009), Suriname (Hiwat 2014), Peru (Chávez 2006) and Venezuela (Cazorla-Perfetti and Nieves-Blanco 2010). Currently, of the 137 species distributed in the Americans (Justi and Galvão 2017), 17 species are cited for Argentina: Panstrongylus geniculatus, P. guentheri, P. megistus, P. rufotuberculatus, Psammolestes coreodes, Triatoma breyeri, T. delpontei, T. eratyrusiformis, T. garciabesi, T. guasayana, T. infestans, T. limai, T. patagonica, T. platensis, T. rubrofasciata, T. rubrovaria and T. sordida. For some of these species, their entire geographic range is circumscribed to Argentina, while others extend their geographic distribution to neighbouring countries, such as Bolivia, Brazil, Chile, Paraguay and Uruguay and there are even cases of widelydistributed species, such as P. geniculatus or P. rufotuberculatus, whose distribution range reaches Central America and Mexico, respectively (Carcavallo et al. 1998, Ramsey et al. 2015. The Wallacean shortfall (Hortal et al. 2015) refers to the lack of knowledge about the geographic distribution of species (Lomolino 2004) and it is intimately connected with temporal and spatial variations in surveying effort (Boakes et al. 2010. At smaller scale (e.g. countries), the Wallacean shortfall becomes more marked, as increasingly precise information about geographic distribution is required (Riddle et al. 2011). Therefore, primary data, i.e. dated records of species occurrences, are the best kind of data, preferable to summaries done at coarser resolutions or that may be missing attributes attached to the original record. In the case of vector species relevant to some vector-borne diseases, there are some initiatives that have compiled vector occurrence data (e.g. mosquitoes (Kraemer et al. 2015) or sandflies (Pigott et al. 2014)), thus providing geographic information that enables policy-makers to make evidence-based decisions. Other vector species are often sparsely recorded (Hay et al. 2013) and there are few globally-comprehensive sets of compiled primary data. Moreover, data on the geographic distribution of vector species is not always publicly available and details of sampling biases or occurrence validation are frequently difficult to find.
In the case of Chagas disease, updated data about current geographic distribution of triatomines are essential resources for the development of strategies to control vector transmission. Knowledge about triatomine habitats (i.e. domicile, peridomicile, sylvatic), as well as their synanthropic behaviour, are epidemiologically relevant since the risk of contracting Chagas disease by vectorial transmission depends mainly on the presence of the triatomines inside human dwellings. Recently, Ceccarelli et al. (2018) published an updated occurrence database for 135 American triatomine species (DataTri) including Argentinean species records. Additionally, the Argentinean Ministry of Health (AMH), through two of its research institutions ('Centro de Referencia de Vectores' and 'Centro Nacional de Diagnóstico e Investigación en Endemoepidemias', CeReVe and CeNDIE/ ANLIS, respectively, after their Spanish acronyms), has also been compiling triatomine occurrence data, based on unpublished entomological reports. Furthermore, triatomine geographic information from a citizen science-based project (GeoVin) is being collected through a mobile application developed by the triatomines laboratory at CEPAVE (CONICET, UNLP).

General description
Purpose: The aim of this work is to develop a detailed temporal, spatial and ecological analysis of updated occurrence data for Argentinean triatomines, obtained from the abovementioned data sources (DataTri, AMH and GeoVin), to achieve the following main goals: (a) provide a temporal description of the collected data, (b) update and refine the current knowledge of geographic distributions for triatomines in Argentina, (c) describe the diversity patterns of Argentinean triatomine species and (d) analyse, categorise and classify triatomine species according to the habitat where they occur.

Project description
Title: Triatomine Database Strengthening ("Fortalecimiento de la Base de Datos de Triatominos") Personnel: Data gathering and final dataset building was conducted over five years under the responsibility of Soledad Ceccarelli, Agustín Balsalobre, Maria Eugenia Cano, Maria Eugenia Vicente, Paula Medone, Jorge E. Rabinovich and Gerardo A. Marti from the Centro de Estudios Parasitológicos y de Vectores (CEPAVE CONICET-CCT La Plata-UNLP). Delmi Canale, Patricia Lobbia and Raúl Stariolo from the Centro de Referencia de Vectores (CeReVe -Coordinación Nacional de Vectores, Ministerio de Salud de la Nación) provided triatomine occurrence data, based on unpublished entomological reports. Additional help and/or occurrence data were provided by many other colleagues throughout the years.

Study area description:
The geographic area associated to the dataset encompasses southern Argentina and Chile to southern Mexico. The temporal coverage of the dataset spans the period from 1918 to 2019, while the taxonomic coverage includes the 17 species currently cited for Argentina.

Design description:
The main goal of the project was based on: i) to improve the quality of existing data in line with established standards, ii) to refine the current knowledge of geographic distributions for triatomines in Argentina and iii) to provide a public geodatabase with updated occurrence data for triatomine species present in Argentina, based on accurately georeferenced locations (Chapman and Wieczorek 2006). Funding: Ministerio de Ciencia y Técnica de la Nación (MinCyT) throughout the Sistema Nacional de Datos Biológicos (SNDB).

Sampling methods
Study extent: The methods applied to gather, georeference and validate the Argentinean triatomine species occurrences to build the dataset are described in detail in Ceccarelli et al. (2018); however, here we provide a brief overview.
Step description: Data were compiled from three main data sources: (1) DataTri (the American triatomine database) (Ceccarelli et al. 2018), a compilation of triatomine occurrences and complementary ecological data representing the most complete and updated database available on triatomine species at a continental scale. It was assembled by collecting the records of triatomine species published from 1904 to 2017, spanning all American countries with triatomine presence. The georeferenced records were obtained from published literature, personal fieldwork and data provided by colleagues; (2) CeReVe (Centro de Referencia de Vectores -Argentinean Ministry of Health -AMH) and CeNDIE/ ANLIS (Centro Nacional de Diagnóstico e Investigación en Endemoepidemias -Reference Center for the Diagnosis of Endemoepidemics, also part of AMH), based on records compiled from entomological surveys. CeReVe has the most extensive collection of triatomine occurrence records in Argentina; every year for the last 20 years, the triatomines collected during the spraying campaigns for vector control organised by the Chagas Disease Control Program in different provinces of Argentina, have been taken to CeReVe for taxonomic determination, recorded and curated. When two or more triatomine specimens were collected from different houses at the same locality and on the same date, individuals of the same species were pooled as a single record; (3) GeoVin (www.geovin.com.ar), a citizen science project developed by researchers at CEPAVE and ILPLA (CONICET-Argentinean National Research and Scientific Council and UNLP-National University of La Plata). This project started in 2018 to gather geographic information on triatomines of Argentina through citizen participation using a public and free application; users report bug findings that are georeferenced using their mobile phone GPS. Occurrence data (photos and geographic coordinates) sent online are stored in a centralised server and then data validation is done by researchers of the Triatomine Laboratory at CEPAVE. Once the data are validated, they are automatically integrated into the GeoVin occurrence dataset. Data from 2018 and 2019 were considered to make up this dataset.

Analysis of the Argentinean triatomine occurrence data
The analysis of the Argentinean triatomine occurrence data was based upon a threestaged approach: i) temporal, ii) geographic and iii) habitat type.
Temporal pattern. To analyse the temporal pattern, the information was taken from each record and classified in 7-year intervals (year represents the year of collection) using only the records with associated information on year or period of years. If records from the published literature did not provide a collection date, the period (pre-2000 or post-2000) to which the publication belonged was assigned.
Geographic distributions and species richness pattern. We analysed the occurrence distributions and richness patterns of triatomine species. First, to obtain an overall view of the geographic information for each triatomine species, its occurrence data were mapped over areas representing ecoregions (Burkart et al. 1999) using QGIS software (QGIS Development Team 2018). The geographic distributions of the species T. garciabesi and T. sordida required particular consideration because their taxonomic classification in certain areas is currently under debate , Panzera et al. 2015; thus, records belonging to both species that were within the area involving eastern Jujuy and Salta Provinces, northern Santiago del Estero Province and Western Chaco and Formosa Provinces were considered controversial and were classified as 'T. garciabesi-T. sordida'. Records that we considered as having doubtful geographic information or those that were the result of passive transport (i.e. carried by humans or in wood and brick shipments), were classified as 'questionable information' and denoted with red stars on the geographic maps of each species. Then, to obtain a pattern of species richness for triatomines of Argentina, the occurrence data (points) of each species from the last 20 years were overlaid by a regular grid of 0.25° (longitude-latitude cells) resolution; from this grid, a presence-absence matrix was built containing all species (columns) and sites/grid cells (rows). In this matrix, the species richness of each cell is represented by the number of different species occurring in that cell. The final species richness values per cell were exported as a raster file and then mapped with the QGIS software (QGIS Development Team 2018) to show the biodiversity pattern maps. Species richness estimates were carried out in the R statistical language (R Core Team 2019), using the 'maptools' (Bivand and Lewin-Koh 2018) and 'letsR' (Vilela and Villalobos 2015) R packages. The R codes used to calculate species richness estimates from the datasets are available on GitHub (https://github.com/solnqnlp/Triatominae_SpRichness).
Habitat type. We classified each triatomine species according to the habitat type where triatomines are found. Although triatomine habitat types are usually classified as belonging to three general environment categories (inside human dwellings or domicile, around human dwellings or peridomicile and natural environment or sylvatic habitat); here we opted for a domiciliation/intrusion level habitat classification, based upon the one proposed by Waleckx et al. (2015). To that end, some particular considerations had to be taken into account: i) only records from the last 20 years were used, ii) the number of records for each species in each basic habitat type (domicile, peridomicile and sylvatic) were assessed from the database, iii) the presence/absence of immature stages in a determined habitat -beyond the presence of adults-was also assessed from the database to determine if a species had the capacity to establish colonies in each habitat; iv) if a record had a mix of habitats, it was duplicated as two records (e.g. habitat information expressed as "Domicile-Peridomicile", was considered as two records, one as "Domicile" and the other as "Peridomicile"); and v) 'T. garciabesi-T. sordida' records were included in both the T. garciabesi and T. sordida datasets. Finally, records with no habitat information were considered as not available (NA).

Geographic coverage
Description: The geographic area associated to the dataset encompasses southern Argentina and Chile to southern Mexico.

Traits coverage
We compiled a total of 15917 occurrence data for the entire geographic range of the triatomine species present in Argentina; of those, 9788 records corresponded to exclusively Argentinean distributions (Ceccarelli et al. 2020).

Temporal pattern
The compiled information for Argentina spans the period from 1918 to 2019, with 70% of the occurrences corresponding to the last 20 years (Fig. 1, Table 1). Considering that most of the records belong to the latter time interval and to describe better the information, the dataset was split into two periods, (a) a pre-2000 period (up to and including 2000) and (b) a post-2000 period (2001 to the present).   [1998][1999][2000][2001][2002][2003][2004][2005][2006] were assigned to the time interval with the largest number of years of collection.

Types of data sources reviewed
Most of the records (62.79%) were obtained from DataTri and 37.21% from the other three datasets (AMH-CeNDIE/ANLIS, AMH-CeReVe and GeoVin). The main data sources contributing to the pre-2000 records were public repositories (96.66%), while in the post-2000 period, AMH-CeReVe and data provided by colleagues were the data sources with the greatest contribution (51.82% and 33.68%, respectively) (

Habitat type
According to the type of habitat where collected, 29.96% of the species are found inside human dwellings, 65.02% in the vicinity of human dwellings and 5.01% in natural environments (Table 3). A classification, based on the range of habitat types, includes seven categories (Table 3): (1) Domestic species, characterised by adults, nymphs, eggs and exuviae present within the house (i.e. the entire life cycle of the insect occurs within human dwellings); (2) Domiciliary species, characterised by having their complete cycle Table 2.
Types of data sources reviewed. The chart presents the number of records compiled from each information source type and the percentage of contribution from each information source in each period (pre-and post-2000). The "New data sets" rows correspond to sources of information obtained after the publication of DataTri.
inside human dwellings, but maintaining small populations, i.e. showing recent (and relatively poor) adaptation to houses; these populations may disappear from human domiciles without any control intervention; (3) Domiciliary intrusive species, characterised by the presence of adult individuals in human dwellings, probably attracted by light or introduced by passive transport, but without evidence of colonisation (i.e. without the presence of nymphs, eggs or exuviae within the domicile); (4) Peridomestic species, similar to the Domestic species, but completing their entire life cycle around human dwellings; (5) Peridomiciliary species, characterised by species that complete their entire life cycle in the vicinity of human dwellings; (6) Peridomiciliary intrusive species, characterised by adult individuals reported around human dwellings, but without evidence of colonisation; and (7) Table 3.
Classification of triatomine species in the Argentinean territory according to the level of domiciliation/intrusion in human dwellings and natural environment. The percentage of records in each habitat type represents the number of records for each species relative to the total records for each habitat type. The percentage of records with presence of nymphal stages represents the number of records of each species with nymphal stages presence relative to the total records for each species in each species category.

Species c ategories
Triatomine species

Percentage of records in each habitat type
Percentage of records with presence of nymphal stage Triatoma infestans is the only species categorised as Domestic and Peridomestic, while T. garciabesi, T. guasayana and T. sordida are mainly Domiciliary and Peridomiciliary species. In the case of T. eratyrusiformis, T. patagonica and T. platensis, they are mainly Peridomiciliary species, whereas P. geniculatus and P. guentheri are mainly Peridomiciliary intrusive species. Additionally, even though the relative frequency of records for each species in natural environments are heterogeneous, Ps. coreodes, T. breyeri and T. delpontei are Sylvatic species, with the maximum percentage of nymphal stages recorded in this habitat type (Table 3).

Geographic distributions and species richness pattern
The occurrence data for triatomine species are distributed over 15 ecoregions, amongst which those with highest number of species occurrences in both periods are Dry Chaco, Humid Chaco, Espinal, Plains and Plateaus Monte and Hills and Bossoms Monte (Suppl. material 1).
The species with widest geographic distribution is T. infestans (Fig. 2), with the highest number of records in both periods (Table 1) (Table 1).
Triatoma guasayana is the species with the second highest number of occurrence data (6.29%), followed by T. garciabesi and T. sordida (7.65%, including the T. garciabesi-T. sordida records) ( Table 1). The occurrence data for post-2000 for T. guasayana records are concentrated in central and north Argentina and mainly restricted to the Dry Chaco ecoregion (Fig. 3). In the case of T. garciabesi, the distribution of occurrence data for both periods included north-western Argentina (mainly Dry Chaco), whereas those of T. sordida included north-eastern Argentina (mainly Humid Chaco). The geographic distribution of T. garciabesi -T. sordida is similar for the pre-2000 and post-2000 periods and shows partial overlap between the distributions of both species (Fig. 3).
Triatoma eratyrusiformis presented similar distribution patterns of occurrence between the pre-2000 and post-2000 periods. For T. patagonica, occurrences in the post-2000 period were restricted to the southern Dry Chaco, Plains and Plateaus Monte and the Espinal ecoregion in southern Buenos Aires Province (Fig. 4). Similarly, occurrence data for T. platensis for the post-2000 period were restricted mainly to the ecoregions Dry Chaco and Plains and Plateaus Monte (Fig. 4). There are fewer records of P. geniculatus and P. guentheri in the post-2000 than in the pre-2000 period, with occurrences restricted to the Humid and Dry Chaco ecoregions, respectively (Fig. 5). Meanwhile, P. guentheri is the species that shows greatest reduction of its distribution between the pre-2000 and the post-2000 periods (Fig. 5).
No reports of occurrence were found for P. megistus, P. rufotuberculatus, T. limai and T. rubrofasciata in the post-2000 period; thus, Fig. 7 shows only the geographic information corresponding to the pre-2000 period for these species. Panstrongylus megistus has 27 records; P. rufotuberculatus, one record; T. limai, nine records; and T. rubrofasciata, one record (Table 1).
These data show a trend of reduction in species richness of triatomines over the last 20 years in central-western Argentina (an area corresponding mainly to the southern and central Dry Chaco ecoregion) (Fig. 8). The maximum number of species sharing a cell in the pre-2000 period is eight species, while this maximum decreased to seven in the post-2000 period. Furthermore, for the pre-2000 period, high species richness values were found in several ecoregions of Argentina, whereas in the post-2000 period, they occurred mainly in only four areas within the Dry Chaco (Fig. 8b). The species with greatest contribution to overall richness in both periods and in decreasing order are T. infestans, T. platensis, T. guasayana and T. garciabesi (details in Table 1).

Figure 5.
Distribution of occurrence data for P. geniculatus and P. guentheri (a) pre-2000 records (b) post-2000 records. Species occurrence data are shown as black dots. Coloured areas represent the ecoregions. Red star is a questionable record for P. guentheri (Carcavallo et al. 1994). Coloured areas represent the ecoregions. Red stars are questionable records for T. breyeri (Moreno et al. 2006) and for T. rubrovaria (Carcavallo and Martinez 1968).

Figure 7.
Distribution of occurrence data for triatomine species with only pre-2000 information available.

Data format: Darwin Core Archive
Column label Column description occurrenceID The globally unique identifier for the occurrence.
dcterms:type The nature of the resource.
dcterms:modified The most recent date-time on which the resource was changed.
dcterms:language A language of the resource. institutionCode The name of the institution having custody of the resource. collectionCode The name identifying the dataset from which the record was derived. basisOfRecord The specific nature of the data record.
catalogNumber An unique identifier for the record within the dataset.
higherClassification A concatenated list of taxa names terminating at the rank immediately superior to the taxon referenced in the taxon record. kingdom The full scientific name of the kingdom in which the taxon is classified. phylum The full scientific name of the phylum in which the taxon is classified. class The full scientific name of the class in which the taxon is classified. order The full scientific name of the order in which the taxon is classified. family The full scientific name of the family in which the taxon is classified.

genus
The full scientific name of the genus in which the taxon is classified. year The four-digit year in which the Event occurred.
month The ordinal month in which the Event occurred. day The integer day of the month on which the Event occurred.
habitat A category of the habitat in which the Event occurred.

samplingProtocol
The name of, reference to, or description of the method or protocol used during an Event.
samplingEffort The amount of effort expended during an Event.
higherGeography A concatenated list of geographic names less specific than the information captured in the locality term. continent The name of the continent in which the Location occurs. country The name of the country in which the Location occurs. countryCode The standard code for the country in which the Location occurs. stateProvince The name of the next smaller administrative region than country in which the Location occurs. municipality The full, unabbreviated name of the next smaller administrative region than county in which the Location occurs. locality The specific description of the place.

Conclusions and prospects
We compiled a large amount of occurrence data and associated information and integrated it into a single database to better understand the geographic distribution of triatomine species present in Argentina over the last 100 years, as well as the temporal variation of these distributions.
One of our key findings was that the number of records in the post-2000 period almost tripled the number of those from the pre-2000 period. While for the pre-2000 period, public repositories (mainly records from researcher's fieldwork published in scientific journals) were the major data sources providing spatial information, in the last two decades, the balance shifted towards records provided by the CeReVe, as well as focused field campaigns carried out by several research groups, which were the major data sources for the post-2000 period. However, these sources have become increasingly focused on the species with major epidemiological importance, mainly T. infestans and few records provide information about other species. We propose the following hypotheses that may account for this difference: ( Another of the key findings concerns the number of triatomine species present in Argentina. According to Carcavallo et al. (1998) and Ceccarelli et al. (2018), there are 17 triatomine species mentioned for Argentina. However, in this analysis, we found that this number of species is valid only for the pre-2000 period, given that, for at least in the last 20 years, there are no records for P. megistus, P. rufotuberculatus, T. limai or T. rubrofasciata.
In the case of P. megistus, the last record found for Argentina corresponds to 1995 (Damborsky et al. 2001); however, the records mentioned for Brazil (Ceccarelli et al. 2018) and Bolivia (Rojas-Cortez 2007) in the post-2000 period, suggest that this species might still be present in Argentina. Although only one record was found for P. rufotuberculatus in the Yungas ecoregion (Salomón et al. 1999) for the pre-2000 period, the species has been detected in Tarija (Bolivia) several times (Rojas-Cortez 2007), leading us to consider that the scarcity of data may be due to a lack of sampling in that ecoregion. Similarly, T. limai was mentioned in only two publications for Argentina (Abalos andWygodzinsky 1951, Carcavallo andMartinez 1968), while Del Ponte (1929) mentioned that it is distributed in Brazil, but without specifying a locality. Finally, given that T. rubrofasciata is a cosmopolitan species suspected of having been transported on ships to some countries, we concluded that it is possible that this species arrived accidentally to Argentina at some point. The only record of T. rubrofasciata is considered as doubtful because Neiva (1914), Larrousse (1924 and Del Ponte (1930) do not agree on the location of the record. Abalos and Wygodzinsky (1951) do not mention T. rubrofasciata in Argentina and Carcavallo et al. (1965) proposed to remove it from the list of Argentinean triatomines, since apparently it was found occasionally and was never able to colonise any environment in Argentina.
Therefore, we consider that the specimens of T. limai, allegedly found in Argentina, do not belong to the species described by del Ponte in 1930 and as there has been no record of T. rubrofasciata in Argentina for more than 100 years, we chose to remove both species from the current list of Argentinean triatomines. On the other hand, we decided to keep P. megistus and P. rufotuberculatus, judging that the small number of occurrences in the last 20 years can be attributed to poor sampling.
In the case of insect vectors, it is common for those species with greater public health importance (e.g. domiciliated species) to receive more attention, as in the case of T. infestans that makes up 70.98% of all the occurrence data and has the largest geographic distribution in Argentina. The occurrence records for some species are also biased towards certain regions and habitats: occurrence data of most species show distributions restricted mainly to the Dry and Humid Chaco ecoregions in the post-2000 period, with P. guentheri showing the greatest reduction of occurrence distribution, from 10 ecoregions in the pre-2000 period to only one in the post-2000 period. Undoubtedly, this phenomenon involves not only the usual tendency of researchers to survey specific or more accessible places (Sastre and Lobo 2009), but also the fact that some areas receive more attention from vector control programmes (Carbajal de la Fuente and Yadón 2013). Thus, this uneven distribution of missing data, influenced by various factors, results in a less accurate representation of the actual occurrence patterns. Thus, the known distribution of a species provides a view of its distribution that is at best incomplete, if not actually misleading. A critical step towards improving this picture is to reinforce the efforts in unsampled and under-sampled areas, considering not only the domiciliary and peridomiciliary species, but also the sylvatic ones. When compiling the distribution data from public sources, we often found it useful to allocate attention to species and regions depending on data availability, because databasing sylvatic species from an under-sampled region provides more novel information than duplicating records for well-known species (domiciliary species) and regions (endemic areas). Applications based on citizen-science projects, such as GeoVin, aim to promote an increase in reports of sylvatic species, as well as an update of records of species more commonly known for their major vector importance (such as T. infestans) from locations outside those areas where reports are commonly made. We also found that unpublished data, such as the records provided by CeReVe and CeNDIE/ANLIS, from colleagues or from studies that are not focused on geography (e.g. molecular, morphometric etc.) are highly-valuable sources of geographic records and we recommend and encourage them to publish their data with geographic coordinates. We also advise them to publish those data in public databases (instead of as part of scientific articles only) because online databases of observations from field studies not only facilitate data gathering, but also contribute to guarantee the persistence and availability of those data.
Finally, on the basis of the information about habitat and the categorisation presented here, we show that, although the greatest proportion of records in all habitat types corresponds to T. infestans, there are also other species, namely T. sordida, T. garciabesi and T. guasayana, that occur as both adults and nymphs inside and around human dwellings. In the case of T. eratyrusiformis, T. patagonica and T. platensis, the records of occurrence within human dwellings include adult individuals only, but these species also occur as nymphs in the vicinity of human dwellings. The categorisation used in this work could contribute to the development of vector risk indices and, combined with the maps of species distributions, could be extremely useful for decision-makers in health organisations, since the risk of these insects acting as vectors of T. cruzi to result in Chagas disease is not only due to the presence of the triatomines themselves, but also depends on their proximity to and colonisation of human dwellings.
In summary, we used an approach, based on occurrence data analysis, to assess temporal, spatial and ecological patterns of the triatomines present in Argentina. Our results provide updated information regarding various aspects of Argentinean triatomines that will improve the ways in which these species are identified, controlled and managed in Argentina. Two major outcomes of this work are, on one hand, the inclusion of 15 species in the updated list of Argentinean triatomines: P. geniculatus, P. guentheri, P. megistus, P. rufotuberculatus, Ps. coreodes, Triatoma breyeri, T. delpontei, T. eratyrusiformis, T. garciabesi, T. guasayana, T. infestans, T. patagonica, T. platensis, T. rubrovaria and T. sordida. On the other hand, the database analysed will add records to those of DataTri, thus becoming the largest open access database of American triatomines with around 30000 records so far, which not only represents a fundamental resource for Chagas disease decision-making, but also contributes to the initiatives related to open access data.
Ongoing investment in national records for vector control is crucial for the continuous growth of the datasets upon which our analyses are based. The methods that we developed and implemented can be easily transferred to and applied in other countries where these types of data are available, thus improving our understanding of existing triatomine information.