Reexamination of Rhopalosiphum (Hemiptera: Aphididae) using linear discriminant analysis to determine the validity of synonymized species, with some new synonymies and distribution data

Abstract Although 17 species of Rhopalosiphum (Hemiptera: Aphididae) are currently recognized, 85 taxonomic names have been proposed historically. Some species are morphologically similar, especially alate individuals and most synonymies were proposed in catalogues without evidence. This has led to both confusion and difficulty in making accurate species-level identifications. In an attempt to address these issues, we developed a new approach to resolve synonymies based on linear discriminant analysis (LDA) and suggest that this approach may be useful for other taxonomic groups to reassess previously proposed synonymies. We compared 34 valid and synonymized species using 49 measurements and 20 ratios from 1,030 individual aphids. LDA was repeatedly applied to subsets of the data after removing clearly separated groups found in a previous iteration. We found our characters and technique worked well to distinguish among apterae. However, it separated well only those alatae with some distinctive traits, while those apterate which were morphologically similar were not well separated using LDA. Based on our morphological investigation, we transfer R.arundinariae (Tissot, 1933) to Melanaphis supported by details of the wing veination and other morphological traits and propose Melanaphistakahashii Skvarla and Miller as a replacement name for M.arundinariae (Takahashi, 1937); we also synonymize R.momo (Shinji, 1922) with R.nymphaeae (Linnaeus, 1761). Our analyses confirmed many of the proposed synonymies, which will help to stabilize the nomenclature and species concepts within Rhopalosiphum.


Introduction
The difficulty of aphid taxonomy and identification has been recognized as far back as the 18 Century by Carl Linnaeus (Walsh 1863) and is driven by multiple, often interconnected factors, including morphology and life history traits. Many aphids are heteroecious (Blackman and Eastop 2000) and host-specific morphology can vary enough that taxonomists have described the same species multiple times based on different hostassociated life stages (Stern et al. 1997). Furthermore, aphid morphology can change substantially with abiotic factors such as temperature (e.g. Blackman and Spence 1994) and number of daylight hours (e.g. Hille Ris Lambers 1966), and biotic factors such as colony size (e.g. Watt and Dixon 1981), all of which cloud species delimitation. Lastly, most species diagnoses require measurements of body parts. Rarely are there one or more discrete characters that can separate species. Therefore, given the plasticity in aphid morphology and the paucity of discrete characters, aphid taxonomy desperately needs better ways to diagnose species.
Challenges in aphid taxonomy have resulted in different tools and methods to discern species. For example,  used mitochondrial DNA barcodes to distinguish among aphid species with high success. This high rate of success of DNA barcoding has led to the formation of regional barcoding databases to aid in the identification of aphid species (e.g. Coeur d' Acier et al. 2014). However, DNA barcoding is limited to freshly collected or alcohol-preserved specimens and cannot be used with cleared and slidemounted specimens, which negates its use when comparing historic specimens.
Both statistical and non-statistical morphological tools have been developed for use with slide-mounted specimens. For example, online interactive keys have been developed to distinguish among large numbers of aphid species, but they lack statistical grounds for identification (e.g., Favret and Miller 2012). Multivariate statistics, specifically linear discriminant analysis (LDA), which requires taxonomic designation a priori and can use th multivariate analysis of variance (MANOVA) with discrete and continuous characters to test whether the group centroids differ (see Henderson 2006 for multivariate statistics introduction applied to taxonomy), have been applied to aphids to distinguish among aphid species and ecotypes, (e.g. Favret and Voegtlin (2004) for three species of Cinara, Lozier et al. (2008) for three species of Hyalopterus, Valenzuela et al. (2009) for three species of Rhopalosiphum and Bašilova (2010) for two species of Cryptomyzus). However, such studies have been limited to just a few species and have not been used to evaluate historic synonymies.
In this paper, we expand upon previous examples using discriminant analysis in aphid taxonomy to test whether multivariate statistics support currently recognized Rhopalosiphum species and their synonymies. Specifically, we use LDA to statistically compare 34 valid and synonymized Rhopalosiphum species to test the validity of historic synonomies. Valid species and synonymies were tested using iterative LDA analyses in which we removed the most distinct species clusters and reanalyzed the remaining species. We used LDA analyses with only valid species and applied the resulting discriminant functions to synonymized species to determine whether the synonymies are statistically correct using species specific LDA functions. The methods and analyses presented here are unique and repeatable with respect to previous aphid studies. No previous studies have tested taxonomic hypotheses in this manner. We use open source software and describe the analyses with sufficient detail to make them repeatable, which has been lacking from the literature. Lastly, our analyses are statistically robust with respect to LDA model assumptions as we thoroughly describe character transformations and missing data and their relationship to model assumptions.
Our example genus of aphids in need of comprehensive review and revision is Rhopalosiphum Koch, 1857 (Aphididae: Aphidinae: Aphidini: Rhopalosiphina) (Koch 1857). Rhopalosiphum currently comprises 17 recognized species, including some of the earliest named aphids (i.e. R. padi (Linnaeus, 1758) and R. nymphaeae (Linnaeus, 1761)) ( Fig. 1, Table 1). Most Rhopalosiphum species are heteroecious and typically overwinter on rosaceous trees (Rosaceae: Crataegus Tourn. ex L., Malus Mill., Prunus L.) and feed on grasses, sedges, and related plants (Poales) throughout the summer (Table 2). A number are important grain (e.g. R. maidis (Fitch, 1856) and R. padi) and apple (R. oxyacanthae) pests that transmit more than 25 aphid-transmitted plant viruses (Chan et al. 1991) and are often intercepted at ports of entry and in aphid-monitoring suction traps (Strażyński 2010, Favret andMiller 2012, Skvarla et al. 2017). Those species that are not economically important have received comparatively little study. For example, R. nigrum Richards, 1960 and R. padiformis Richards, 1962 have not been discussed in scientific literature outside of their original descriptions and catalog entries. Examples like this complicate the identification of alates collected without host data (e.g., in suction traps) because the majority of taxonomic resources focus on apterae and not alates.  The taxonomic history of Rhopalosiphum is complicated. Many taxa with slightly-swollen siphunculi were historically included within Rhopalosiphum, but are now placed in a different tribe, Macrosiphini (e.g. Hyadaphis Börner, Lipaphis Mordvilko, Rhopalomyzus Mordvilko), or other aphidine genera (e.g. Melanaphis van der Goot, Schizaphis Börner) (Richards 1960, Blackman and Eastop 2017, Favret 2019. Börner (1952) was the first to restrict the definition of the genus as it is currently conceived.

Species
Molecular studies (e.g. Kim and Lee 2008) suggest Rhopalosiphum is closely related to the rhopalosiphine genera Melanaphis and Schizaphis. However, no comprehensive molecular or morphological study has tested the monophyly of the genera and species contained therein (Valenzuela et al. 2009). Indeed, Halbert and Voegtlin (1998) suggested that Rhopalosiphum arundinariae (Tissot, 1933 Herein, we confirm many historic synonymies using LDA, propose two new synonymies, and report new distribution data for R. rufulum Richards, 1960, based on material examined for the analyses.

Materials and Methods
Most specimens examined are housed in the National Museum of Natural History Aphidomorpha Collection ( (2017).
Slides were labelled with individual, sequential numbers (MS 0001-0980) and specimens assigned a number that was appended to the slide number (e.g. MS 0001-1 for the first specimen on the first slide). Specimens were examined using a Zeiss Axio Imager M1 stereomicroscope; micrographs were taken and measurements of various morphological features of adult female apterous and alate specimens made using AxioVision 4.9.1 software (Carl Zeiss AG, Oberkochen, Germany) (Fig. 2, Table 3). Measurements were hand-written in a notebook, then later manually entered into an Excel spreadsheet.  Morphological terms and structures were adapted from Foottit and Richards (1993).

Measurement number Measurement description or ratio
Throughout the text, the term aptera (pl. apterae) refers to wingless adult vivipara (pl. viviparae) and alata (pl. alatae) refers to winged adult vivipara (pl. viviparae). Body measurements were adapted from Foottit et al. (2010). Wing measurements (Fig. 2) were adapted from Favret (2009). All measurements are in micrometers (μm) and were taken from the right side of the body when possible. Species names and statuses follow Favret (2019). Measurement data are available as supplementary files (Suppl. materials 1, 2, 3, 4).  Table 3 for explanation of alphanumeric codes.

Statistical analyses
We used linear discriminant analysis (LDA) in R (R Core Team 2017) using the MASS package (Venables and Ripley 2002) to test the synonymized and valid species names designations. The code we used to analyze the apterae datasets is available in Suppl. material 5 The analysis proceeded in several steps, which were similar for apterae and alatae, the first of which was data cleaning, described below. LDA is one of the most commonly employed discriminant function analyses, used both to identify useful characters distinguishing specimens of valid species and groups of species, and to form discriminant functions that could then be applied to specimens of uncertain taxonomic status to determine if they cluster with valid species or cluster separately, which would suggest a new species. This LDA approach was applied in a systematic iterative manner; the initial linear discriminant functions separated the most distinctive species, leaving a large amorphous cluster of the other known species. In the next step, the distinctive valid species were removed in that iteration, and the method applied to the remaining valid species. This could be followed by another iteration, until all valid species were separated to the extent possible using available morphological characters.
Prior to the analyses, the dataset was checked for incorrectly entered data. This was performed for both the valid and synonymized by constructing boxplots for each trait, broken down by species (e.g. Fig. 3). This allowed us to identify outlier measurements in the Excel database, which were compared against the hand-recorded measurements and in most cases were the result of inaccurately entered data (e.g. 10002 instead of 100.02 or 278 instead of 728). Incorrectly entered data that were discovered in this manner were corrected before additional analyses. The boxplots also gave a general picture of which species were more variable for which traits. Example of box plots created for a character (length of antennal segment I here) that were used to identify outlier measurements that should be doublechecked before further analyses. Note the outliers for R. enigmae and R. padi, which were the result of miskeyed data.

'Size': as a trait and as an adjustment
We standardized the size measurements of each specimen, based on a 'size' variable (but also kept 'size' as a variable in the LDA, explained below). There are several reasons for this. One is that some species are more variable in size than others. If measures are not size adjusted, the LDA does not work as well; in fact, we implemented the size adjustments because size affected the usefulness of many of the measures, especially for taxa with a lot of size variability. Second, if one doesn't adjust for size, many of the measures are correlated through size, so less useful. Third, we wanted measures which made sense morphometrically, and these are often 'relative' measures, (e.g. antenna are relatively long for species A (in relation to its size) versus species B). If not adjusted for size, we would have to use ratios for more variables. Standardizing on size was very helpful for many traits and might be useful when developing criteria to separate species in other groups. Given that it is desirable to adjust for size, how does one best estimate 'size'? There is not a one-size-fits-all solution for this and, while we found one that seems to work well for these taxa, it may be improved upon or altered for other taxa. This variable was constructed by combining the body length, head width, and femur length using principal components (PC); head width and femur length were chosen as, among all measured characters, these were most highly correlated with body length. This is a dimension reduction technique, the idea being to create a single variable (the first PC) that best captures the variation in these three correlated measures. In cases where one or two of these measures were missing, the derived principal component measure (henceforth labelled 'relative size') for the specimen was imputed using linear regression based on the rest of the data set. Another method employed for developing a size measure is the use the geometric mean of the characters. We calculated the geometric means for data where we had all three characters and found the correlation between the geometric mean and first PC to be 0.9963; for this data set the two methods would yield essentially identical results.
Our relative size measure was retained as a character in the LDA analysis. It was also used to adjust all other size measures using linear regression, i.e. adjusted measures were residuals from regressing each of the size measures (dependent variable) on the relative size (independent variable). This resulted in smaller and larger individuals of the same species having similar adjusted measurements. Non-size measures (such as wing angles or ratios) were not adjusted using relative size, instead they were transformed by taking logs; this created measures that were closer to being normally distributed. All measures were then individually standardized (using all samples) to mean zero, standard deviation one (this helps one to interpret the coefficients of the linear discriminant analysis results). Missing values were then imputed by randomly sampling from the corresponding adjusted measure of other individuals of that same species. In the unusual case where this could not be done (e.g. all specimens were incomplete for this trait), the trait value was set to zero (which is the overall mean for each measure), removing its influence on the calculated discriminant functions. If all specimens of a species were naturally lacking a trait, a new character column was created for that trait, with either 0 (not missing) or 1 (missing). The end result was that a naturally missing trait could be used as a character when creating the linear discriminant function and that incomplete or aberrant individuals were not dropped from the analysis because of missing data.

Linear discriminant analyses
A first LDA was performed on the cleaned dataset for only the valid species (this included the three variables used to create relative size, as well as the adjusted size measure) using 49 measured values and 20 ratios calculated from those values. The first three latent axes were sufficient to explain 80% or more of the variability. We looked at which variables loaded most heavily on each axis and compared that to the boxplots created at an earlier step as a check that the methodology was working as expected. The linear discriminant functions derived from the valid species were then applied to all specimens (valid and synonymized) so the specimens could be mapped to the two dimensional space created from pairs of the latent axes. Species were suitably coded so they could be distinguished on the plots. Since there was only one individual of R. sanguinarium Baker, 1934 for both apterae and alatae, it was analyzed with the synonymized, rather than the valid, species.
We then identified clusters of specimens comprising a valid species. Sometimes a valid species was well separated from other valid species and sometimes not. For those wellseparated valid species, we looked to see if any of the synonymized species occupied the same space and manually outlined the group. When this happened, we concluded that the synonymized species and valid species were likely the same and we removed them from further analyses. If a synonymized species formed a distinct cluster, we concluded that it was not synonymized with any of the valid (or other synonymized) species and considered it a separate species and also removed it from further analyses.
With the remaining species, we repeated the previously described procedure, producing another LDA using only the valid species that did not separate well in the previous analysis, and using the same independent variables. With fewer species and the same number of independent variables, the software sometimes had difficulty due to insufficient degrees of freedom. When this occurred, we reduced the number of independent variables by removing those that were highly correlated with others in a stepwise manner until there were no independent variables that were highly correlated with other independent variables, and then produced an LDA with the reduced set of independent variables. For alatae, this methodology needed to be repeated a third time to finish separating all the known species.

Notes on Rhopalosiphum species not included in linear discriminant analyses
Rhopalosiphum dryopterae Kan, 1986. Halbert andVoegtlin (1998) and Jenson and Holman (2000) suggested R. dryopterae is a species of Dysaphis. The authors have been unable to locate the type specimens or a copy of the original description by Kan (1986), though Holman, now deceased, based his suggestion on it (Jensen, pers. comm). We hesitate to move R. dryopterae without having seen specimens or the description, although concede that it is likely not a species of Rhopalosiphum given the weight of expert opinion and do not treat it further herein. Roychoudhuri, 1978. Remaudière andRemaudière (1997) suggested R. esculentum may be a synonym of Aphis craccivora Koch, 1854. Unfortunately, the description of R. esculentum does not include characters that would definitively place the species in either Aphidina or Rhopalosiphina (i.e. lateral tubercles I and VII dorsal to adjacent spiracles). Raychaudhuri and Roychoudhuri (1978) report the presence of "rhopalosiphine reticulations" on the thorax and abdomen. However, A. craccivora also has reticulations and the lack of accompanying illustrations in the description of R. esculentum leaves it ambiguous as to whether the reticulations are composed of solid lines as in A. craccivora (similar to Fig. 4a, b) or spicules as in Rhopalosiphum (Fig. 4c). Certain characters in the description agree with A. craccivora (e.g. segments I, II, apex of V, and VI dark or dusky, the shape of and imbrications on the siphunculi). Finally, A craccivora is the only aphid species reported to feed on Manihot esculenta Crantz, the reported host plant (Blackman and Eastop 2017), which is native to South America and grown as a food crop in India, where R. esculentum was described. We, therefore, agree that R. esculentum is likely to be a synonym of A. craccivora and do not include it in subsequent analyses. However, as we have not examined the holotype of R. esculentum to confirm our suspicions, we decline to formally synonymize the two species herein.

Rhopalosiphum esculentum
Rhopalosiphum momo Shinji, 1922. Halbert andVoegtlin (1998) reported that the description of Rhopaloiphum momo "is suggestive of" R. nymphaeae, but did not formally  (Tissott, 1933). c: R. enigmae Hottes and Frison, 1931. synonymize the two species. An undated, unattributed translation of the original description is available in the USDA-ARS Systematic Entomology Laboratory library and is reproduced here in full: (40) Rhopalosiphum momo SHINJI, n. sp. / pp. 791 Characteristics: Body green or pale. Antennae somewhat longer than body, III shorter than IV and V taken together, with 16-18 subcircular sensorial, IV and V subequal in length, the former with one sensorium in the middle and the latter with a subapical one; flagellum of VI about 3 times as long as the base. Antennae as a whole infuscated and each segment has a few hairs. Cornicles with a basal 2/3 part green or pale, the remaining 1/3 part swollen and infuscated.

Date of collection: 2 June 1920
Locality: Miyakonojo, Nakajo (Nagano Pref.) Rhopalosiphum arundinariae (Tissot, 1933). Halbert and Voegtlin (1998) suggested that R. arundinariae belongs within Melanaphis based on "wing venation [and] cuticular sculpturing" (they also suggested bamboo as a host plant is a Melanaphis but not a Rhopalosiphum characteristic; however, R. chusqueae Pérez Hidalgo and Villalobos Muller, 2012, which feeds on bamboo, was subsequently described and placed unambiguously within Rhopalosiphum based on morphological and molecular characters, thus establishing bamboo-feeding as a Rhopalosiphum characteristic). We agree with Halbert and Voegtlin (1998) and did not include it in the linear discriminant analyses (see Results for additional details).
We were unable to locate apterae of R. cerasifoliae (Fitch, 1855) to include in the analyses. However, we included specimens of the junior synonym R. tahasa Hottes, 1950 in order to understand what might happen when synonymized species do not have a presumed conspecific valid species for comparison.

New synonymies and distribution data
The type material of R. momo is apparently lost, so, left with only the description, we concur with Halbert and Voegtlin (1998) that R. momo is synonymous with R. nymphaeae and formally synonymize them for the following reasons: 1) Rhopalosiphum nymphaeae is the only species in the genus with siphunculi that are green basally and dark and expanded apically, as was described in R. momo.
2) Seven species of Rhopalosiphum utilize Prunus as a primary host (Table 2), five of which occur in Japan: R. maidis, R. nymphaeae, R. oxyacanthae, R. padi, and R. rufiabdominale. Of these candidate species, only R. nymphaeae is not eliminated based on differences in color.
a) The body of R. momo is described as "green or pale". Rhopalosiphum padi and R. rufiabdominale have a distinctive red patch between the siphunculi and R. oxyacanthae have dark green stripes that would probably have been noted in the description if present b) The antennae of R. momo are "whol[ly] infuscated". Antenna segment III of R. maidis is pale, rather than concolorous with the other dark segments, and can be pale or dark in R. padi.
3) One potential complication is that Richards (1960), who published the last review of Rhopalosiphum, reported that alate R. nymphaeae have zero secondary rhinaria on antennal segment IV. However, Heie (1986) reports a range of 0-8 secondary rhinaria on antennal segment IV and multiple examples in the USNM collection have 1-4 (although most have zero). The single secondary rhinaria on antennal segment IV reported for R. momo, therefore, does not exclude the possibility they are synonymous with R. nymphaeae.
Pursuant to Article 57 of the International Code of Zoological Nomenclature (International Commission on Zoological Nomenclature 2000), Melanaphis arundinariae (Takahashi 1937) is considered a junior homonym of Melanaphis arundinariae (Tissot 1933a). We therefore suggest Melanaphis takahashii Skvarla, Kramer, Owen, and Miller, in honor of Ryoichi Takahashi, who originally described the species, as a replacement name for Melanaphis arundinariae (Takahashi, 1937).
While sorting undetermined Rhopalosiphum specimens in the USNM collection, three slides of specimens collected from Acorus in North America were discovered. The specimens match the measurements and brief descriptions of European R. rufulum apterous viviperae (Stroyan 1972, Heie 1986. These specimens represent the first collections of R. rufulum from secondary hosts in North America and extend the known range southeast of previous collections. In Europe, colonies on A. calamus are reported to grow so large that "the plants look bespattered with black mud" (Heie 1986), so, while the species is rarely collected in North America, it is presumably common where the appropriate host plants are present. The collection data are as follows: CANADA: 3 female apterae, locality unknown (label states "at HO-17214"),

Morphometric analyses
In the first LDA of apterae, R. maidis and its synonom R. africana (Theobald, 1914) formed a distinct cluster in the plots of LD1 x LD2 and LD1 x LD3 (Fig. 5). Rhopalosiphum nymphaeae L., 1761) and its synonym R. sparganii (Theobald, 1925) formed a distinct cluster in the LD1 x LD2 plot (Fig. 4a) and R. sp. nov. "ex. Arisaema" formed a distinct cluster in the plot of LD1 x LD3 (Fig. 4b). These five species were removed prior to performing the second LDA.
In the second linear discriminant analysis of apterae, R. padi and its synonyms R. prunifoliae (Fitch, 1855) and R. pseudoavenae (Patch, 1917) formed distinct clusters in both the LD1 x LD2 and LD1 x LD3 plots (Fig. 6); Rhopalosiphum tahasa (Hottes, 1950) also clustered with R. padi, although this may be an artifact of the fact that the species with which it is currently synonymized, R. cerasifoliae, was not included in the analyses. In the LD1 x LD2 plot, Rhopalosiphum enigmae and its synonym R. laconae (+ R. chusqueae), R. musae (Schouteden, 1906) and its synonym R. scirpifolii Gillette &Palmer, 1932 andR. nigrum Richards, 1960 all formed distinct clusters. In the LD1 x LD3 plot, Rhopalosiphum rufiabdominale (Sasaki, 1899) and its three synonyms R. gnaphalii Tissot, 1933, R. subterraneum Mason, 1937and R. splendens (Theobald, 1915   Graphs of the first linear discriminant analysis of apterae. In the first linear discriminant analysis of alatae, R. maidis formed a cluster with its synonyms R. cookii (Essig, 1911) and R. africana and R. nymphaeae clustered with its synonym R. prunaria (Walker, 1848) in the plots of LD1 x LD2 and LD1 x LD3 (Fig. 7). These five species were removed prior to performing the second LDA except for the three most outlying R. africana, which were paratypes.  In the second linear discriminant analysis of alatae, R. rufulum (along with one of the outlier R. africana not removed from the analyses) formed a distinct cluster in the LD1 x LD2 plot (Fig. 8). Rhopalosiphum enigmae and its synonym R. laconae formed a distinct cluster in the LD1 x LD3 plot and were removed prior to performing the third LDA. Rhopalosiphum rufiabdominale and its synonyms R. californica (Essig, 1944), R. splendens (Theobald, 1915), and R. subterraneum Mason, 1937 formed a distinct cluster in the LD1 x LD3 plot with minimal overlap with R. oxyacanthae synonyms and so were also removed prior to the third LDA.
None of the remaining species formed distinct clusters in the third linear discriminant analysis (Fig. 9).

Figure 7.
Graphs of the first linear discriminant analysis of alatae. Skvarla M et al Figure 8.
Graphs of the second linear discriminant analysis of alatae.

Figure 9.
Graphs of the third linear discriminant analysis of alatae.

Current species and natural history
The synonymization of R. momo with R. nymphaeae and movement of M. arundinariae (Tissot, 1933) to Melanaphis brings the total number of described Rhopalosiphum species to 17, two of which (R. dryopterae and R. esculentum) are questionably assigned to the genus, pending examination of the type material, with an additional four undescribed species known.
The discovery of R. rufulum on Acorus in North America and the presence of an undescribed Rhopalosiphum species on Arisaema in eastern North America echoes the sentiment of Skvarla et al. (2018) -who found that R. enigmae, which was once considered a rare species, was present in every cattail stand examined -that many rarely collected Rhopalosiphum species are likely abundant in the correct habitats/secondary hosts and that apparent rarity may be due to inadequate collecting efforts on non-crop plants. Indeed, we expect that additional surveys of semi-aquatic plants, especially monocots, will continue to produce new species of Rhopalosiphum in North America and probably East and Southeast Asia.

Morphological analyses in the molecular era
Determining species boundaries and synonymies using molecular techniques has become the standard by which most modern taxonomy and systematics are measured. Indeed, with the continually lower costs and greater ease with which highly degraded DNA can be extracted and sequenced from historic museum specimens (Gilbert et al. 2007, Bi et al. 2013, Tin et al. 2014, McCormack et al. 2015, Blaimer et al. 2016, Sproul and Maddison 2017, determining species validity and synonomies using old specimens, including type material, has never been easier (e.g. Kirchman et al. 2009, Mutanen et al. 2015, McGuire et al. 2018). However, most aphids and other slide-mounted arthropods present a relatively unique challange in that all of the DNA is destroyed when specimens are cleared for mounting (although the authors have seen a few decades-old aphids that were not cleared before being slide-mounted in balsam and have wondered if it would be possible to free them from the mounting medium and extract DNA). Thus, even in this era when molecular techniques dominate, there is still a need for robust morphological comparisons for certain groups, especially aphids, as has been demonstrated here.

LDA methodology applied to apterae confirms most species and their synonymies
In general, the linear discriminant analyses were successful when applied to apterae. The analyses successfully grouped all synonymized species with their associated valid species over two iterations and we concluded that most of the synonymizations we tested are sound. This demonstrates that linear discriminant analyses can be used to test synonymizations when DNA is unavailable and provides a new method to examine and use historic, slide-mounted specimens. Additionally, R. sp. nov. "ex. Arisaema" formed a distinct group when it was included as a valid species. This supports its status as a valid, but undescribed, species and demonstrates that linear discriminant analysis can be used to distinguish potentially undescribed species from valid described species based on morphological similarities.
However, a few notable problems exist. First, R. parvae and R. rufulum consistently clustered together, but never as a distinct group away from other species. This may indicate that they are synonymous, although several factors indicate they are distinct species: they feed on different secondary hosts (Carex and Acorus, respectively), have non-overlapping ranges (Illinois versus New England and adjacent Canada), and have several morphological differences that separate them (Table 5). However, this is based on very few individuals (3 and 10 specimens, respectively). Without additional specimens from a wider geographic range and considering the above evidence, we elected to leave them as separate species, although acknowledge the LDA indicated these species concepts should be revisited in the future. Unfortunately, we were unable to include verified apterous R. cerasifoliae in the analyses due to lack of specimens. We included specimens of R. tahasa, which is synonymized with R. cerasifoliae, in the analyses to see if they would still form a distinct cluster without valid R. cerasifoliae to act as a guide in the discriminant function. Instead, the R. tahasa specimens clustered well with R. padi, which it is not synonymized with, rather than forming a distinct cluster. This association has not previously been suggested in the literature, but without the inclusion of R. cerasifoliae, it is impossible to determine if R. tahasa should instead be synonymized with R. padi. This issue suggests that it is extremely important to include all valid species when creating the discriminant functions, so that synonymized species can be properly plotted. Additionally, assuming that R. tahasa is synonymous with R. cerasifoliae, it suggests that valid species analyzed with the synonymized species (either because they are incorrectly synonymized with a different species or because, as in this case, a synonymized species is included without specimens of its associated valid species) may not form distinct clusters. Table 5.
Morphological differences between R. parvae and R. rufulum. Measurement and ratio ranges are followed parenthetically by the mean and number of specimens measured.

LDA methodology worked less well for classifying alates using morphological traits
The analyses of alatae were less decisive. Intuitively, species with the most distinctive apterae -R. maidis, R. nymphaeae, R. enigmae, and R. rufiabdominale -also had the most distinctive alatae and formed distinct clusters with their associated synonymized species in the first two LDA. However, the remaining species failed to form distinct clusters, even after a third LDA. Additionally, a number of synonymized species -R. furcata, R. fitchii, R. insertum, R. mactata, R. mali, and R. viridis -were only represented by alate specimens and did not cluster with the species with which they are synonymized. Based on our analyses, we cannot confirm that these synonymies are correct, although we also do not propose any be raised as valid species pending additional investigation.
Aphidologists have found that Rhopalosiphum alatae captured without host plant data (e.g., in suction traps) are difficult to identify to species due to similar, conserved morphologies. While published keys to alatae are available (i.e. Richards 1960), they can be difficult to use because no key includes all described species and specimens often do not neatly fit within one couplet or another. Rather than revealing new, perhaps subtle, morphological characters that can be used to distinguish alatae, the suboptimal results from the alatae analyses reinforces the perception that they are very similar morphologically and reticent to identification without additional information (e.g. associated apterae, host plant data).

LDA loadings to create taxonomic keys
The loadings (coefficients) for each character on the discriminant functions could conceivably be used to create taxonomic keys. However, we do not suggest following that path for this group for a number of reasons: measures we used were adjusted by size, based on our data, so this same adjustment would need to be made for all measures (except wing venation angles); variables were then standardized to mean zero, variance one, so each variable would need to be standardized in the same way we did (again, the standardization we used was based on our data); the functions are built largely on continuous characters, rather than naturally dichotomous ones and all characters contribute to each discriminant function. However, since the variables were standardized, their importance (i.e. weighting) to the discriminant function can be evaluated, based on the absolute value of the coefficient. One could determine a cut-off for which variables were important. We looked at this for all the discriminant functions and found that, for any reasonable cut-off, some discriminant functions were largely determined by just a few characters (which would be useful for creating a key), but many were largely determined by at least 12 (which would be less useful). Since placement in the plane for any specimen is determined by two discriminant functions, many with large contributions from many characters, creating a usable key from these results would not be an easy exercise. One possible statistical approach to creating keys from continuous characters that should be explored is the use of regression trees.

Summary
Two species of Rhopalosiphum were moved or synonymized, bringing the total number of species in the genus to 17. While LDA has previously been used to distinguish between a limited number of species or ecotypes, the use of it to confirm previously proposed synonymies using historic slide-mounted specimens that lack DNA is novel and yielded promising results. In particular, the analyses confirmed most synonymizations when apterae were analyzed. However, while the most distinct alate Rhopalosiphum and associated synonymies were recovered in the LDA, many species and associated synonymies were not recovered as distinct. The failure of the analyses with some of the alatae using phenotypic traits mirrors problems previously documented for this morphologically similar group.