Introduction

Whether for ecological performance or sexual display, many morphological phenotypes that vary across species are considered the result of ecological or sexual selection of the varieties that portray the highest (relative) fitness (Andersson 1994; Darwin 1859, 1871; Mayr 1942). Indeed, there are many examples where morphological variation can be tied to ecological or sexual behaviours that directly affect fitness, e.g. climbing performance in Anolis lizards (Losos 1990) or eye span in stalk-eyed flies (Wilkinson and Reillo 1994). However, there are also notable exceptions to this general, ‘adaptationist’ pattern, e.g. (Blankers et al. 2012; Schulte et al. 2004) and the extent and causes of covariation between morphology and behaviour remain poorly understood.

The intricate relationship between behaviour, morphology, and sexual success or ecological performance is a function of the developmental pathways, genetic architecture, and the landscape of conflicting selection regimes associated with the traits. Whether associations among traits come about through developmental (shared pathways), genetic (pleiotropy or linkage), or evolutionary (shared effects from drift, mutation, and selection) covariation also determines whether these associations are observed within or among individuals and species (Armbruster et al. 2014; Cheverud 1996; Klingenberg 2014). If the genetic loci that underlie morphological and behavioural adaptation are shared (pleiotropy) or closely linked (genetic linkage), indirect selection effects can lead to co-divergence or constrain selection responses (Felsenstein 1988; Gavrilets 2003; Hansen and Houle 2008; Lande and Arnold 1983; Templeton 1981). However, if morphological phenotypes are involved in multiple behaviours, indirect selection acting on morphology following selection on one such behaviour can be counterbalanced by indirect selection or constraints resulting from coupling to other behaviours. The wings in Drosophila are an example of this, as they are used both in courtship song production and in flight (Bennet-Clark and Ewing 1968).

Interacting phenotypes such as morphology and behaviour can also become coupled among rather than within populations. Even in the absence of developmental or genetic linkage between morphological and behavioural traits, (correlated) selection may create covariance among traits if they are functionally related, if they have shared selection responses, or because the selection pressures themselves covary (Armbruster and Schwaegerle 1996; Armbruster et al. 2014; Lande and Arnold 1983). Additionally, across populations, shared ancestry and derived effects from drift and de novo mutations can also generate evolutionary covariance in the absence of genetic linkage (Armbruster and Schwaegerle 1996; Klingenberg 2014).

Here, we examine co-evolution of wing shape morphology and sexual signalling behaviour across species of field crickets (Gryllus). Field crickets differ strikingly in the calling songs that males produce to attract females and the corresponding preferences these females have for the male songs (Alexander 1962; Bailey 2008; Blankers et al. 2015; Gray et al. 2016; Hennig et al. 2016; Otte 1992; Simmons and Ritchie 1996); as a result, male reproductive success is strongly dependent on the calling song structure (Cade and Cade 1992; Wagner Jr. and Reiser 2000; Zuk and Simmons 1997). The crickets produce their songs by rubbing their forewings (stridulation), which play a very limited role in flight (except for steering), and can thus be expected to co-evolve mostly as a result from selection acting on the song structure (Bennet-Clark 1989, 2003; Gerhardt and Huber 2002; Nocke 1971).

The structure of the cricket’s mate calling song can be described in multiple dimensions. These dimensions consist of a spectral component (carrier frequency) and temporal components on short (pulse) and long timescales (chirp): pulse/chirp duration, pause (interval), period (sum of duration and pause) or rate (the inverse of the period), and the duty cycle (duration over period) (Fig. 1a). Across the different song traits, several types of selective regimes have been found for the species studied here (Blankers et al. 2015; Gray et al. 2016; Hennig et al. 2016): The pulse pattern is mostly associated with strongly concave preference functions, closely matching the distributions in the male signal thus suggesting stabilizing selection. Chirp rate is either not under direct selection or under very weak stabilizing selection, whereas chirp duty cycle is under strong directional selection in some but not in other species (where selection is stabilizing or only weakly directional). Carrier frequency is divergent among males, but female preferences are largely overlapping, suggesting that divergence in the pitch of the song is not driven by direct sexual selection, but may diverge due to a genetic correlation with pulse rate (Blankers et al. 2017).

Fig. 1
figure 1

Song, geographic, and phylogenetic variation in the samples. a Annotated song waveform to illustrate the different song traits. Note that these traits apply to both chirps (G. firmus and G#15) and trills (G. rubens and G. texensis). b Geographic distributions of the study species. Distributions are approximate and based on (Walker 2017). c Schematic representation of unpublished phylogenetic data truncated to include only the present study species, a waveform of a 1.2 s recording of the species’ calling songs to illustrate variation in temporal parameters, and example spectral data for each species. The colors correspond to the colors in (b). Phylogeny courtesy of D.A. Gray (unpublished data)

The biomechanics of stridulatory behaviour in crickets have been well studied. With each closing movement of the wings, the plectrum on the dorso-posterior edge of one wing (usually the left) excites the teeth on the file located on the ventroposterior side of the other wing; capture-release of consecutive teeth produces vibrations that transfer to the wings resulting in near pure-tone sound pulses (Bennet-Clark 1989; Nocke 1971). The sound then radiates over the wings, mediated by the wing’s resonant structures, the harp and mirror, which have been implicated in modulating the frequency and amplitude of the song (Bennet-Clark 1989, 2003; Mhatre et al. 2012; Montealegre-Z et al. 2009; Nocke 1971; Stephen and Hartley 1995). Several aspects of wing morphology, such as the length of the file, number of teeth, and the area of the harp, have been found to correlate with natural variation in the carrier (or dominant) frequency of the song (Simmons and Ritchie 1996) and even with traits governing the temporal structure of the song (Webb and Roff 1992). However, these results are potentially affected by allometry (i.e. size–shape relationships) and cryptic phenotypic integration, that is, unaccounted covariation among song or morphological traits.

Three more recent studies (Klingenberg et al. 2010; Ower et al. 2017; Pitchers et al. 2014) have exploited the possibilities of geometric morphometrics—a widely celebrated approach to quantifying variation in shape and size of complex morphologies in a robust statistical framework (Adams et al. 2004)—to specifically account for allometry and address integration of song and shape phenotypes. These studies have revealed that there is limited phenotypic and genetic wing shape variation in the dimensions associated with functional modules, e.g. mirror, harp (Klingenberg et al. 2010) and limited covariation between shape or size and song structure (Ower et al. 2017; Pitchers et al. 2014). However, these studies all address variability within species. Hitherto, it is thus unclear whether wing shape varies among cricket species and whether wing shape variation tracks song divergence on macroevolutionary scales.

If there is significant variation in wing shape or size, we may see three possible relationships between song and wing morphology variation: (1) wing morphology and song are unrelated and vary independently. This pattern may arise if different selection pressures are driving variation in wing morphology and song, or if wing morphology, contrary to song structure, evolved mostly neutrally, that is, has a strong phylogenetic signal; (2a) wing shape or size covaries with song structure due to functional or genetic correlations between morphology and behaviour. This scenario is hypothesized for carrier frequency (because of the biomechanical relation between the resonant structures on the wings and the carrier frequency of the sound pulses) and is expected to manifest itself on both individual-level and species-level comparisons; (2b) wing shape or size co-evolves with song structure across species due to shared neutral or selective processes, but shows no covariance at the intraspecific level (evolutionary covariance).

Here, we test these hypotheses in four congeneric species of field crickets: Gryllus firmus, G. rubens, G. texensis, and G#15 [a.k.a. Gryllus staccato (Sakaguchi and Gray 2011)]. These species span across a wide range of the southern and eastern United States (Fig. 1b) and show substantial variation in song structure (Fig. 1c). Currently, phylogenetic resources are limited, but a preliminary topology depicting the relationship of these species is shown in Fig. 1c (courtesy of D. A. Gray, unpublished results). One important difference between G. firmus and G#15 on the one side, and G. rubens and G. texensis on the other, is the temporal structure of the song on the long timescale. The former two produce short, regularly spaced groups of pulses, whereas the latter two produce long bouts of pulses (Fig. 1c). These two song types are generally categorized as chirps and trills, respectively (Alexander 1962). The different song types are likely the result of variation in the shape of the preference function for the chirp/trill duty cycle: chirpers are associated with concave preference functions (Hennig et al. 2016) and trillers with linear preferences (Blankers et al. 2015), which impose strong directional selection on the trill duty cycle (Blankers et al. 2017). Importantly, these songs types are not the result of physiological or biomechanical constraints, as many species that have chirped calling songs produce trilled aggressive (for male–male encounters) or courtship (for close contact mating behaviour) songs (Alexander 1962). As both chirps and trills represent pulse trains separated by longer pauses (Fig. 1c), we refer to both trills and chirps with “chirps” for the sake of simplicity.

Methods

All individuals used in the experiments were raised in the laboratory. Parental generations were collected in Agua Fria National Monument (AZ, USA; G#15); Gainesville, Lake City, and Live Oak (FL, USA; G. firmus and G. rubens); Austin, Lancaster, and Round Rock (TX, USA; G. texensis). Individuals were kept in 19L containers at an average temperature of 25.3 °C (± 2.73 SD) with gravel, shelter, water and food ad libitum, and artificial light–dark cycling (16:8 L:D; 50W UV lights, at 50 cm distance UV-B: 28 μW/cm2; UV-A: 2,00 mW/cm2; intensity: 19.500).

Song Data

Male crickets were recorded in the dark for a 16 to 24-h period (mean temperature 24.9 °C ± 0.98 SD). Temperatures vary typically less than 1 °C during one night in the anechoic recording room. Each male was assigned a container at random. The containers, plastic boxes measuring 5 × 5 × 5 cm, were equipped with gravel, egg carton and food and water. A container was placed in an anechoic box overnight, with a microphone (TCM 141 Conrad) mounted approximately 8 cm above the container. The room was equipped with 16 such boxes, which were acoustically isolated from each other. Using customized software (LabVIEW, 2007), the microphones were iteratively scanned for 800 ms intervals and a male was recorded for 20 s if it produced sound during that 800 ms interval.

Using custom software (LabVIEW, 2009), we determined the dominant carrier frequency from the spectral peak of the real-time signal (see Fig. 1 for an example of the raw data and the corresponding spectral peak). For analysis of the temporal pattern, the normalized envelope of the song signal was computed after signal rectification by squaring and low-pass filtering at 200 Hz (equivalent to a temporal resolution of 2.5 ms). Temporal parameters were calculated when the envelope crossed a threshold value at 10–15% of the signal envelope. Individual mean values were based on at least two 10 s windows, typically containing around 400 pulses and 2–10 chirps or trills each, from two different recordings.

We included the following measurements on both the short (pulse) and long (chirp/trill) timescale (Fig. 1a): the duration, the pause (interval), the rate (inverse of the period, obtained by summing over the duration and the subsequent pause) and duty cycle (duration divided by the period). Although a trill (for G. texensis and G. rubens) is different from a chirp, both have durations, rates, and duty cycles (see Fig. 1c). We refer to the measurements on the long timescale uniformly as the chirp rhythm.

Wing Morphology Data

Forewings were clipped off with precision (micro) scissors by cutting through the articular membrane (or articular sclerites) where the wing is attached to the thorax and stored at − 20 °C to prevent dehydration and mould. Only the left wing was used in the analyses. The whole wings were fixed between two glass slides and photographed with a Canon EOS 500D (shutter speed 1/60, ISO 400, 15.1 Mpx resolution), which was mounted on a stereomicroscope (×28 zoom). A 1 cm bar was included in the picture to allow for size correction.

We used the same 12 landmarks as in a previous study on G. firmus (Klingenberg et al. 2010). These 12 landmarks captured the main structures related to calling (i.e. functional modules) and were relatively easily located on each individual wing across species (Fig. 2). We defined the following functional modules in the wings based on our landmarks: file (landmarks 2 and 3), harp (landmarks 2, 3, 6, 9, and 10), mirror (landmarks 8, 9, 10, and 12), and plectrum (landmarks 1 and 4). We then used TPSdig2 (Rohlf 2006) to digitize the coordinates of the 12 landmarks. For each wing, landmarks were digitized twice. The coordinates were then Procustes superimposed in MorphoJ (Klingenberg 2011) followed by a Procustes ANOVA with the two independent measurements of each wing as the error term to test for measurement error. Global wing shape variation was retained as all principal components with non-zero eigenvalues describing the morphospace captured in the Procustes coordinates.

Fig. 2
figure 2

(Photo credits: Rafael Block)

Landmarks on left wing used in the analysis. Red, numbered dots represent the location of the 12 landmarks superimposed on a photograph of a G. firmus wing; the black bar measures 1 cm and was used to scale the wings. Landmarks are only placed on the dorsal part of the wing and not on the flexible, lateral part above the main wing vein through 2, 6, and 10.

Statistical Analyses

We first tested for global differences in size and shape among species. Size was represented by centroid size, the square root of the sum of squared distances between the landmarks and their centroid and is statistically independent of shape variation. The geometric shape variation, in turn, is Procustes superimposed and thus independent of size variation. We partition samples in morphospace using the Canonical Variates Analysis (CVA) in MorphoJ (Klingenberg 2011), and by performing a principal component analysis using the ‘prcomp’ function in R (R Development Core Team 2016). Both analyses use the 24 Procustes superimposed landmark coordinates as variables (12 landmarks each with Procustes coordinates in the x and y direction). A CVA optimizes the covariance matrix between the species, whereas a PCA is agnostic to species differences. The PCA thus shows whether there is structure in the major axes of shape variation at all, whereas the CVA will identify the direction of shape change that contributes most strongly to interspecific differences.

We then addressed whether interspecific variation in wing shape (1) covaries with song structure, and (2) is mostly coupled due to phylogenetic differentiation, or (3) mostly coupled due to function (carrier frequency), or (4) mostly coupled to song divergence driven by sexual selection (specifically, pulse rate and chirp duty cycle). We first tested whether individual variation in multidimensional song space and morphospace were partitioned similarly among species using (partial) Mantel tests (Legendre and Legendre 2012) from the ‘vegan’ package (Oksanen et al. 2016) in R (R Development Core Team 2016), using the matrix of phylogenetic distances as a covariate. To further explore statistical associations between wing shape or size and song structure we performed a MANOVA (using the lm() function in R) with song as a multivariate dependent variable, and species identity, shape, and their interaction as fixed effects. We followed up with ANOVAs for each separate song trait (with species and multivariate shape as predictors) and did post-hoc comparisons for individual PCs. We used the R base ‘stats’ package for analyses of variance and calculated partial R 2 using the etasq() function from the ‘heplots’ package (Fox et al. 2009). All analyses included recording temperature as a covariate.

Results

We collected wing shape data for 44 G#15, 79 G. firmus, 72 G. rubens, and 63 G. texensis individuals with two independent measurements per individual. Of the 258 individuals, song data were available for 31 G. firmus, 17 G#15, 27 G. rubens, and 27 G. texensis individuals.

Wing Shape and Size Differ Among Species

Wing size (centroid size) differed significantly among species (Procustes ANOVA: F3 = 6555.19, P < 0.0001; R 2 = 0.66) and so did wing shape (F60 = 858.59, P < 0.0001; R 2 = 0.43), with R 2 values indicating that 66 and 43% of the variation in wing size and shape, respectively, was partitioned among species. Measurement error did not significantly affect variation in size or shape (size: F3 = 0.05.19, P = 0.9953; shape: F80 = 6555.19, P = 1.0000).

The directions of greatest phenotypic variation among species were correlated with variation in the relative harp and mirror size; although, CV1 and CV2 also correlated with several landmarks that did not encompass resonant structures of the wing (Fig. 3). Overall, the primary axis of interspecific phenotypic divergence (CV1) corresponded to an elongation of the wing, but with slightly reduced harp size; individuals with higher scores for this axis had more flattened and longer wings with smaller harps compared to individuals with lower scores (Fig. 3).

Fig. 3
figure 3

Canonical variate analysis of wing shape. Variation in wing shape is shown in the direction of the first two CV axes, describing the maximized variation among species. The schematic outline drawings show variation in wing shape (black and grey lines respectively delineate the extreme positive and negative values on CV1 and CV2). The outlines represent the dorsal section of a left wing with the same orientation as the picture in Fig. 2

Wing Size and Shape Correlate Only Weakly with Calling Song Structure

Song structure is strongly divergent among species (Table 1). A principal component analysis (PCA) separated individuals along a pulse rate/carrier frequency axis (PC1, higher scores equalled lower pulse rates and carrier frequencies) and a chirp rate/chirp duty cycle axis (PC2, higher scores equalled higher chirp rates, but lower chirp duty cycles; Fig. 4a; Table 1). For shape morphology, levels of variation were substantially lower and species clusters in multivariate parameter space were less well-defined (Fig. 4b). Individuals with high scores for the primary PCA dimension describing shape variation had, similar to individuals with high scores for CV1, elongated wings with smaller harps (Fig. 4c). The secondary axis of shape variation represented a harp:mirror size ratio, with higher values indicating a higher harp:mirror ratio.

Table 1 Variation in song structure
Fig. 4
figure 4

Principal component analysis of song and wing shape data. a Distribution of samples in phenotypic space described by the first two principal components. Within the boxes the trait variation corresponding to the upper right and lower left of the phenotypic space is shown; cf carrier frequency, pr pulse rate, pdc pulse duty cycle, cr chirp rate, cdc chirp duty cycle. b Distribution of samples in morphospace (PC1 and PC2). The insets in b show the morphometric variation along the first two principal components (extreme positive and negative values on the PCs indicated by black and grey lines, respectively)

Interestingly, species distributions in two-dimensional space (along the first two principal components) were similar for song and shape (Fig. 4a vs. b). Although, species clusters were clearly more distinct, and variation was of higher magnitude when using calling song data (note that both song and shape data sets were scaled to have unit variance prior to PCA), the distributions were remarkably similar. This observation was supported by a Mantel test revealing a significant correlation between Euclidean distances in song and shape among species after correcting for shared ancestry using a cophenetic distance matrix based on the phylogeny (r = 0.73, P = 0.0417).

We then tested whether variation in morphology was associated with variation in song traits. We first tested for a global association using a MANOVA including all 5 song PCs (Table 1) as response variables and all 20 shape PCs and species as well as their interaction as predictor variables. We retained only 20 shape PCs of the total 24 PCs, because the remaining four had eigenvalues that were not significantly different from zero (and thus effectively describe no phenotypic variation). Both predictor variables independently correlated with variation in song structure (species: Wilks λ = 4.00 × 10−6, F15,42 = 232.3, P < 0.0001, shape: Wilks λ = 4.00 × 10−3, F100,78 = 1.50, P < 0.0315). There was no significant association between wing size and overall song structure (centroid size fixed effect: Wilks λ = 0.06, F15,85 = 1.1, P = 0.3707).

We predicted that, if shape divergence tracks song divergence because of functional correlations, wing shape would most strongly covary with carrier frequency. Alternatively, if shape divergence tracks song divergence due to non-functional coupling, the major phenotypic dimensions of song divergence (pulse rate and chirp duty cycle) would covary with wing shape. Only pulse rate was significantly associated with variation in wing shape and the nature of this relationship varied among species (Table 2). Post-hoc single PC ANOVAs corrected for multiple hypothesis testing revealed that only shape variation described by PC3 (15.2% of total shape variation) correlated with variation in pulse rate (effect from PC3 on pulse rate corrected for species effects: partial R 2 = 0.14, F 1,1 = 14.5, P = 0.0003). Variation in wing shape was not correlated with carrier frequency (Table 2), but we found a borderline significant association between centroid size and carrier frequency (partial R 2 = 0.03, F 1,1 = 4.00, P = 0.0486; Table 3), which is not significant after correcting for multiple hypothesis testing.

Table 2 ANOVA tables for the association between shape and independent song traits
Table 3 ANOVA tables for the association between size and independent song traits

In the previous analyses, we established that some aspects of shape morphology correlated with song structure variation within species. One potential caveat in this analysis is that when the 20 non-zero shape PCs are all included in the model, covariation between lower rank PCs and song structure may drive statistically significant correlations despite these PCs only represent a small fraction of the morphological variation. Alternatively, variation in arbitrary (i.e. non-functional) domains of the wing might render biological meaningful associations between specific PCs and song traits insignificant due to statistical noise. Therefore, to test if any of the two major axes of shape variation, together representing about 52% of the variance and describing changes in functionally relevant morphological structures, correlated with song structure we fitted additional univariate models. Correcting for multiple comparison, we found no song traits to be dependent on PC1 or PC2 of the morphospace (Table 4; Fig. 5f–o).

Table 4 ANOVA table for the association between the first two principal components of shape variation and independent song traits
Fig. 5
figure 5

Independent song trait correlations with size and shape. The panels show variation in the five song traits offset against variation in centroid size and the first two PCs of the morphospace

Discussion

To understand the co-evolutionary dynamics of behavioural and morphological variation, it is important to distinguish between the alternative evolutionary mechanisms at play. Here we identify significant wing shape divergence among four species of North American field crickets and ask whether wing shape variation is related to divergence in their sexually selected calling songs that form a major reproductive barrier between closely related species pairs. We compare observed patterns of covariation between wing shape and calling song structure with expectations under different scenarios (no covariance, phylogenetic covariance, functional covariance, evolutionary covariance). Wing shape co-evolves with song structure among species but shows very limited covariation with song within species. There was no association between carrier frequency variation and wing shape or size within species, rendering functional constraints an unlikely driving force of co-evolution. Rather, our data suggest that the evolution of multivariate wing shape and multivariate song structure are broadly linked due to shared (ancestral) effects from neutral and selective processes. These findings are significant in that they decouple the functional aspects of wing morphology in crickets from wing shape evolution and provide an interesting case of morphology-behaviour co-evolution across multiple species.

We show for the first time that closely related cricket species show substantial variation in wing shape (Figs. 3, 4b). Previous studies highlighted that there is ample wing shape variation within cricket species and between laboratory and geographic populations (Klingenberg et al. 2010; Ower et al. 2017; Pitchers et al. 2014). We here show that the major axes of phenotypic variation between individuals also tease apart the species (Fig. 4), suggesting a continuum of intraspecific and interspecific variation. The variation in wing shape among species is strongly related to song divergence among species (Fig. 4; Table 2), but we corroborate previous findings that covariation in morphological dimensions of wing shape and acoustic dimensions of the song are limited at best.

We expected that, if functional morphological differentiation was the driving force behind wing shape variation, the aspects of the song that are closely linked to the biomechanical and morphological properties of the wings would track divergence in the wing shape. The major candidate was the carrier frequency and the lack of coupling between frequency variation and variation in shape leads us to reject the hypothesis. Our results show that in line with findings for the field cricket Teleogryllus comodus (Pitchers et al. 2014) and the sagebrush cricket Cyphoderris strepitans (Ower et al. 2017), we currently lack evidence for a consistent statistical association between wing shape and song pitch in crickets. Similarly, the relationship between wing size and the pitch of the song is not unambiguous following the findings in this and previous studies. Note that this does not contradict the functional role of the resonant structures in modulating sounds at a given frequency (Bennet-Clark 1999, 2003; Montealegre-Z et al. 2011). Rather, they implicate that the variation in carrier frequency within and across species does not predict the variation in wing shape measured by the 12 landmarks and indicated that the evolution of wing shape across species might happen independent from the functional role the wings have.

Form a sexual selection standpoint, there has been great interest in the relation between carrier frequency and male body size in crickets (Gerhardt and Huber 2002), which is closely correlated with wing size (Simmons and Ritchie 1996; Webb and Roff 1992). However, empirically the relationship between wing size, body size, and song frequency is somewhat contentious. On the one hand, there is a general expectation of covariation between body size and carrier frequency so that males can advocate (by means of an honest signal) their body size to females (Bennet-Clark 1999; Gerhardt and Huber 2002). On the other hand, although there have been reports on a correlation between harp area and carrier frequency (Simmons and Ritchie 1996) as well as an association between body size, resonator area and carrier frequency (Webb and Roff 1992), there are also several studies that fail to find a correlation between body size and carrier frequency (Verburgt and Ferguson 2010). Both this study (note that we found a weak association between centroid size and carrier frequency which was borderline significant, but not after correcting for multiple hypothesis testing) and a related study in populations of Teleogryllus commodus (Pitchers et al. 2014) add to a growing body of work that shows there is no straightforward opportunity for males to advocate their body size through the pitch of their song.

The only univariate song trait that was weakly correlated (within and among species) to wing shape variation was pulse rate. This is not the first study to report correlations between song rhythm and aspects of wing morphology (Ower et al. 2017; Webb and Roff 1992). Functional co-dependency or shared developmental pathways are unlikely candidates to explain an association between pulse rate and wing shape based on the current knowledge of the mechanisms driving temporal song rhythms (Gerhardt and Huber 2002; Hennig 1990; Schöneich and Hedwig 2011). There are tentative explanations that connect wing morphology and the pulse or chirp pattern functionally due to mechanistic correlations between the file length and wing movement (Symes et al. 2015) or between wing shape and plectrum—file engagement (Ower et al. 2017); although, it seems improbable that this would happen at the limited level of variation in wing morphology observed within species. Intraspecific covariance can also result from pleiotropy or genetic linkage (Cheverud 1996; Klingenberg 2014). Our current data can neither reject nor confirm a shared or linked genetic basis for wing shape and song traits. If wing shape is most strongly associated with pulse rate, which is a major discriminator in female preference behaviour in these species (Blankers et al. 2015; Hennig et al. 2016), due to genetic covariance, the effects of indirect selection on wing shape might be quite strong. However, environmental variation and intraspecific stasis in morphology and some aspects of song rhythm might have limited the potential to detect covariance between wing shape and other song traits within species.

The alternative hypotheses involved either a purely neutral model, in which variation in wing shape variation tracks the phylogenetic relationships or a model where song structure and wing shape are coupled through shared evolutionary processes. To some extent, our data support a phylogenetic signal in both morphological and song data: In morphospace and multidimensional song space, the sister species G. texensis and G. rubens are closely positioned, with overlapping distributions in morphospace (Fig. 4); G#15 and G. firmus, which are more distantly related to the sister species pair, are also further differentiated phenotypically. For song variation in these species, however, it is known that selection strongly drives within and among species variation on the level of univariate song traits (most notably, pulse rate and chirp duty cycle) and on the level of multivariate song variation (Blankers et al. 2015, 2017; Hennig et al. 2016). Additionally, the strong relationship between relative position in multidimensional song space and morphospace (Fig. 4) is not necessarily expected under drift, because G#15 is phylogenetically equidistant from G. texensis and G. rubens, and all three are (approximately) equidistant from G. firmus. Under random drift, there are many different possible orientations of sample distributions in morphospace, so the strong similarity with multidimensional song divergence suggest that non-neutral processes are also driving the observed patterns of covariation.

There are two methodological factors that may have influenced our ability to pick up on a wing shape—song structure association. We lacked phylogenetic data for the individual samples, preventing us from testing the association in a proper phylogenetic context (e.g. using independent contrasts or phylogenetic generalised least squares). However, the preliminary phylogeny as well as accounting for relatedness in the association test of Euclidean song and shape distances strongly suggests that both song divergence and wing shape divergence are not merely a factor of phylogenetic distances. Another potential limitation of this study is that we only included measurements from the left wing. Asymmetry in the wings, e.g. the relative size of resonant structures (Montealegre-Z et al. 2011; Simmons and Ritchie 1996) combined with the fact that the left and right wing have a different ‘role’ during stridulation may have introduced a bias in our analyses. However, asymmetry in cricket wings is generally limited (Pitchers et al. 2014) and both wings weigh in as resonators during song production (Bennet-Clark 1989, 2003; Nocke 1971). In addition, in G. bimaculatus the resonant frequency of the left wing is more similar to the carrier frequency of the song (Montealegre-Z et al. 2011). Together, these factors likely alleviate any effects on our findings due to focussing on one wing only.

In summary, we have shown that field cricket wings harbour interspecific shape and size variation and that morphology and sexual behaviour co-evolve on larger evolutionary timescales. The multivariate morphological and behavioural phenotypes codiverge and multivariate correlations between song and shape remain significant after accounting for phylogenetic effects. However, the lack of intraspecific covariance suggested that this codivergence is likely not strongly driven by functional, developmental, or genetic integration.