Introduction

In both vertebrates and invertebrates melanin is the pigment that is most frequently used in the integument and either gives blackish (eumelanins) or reddish colorations (pheomelanins). Melanins are more frequently used than carotenoids or porphyrin probably because they have multiple key functions. Melanins have physical and biological properties that enhance resistance to abrasion of the external body surface (Bonser 1995; Burtt 1979) and that protect against solar radiation (Clusella Trullas et al. 2007), oxidative stress (McGraw 2005) and pathogens (Mackintosh 2001). Genes involved in melanogenesis can have numerous pleiotropic effects on other phenotypic traits implying that coloration can evolve as an indirect response to selection exerted on genetically correlated traits (Ducrest et al. 2008; Galván and Alonso-Alvarez 2008). Finally, melanin-based coloration itself can play a role in foraging (Galeotti and Sacchi 2003; Roulin and Wink 2004) and anti-predatory strategies (Jones et al. 1977; Johannesson and Ekendahl 2002), for instance, by improving camouflage (Hoekstra et al. 2005).

In many animals, individuals from the same population vary in the degree of melanin-based coloration. In genetically colour-polymorphic species the evolutionary stability of colour variation requires that alternative dark and pale melanic morphs achieve the same fitness in the long term (Losey et al. 1997; Bond and Kamil 1998; Roulin 2004). In contrast, in species in which melanin-based coloration evolved under directional sexual selection, dark individuals may be superior to paler conspecifics if the expression of melanin-based coloration has a significant condition-dependent component. Assuming that the production of melanic coloration entails significant costs, only the best individuals could afford to produce the darkest version of a colour trait. Evidence for a trade-off between resource allocation in melanin-based sexually selected traits and other physiological processes such as immunocompetence and resistance to oxidative stress has been confirmed in some species (Fitze and Richner 2002; Horth 2003; Fargallo et al. 2007; Roulin et al. 2008) but not in others (Bize et al. 2006; Buchanan et al. 2001; McGraw and Hill 2000; McGraw et al. 2002; Roulin et al. 1998; Siefferman and Hill 2005). Thus, the level of condition-dependent expression of melanin-based coloration may vary across species and between traits. Furthermore, in some species white rather than black coloration reflects quality (e.g. Hanssen et al. 2006), suggesting that sometimes pale coloration is more costly to produce than dark coloration (Hanssen et al. 2008). As a consequence, the signalling function of melanin-based traits remains controversial. This situation is further complicated by the fact that within a given species different melanin-based colour traits may have various functions with some specific parts of the body or some specific colour traits being sexually selected and others naturally selected. Thus, ideally more than one melanin-based trait should be considered in the studied species, an approach that is rarely considered.

If selection exerted on melanic coloration is species-, population- or trait-specific, the degree of coloration may not be correlated with fitness components in any consistent way across vertebrates. We tested this by performing meta-analyses of studies on melanism and fitness carried out in birds and while controlling for phylogeny. We considered birds because they represent the only taxon with a sufficiently large number of published papers on the relationship between melanin-based coloration and fitness parameters, i.e. laying date, clutch and brood sizes as well as survival. Because selection exerted on coloration is likely to differ in both sign and magnitude between genders or group of species, we performed one global and several specific meta-analyses, so-called meta-regression (Borenstein et al. 2009), on different subsets of the data. Since the intensity of sexual and natural selection probably differs between males and females, a first group of meta-analyses was performed on data sets including only males and then only females. Then, we investigated whether covariation between fitness components and melanin-based coloration differs between sexually dimorphic and non-dimorphic species, because sexual dimorphism is usually thought to be the result of directional sexual selection. We performed separate analyses in passerines versus non-passerines because passerines are traditionally used in sexual selection studies. We also considered colour polymorphic versus monomorphic species, because in colour polymorphic species dark and pale individuals are predicted to achieve the same fitness (Roulin 2004), whereas in monomorphic species coloration may be associated with fitness parameters. Finally, we conducted separate analyses in species where colour traits vary between individuals in their size versus colour intensity. We restricted our meta-analyses to eumelanin-based coloration (i.e. black and grey coloration) because few studies have yet been published on pheomelanin-based coloration (i.e. reddish-brown coloration).

Material and methods

Data collection

We performed a large-scale search of studies that tested the above predictions using a number of methods. In particular, we (1) used an extensive collection of studies on melanin-based traits gathered for the last 10 years, (2) searched for recent articles containing the keywords ‘melanin’ or ‘melanism’ in the Web of Science considering publications until 12 July 2010, and finally (3) looked for citations in reviews of the subject (Ducrest et al. 2008; Jawor and Breitwisch 2003; Nakagawa et al. 2007; Roulin 2004) and in all papers found using the above methods. We did not include in the meta-analyses studies for which effect sizes could not be calculated because raw data were not available as well as appropriate statistical analyses (e.g. Bókony et al. 2008; Cooke et al. 1995; Hatch 1991; Roulin and Altwegg 2007; Roulin et al. 2003). Finally, data on the tawny owl (Strix aluco), barn owl (Tyto alba) and Alpine swift (Apus melba) were obtained from our own unpublished data. Although tawny owls vary in the degree of reddishness, Gasparini et al. (2009) showed that feathers of redder individuals contain more eumelanin than feathers of pale conspecifics. We included 28 published studies and four unpublished data sets in the meta-analyses on reproductive parameters and 15 studies on survival. For most of the species used in the present study, the authors mentioned that melanin-based coloration plays a role in mate choice (e.g. Mundy et al. 2004). Thus, melanin-based traits considered may not be neutral to both sexual and natural selection. Because the data set on adult survival was not available for both males and females, we did not investigate sex differences in effect sizes in the survival analyses. Additionally, we report the sign of the relationship between adult survival and melanin-based coloration for the studies not included in the meta-analysis. The data used to carry out the meta-analyses are reported in the Online Resource 1.

Meta-analyses

For each study, sign and magnitude of the correlation between melanin-based coloration and laying date, clutch size, brood size and survival were given by the parameter ‘effect size’ r calculated following standard methodology (Rosenthal 1991). We defined a positive effect size when individuals with larger or darker melanin-based colour traits had lower laying date, had larger clutch size, larger brood size and higher survival rate than individuals with smaller or paler melanin-based colour traits. When authors provided Spearman’s rank or Pearson’s correlations, theses values were directly used as effect sizes. In other cases, effect sizes were calculated from the available statistics (e.g. t, x 2, F, Z, r 2, P values) and sample sizes using standard formulas (Rosenthal 1991). When statistics were derived from analyses of variance (ANOVA) with more than two treatments, we applied an ordered heterogeneity (OH) test (Rice and Gaines 1994a, b). For the few studies where values of statistical analyses were missing or could not be used (Online Resource 1), we recalculated appropriate statistics using raw data reported in the text, tables or figures using standard formulas and R.2.9.2 (Sokal and Rohlf 1995).

Mean effect sizes and homogeneity of effect sizes within data sets were estimated using the software phyloMeta v1.0beta (http://lajeunesse.nescent.org/software.html). This software accounts for potential phylogenetic bias in a data set by performing meta-analyses wherein effect sizes are weighted by the phylogenetic relationships of the studied species towards the species included in the analysis (Lajeunesse 2009). Effect sizes entered into the meta-analyses were previously transformed into Zr values to correct for asymptotic behaviour of large values of r (Rosenberg et al. 2000; Sheldon and West 2004). We considered one mean Zr value per species and fitness component (i.e. laying date, clutch size, brood size and survival). When the same species was studied in several populations, over several years, or for females and males separately, we calculated an average Zr value weighted by sample size as usually done in meta-analyses (West et al. 2005). Sample sizes were summed when studies were conducted on different populations but averaged when a single study reported several statistics from the same populations (e.g. if a study was carried out in several years). Meta-analyses were computed using random effect models (Møller and Jennions 2002; Rosenberg et al. 2000; Rosenthal 1991; West et al. 2005). Results were back-transformed into r values for illustrative purposes.

We calculated the homogeneity of effects sizes Q H to determine whether sampling error can explain the observed variation among our collection of effect sizes (Lajeunesse 2009). In particular, significant Q H indicated that a relationship between eumelanin-based coloration and fitness was heterogeneous across species or a group of species. In this case, four moderator variables (i.e. sexual dimorphism, colour polymorphism, passerine vs. non-passerine and intensity vs. patch size of melanism) were successively used as grouping variables in a new meta-analysis, so-called meta-regression (Lajeunesse 2009). We also performed two separate meta-analyses on male and female data sets, as effect sizes may differ between sexes from similar species (see Online Resource 1). Differences between sub-classes (i.e. males vs. females, sexual mono- vs. dimorphism, colour mono- vs. polymorphism, passerine vs. not passerine, and eumelanism intensity with size vs. colour) were considered as significant when 95% confidence intervals did not overlap. Since seven meta-analyses were conducted on the same data set, type I errors due to multiple testing were controlled using Bonferonni correction, wherein criterion α=0.05 was transformed to α=0.007 (i.e. all the reported P values are uncorrected and considered significant when smaller than 0.007). Finally, models with and without moderator variables were compared using AIC criterion to determine which moderator better fits our data set (Lajeunesse 2009).

The tree topology used to correct for phylogenetic signals was compiled from previous studies. The relationships among orders and families were extracted from recent avian phylogenies (Brown et al. 2007; Ericson et al. 2006; Hackett et al. 2008), and relationships within orders and families were retrieved from phylogenetic studies conducted in the respective taxa (Alström et al. 2006; Fain and Houde 2004; Johansson et al. 2008; Kimball and Braun 2008; Lovette and Bermingham 2002; Treplin et al. 2008). To estimate branch lengths, cytochrome b sequences were retrieved from the GenBank database for each species with available data. When data were not available for a species, we used data from a congener (Anser albifrons, Phalacrocorax pelagicus, Poecile rufescens; Online Resource 2). Sequences were aligned using the ClustalW algorithm (Thompson et al. 1994). We used 921 bp to estimate branch lengths for the tree encompassing all species included in this study using PAUP 4 beta 10 (Swofford 2003) based on a GTR + I + G model of nucleotide substitution. Based on this tree, ultrametric branch lengths were obtained using the pelanized likelihood approach by Sanderson (2002) implemented in the ape package (Paradis et al. 2004) in the software R 2.9.2. Subtrees for each trait were obtained by pruning species for which no measurements were available in Mesquite 2.71 (Maddison and Maddison 2009). The species tree considered in the present study can be found in Online Resource 2.

We used different methods to detect potential publication bias, which can arise if the likelihood of publishing results depends on the strength or direction (negative or positive correlation) of scientific findings. First, we used ‘funnel plot’ method to graph effect sizes against sample sizes (Møller and Jennions 2001). In the absence of publication bias, we expect a funnel shape due to a decrease in sample error with increased sampling effort. But in the presence of publication bias, we expect significant negative or positive correlations between effect sizes and sample sizes due to a deficit of studies with effect sizes smaller or bigger than the true effect size, respectively. The strength and direction of the publication bias can then be quantitatively assessed using a Spearman’s rank correlation. Second, we evaluated the potential consequences of publication bias on our results using the ‘trim and fill’ method (Rosenthal 1991). This method estimates the number (L 0) of ‘missing’ unpublished studies in our data set due to publication bias, simulates them and adds them to the data set in order to recalculate a corrected effect size and its significance (Møller and Jennions 2001).

Results

Across bird species, the mean effect sizes for the relationship between eumelanin-based coloration and laying date, clutch size, brood size and survival were not significantly different from zero (Table 1). By contrast, tests for homogeneity among effect sizes revealed significant results (Table 1), suggesting that associations between eumelanin-based coloration and fitness parameters varies among species. Therefore, we determined the effect of potential moderator variables.

Table 1 Mean effect sizes (r) and homogeneity of effect sizes (Q H) of studies investigating the relationship between eumelanin-based coloration and laying date, clutch size, brood size and survival in birds

In general, mean effect sizes remained non-significant when data sets were analyzed in relation to sex or using moderators including sexual dimorphism, colour polymorphism, membership to the Passeriformes order or type of colour trait (Table 1). We found only two significant effects: in sexually dimorphic species darker eumelanic individuals produced more eggs, and in species for which colour traits vary in size individuals displaying larger black patches produced more young. Moreover, the association between eumelanin-based coloration and clutch size was significantly larger in sexually dimorphic than sexually monomorphic species further suggesting that in sexually dimorphic birds dark coloration is directionally selected (Table 1). For all other data sets, confidence intervals of the mean effects sizes overlapped between males and females, sexually dimorphic and non-dimorphic species, colour polymorphic and monomorphic species, Passeriformes and non-Passeriformes species, as well as between species for which colour traits vary in size versus those which vary in colour intensity (Table 1). Finally, the use of moderators only slightly improved AIC values and effect sizes remained significantly heterogeneous within most of the data sets (Table 1). This indicates that associations between eumelanin-based coloration and fitness parameters are species- or trait-specific.

Overall, only the data set on brood size presented publication bias. In particular, sample sizes were significantly correlated with mean effect sizes in the global data set (Fig. 1, r s=−0.69, P = 0.001), as well as on the ones focusing on males (r s=−0.72, P = 0.002), sexually dimorphic species (r s=−0.77, P = 0.003), colour monomorphic species (r s=−0.77, P = 0.012), Passeriformes species (r s=−0.40, P = 0.012) and species that vary in the intensity of black patches (r s=−0.63, P = 0.018). This bias tends to overestimate the corresponding mean effect sizes since significant correlations between brood size and melanin-based traits were more likely to be published when based on small than large data sets. However, the non-significant results remained robust as trim and fill method indicates that additional studies that might be necessary to correct for publication bias would not change the non-significance of these mean effect sizes (for all data sets with non-significant results: L 0 = between 1 and 7; adjusted mean r = between −0.009 and 0.067; all P > 0.05). For all other data sets used in the meta-analyses, correlations between sample sizes and effect sizes were not significant (Fig. 1, all P > 0.08).

Fig. 1
figure 1

Relationship between effect size and sample size for studies investigating the relationship between eumelanin-based coloration and brood size (empty triangles), clutch size (filled triangles), laying date (empty squares) and survival (filled squares). Each point corresponds to the mean effect size for both sexes (or one sex if data are missing for the other sex) of one species

Discussion

In meta-analyses controlling for phylogeny, we found that across bird species variation in melanin-based coloration was not significantly associated with laying date, clutch size, brood size and survival. In some species the degree of coloration was positively correlated with reproductive and survival parameters, while in other species the opposite pattern was detected (Table 1 and Online Resource 1). We nevertheless found that in sexually dimorphic species and in species for which melanic trait varies in size rather than in colour intensity between individuals, clutch size and brood size covaried positively with the degree of melanin-based coloration. Thus, greater extent of melanin-based coloration reflects absolute individual quality (i.e. fitness components) in some species, particularly in sexually dimorphic species (e.g. Roulin et al. 2010) and in those that vary in the size of black patches (e.g. Jensen et al. 2004), while a lightly coloured plumage can sometimes signal absolute quality in other species (e.g. Schroeder et al. 2009).

A major goal of our study was to investigate whether a dark or pale melanic coloration reflects individual quality that translates towards a higher reproductive success and/or survival. Our meta-analyses can thus be considered as a first step into understanding how the degree of melanin-based coloration is selected across birds. To evaluate the possibility that the results may depend on the selected species, we performed meta-analyses on non-sexually dimorphic versus sexually dimorphic species (i.e. category of birds in which selection is considered to be directional), in non-passerines versus passerines (i.e. category of birds traditionally considered in the context of sexual selection), in colour monomorphic versus polymorphic species (i.e. a category of birds in which selection is considered to be balanced) and in species in which variation in coloration is due to the size of eumelanic patch versus colour intensity. The finding that across birds, variation in melanin-based coloration was not significantly positively associated with fitness components indicates that across bird species dark individuals have not had a selective advantage over paler conspecifics, or vice versa. Nevertheless, we discovered that in sexually dimorphic species darker individuals achieve a higher reproductive success. This suggests that in species in which members of one sex evolved towards a darker coloration than members of the other sex, a dark coloration signals absolute quality. The same pattern was detected in species for which colour traits vary in size but not in intensity. This finding is important because size of a black patch rather than the density of pigments deposited per unit of body surface signals quality. This proposition should be tackled further and, in particular, researchers should quantify size versus intensity of a colour trait separately. It now remains to investigate the pattern of selection in species in which the expression of melanin-based coloration is condition-dependent. Assuming that a darker coloration is more costly to produce, dark individuals might achieve a higher fitness than pale conspecifics (e.g. Fargallo et al. 2007). Unfortunately, the quantitative genetics of melanin-based coloration has been rarely studied (e.g. Fargallo et al. 2007; Roulin et al. 2010).

Why a dark or pale coloration is not systematically positively selected? Melanin is the most common pigment in animal integuments. Given that melanin plays key physical and biological protective roles and that in some species melanin-based ornaments signal absolute individual quality due to the cost entailed by their production (Fitze and Richner 2002; Fargallo et al. 2007) and sensitivity to environmental factors (Jensen et al. 2003), we could have predicted that across birds the darkest version of a colour trait is favoured under natural or sexual selection. At least two mechanisms can explain why lighter coloured individuals are positively selected in some species while in other species the darkest individuals have a selective advantage. First, camouflage may be best achieved with a light or dark coloration in different species (e.g. Götmark 1987; Hoekstra et al. 2005). The situation might be complex as natural selection can favour a drab plumage to enhance cryptism, while sexual selection might select for the most conspicuous individuals as a signal of the ability to avoid predators (Götmark 1993). Comparative analyses should be carried out to identify the ecological factors associated with the evolution of dark and pale melanic plumage. Second, genes involved in melanogenesis can pleiotropically regulate many physiological and behavioural functions in a complex way (Ducrest et al. 2008). The cost/benefit balance of these multiple pleiotropic effects probably differ between species, implying that different degrees of melanin-based coloration may be indirectly selected in different species. There might be situations where the pleiotropic effects of genes involved in melanogenesis are beneficial but other situations where they entail costs. For instance, the melanocortin system triggers the production of eumelanin pigments but has also immunostimulatory effects (Ducrest et al. 2008). Given the costs of immunity (Bonneaud et al. 2003) the optimal level of immunity may differ between populations or species implying that parasites may indirectly select for different degrees of melanin-based coloration.

For these three reasons, our results showing that sign and magnitude of selection exerted on melanic coloration differs between species is not so surprising. The absence of a general pattern of selection on melanin-based coloration is particularly stimulating to identify the ecological factors that make species or colour traits so different. For instance, detailed observations in the cosmopolitan barn owl showed that different degrees of melanin-based coloration are selected in different parts of the world (Roulin et al. 2009; Antoniazza et al. 2010). This suggests that selection can either favour dark or pale colour traits depending on ecological factors. Furthermore, in some species individuals become darker with age (Potti and Montalvo 1991) and lighter coloured in other species (Anderson et al. 2009). In raptors, Ferguson-Lees and Christie (2001) reported 110 species out of 315 (35%) for which some parts of the plumage become lighter coloured with age and 222 species out of 317 (70%) for which the plumage becomes darker with age. Although darkening is more prevalent than whitening, it seems that in a non-negligible number of species a light melanic plumage is favoured over a dark plumage.