Introduction

One of the most pervasive patterns in animal behavior is extensive intraspecific variation in the size of groups occupied by individuals. Group size can vary by several orders of magnitude in some species (Jarman 1974; Brown et al. 1990; Avilés 1997; Jovani et al. 2008a, 2016). Natural variation in group size often provides an opportunity to examine how life-history and behavioral traits and their fitness consequences change in response to potential competition and/or cooperation among group members. Behavioral ecologists have commonly taken advantage of group size variation, using it to study, for example, vigilance behavior (Pulliam 1973; Elgar 1989; Roberts 1996; Carter et al. 2009; Beauchamp et al. 2011), social foraging (Clark and Mangel 1984; Giraldeau and Caraco 2000; Barta and Giraldeau 2001), lekking (Höglund and Alatalo 1995), disease and parasite transmission (Hoogland 1979; Brown and Brown 1986, 2004; Wilder et al. 2011; Rifkin et al. 2012; Nunn et al. 2015), and the evolution of sociality (Pulliam and Millikan 1982; Hass and Valenzuela 2002; Johnson et al. 2002; Williams et al. 2003; Silk 2007b; Markham et al. 2015).

Variation in group size is particularly apparent in colonially breeding animals: for example, some colonial seabirds exhibit colony sizes ranging from 1 to 120,000 nests (Forbes et al. 2000; Jovani et al. 2008a); the tricolored blackbird (Agelaius tricolor) occupies colonies ranging from 50 to 200,000 nests (Neff 1937); and intraspecific colony-size variation in spiders spans 1 to >50,000 breeding individuals (Avilés 1997). Beginning in the 1970s (Lubin 1974; Hoogland and Sherman 1976; Snapp 1976) and continuing to the present (Kenyon et al. 2007; Serrano and Tella 2007; Spottiswoode 2007, 2009; Gager et al. 2016), natural variation in colony size was used to evaluate how many of the hypothesized costs and benefits of living in groups—such as increased likelihood of parasite transmission, intensified competition for resources, better avoidance of predators, or enhanced ability to find food (Alexander 1974)—changed with social environment. Field-based studies testing for correlations between colony size and behavior have contributed, for example, to a more general understanding of how varying levels of breeding gregariousness affect the expression and magnitude of the costs and benefits of group life.

Yet, despite the recognition that colony-size variation is extreme in many colonially breeding species, surprisingly little attention has been paid to what causes this variation in the first place. To date, hardly any empirical studies have convincingly demonstrated a clear ecological or evolutionary mechanism that explains the persistence of colony-size variation in animals. Because annual fitness often appears to be greatest for individuals occupying colonies of particular sizes (Avilés and Trufiño 1998; Hötker 2000; Brown and Brown 2001; Schwager 2005; Bilde et al. 2007; Serrano and Tella 2007), why animals with lower fitness in the other colony sizes are not removed by natural selection—and the size range restricted to a more narrow set of colony sizes—remains one of behavioral ecology’s paradoxes. Colony-size variation within a population can remain stable, with little directional change in the distribution of colony sizes over periods as long as 30 years in some species (Brown et al. 2013). Colony-size frequency distributions are also highly repeatable across a species’ range (Jovani et al. 2012), and Cipriani and Jaffe (2005) suggested that these distributions might reflect the overall strength of natural selection for grouping.

Hypotheses for what might cause and maintain variation in colony size were first proposed over 25 years ago, but there was little information to evaluate them at that time (Brown et al. 1990). We now have more refined hypotheses and relevant empirical and theoretical work that can address the colony-size paradox. This review focuses, first, on the ecology of colony-size variation, asking principally how ecological conditions in and around colony sites influence the number or characteristics of individuals settling there and on what basis individuals seem to choose colony sizes. I then examine the fitness consequences of variable colony sizes, the extent to which fitness asymmetries that fluctuate in time and space may select for or against particular colony sizes and to what extent colony-size variation may stem from constraints on settlement options, reliability of information on site quality, and movement decisions. My review does not address the evolution of coloniality per se or the costs and benefits of colonial nesting, except as relevant to colony-size variation; these topics are treated well elsewhere (Alexander 1974; Wittenberger and Hunt 1985; Siegel-Causey and Kharitonov 1990; Brown and Brown 1996, 2001; Danchin and Wagner 1997; Safran et al. 2007). The focus here is primarily on colonial vertebrates (especially birds) and colonial spiders that breed at spatially fixed sites, in part because coloniality has been best studied in these taxa. I do not include eusocial insects or other cooperatively breeding species that live primarily in kin-structured groups. The extensive literature on group size in mobile (e.g., primates) or nonbreeding animals (e.g., foraging bird flocks) is referenced only when directly applicable to coloniality.

Ecological correlates of colony-size variation

Resource patchiness

A patchy distribution of resources is often associated generally with variation in colony size (Brown and Rannala 1995; Schwager 2005; Safran et al. 2007; Thompson et al. 2007; Votier et al. 2007; Spottiswoode 2009). These resource patches could include the amount of appropriate habitat for nesting (either of a required type of substrate or sites offering less access to predators) and concentrations of heterogeneously distributed food sources. If individuals are distributed among colonies in proportion to the fraction of the total resource available there (Fretwell and Lucas 1970), we would predict relatively similar fitness among individuals in all habitat patches (Sachs et al. 2007). In some ways, this is one of the most parsimonious hypotheses for why colonies vary in size. Most of the limited empirical evidence for resource patchiness driving colony-size variation is indirect.

Nesting substrate

Many waterbirds, especially pelagic species, may form colonies because the land masses that are appropriate for nesting are limited to a relatively few islands and coastlines. If a shortage of sites forces individuals into colonies, more birds should be found at sites that can accommodate more nests (Lack 1968; Wittenberger 1981; Forbes et al. 2000; Nuechterlein et al. 2003; Sachs et al. 2007; Votier et al. 2007). Testing this hypothesis requires measuring total nesting substrate size, which often can be difficult because of uncertainty about what constitutes a suitable substrate, especially given the heterogeneities in nesting-site quality even within a colony for some animals (Velando and Freire 2001; Herring and Ackerman 2011; Minias 2014; Minias et al. 2016).

The only study in seabirds that directly measured the extent of physical nesting substrate at a site did not find that nesting sites were limited, either regionally or locally (within colonies; Olsthoorn and Nelson 1990). Substrate availability was thus unrelated to colony size. In contrast, colony-size variation was driven largely by quantity of nesting substrate in purple martins (Progne subis), which breed in eastern North America only in artificial bird houses installed in people’s backyards. Nesting substrate could be objectively quantified by how many bird houses were present in a particular locale, and martin colony size varied directly with the number of nesting cavities present (Davis and Brown 1999). In addition, although not a colonially breeding species per se, some coral-dwelling fishes exhibit group sizes that directly reflected the size of the habitat patches the groups occupied (Thompson et al. 2007).

Food resources

More attention has been given to availability of food and to what degree local patchiness of food resources affects colony-size variation. Some spiders aggregate into colonies in areas of high local food availability, while solitary individuals occur in areas with less food; experiments showed that when natural food availability was manipulated by excluding prey species from the vicinity of a colony’s web, colony size declined in response (Rypstra 1985; Smith 1985). These examples provide compelling evidence that colony-size variation can be a direct response to heterogeneity in the distribution and abundance of food resources.

The spider studies are exceptions, however, because measuring food availability directly is difficult for most colonial species (especially birds). Colony residents often forage over wide areas and consume many different kinds of food, each requiring a different sampling method (Hunt and Schneider 1987; Cooper and Whitmore 1990). Even when prey availability near colonies can be quantified, results may be counterintuitive: colony sizes of great skuas (Stercorarius skua) increased with the numbers of other seabirds of different species (on which skuas prey) nesting nearby (Votier et al. 2007). Yet, skuas in the largest colonies consumed proportionately fewer seabirds than did skuas in the smallest colonies, indicating that skua colony size was unlikely to be a direct response to seabird-prey availability.

An alternative approach has been to measure the extent of foraging habitat near a colony and assume that more foraging habitat translates into greater food availability. Some of these studies have shown positive correlations between the extent of foraging habitat within a given area and colony size (Fasola and Barbieri 1978; Gibbs et al. 1987; Farinha and Leitao 1996; Gibbs and Kinkel 1997; Baxter and Fairweather 1998; Griffin and Thomas 2000; Ambrosini et al. 2002; Ainley et al. 2003), while others have not (Brown et al. 2013). Those species with positive relationships would be good candidates for directly testing whether colony size reflects ideal-free matching to food resources.

Interpreting how food availability influences colony size is complicated by the fact that a colony often depletes the local food resources over time (Ashmole 1963; Furness and Birkhead 1984; Birt et al. 1987; Lewis et al. 2001; Forero et al. 2002; Oppel et al. 2015). The extent of such food depletion may vary with colony size in some species, depending on the typical foraging range for individuals (Brown and Brown 1996; Jovani et al. 2016). If residents in larger colonies suffer net food-related costs, why colony-size variation persists in these species remains particularly puzzling, especially when nesting substrates are not limiting.

Phenotypic sorting

Another hypothesis to explain colony-size variation is that individuals sort among groups based on phenotypic characteristics that confer differential success depending on the colony size an individual occupies (Brown et al. 1990). Certain inherent characteristics of individuals make some better suited for large groups and others for small groups. At the time of my first colony-size review a quarter of a century ago, almost no data were available to know whether individuals might sort among colony sizes in such a way. That situation has been partly remedied by recent studies on the characteristics of individuals occupying colonies (or other sorts of groups) of different sizes (Table 1; Brown 1982; Ranta and Lindström 1990; Höglund et al. 1993; Ranta et al. 1993).

Table 1 Traits in colonial species that varied based on the colony size an individual occupied

In studying sorting of individuals among colonies, it is important to focus on traits that are inherent to individuals (e.g., age, morphology) or that characterize them when they make their colony choice (e.g., body condition at settlement), and not traits that are determined by post-settlement residency in a colony of a particular size. For example, deposition of egg-yolk androgens in eggs of several colonial birds varies with colony size (Table 1; Gil et al. 2007). Androgen provisioning may improve the competitive ability of the offspring and thus might reflect an adaptive life-history decision, beneficial especially in highly competitive, crowded environments. However, aggressive interactions among egg-laying females are known to directly influence yolk androgen levels (Whittingham and Schwabl 2002; Mazuc et al. 2003; Pilz and Smith 2004), and the frequency of aggressive interactions tends to increase with colony size (e.g., Hoogland and Sherman 1976; Brown and Brown 1996). Variation in yolk androgen levels (Table 1) therefore could be partly a consequence of simply being resident in a large vs. small colony where females fight more or less, respectively.

The available evidence for colonially breeding species shows a wide range of phenotypic traits that tend to vary with colony size or in other ways differ among individuals settling in different colonies. Although the number of relevant studies is still too limited to reveal quantitative patterns (Table 1), the phenotypic traits on which colony-size sorting occurs fall into two major groups based on (i) life-history traits and (ii) behavioral or cognitive characteristics.

Life-history traits

Life-history traits on which individuals may sort include egg size, body size and condition, plumage ornamentation, testis size, investment in incubation, and overall fecundity (Bukacińska et al. 1993; Brown and Brown 1996, 2003; Neff et al. 2004; Spottiswoode 2007; Acker et al. 2015). In some cases, individuals settling in larger colonies were in poorer condition and perhaps less competitive, or invested less in reproduction, while in other species, individuals in better condition chose larger colonies (Table 1). Body size decreased among individuals in larger colonies in at least two species, while in others, the extent of a plumage ornament, testis size, and circulating testosterone levels increased with colony size (Table 1). Cell-mediated immune response decreased with colony size in sociable weavers (Philetairus socius), and this trait was heritable (Spottiswoode 2009). Experiments that manipulated colony resources found that life-history allocations of sociable weavers in different sized colonies were not plastic responses to local conditions (Spottiswoode 2009), showing that colonies indeed consisted of collections of individuals with different life-history strategies. Colony-level fecundity was inversely associated with colony-level breeder survival in cliff swallows (Petrochelidon pyrrhonota), implying that classic life-history tradeoffs might be manifested in different colonies (Brown et al. 2015).

Also in cliff swallows, sorting by colony size was delayed, being related to the relative stress level of an individual (as measured by baseline corticosterone) exhibited in its breeding colony the previous year (Brown et al. 2005). Individuals with above-average baseline corticosterone in year t were more likely to choose and settle in larger colonies in year t + 1, while those with lower corticosterone levels in year t chose smaller colonies the next year. In sociable weavers, colony size may interact with testosterone levels, leading to a facultative expression of a morphological trait: birds increased or reduced their plumage bib size when moving to a larger or smaller colony, respectively (Acker et al. 2015).

Behavioral traits

Colony residents may also sort via behavioral and cognitive traits (Table 1). Recent evidence indicates that the pervasive individual variation in behavior seen in natural populations of the same species may reflect different adaptive solutions to ecological problems (Réale et al. 2007; Bergmüller et al. 2010). We now know that certain kinds of behavior tend to co-occur in the same individual more often than one might expect at random, leading to the characterization of distinct behavioral syndromes that consist of predictable behavioral patterns and often represented as “personality” types (Dall et al. 2004; Sih et al. 2004). Studies of personality are now being done on colonial species as diverse as spiders and swallows (Table 1). In these species, larger colonies seemed to consist of a greater proportion of less-aggressive, more socially “tolerant” individuals who were also less bold in approaching novel or aversive stimuli (Pruitt et al. 2011; Dardenne et al. 2013). In each case, the link between residents’ personalities and group size could have reflected behavioral sorting of individuals among groups, although the possibility that colony size affected the expression of aggression or neophobia could not be ruled out.

An intriguing pattern, reported for the barn swallow (Hirundo rustica), suggests that individuals might also sort among colonies based on brain size (Møller 2010). Barn swallows in larger colonies had larger head volumes, which predict brain mass in that species. This suggests that individuals with potentially greater cognitive abilities might occupy more socially complex environments such as larger colonies (Møller 2010), as also reported in other group-living species (Bond et al. 2003; Sallet et al. 2011). One consequence might be more innovative problem solving in these groups (Liker and Bókony 2009; Mirville et al. 2016), where the greater number of residents with bigger brains leads to a higher probability that one or more individuals will exhibit novel behavior (such as discovering a new method or location of foraging) that other residents can imitate. That more complex social situations tend to select for bigger (or different) brains has been hypothesized to apply across species (Dunbar and Shultz 2007; Silk 2007a; O’Connell and Hofmann 2012; cf. Benson-Amram et al. 2016), but Møller’s (2010) results suggest we should also examine this possibility within species. Animals that live in colonies of different sizes would be good candidates for such a study.

Why sort?

Various adaptive hypotheses can potentially explain the patterns of phenotypic sorting observed in colonial birds. Most are based on the presumption that individuals with different phenotypic characteristics are likely to experience the associated costs and benefits of group size to different extents as a consequence of their varying phenotypes. For example, in cases where plumage ornamentation, testis size, and circulating testosterone levels increase with colony size (Table 1), these traits may enable individuals in larger colonies to better cope with the increased competition for both intra- and extra-pair matings (Gladstone 1979; Wagner 1993; Brown and Brown 1996) that are automatic consequences of large groups. But because these traits may also be expensive to produce or confer immunosuppressive costs (Folstad and Karter 1992; Sheldon and Verhulst 1996; Braude et al. 1999), individuals who are unable to produce them can settle in smaller colonies where competition for matings is lower and thus the reduced expression of the traits is unlikely to be as detrimental in mate competition. Hormone-based sorting could confer other fitness advantages; for example, birds predisposed to certain baseline corticosterone levels can modulate the negative effects of the stress hormones by adjusting the level of social stress (i.e., colony size) to which they are exposed (Brown and Brown 2005).

For life-history traits, fine-scale differences in, for example, investment in fecundity versus survival could serve to reduce the costs or increase the benefits to individuals of living in colonies of particular sizes (Spottiswoode 2009). When higher costs of ectoparasitism or resource competition lower annual reproductive success (e.g., in larger colonies), individuals may invest less in reproduction and more in personal survival (which itself might be enhanced by better avoidance of predators in larger colonies). In contrast, when the expectation of annual adult survival is lower but reproduction is more certain (e.g., in small colonies), individuals that invest less in survival and more in reproduction might be at an advantage. This scenario would result in sorting by favoring individuals with inherent predispositions for certain life-history strategies in different social environments.

In colonial spiders, even though inherently less-aggressive individuals have lower prey capture success, they also suffer fewer costs than the more-aggressive individuals in larger colonies because the less-aggressive ones are less likely to fight with neighbors. Thus the net benefit of living in a large colony would be greater for less-aggressive spiders than for the more-aggressive ones (Pruitt et al. 2011; Pruitt and Riechert 2011). In contrast, in smaller colonies where foraging in general is less successful, the more-aggressive individuals can take better advantage of their greater ability to capture prey while also minimizing their costs of social life because they have fewer neighbors to fight with. Consequently, the inherently more-aggressive animals benefit more from small colonies than do the less-aggressive ones.

Foraging success may be highest in spider colonies that contain both aggressive and nonaggressive phenotypes, illustrating that colony composition in addition to size could affect the expectation for a given individual (Grinsted et al. 2013; Harwood and Avilés 2013; Keiser and Pruitt 2014; and see Hodgkin et al. 2014). Increasing evidence indicates that animal groups in general often consist of mixes of individuals with different personality types, and the relative proportion of the different types may have profound effects on group dynamics and ultimately individual fitness (Bergmüller and Taborsky 2010; Kurvers et al. 2010; Pruitt and Reichert 2011; Bengston and Jandt 2014). The personalities of colony members may thus provide new insight into both the causes and consequences of living in colonies of different sizes (e.g., Kazama and Watanuki 2010; Pruitt et al. 2013).

While we are accumulating obvious instances of phenotypic variability among colonies of different sizes (Table 1), we do not have enough information to know if there are certain types or classes of phenotypic traits that tend most often to be associated with colony-size variation. In addition, some of the data have been collected incidental to other work, and in only three species (barn swallow, cliff swallow, sociable weaver) have a wide variety of phenotypic traits been measured specifically in relation to colony size. Even in these species, how (or if) phenotypic sorting directly affects fitness of individuals in colonies of different sizes has not been established (see next section). Much remains to be learned about the extent to which colony size reflects nonrandom subsets of individuals.

Colony-size effects on fitness

One reason that variation in colony size is so interesting evolutionarily is the observation that measures of fitness of colony residents often differ widely among those occupying colonies of different sizes, yet colony-size variation persists in populations. This paradox has been recognized for some time (Brown et al. 1990), and potential explanations are starting to emerge. Collecting complete data on fitness of individuals in colonies of different sizes has been challenging, however, and to some degree, our understanding of the fitness effects of colony-size variation is still hindered by lack of information on all components of fitness and/or sampling of enough colony sizes to construct fitness functions.

The majority of studies that have estimated fitness components for residents of different sized colonies, especially in birds, have focused on annual fecundity, as measured by the number of offspring reared to fledging (reviewed in Brown and Brown 2001). Four main patterns have been found: reproductive success (i) increases with colony size, (ii) decreases with colony size, (iii) peaks at intermediate colony sizes, or (iv) does not vary systematically with colony size. Studies most commonly have reported patterns (i) and (iv), and pattern (iii) seems to be least frequent, at least in birds (Brown and Brown 2001). No evidence suggests that these conclusions are appreciably different today than at the time of the 2001 review, based on more recent studies of annual fecundity in relation to both colony size (e.g., Forero et al. 2002; Václav and Hoi 2002; Acquarone et al. 2003; Nuechterlein et al. 2003; Votier et al. 2007; Magrath et al. 2009; Calabuig et al. 2010, Altwegg et al. 2014) and group size more generally (Silk 2007b; Ebensperger et al. 2012). Where we have advanced is in learning more about variation in other fitness components such as adult and first-year survival and in developing hypotheses for why fitness varies with colony size.

Heritability of colony-size choice

To use fitness variation to infer how selection might favor residents occupying particular colony sizes, the choice of colony size by a given individual must be repeatable over its lifetime and to some degree heritable. Evidence is now accumulating that suggests moderately high heritabilities (≥0.40) for colony-size choice: in cliff swallows (Brown and Brown 2000; Roche et al. 2011), barn swallows (Møller 2002), and lesser kestrels (Falco naumanni; Serrano and Tella 2007). Individuals of other species exhibit consistency across their lifetimes in choosing particular colony sizes, although heritability has not been formally measured (Brown et al. 2003; Neff et al. 2004). In addition, genetic differences between colonial and noncolonial individuals in mute swans (Cygnus olor; Bacon and Andersen-Harild 1987), functional differences in the brain between social and asocial finches (Goodson et al. 2005, 2009), a heritable basis for cooperative breeding in western bluebirds (Sialia mexicana; Charmantier et al. 2007), rapid evolution of schooling behavior in fish (Magurran et al. 1995), and a single locus controlling whether ant colonies are monogyne or polygyne (Krieger and Ross 2001), all suggest that level of sociality in animals can at times be under genetic control or at least correlate with genetic differences among individuals. For these reasons, selection should theoretically be able to act on colony size, and I make that implicit assumption in the following sections.

Unequal fitness across colony sizes: life-history tradeoffs

The colony-size paradox might exist because we have incomplete estimates of fitness. In most empirical studies, only one component of fitness, usually annual reproductive success, was measured (Brown and Brown 2001), and tradeoffs between the components of fitness (e.g., survival or fecundity) could diminish some of the differences reported among colony sizes (Minias et al. 2015). If adult survival is higher for individuals in the colony sizes where annual fecundity is lower, lifetime reproductive success might be relatively similar across the size range. For example, in a colonial spider (Anelosimus eximius), the probability of offspring survival increased with colony size, while the likelihood of a female reproducing decreased with colony size (Aviles and Trufiño 1998). In another colonial spider (Stegodyphus dumicola), survival also increased with colony size, while female fecundity and other fitness components decreased with colony size (Bilde et al. 2007). In both instances, focusing on only one component of fitness would have yielded misleading inferences about the effect of colony size on fitness. Interestingly, in both cases, lifetime fitness, as estimated from all fitness components, suggested that individuals occupying intermediate-sized colonies had the greatest fitness advantage (Aviles and Trufiño 1998; Bilde et al. 2007). Even in these species, the colony-size paradox held, with spider colonies either smaller or larger than the intermediate optimum being common.

Only a few field studies of birds have integrated both annual survival of breeders and annual reproductive success into a single fitness measure. With per capita growth rate of a colony as the measure of fitness, Schwager (2005) found that sociable weavers showed a fitness curve in relation to colony size resembling disruptive selection: highest fitness for birds in the smallest and largest colonies and lowest in intermediate-sized colonies. In the lesser kestrel, juvenile survival declined with colony size, while the number of offspring increased with colony size (Di Maggio et al. 2016). These fitness components suggested that an intermediate colony size might afford the highest fitness, although positive effects of colony size on adult survival also existed that were not integrated into the fitness estimates. Crude measures of lifetime reproductive success in cliff swallows, incorporating estimates of breeder survival, first-year survival, and number of offspring reared, suggested an advantage for birds in the largest colonies, with fitness in all others about the same (Brown and Brown 1996). In all of these species—spiders, weavers, kestrels, and swallows—the colony-size paradox still applied even when fitness components were measured relatively comprehensively.

Unequal fitness across colony sizes: spatiotemporally fluctuating selection

Another potential explanation for the observation of unequal fitness across colony sizes is that selection on colony size varies in direction and form among years or across regions, alternately favoring different sizes in different contexts, and in this way maintains long-term variation (Møller 2002; Serrano and Tella 2007; Rubenstein 2011; Brown et al. 2013, 2016; Le Coeur et al. 2015). Spatiotemporally fluctuating selection is one way that stasis in particular traits can be achieved (Siepelski et al. 2009, 2011; Bell 2010) and is consistent with the many short-term studies indicating that directional selection is more common than stabilizing selection, yet long-term change in traits is apparently rare (Kingsolver and Diamond 2011; Kingsolver et al. 2012; Morrissey and Hadfield 2012).

When ecological conditions, driven perhaps in part by climatic variability or anthropogenic activity (Lusseau et al. 2004; Siepelski et al. 2009; Altwegg et al. 2014; Millet et al. 2015), vary among years and cause changes in populations of a colonial species’ prey, predators, or parasites, the costs and benefits associated with colony size might also regularly change among years or locations. When this is the case (and individuals cannot predict this variability at the time of settlement), selection on colony size could theoretically fluctuate in time and/or space between directional (of either sign), stabilizing, or disruptive, and in each case, different colony sizes would be favored (Brown et al. 2016). The wide range in colony sizes could thus be maintained over the long term, because no one size confers a fitness advantage to individuals that persists for very long.

Testing this hypothesis requires (ideally) data on each fitness component for individuals in all colony sizes in multiple years or study areas, with each breeding season being the unit of analysis (and thus requiring adequate sample sizes for each year). This is a tall order that most studies of colonial animals have not been able to achieve. Many of those on annual fecundity in birds (Brown and Brown 2001) were not designed to study temporal variability, having been done either during a single season or over only a few years. Other studies that were longer term have demonstrated yearly variation in fitness components associated with reproductive success (Burger 1982; Picman et al. 1988; Danchin et al. 1998; Stokes and Boersma 2000; Serrano et al. 2004; Acker et al. 2015) but did not report whether the effect of colony size on these components varied with year.

Two exceptions, however, support the notion that fecundity selection on colony size might fluctuate temporally. In California gulls (Larus californicus) where nest density can be considered a proxy for colony size, fledging success declined significantly with density over the first 6 years of the study and increased significantly with density over the last 7 years (Jehl 1994). In red-necked grebes (Podiceps grisegena), solitary nests had a higher probability of survival than nests in colonies during 1 year, the pattern was reversed in another year, and in 2 years, there was no effect of colony size on nest survival (Nuechterlein et al. 2003)

Data on temporal variability in colony-specific survival are also sparse. Most studies incorporating survival in fitness estimates either pooled all years (Schwager 2005) or were done on short-lived species during single seasons (Avilés and Trufiño 1998; Bilde et al. 2007). Work on lesser kestrels suggested that yearly effects on adult survival might change among colony sizes, although colony-size differences were mostly nonsignificant and the rank order of colony-size success was largely the same among different years (Serrano et al. 2005). Another kestrel study also showed temporal differences in survival among years, but interactions between colony size and year were either not examined or did not prove significant (Di Maggio et al. 2016). Sociable weavers, on the other hand, did show apparent colony-size effects on survival that varied over 4 years (Altwegg et al. 2014); however, the form of the interaction was not specified, probably in part because documenting possible selection on colony size was not the primary objective of the research.

The most complete study of temporally fluctuating selection on colony size comes from cliff swallows, in which survival selection on both first-year birds and breeding adults fluctuated among years (Brown et al. 2016). Over a 30-year period, colony size was under both stabilizing and directional selection in different years, with birds in larger colonies favored in cooler and wetter seasons and birds in smaller colonies in hotter and drier ones. Oscillating selection on colony size likely reflected annual differences in food availability and the consequent importance of information transfer, and in the level of ectoparasitism. The results help explain the colony-size paradox, showing that the colony sizes that are least successful in 1 year may be the most successful in a later year. Perhaps as a result, the distribution of colony sizes in that population has shown no long-term directional change (Brown et al. 2013).

We also know relatively little about the extent of spatial variability in fitness of colonial animals. Among birds, the most relevant study is that on fieldfares (Turdus pilaris), in which high predation on adults selected for smaller colonies in southern Sweden, whereas reduced predation on adults in northern Sweden favored larger colonies where nest predation was lower (Wiklund and Andersson 1994). This example showed clear geographic variation in selection on colony size. However, even though Wiklund and Andersson (1994) had data for 10 years, whether the spatial variation between northern and southern Sweden differed among years was not reported.

Among spiders, fitness components in relation to colony size in S. dumicola seemed to differ to some degree between two Namibian study areas approximately 200 km apart (Bilde et al. 2007), but whether the payoffs for these two areas also varied temporally could not be investigated because the two areas were studied only during a single breeding season. Another spider, A. eximius, showed elevational differences in colony size in Ecuador, with smaller colonies at higher elevations and larger ones at lower elevations, implying differences in selection on colony size in the different places, although colony-size-specific fitness was not reported (Purcell and Avilés 2007). In a related species (Anelosimus studiosus) within a relatively small area in Tennessee, Jones and Reichert (2008) found that individuals in larger (“multi-female”) colonies had higher average fitness at cooler sites and that spiders in smaller (“single-female”) colonies had higher fitness at warmer sites, providing support for spatially fluctuating selection on colony size over a small geographic scale and consistent with the observed distribution of more large colonies at cooler sites and more small ones at warmer locales (Jones et al. 2007).

The hypothesis that selection on colony size fluctuates in time and space seems to me the most promising of the adaptive, fitness-based explanations for variation in colony size (Brown et al. 2016). Unfortunately, testing it requires comprehensive data on various fitness components over multiple years in which enough individuals and colonies are sampled each season to yield sufficient statistical power for estimating time-by-colony-size interactions. In the relatively few studies to examine annual survival in colonial vertebrates, for example, small sample sizes or model complexity have often required pooling years in order to achieve any estimates (Brown et al. 2003; Gager et al. 2016) or, where survival was estimated by year, sample size did not permit fully exploring whether year interacted with colony size to predict variation in survival (Serrano et al. 2005).

Even when sample sizes are large (e.g., Roche et al. 2013), studying the effect of colony size by year is challenging. If each colony size is considered a distinct group (or state in multi-state statistical models), parameters proliferate to an extent to be unwieldy (Royale 2009) and may require collapsing states in order to achieve actual parameter estimates (Brown et al. 2016). Following long-lived vertebrates over their lifetimes to measure both survival and reproductive success is logistically difficult because doing so requires conducting long-term field studies, which face numerous obstacles (Franklin 1989; Tilman 1989; Clutton-Brock and Sheldon 2010a, b; Birkhead 2014). When animals switch colonies among years, locating them in space and monitoring them becomes even more challenging. Yet, until more information on spatiotemporal variation in survival and fecundity for different colony sizes is obtained, the paradox of unequal fitness among colony sizes will remain, and we cannot evaluate whether in general spatiotemporally fluctuating selection maintains colony-size variation. Collecting such data should be a priority in future studies of colony size.

Equal fitness across colony sizes: what does it tell us?

Fitness components do not always vary significantly with colony size (Brown and Brown 2001). Those results are the least difficult to reconcile with colony-size variation. Equal fitness across colony sizes would be predicted by the hypothesis of ideal-free matching of the number of conspecifics in a colony to local resource availability. In one of the only examples of colony size being a direct reflection of local nesting-site availability, in purple martins, annual fecundity did not vary significantly across the size range (Davis and Brown 1999). In addition, if colony residents sort themselves among group sizes based on their life-history predispositions, morphology, neuroendocrine parameters, or behavioral temperament (Table 1; see earlier section) in ways that maximize their performance, fitness among colony sizes might tend to be more similar than if individuals were assorting randomly.

However, most studies showing little differences in survival or fecundity with colony size suffer the same limitations discussed above for the cases of unequal fitness among colonies. Virtually, all such studies have examined only one component of fitness (usually nesting success), and whether other life-history traits have similar fitness functions with colony size is unknown. Many are of short duration (1–2 years), and we do not know if the result of equal fecundity across colonies applies in each season. When sample sizes (i.e., the number of colonies studied) are small, which is typically the case, low statistical power may be responsible for not detecting colony-size effects. Or, when studies rule out directional selection on colony size, their statistical power may often still be too low to detect nonlinear patterns (stabilizing or disruptive selection on colony size). Thus, the current examples in which fitness is similar across colony sizes cannot conclusively support or refute the hypotheses that variation in colony size is attributable to ideal-free sorting with respect to resource patches, life-history tradeoffs among fitness components, spatiotemporally fluctuating selection, or nonfitness-based explanations of group size choice (see next section).

Settlement constraints and information limitations

Another class of hypotheses to explain variation in colony size posits the variation to be a consequence of constraints driven by subsequent settlement of other individuals, dispersal limitations, and incomplete or inaccurate information on site suitability. In these cases, while a given individual may try to make a colony-size choice that maximizes its fitness (based possibly on extent or quality of colony resources available or its own phenotypic predisposition for a particular group size), constraints prevent this from occurring. The result may be a distribution of animals among colony sizes in ways where fitness is not equal among colony sizes, and in these cases the colony-size paradox could be explained.

Constraints on achieving a particular colony size

We recognized over 30 years ago that colony size is not a variable over which a given individual has complete control: at the time of settlement, the colony size probably can be predicted relatively well by an incoming settler, but other individuals may later join, or depart from, an existing colony and change its size (Brown and Brown 1996). This reality was first emphasized by Sibly (1983) and Pulliam and Caraco (1984), who assumed that intermediate-sized groups were often best. They pointed out that an incoming individual, faced with the prospect of settling as a solitary or joining an existing group of optimal size, would most likely join the group, despite lowering the fitness of all group members (by making the group larger) because the joiner’s fitness would still be higher than if it had settled as a solitary. It was thus argued that groups are typically larger than what would be optimal in terms of fitness expectations (Sibly 1983; Pulliam and Caraco 1984; Zemel and Lubin 1995).

Although originally developed to explain variation in foraging groups, the notion that groups are routinely larger than the optimum might also hold for breeding colonies (Kramer 1985; Jones 1987). However, it can potentially apply only to species in which an intermediate colony size is theoretically best, which at present is known to be only a small subset of colonial species (Aviles and Trufińo 1998; Brunton 1999; Brown and Brown 2001; Bilde et al. 2007; Markham et al. 2015). This hypothesis cannot explain continued persistence of small colonies in species in which fitness increases linearly with colony size nor can it explain continued persistence of large colonies when fitness decreases linearly with colony size; in the latter case, individuals should always settle as solitaries in preference to joining a large colony. In addition, arguments about whether settlers have constraints on which group to join depend heavily on the shape of the fitness function associated with colony size (Sibly 1983; Giraldeau and Gillis 1985; Giraldeau and Caraco 1993), how fitness varies over time (Griesser et al. 2011), the degree of relatedness among potential colony members (Higashi and Yamamura 1993; Rannala and Brown 1994; Giraldeau and Caraco 2000; Tóth et al. 2009), the extent of collective movement (either into or out of a colony) by multiple individuals (Kramer 1985; Kharitonov and Siegel-Causey 1988; Brown and Brown 1996; Szabó and Szép 2010), and whether a single optimal group size exists in most populations, given phenotypic differences among individuals (Ranta 1993; see Phenotypic Sorting). Furthermore, the original Sibly (1983) model that optimal group sizes are unstable and lead to larger groups predicts a relatively narrow distribution of group sizes (Gerard et al. 2002; Ma et al. 2011), the opposite of what is typically observed in the field.

I am not aware of any empirical study that has directly tested the hypothesis that breeding colonies are routinely larger than what might be considered optimal or that incoming settlers are faced with these sorts of constraints on choosing colony sizes. Some modeling now suggests that when animals have relatively good information about the quality of the resource patches (i.e., colony sites) in the environment, optimal group sizes may in fact be relatively stable (Beauchamp and Fernández-Juricic 2005) and vary in size (Martinez and Marschall 1999). The only field study partially supporting the Sibly (1983) and Pulliam and Caraco (1984) hypothesis was on Montagu’s harriers (Circus pygargus), where colony settlement patterns seemed to support individuals’ decisions to choose large colonies up to some threshold, after which they began colonizing new sites (Soutullo et al. 2006). However, whether fitness varied among the different settlement options as predicted (sensu Pulliam and Caraco 1984) was unclear. In northern bobwhite quail (Colinus virginianus), nonbreeding group sizes appeared to remain near an intermediate size where fitness was highest, suggesting that the optimal group size was relatively stable in that species (Williams et al. 2003). Nonbreeding quail, of course, have more options for moving among groups and potentially regulating their group size than do colonially nesting species where fixed nests preclude much inter-colony movement.

Even if a range of colony-size optima exists, though, colony size can sometimes change to become larger or smaller than what individuals might “prefer,” either through new settlement by later arrivals or by extensive breeding failure of established residents (Brown and Brown 1996). This inevitably introduces some noise into the estimated fitness function associated with colony size, and also suggests that we should focus especially on colony sizes chosen by settlers specifically at the time they make their decision if we wish to directly correlate individuals’ choices with particular colony sizes and fitness outcomes.

Information use and settlement cues

The animal coloniality literature has been dominated in recent years by discussion of what cues individuals use to assess breeding habitat, the form and extent of public information about site quality that is available to potential settlers, and how these information-based settlement decisions might lead to the evolution of coloniality (Danchin and Wagner 1997; Valone and Templeton 2002; Safran et al. 2007; Danchin et al. 2008; Evans et al. 2016). Much of this work focuses primarily on what drives formation of colonies in, what I consider unsuccessful, attempts to achieve a universal theory of coloniality, and what causes or maintains variation in colony size has proven to be a sometimes neglected component of this approach. It is assumed that fitness expectations are the underlying driver of the decision rules or information employed to make settlement choices. If individuals can accurately assess future fitness based on environmental or social context (sensu Danchin and Wagner 1997, Reed et al. 1999; Nocera et al. 2006; Safran 2007; Evans et al. 2016), their decision rules are simply a mechanistic explanation for individuals’ sorting among sites based on resource patchiness or social benefits deriving from group composition or size (Safran et al. 2007). Under some circumstances, relatively simple decision rules can lead to adaptive choice of breeding habitat where success is maximized (Danchin et al. 1998). However, when the information used or the environmental cues are unreliable or for other reasons do not correlate with fitness (Giraldeau et al. 2002), individuals will thus occupy some colony sizes where their success is lower than that of animals in other colony sizes, consistent with the colony-size paradox.

Direct assessment of resources

One way that colonial animals are thought to select breeding sites is by directly assessing the extent of local resources (e.g., food, nesting substrate) available in a given habitat patch. Species in which colony size represents an ideal-free match to resource availability would be candidates to use this sort of information (see Davis and Brown 1999). It seems reasonable that individuals might assess, for example, extent of foraging habitat (by traveling through it) or amount of nesting substrate (by examining several potential nest sites) in relation to the number of conspecifics already present at a site, and use that to decide whether to settle there.

The best evidence for this sort of habitat assessment comes from studies on colonial spiders, in which food availability around a colony was experimentally reduced, and spiders responded by decreasing their colony sizes (Rypstra 1985; Smith 1985). However, it is unknown if this sort of information is widely gathered or used by colonial species. A limitation is that acquiring the relevant information may require extensive environmental sampling of multiple habitats, analogous to problems an optimal forager has in deciding on foraging-patch use (Bell 1991; Reed et al. 1999), and individuals may waste valuable time sampling during a finite reproductive period. Typically, settlement decisions must be made relatively early in a breeding season when colony choice occurs. Only if resource availability is predictable across time can animals use habitat assessments at the start of the season to reliably predict what will be available later when young are actually being raised. Time constraints on sampling or unreliable information on, for example, the extent of food resources available for provisioning of offspring or the abundance of local predators may result in nonadaptive fitness variation among colonies of different sizes, manifested in brood condition or survival.

Indirect assessment: conspecific presence

In part because of the difficulties inherent in directly assessing resources or other factors (e.g., predation, parasitism) that affect reproduction at potential colony sites, animals might instead rely on indirect cues. Two of the most widely discussed are the number of individuals present (or a proxy, such as the number of existing nests from a past year) and the actual reproductive success of individuals at an active colony site (Shields et al. 1988; Danchin and Wagner 1997; Safran 2004, 2007; Evans et al. 2016). Some colonial species have been shown to be attracted primarily by the presence of conspecifics at a site and to use this to select colonies (Ward et al. 2011), often seeming to prefer the sites with the most numbers of individuals present (Brown and Rannala 1995; Serrano et al. 2001, 2003, 2004; Dittmann et al. 2005). Presumably, in these cases, naïve individuals can rely on the presence of others (e.g., the early arrivals who might be older or more experienced) to assess site suitability, and this provides more accurate information than attempting to evaluate sites themselves. Immigrant or first-year individuals in particular, who have no prior knowledge of the relative suitability of sites in an area, have been shown to cue primarily on conspecific presence in some colonial species (Reed et al. 1999; Serrano et al. 2001; Calabuig et al. 2010).

However, conspecific presence per se is not always a reliable indicator of success (Safran et al. 2007), in part because total reliance on this alone may lead to continued growth of a few colonies to sizes at which individual success is depressed because of intense competition for food or nest sites (that outstrips local availability), interference among neighbors, or heightened transmission of parasites and disease (Wittenberger and Hunt 1985; Brown and Brown 1996, 2001). In other cases, potentially suitable sites will be ignored simply because no conspecifics are using them (Forbes and Kaiser 1994; Kildaw et al. 2005). Most empirical studies of conspecific attraction implicitly assume all individuals use the same cue and typically favor sites with the most individuals present (Stamps 1988; Shields et al. 1988; Safran 2004; Serrano et al. 2004), but if sorting among colony sizes occurs based on individuals’ phenotypic attributes (Table 1), we might expect different degrees of conspecific attraction and even conspecific repulsion in some individuals.

Indirect assessment: conspecific reproductive success

An alternative means of assessing colony sites is directly observing the reproductive success of others (“prospecting”) and using that information to choose colony sites the following year (Danchin and Wagner 1997; Danchin et al. 1998; Reed et al. 1999; Dittmann et al. 2005; Evans et al. 2016). The time lag introduces an element of uncertainty, and thus use of conspecifics’ reproductive success requires that individual fitness at physical colony sites be autocorrelated from one year to the next, probably because local resources are temporally predictable (Switzer 1993; Brown and Rannala 1995; Doligez et al. 2003), and that individuals exhibit relatively high philopatry to the same general locale between years. Re-occupancy of sites the next year and extent of colony growth at a site between years correlate positively with the past year’s overall reproductive success in some colonial species (Danchin et al. 1998; Brown et al. 2000; Frederikssen and Bregnballe 2001; Sergio and Penteriani 2005; Aparicio et al. 2007) but not in others (Safran 2004; Serrano et al. 2004; Sachs et al. 2007; Igual et al. 2007; Parejo et al. 2006). So far, models of colony site occupancy based on assessment of conspecifics’ success have focused only on the fecundity component of fitness, presumably because that is more easily monitored by a prospective settler (e.g., by seeing the number of active nests or the number of dependent offspring in nests). Using annual survival of others as a proxy for site suitability would seem to be much more difficult to do, although the age distribution of residents at a site might offer indirect information on survival prospects for residents there.

In either case, whether colonies are chosen by conspecific attraction or by monitoring success of others the previous year, these mechanisms do not explain why colony-size variation occurs in the first place. Both in fact predict that large colonies should continue to grow, either because they are large or because successful sites will accumulate settlers, and that small colonies will be rare and generally unsuccessful. Site choice based solely on conspecific attraction or conspecific reproductive success does not generate the extreme variation in colony size associated with many species (Johst and Brandl 1997; Safran et al. 2007). Information cues, to the degree that they are unreliable and do not accurately reflect suitability of a given site at the time of settlement, can best explain only why fitness might differ among sites of roughly similar size.

Dispersal, philopatry, and collective movement

A promising approach to understanding variation in colony size is to consider the set of colonies in a habitat as a metapopulation and investigate colony occupancy dynamics as largely a function of dispersal and philopatry (Johst and Brandl 1997; Matthiopoulos et al. 2005; Schwager 2005). Because re-occupancy of a colony site the next year (philopatry) often confers advantages simply through familiarity with a local area (Serrano et al. 2001; Safran 2004; Hoogland et al. 2006; Brown et al. 2008), dispersing to new sites has an inherent cost. In some circumstances, the cost of dispersing to a new colony may be so great that individuals continue to exhibit philopatry at a site that is no longer suitable, effectively trapping an entire population in a subset of the available colony sites (Matthiopoulos et al. 2005; Schippers et al. 2011).

For example, when distances between sites are variable and dispersal costs vary with distance, colonization of some sites can be prevented, and variability in colony size may result, even when colony sites are homogenous in quality (Matthiopoulos et al. 2005; Schwager 2005). In these cases, we might see lower fitness for residents of some of the larger colonies, potentially explaining the colony-size paradox for those species. Simulations using various levels of philopatry and variable distances and configurations among sites reveal extensive temporal size variability for individual colony sites (Matthiopoulos et al. 2005) that are consistent with empirical colony-size distributions in some species (Brown et al. 2013) and lead to population-level variation in colony size. Additional work (Johst and Brandl 1997) has modeled different levels of density dependence in driving dispersal among colony sites, and variation in colony size occurs under certain levels of density dependence, especially when there is extensive temporal variability in the number of individuals that can be supported at a site (Ives and Klopfer 1997; Schwager 2005). These results are particularly interesting because they show mechanistically how colony-size variation can be generated using rather simple dispersal rules. However, they do not specify what the costs or benefits underlying different degrees of dispersal might be, and those are likely influenced by the fitness effects associated with colony size or particular sites reviewed earlier.

A more explicit attempt to model how between-site movement might predict observed variation in colony size was Russell and Rosales’ (2010) important study of conditions favoring site-switching cascades by individuals at different colonies. That animals often have imperfect information on site suitability is not disputed (Bell 1991; Giraldeau et al. 2002), and Russell and Rosales suggest that imperfect decision-making can result in mass movement of individuals between locations from one breeding season to the next (i.e., site-switching cascades). The movement can take any number of forms: density-independent or density-dependent dispersal, random walks, or diffusion processes and does not require that dispersal among sites necessarily be fitness-based.

Simulations (Russell and Rosales 2010) show that colonies can often develop through an interaction between settlers’ being attracted to sites that are starting to fill up and a component of randomness introduced by the occasional “mistakes” made by certain individuals in colonizing a previously unoccupied (and thus presumably suboptimal) site. One or two such mistakes early in the site colonization period can cause an almost-empty site to gain a small colony that then quickly begins to attract additional settlers, especially when such sites are in fact suitable (Kildaw et al. 2005; Calabuig et al. 2010). As the tendency of individuals to actively choose sites (based on undetermined but presumably imperfect cues) increases, and when sites have unequal capacities to accommodate settlers, colony dynamics resemble real-life populations (Brown et al. 2013). Some sites will have large numbers of birds and are used perennially, others have none, and still others show wide oscillations in size as individuals switch en masse between sites among years (Russell and Rosales 2010). Like other theoretical studies (Johst and Brandl 1997; Giraldeau et al. 2002; Matthiopoulos et al. 2005; Schwager 2005), Russell and Rosales’ (2010) results show how colony-size variation can be generated, and if settlement decisions do not correlate with fitness (because of imperfect information), they may explain the colony-size paradox.

Another advantage of Russell and Rosales’ (2010) approach to studying colony dynamics is that it permits specifying how purely random settlement would influence variation in colony size. Ecologists increasingly recognize that habitat occupancy can sometimes be described by a largely stochastic settlement process (Haila et al. 1996; Campbell et al. 2010), and at the very least, this provides a convenient “null” model to which to compare observed distributions. Colony-size variation in the spider, Nephila clavipes, was thought to reflect a primarily stochastic distribution of individuals in appropriate habitat, based on a fit to a zero-truncated Poisson distribution of colony sizes (Farr 1977). However, other (simulation-based) models of purely random settlement (without use of information-based cues or site-switching cascades) led to colony sizes being similar among sites, and the only variation in occupancy was small and attributable to white noise (Russell and Rosales 2010). Because variation is more extreme in most colonial animals (Brown et al. 1990, 2013; Aviles 1997; Moss et al. 2002; Jovani et al. 2008a; Griesser et al. 2011), it seems unlikely that colony-size distributions in most species reflect purely stochastic movement among sites.

Describing colony-size variation

Empirical colony-size distributions themselves can often provide some insight into ecological conditions associated with particular colony sizes. For example, some simulations suggest that predation can select for fewer small groups and more large groups than might be expected at random (Cipriani and Jaffe 2005), while others find the reverse (de Cara et al. 2002). However, quantitative analyses of colony-size distributions is complicated by the fact that the distribution in many species is heavily right-skewed when plotted as traditional histograms, with a relatively few of the largest colonies creating a long tail (Jovani and Tella 2007; Jovani et al. 2008a, b). This can obscure whether there are threshold colony sizes, above or below which the distributions consist of more or less colonies than might be expected given certain assumptions. A solution to this problem is to fit single versus truncated power laws to colony-size distributions in order to identify the cutoff (or threshold) colony size, the point at which we would predict that associated ecological conditions, environmental selection pressures, or other density-dependent processes might suddenly change (Bonabeau et al. 1999; Sjöberg et al. 2000).

Truncated power laws are a type of logarithmic scaling that consists of two separate power functions with different slopes that describe portions of the colony-size distribution. They have been applied to a wide array of animal groups, fit empirical group size distributions fairly well (Sjöberg et al. 2000; Lusseau et al. 2004; Jovani et al. 2008a), and can identify cutoff (threshold) colony sizes that may be biologically important (de Cara et al. 2002; Jovani et al. 2008a; Ma et al. 2011). In lesser kestrels, truncated power laws that were best fits to the yearly distributions of colony sizes suggested that large colonies declined first and most precipitously in a population undergoing instability as a result of a reduced food supply, and that colonies of 10 nests in size represented a threshold at which despotic behavior by residents began to regulate colony size, resulting in a markedly reduced frequency of colonies larger than that 10-nest size (Jovani et al. 2008b).

As an example of this approach (Fig. 1), truncated power laws were fit to the colony-size distributions for cliff swallows at my Nebraska study site. A colony size of 1000 nests was identified as the cutoff size, at which the two power laws differed in slopes. The extent to which the truncated power law was a better fit to the data than a single power function was evaluated statistically (e.g., with the Akaike Information Criterion). In 22 of 30 years, the truncated power law with a cutoff colony size of 1000 nests was a better fit than one that used the same function for all colony sizes (Fig. 1a). In only a few years (Fig. 1b) was the truncated power law not a better fit. This sort of analysis—especially when repeated for multiple years—thus indicates that, in general, something begins to constrain formation of colonies >1000 nests in most seasons. Fewer such large colonies occur than we might expect based on the frequency distributions for the smaller colonies. In cliff swallows, this probably reflects physical limitations on the number of colony sites with a local food supply that can support so many birds and their offspring (Brown et al. 2013). Such a threshold colony size would be difficult to identify from traditional frequency histograms without using a truncated power law.

Fig. 1
figure 1

Examples of fitting a truncated power law to the observed distribution of colony sizes in Nebraska cliff swallows in 2 years, a 1986 and b 1992. Multiplicative bins of the form (X n,X n + 1 − 1) along the x-axis with n = 0, 1, 2, 3 ..., where X = 2 were used to create log-log plots. The y-axis of the log-log plots depicted the number of colonies divided by the bin size and the x-axis the midpoint of each colony-size bin 10(log(minimum colony size of the bin) + log(maximum colony size of the bin))/2 (Jovani et al. 2008a). Twelve bins were used, and the cutoff colony size (across all years combined) was established as approximately 1000 nests following the methods of Sjöberg et al. (2000). Least-square regression was fit to the entire log-log plot (a single power function) and as separate regressions to the bins on each side of the cutoff colony size (the truncated power law function). In (a), the truncated power law was a better fit (as evaluated with the Akaike Information Criterion), indicating that proportionately fewer colonies >1000 nests existed, while in (b), the truncated power law was not a better fit, indicating either a general lack of colonies >1000 nests or that such colonies occurred at the same frequency as might be expected from the entire distribution

In addition, Jovani and Mavor (2011) point out that colony-size distributions per se may not describe accurately the distribution of individuals among group sizes. In some species, a relatively few enormous colonies may contain a very large fraction of the total individuals within the population. Because selection operates on individuals and not on colonies, the colony-size distribution itself may not provide the best index of the degree to which particular colony-size-specific selection pressures are acting in a population as a whole. We perhaps should move toward always reporting the number of individuals in colonies of different sizes in addition to the colony-size distribution per se (Jovani and Mavor 2011; Brown et al. 2013).

Conclusions and prospectus

Variation in colony size remains poorly understood. Despite the recognition that individuals’ fitness often varies widely among different colony sizes, few satisfactory explanations exist for why these fitness differences occur, and thus how colony-size variation is maintained over the long term remains puzzling. In general, we lack the data to test the major hypotheses. Furthermore, most of the research explicitly addressing colony-size variation has been done only on a few colonial spiders and a handful of bird species.

These limitations notwithstanding, a few generalizations are possible. Colonies of different sizes often differ in the phenotypic composition of their residents and do not represent homogenous subsets of the population. Thus, all individuals are not equally likely to settle in colonies of a given size. There is enough indirect evidence on yearly variability in fitness components of colonial animals to suggest temporally fluctuating directional or stabilizing selection on colony size as a viable hypothesis to explain size variation, especially given apparent heritability of colony-size choice in some species. Imperfect information on site quality and costs of dispersing between sites can (theoretically) cause collective movements of individuals among colony sites within a metapopulation and in that way generate size variation in colony size independent of any fitness associations with colony size. Fitting power laws to colony-size distributions may reveal at what colony size(s) the biology of the organisms changes in a major way, providing clues to where empirical investigations should focus.

Variation in colony size provides behavioral ecologists with opportunities for investigating numerous poorly addressed or unanswered questions, and the results will be important not only to understanding colony-size variation but also to broader behavioral, ecological, and evolutionary issues. I highlight the following areas that seem most fruitful.

1. How heritable is colony-size choice among colonial animals? While studies of a few bird species suggest moderate heritability of group size preference (Brown and Brown 2000; Møller 2002; Serrano and Tella 2007; Roche et al. 2011), the genetic basis of colony-size choice is unstudied for the vast majority of species. If heritability of size choice is widespread, it will suggest that complex social behavior can be genetically influenced (e.g., Krieger and Ross, 2001; Charmantier et al. 2007) and thus subject to selection and potentially rapid evolution.

2. To what extent does colony-size variation reflect heterogeneity in resource availability? While often assumed to do so, little empirical evidence shows that this is the case. Addressing this question can provide insights not only into the evolution of coloniality in response to resource patchiness and why colony-size variation might occur, but it will also assist in identifying the critical habitat (e.g., that which will reliably support the largest colonies) important for long-term persistence of threatened colonial species of conservation concern (Cook and Toft 2005; Rodríguez et al. 2006; Di Maggio et al. 2016).

3. On the basis of what phenotypic traits do colonial birds sort among colony sizes? While sorting does seem to occur in a variety of species, this question has been addressed systematically only in a very few species. Determining how phenotypic traits in general associate with group size will offer insight into how selection may vary with the social or ecological environment, and perhaps more importantly, suggest mechanisms for the maintenance of phenotypic variation within populations. Colony size may provide a framework for studying the persistence of different behavioral syndromes or personalities in populations, an area currently of intense interest among behavioral biologists (Dall et al. 2004; Sih et al. 2004; Réale et al. 2007; Martins et al. 2012).

4. How does temporal or spatial variation in life-history components such as survival and fecundity interact with colony size to alternately favor different colony sizes through fluctuating selection? Long-term studies, in which sufficient yearly data are collected to estimate within-season selection on colony size (Brown et al. 2016), will reveal whether this promising hypothesis can explain the variation in colony size that remains stable over long time periods. In addition, analyses of this sort will address whether fluctuating selection can maintain long-term stasis in certain traits, a currently controversial question in evolutionary biology (Siepielski et al. 2009, 2011; Bell 2010; Kingsolver et al. 2012; Morrissey and Hadfield 2012).

5. What can colony-size distributions themselves tell us about population-level phenomena such as dispersal, philopatry, and decision-making? Theoretical models now enable us to use the observed distribution of colony sizes in a population to make inferences about the extent of dispersal and philopatry within metapopulations, the extent to which animals base their movement decisions on the behavior of others, or simply randomly move in space, and at what threshold colony sizes different population-level processes (e.g., density dependence) change. Understanding factors causing individuals to remain in (i.e., not disperse from) colonies at sites that no longer may be suitable (Kenyon et al. 2007; Schippers et al. 2011) may provide insight into population declines in some threatened species. Greater attention thus should be given to existing colony-size distributions (Jovani et al. 2008a, 2016) and analyzing them in ways that provide the foundation for more in-depth behavioral or ecological studies on colonial animals.

Studies of colony-size variation in general may benefit from a greater use of experiments. Only a very few studies have manipulated colony size in the field and measured fitness-related or behavioral responses (Brown 1988; Minias et al. 2015). While there may be ethical concerns in preventing animals from settling or reducing a colony’s size by destroying large numbers of nests (especially in vertebrates), some colonial species are amenable to living in captivity, and in these colony size can be manipulated. For example, colonies of spiders can be brought into the laboratory, changed to particular sizes, and individual responses to different colony sizes measured (Pruitt and Riechert 2011; Grinsted et al. 2013; Keiser and Pruitt 2014). In species in which colony size cannot be manipulated, it may be possible to modify local resources (e.g., by changing food availability; Rypstra 1985; Smith 1985; Spottiswoode 2009) or to remove certain individuals (Pruitt and Pinter-Wollman 2015), and observe how colony size, composition, or performance varies in response. Many opportunities exist for developing creative new approaches to studying colony-size variation.