Introduction

Behavioural flexibility is the ability of an organism to directly respond and adjust its behaviour to environmental circumstances (Coppens et al. 2010; Wright et al. 2010), which provides animals with a crucial tool for adapting to changing environments (Sol and Lefebvre 2000; Reader and Laland 2003). For example, species that frequently adopt novel foraging behaviours (i.e. feeding innovations) in the wild have been found more successful at establishing themselves in new environments (Sol and Lefebvre 2000; Sol et al. 2002, 2005), and individuals from an invasive population showed more willingness to taste unfamiliar food than conspecifics from a resident population (Martin and Fitzgerald 2005). Cities represent one type of environment where animals may face novel situations particularly often, as these heavily fragmented built-up areas expose animals to several kinds of pollution, unnatural lighting and noise, disturbance by domestic animals and humans and potential new resources such as garbage (reviewed by Sol et al. 2013). These habitat characteristics may make innovative and explorative tendencies advantageous when their benefits outweigh their costs (Sih 2013; Brosnan and Hopper 2014); thus, urbanization is hypothesized to select for such traits via genetic adaptations and/or ontogenetic plasticity (Carrete and Tella 2011; Maklakov et al. 2011; Sih 2013; Sol et al. 2013).

The evaluation of this simple idea is not easy because authors often refer to different phenomena when studying behavioural flexibility. For example, responses to novelty such as tasting novel foods are sometimes treated as an expression of behavioural flexibility (e.g. Martin and Fitzgerald 2005) and sometimes as a distinct, although related phenomenon (e.g. Greenberg 2003). Also, while animal innovations are often considered an expression of behavioural flexibility at the interspecific level (Reader and Laland 2003), some authors argue that at the intraspecific level, behavioural flexibility and innovation are two partially different (Ramsey et al. 2007) or even opposing aspects (Griffin et al. 2013a). Here, we consider both novelty responses and innovations as important components of behavioural flexibility, and we adhere to the broadly accepted definition that animal innovation is the acquisition of new, learned behaviours (Reader and Laland 2003) that can be reliably studied by problem-solving tests (Griffin and Guez 2014). Studies up to now have yielded controversial results about the relationship between urbanization and these aspects of behavioural flexibility.

First, several interspecific comparative studies investigated general measures of behavioural flexibility, such as the number of foraging innovations reported in birds. Whereas one study found that urban species have higher rates of innovation (Møller 2009), another found no such relationship in a different set of species (Kark et al. 2007). Although several studies showed that species with larger brain exhibit higher frequency of foraging innovations in birds and primates (Lefebvre et al. 1997; Reader and Laland 2002; Sol et al. 2005; Overington et al. 2009) and higher introduction success (birds, Sol et al. 2005, 2012b; mammals, Sol et al. 2008; reptiles, Amiel et al. 2011), the relationship between brain size and urbanization is ambiguous. Some findings suggested that large brains are associated with successful adaptation to urbanization (Carrete and Tella 2011; Maklakov et al. 2011), while others found no consistent relationship between brain size and the rate of urbanization either at the interspecific level in birds (Kark et al. 2007; Evans et al. 2011) or within species in mammals (Snell-Rood and Wick 2013).

Second, a handful of experimental studies focused on various aspects of behavioural flexibility in relation to urbanization. One such aspect is a “taste for novelty” (Martin and Fitzgerald 2005) which is expected to be adaptive for urban animals, but results regarding this behavioural trait are also mixed. For instance, in an invasive Australian bird, the common myna (Acridotheres tristis), urban individuals were less fearful from novel objects than suburban individuals, but they were similarly neophobic towards novel food (Sol et al. 2011). In studies of non-invasive populations of house sparrows (Passer domesticus) and blackbirds (Turdus merula), urban individuals tended to show either similar or higher neophobia and/or lower neophilia than rural conspecifics (Echeverría and Vassallo 2008; Liker and Bókony 2009; Bókony et al. 2012a; Miranda et al. 2013).

Another crucial component of behavioural flexibility is the animals’ ability for problem solving such as the application of novel foraging techniques; diversity of the latter is strongly related to interspecific variation in brain size (Overington et al. 2009). Problem solving can help animals to access novel food sources (Lefebvre 1995) and to modify their environment in preferable ways, e.g. to increase the attractiveness of display site (Keagy et al. 2009) or reduce parasite load (Suárez-Rodríguez et al. 2013). We are aware of only two studies in which problem solving was compared between conspecifics from differently urbanized populations, and both cases indicated that individuals from more urbanized habitats were more successful (Liker and Bókony 2009; Sol et al. 2011). However, within one suburban population of zenaida doves (Zenaida aurita), individuals on territories with higher human disturbance tended to be slower at learning a problem-solving task (Boogert et al. 2010), hinting that some aspects of urbanization might hinder animal innovations (rural birds were not tested in this study).

These findings suggest that there is no simple general relationship between behavioural flexibility and urbanization. In this study, we focus on problem solving and compare the performance of individuals from differently urbanized habitats in four different foraging tasks. Since previous experimental studies were restricted to a relatively narrow spectrum of the urbanization gradient (suburban-urban and suburban-rural comparisons; Liker and Bókony 2009; Sol et al. 2011), here we aimed to sample the entire range of habitats occupied by a species, focusing on the two extremes (i.e. least and most urbanized). We used the house sparrow as subject, a bird with long history of commensalism with humans that can be found worldwide from built-in, busy city centres to remote, semi-natural rural sites (Anderson 2006). House sparrows are among the species with the highest number of foraging innovations reported in wild birds; the number of foraging innovations described for this species is about ten times as high as the average number in the Passeridae family (see supplementary material of Overington et al. 2009). For example, sparrows are able to learn to open automatic doors by hovering in front of the sensor to get access to restaurants (Overington et al. 2009), and they can also learn to solve novel food-extracting tasks in captivity (Liker and Bókony 2009; Bókony et al. 2014). The present study is part of a more general project that explores the causes and consequences of behavioural flexibility in birds (Bókony et al. 2014; Vincze et al. 2014); here, we use the problem-solving data of the same set of birds to analyze the relationship between urbanization and innovativeness. Specifically, we investigated two behavioural aspects involved in animal innovations: problem-solving success (i.e. coming up with a solution in a novel situation) and learning efficiency (i.e. the ability to repeat and improve performance based on previous problem-solving experiences). We also tested whether repeatability of performance differed between urban and rural individuals to see if the former are consistently better than the latter.

Methods

Captures and housing of birds and protocols of the problem-solving tests were detailed by Bókony et al. (2014); however, we describe these procedures here as well to provide readily accessible explanation of the methods.

Capturing and housing of birds

We captured 104 house sparrows (50 males and 54 females) with mist nets (Ecotone, Gdynia, Poland) from differently urbanized habitats in Hungary from January to March in 2012, including farms or edges of small villages (rural sites) and densely built inner city sites (urban sites; see Supplementary Table 1). Over 8 weeks, we captured 10–14 birds in total each week from two sites of the same habitat type, alternating urban sites and rural sites weekly. Each site was involved in capturing birds only once, except one rural site where we captured nine and four birds on two occasions, respectively (Supplementary Table 1). At capture, we measured each bird’s body mass (with a Pesola spring balance with an accuracy of 0.1 g) and tarsus length (with dial calliper with an accuracy of 0.1 mm) and ringed them with a numbered metal ring.

Table 1 The effects of urbanization and other traits on problem-solving latencies in four foraging tasks

Birds were housed indoors in individual metal cages (53 × 27 × 41 cm), each equipped with two perches and a shelter (a vertical plastic sheet hanging from the top of the cage that hid one of the perches). They were provided ad libitum food (wheat, millet, oat and sunflower seeds) and tap water supplemented by vitamin droplets throughout the study except for the duration of problem-solving tests and preceding fasting periods. Individual housing and visual separation during tests were necessary for assessing individual performance (see below); however, birds were always in acoustic contact with each other. Birds were released after the study in weekly groups either at the site of capture or in a village near Veszprém. Sample size varied between tests (Table 1) because seven birds (6.7 %) died during the procedures described here, mostly within a few days after capture. This rate of mortality is within the range (1.7–38.5 %) reported by other studies of captive sparrows (e.g. Gonzalez et al. 2001; Poston et al. 2005; Pap et al. 2008, 2011; Liker and Bókony 2009; Bókony et al. 2012a). Although we did not see any sign of disease, birds that died might have been physiologically stressed already at capture as indicated by haematological measures (Bókony et al. 2014).

Problem-solving tests

After 1–2 days of habituation, birds participated in four problem-solving tests as follows. On days 3–5, birds were deprived of food each morning between 8:00 and 9:30, and then, they were presented with a food-extracting task (i.e. closed feeder) between 9:30 and 11:00. Each task involved a different feeder; each feeder was placed open into the cage as the sole source of food during the day preceding the respective test to familiarize the birds with each feeder (i.e. from 11:00 till next morning 8:00). Our observations suggested that birds readily used the feeders during these habituation periods. The first feeder (Fig. 1a) was an 8.5 × 8.5 × 2.5-cm transparent plastic box with a 2.5-cm hole on the top; this hole was uncovered during familiarization but covered by a transparent plastic card, fixed by two toothpicks, during the test. To reach the seeds, birds had to pull out one or both sticks and toss the card away, or pull the card upwards until it came off the sticks. The second feeder (Fig. 1b) was a 7.5-cm diameter, 3.5-cm-high transparent plastic dish that was covered by white bakery paper on the top, fixed by sticky tape on the sides, during the second test. Birds had to pierce the paper to reach the food. The third feeder (Fig. 1c) was an 11-cm-high commercial bird feeder with a slot cut into it at about 8-cm height; during the third test, a small transparent plastic card was placed into this slot to keep the seeds from falling down. To have the seeds flow out, birds had to remove the card by pulling it out with their beak; some birds achieved this by heavily shaking the feeder.

Fig. 1
figure 1

Feeders used in the problem-solving tasks. a Feeder 1 closed on the left, open on the right. b Feeder 2, opened by two different individuals. c Feeder 3 closed on the left, open on the right. d Feeder 4 closed on the left, open on the right

After the third test, birds were weighed and placed into another room with similar housing conditions as before, where they were allowed to habituate for 2 days. They received food from the fourth feeder (Fig. 1d), an 8.5 × 8.5 × 2.5-cm white plastic box with one transparent side and a lid on the top; the lid was held open during habituation. On days 8–10, we used this feeder to present a subset of birds (N = 72) with the fourth task repeatedly in ten trials (the rest of the birds were not tested in this task because they participated in another experiment). During the trials, the lid was closed, and birds had to insert their beak and head under the lid and push it up to reach the food. In contrast to the first three tasks, feeder 4 did not remain open after the bird first fed from it; instead, it had to be opened every time to peck a seed. There were four 30-min trials every day (excepting the last day) between 9:00 and 16:00, each preceded by 60-min fasting. After each trial, all birds were allowed to feed freely for 15 min by fixing the lid again in open position. After the 10th trial, the birds received nine more trials to ensure that most of them learn to use the feeder for another experiment (BP et al. unpubl. data); however, we do not report analyses of these latter trials here because their protocol was not the same for all 72 birds (note that including these nine later trials into the analyses does not alter our conclusions). After the last trial, we measured each bird’s body mass for the third time.

The four food-extracting tasks were presented in the same order to all birds (Sol et al. 2011; Bókony et al. 2012a). During all tests and fasting periods, birds were visually isolated from one another by opaque plastic sheets to prevent social learning. The birds were observed during the problem-solving tests through a one-way window. The observer scanned the birds every 3 min and recorded their behaviour as one of the following categories: hiding behind the shelter, resting non-hidden in the cage, hopping, flapping, staying on or next to the feeder, attempting to solve the problem (i.e. manipulating the feeder with the beak), feeding, drinking, preening, engaging in stereotypical movements (this occurred in only 3 % of behavioural records). From the behavioural observations for each bird in each test, we calculated the proportion of time spent in shelter, as the number of records in which the bird was hiding divided by the number of all behavioural records for that bird in that test; and the frequency of attempts as the number of records in which the bird was manipulating the feeder divided by the number of all behavioural records until the bird started to feed (expressed as percentages). The first three tasks were also recorded by videocamera; from these recordings, we measured the latency to feed as the time (in seconds) elapsed from the bird leaving the shelter after the start of test until it began to feed.

We expressed problem-solving latency in tasks 1–3 as the latency to feed from leaving the shelter, which usually happened soon after the test start (median for the three tasks, 1.17 min). However, we repeated the analyses with substituting problem-solving latency by solving time, i.e. the time between the first attempt to open the feeder (median 6.78 min; in each task, 2, 16, and 4 birds did not attempt to solve at all, respectively) and the subsequent successful solution of the task (Supplementary Table 2). In task 4, we expressed problem-solving latency as the number of trials (i.e. the number of 30-min test periods) needed to first open the feeder. Following Keagy et al. (2011), we calculated the rank of problem-solving latency of all individuals in each task; then, we averaged these ranks over the four tests for each individual (N = 72) to quantify their mean success relative to other birds (hereafter “overall performance”). During rank calculations, unsuccessful birds were assigned the worst rank in each task.

Urbanization score

Urbanization of the capture sites was quantified based on four habitat features: building density, vegetation cover, presence of roads and human population density. First, the digital aerial photograph of each site was scored by a single observer (SP) using the method of Liker et al. (2008). We divided a 1-km2 area around the site of capture into 10 × 10 cells, and each cell was assigned a score for the cover of vegetation (0, absent; 1, <50 %; 2, >50 %), density of buildings (0, absent; 1, <50 %; 2, >50 %) and presence of paved roads (0, absent; 1, present). From these cell scores, we calculated five habitat characteristics for each site (mean vegetation density, mean building density, number of cells with roads, number of cells with high density of vegetation and buildings, respectively). Then, following Bókony et al. (2010), we collected data on the density of residential human population for each settlement from the Hungarian Central Statistical Office. For the two sites in Budapest, we used the data for the respective districts of the capital, whereas for farm sites, we ascertained population density by either asking the residents or consulting the farms’ website. Then, we included the above five habitat characteristics and human population density in a principal component analysis (PCA), which resulted in a single axis with >1 eigenvalue that explained 92.75 % of total variance and correlated strongly negatively with mean vegetation density (r = −0.98) and number of cells with high vegetation density (r = −0.99), and strongly positively with mean building density (r = 0.98), number of cells with high building density (r = 0.98), and number of cells with roads (r = 0.98), and human population density (r = 0.86). We used the PCA scores along this first axis as urbanization score for each site (Supplementary Table 1). For some analyses, we divided the capture sites as “urban” (positive urbanization score) and “rural” (negative urbanization score); these categories matched the administrative titles of all but one settlement in our sample (site Ajka is located in the suburbs of a small town and received a negative urbanization score close to zero). In addition, we repeated all analyses using this latter urban-rural categorization which yielded the same conclusions as those that used urbanization scores (results available from the authors on request).

Statistical analyses

To infer whether our sample of house sparrows was representative of the urbanization gradient, we tested whether birds from more urbanized sites had lower body mass, as shown repeatedly for house sparrows (Liker et al. 2008; Bókony et al. 2012b). We used a linear mixed effects (LME) model including body mass at capture as dependent variable, urbanization score, date and time of capture as covariates, sex of the bird and identity of measurer as fixed factor and capture site as random factor. We used similar models to test if body condition varied with urbanization at any time during the study. To quantify body condition as body mass relative to body size, we calculated the scaled mass index following Bókony et al. (2012b) as body mass × (19/tarsus length)1.71, henceforth referred to as body mass index.

We tested whether individual performance was repeatable across different tasks by calculating the intra-class correlation coefficient (ICC; Lessells and Boag 1987) for the four rank-transformed problem-solving latencies. To compare repeatability between urban and rural birds, we estimated the 84 % confidence interval (CI) of ICC for each group, since the absence of overlap between two 84 % CIs is equivalent to a 95 % CI around the difference not including zero (Julious 2004).

We analyzed the relationships of problem-solving latencies with urbanization in each task using Cox’s proportional hazards models, a non-parametric survival-analysis method (Sol et al. 2011). Birds that did not solve a task were assigned a value one unit higher than the maximal latency for the respective task (i.e. 5490 s in tasks 1–3, and 11 trials in task 4) and were treated as censored observations in the analyses. Initially, each model included the following predictor variables: urbanization score, sex, morphological measurements taken at capture, number of days spent in captivity before the first test, and frequency of attempts and proportion of time spent in shelter in the given test (for test 4, we used the average of values measured over the trials until the bird succeeded), and the two-way interaction terms of urbanization with the rest of the predictors. Because Cox’s analyses model the probability of not solving as a function of test time, positive parameter estimates mean shorter latencies (i.e. faster decrease of probability of not solving during the test) whereas negative parameter estimates mean longer latencies.

To analyze the relationship between urbanization and overall performance, we used an LME model with the mean rank of problem-solving latencies as dependent variable, capture site as random factor, and the following predictors: urbanization score, sex, morphological measurements taken at capture, number of days spent in captivity before the first test, average frequency of attempts and average proportion of time spent in shelter during all tests, and the two-way interaction terms of urbanization with the rest of the predictors.

To investigate the efficiency of learning, we compared the time needed to solve task 4 in the first successful trial with the time needed to solve it in the second successful trial. Since attempt latency (i.e. time elapsed from the test start until the bird first contacted the feeder) decreased significantly from the first to the second successful trial (paired t test t58 = 2.78, P = 0.007), to control for this effect, we calculated solving time as the time between the first attempt to open the feeder and the subsequent successful solution of the task (i.e. eating seed from the feeder) in each trial. The change in solving time from the first to the second successful trial was used as a proxy for learning efficiency, i.e. individuals that reduced their solving time to a greater extent were considered more effective in recalling and processing the information obtained during their first successful trial (Thornton and Samson 2012). We analyzed this variable using an LME model similar to the model used for overall performance (above). We did not investigate learning after the second successful trial because there was little variance from the third successful trial, i.e. most of the birds started to feed immediately after the start of the trials (for further details, see Bókony et al. 2014).

In the same set of birds, Bókony et al. (2014) found that problem-solving success and learning efficiency showed task-specific relationships with various physiological traits. To ensure that these physiological variables did not affect our conclusions about urbanization, we tested whether the results reported here on the effects of urbanization change qualitatively when the relevant physiological variables are taken into account in the models. We also ran LME models with capture site as random factor to test whether the physiological traits showed consistent variation along the urban gradient. Since we found no such effects (Supplementary Tables 3, 4 and 5), below we present all results on urbanization without including physiological variables because the latter were available only for various subsets of birds.

In all analyses (LME and Cox’s models), we eliminated the non-significant effects from the initial models stepwise, removing the variable or interaction associated with the largest P value in each step, and we report the final models that contain significant effects only (P < 0.05; note that we never omitted urbanization, nor its significant interactions if it had any). All statistical analyses were performed with R version 2.15.2 (R Development Core Team 2012), and the tests are two-tailed with 95 % confidence level.

Results

Repeatability of individual performance rank was somewhat higher in urban (ICC = 0.15, 84 % CI = 0.04–0.30, F 30,93 = 1.71, P = 0.027) than that in rural birds (ICC = 0.10, 84 % CI = 0.003–0.22, F 40,123 = 1.43, P = 0.073), but the large overlap of the confidence intervals indicated similar level of individual consistency in the two groups.

Out of the four tasks, three were solved by the majority of birds (72–80 %) whereas a significantly lower proportion of individuals (23 %) succeeded in task 2, i.e. when they had to pierce the paper cover of the feeder (see Bókony et al. 2014 for further details). In this latter, difficult task there was a significant interaction between urbanization and body mass at capture: urban birds with large body mass were faster than the rest of the birds (Fig. 2, Table 1; see also Supplementary Table 2 for analyses using solving time instead of problem-solving latency and Supplementary Table 3 for analyses including physiological variables). Urban and rural birds were similarly successful in the three easier tasks (Table 1; Supplementary Tables 23), in overall performance (Table 2; Supplementary Tables 2 and 4) and in learning efficiency in task 4 (Table 2; Supplementary Table 4).

Fig. 2
figure 2

Problem solving in task 2 in relation to habitat urbanization and body mass in house sparrows. The figure shows changes over test time in the probability of not solving; a steeper decrease in probability indicates faster problem solving. For illustrative purposes, birds were divided as “large” and “small” if their body mass was larger or smaller than average, respectively

Table 2 Final LME models of overall performance (mean rank of problem-solving latencies in the four tasks) and learning efficiency (the change in solving time from the first to the second successful trial) in relation to urbanization

In all tasks, birds that attempted to access food more frequently solved the problem faster (Tables 1 and 2; Supplementary Tables 24); attempt frequency was not related to urbanization in any test (Supplementary Table 6). In task 2, the three-way interaction of urbanization, attempt frequency and body mass had no significant effect on solving latency (P = 0.602), and neither attempt frequency nor attempt latency was significantly explained by the interaction of urbanization and body mass (all P > 0.151). Females were significantly faster than males in task 2 (Table 1; Supplementary Table 3), although this difference was no longer significant when we used solving time instead of problem-solving latency (Supplementary Table 2). Problem-solving latencies were unrelated to the proportion of time spent in shelter and morphological measurements in all tests (P > 0.358), and birds from different habitats spent similar time in shelter in all tests (P > 0.131). Sparrows from more urbanized habitats had smaller body weight, but body mass index did not vary systematically with urbanization at any time of the study (Supplementary Table 7), and other physiological variables were also unrelated to urbanization (Supplementary Table 5).

Discussion

Our results suggest that there is no strong, clear-cut relationship between urbanization and problem-solving success in house sparrows, as birds from urban and rural habitats performed similarly in most tasks and had similar learning efficiency. In the most difficult task, sparrows originating from urban sites outperformed rural birds if they had large body mass, and this result was consistent between different analyses (i.e. using different measures of problem-solving success and controlling for physiological variables). This latter result supports previous studies in the sense that whenever a difference was found between differently urbanized populations, individuals from the more urbanized sites tended to be more successful (Liker and Bókony 2009; Sol et al. 2011). Furthermore, our findings also corroborate the notion outlined in the Introduction that the relationship between urbanization and behavioural flexibility is not as steady as it has often been assumed (Sol et al. 2013).

Several potential explanations can be proposed for the lack of significant difference between urban and rural sparrows we found in most tasks. First, the house sparrow probably originated from a single geographic area where they became human commensal and evolved alongside humans ever since (Sætre et al. 2012), and they only rarely occupy wild habitats far from human settlements (Anderson 2006). This means that all populations of house sparrows, whether urban or rural, may possess some common genetic adaptations to anthropogenic environments, which might help them in novel situations like exploiting new food sources, and this could have caused their similar performance in our study. Nevertheless, the species is very sedentary, with adults rarely moving more than a few hundred meters (Heij and Moeliker 1990; Liker et al. 2009) and natal dispersal occurring within relatively short distances (Anderson 2006; Kekkonen et al. 2011). Accordingly, studies of contemporary populations found small-scale genetic structure across the rural-urban gradient (Vangestel et al. 2011), significant genetic differentiation across nearby rural populations (Hole et al. 2002) and high levels of inbreeding in an island metapopulation (Jensen et al. 2007), altogether indicating that gene flow is restricted beyond distances of a few kilometres. This may promote local adaptations along the urbanization gradient. Furthermore, early experiences can differ drastically between urban and rural sparrows (Peach et al. 2008; Seress et al. 2012) which can affect their learning of foraging behaviour (Katsnelson et al. 2011). Given this potential for habitat differences, the similar performance of urban and rural birds we found in most tests is intriguing. Thus, as a second line of possible explanations, we may speculate that urbanization could exert several selection pressures simultaneously, out of which some might promote innovative adaptations while others might counter the latter. For example, cities provide not only new resources to exploit but also new types of danger such as poisons, vehicles and non-native predators that could make novelty-avoidance beneficial to some extent. Such a tendency for higher neophobia was found in some urban birds (Echeverría and Vassallo 2008; Bókony et al. 2012a; Miranda et al. 2013; but see Sol et al. 2011), and this might hinder the evolution or expression of innovative behaviours (Brosnan and Hopper 2014). The abundance of easily accessible food such as trash bins and bird feeders could also have a similar effect by making animals rely on such sources instead of learning new ways of foraging; this has been proposed as an explanation for the reduced learning performance of doves in territories with more people (Boogert et al. 2010). This hypothesis is in contrast with the idea that manipulating human-provided food sources may facilitate innovations by making urban birds more flexible in their motor repertoire (Griffin et al. 2014). Such potentially opposing effects of urbanization would merit further exploration.

Task 2 was solved by a relatively small proportion of birds, suggesting that it was more difficult than the rest of the tasks. There are several possible reasons for this. First, because the order of tasks was fixed, performance in the second task may have been influenced by the first task such that our protocol evoked some reversal learning (Griffin et al. 2013a), i.e. the experiences acquired in task 1 had to be unlearned or at least ignored in task 2. Although it seems unlikely that such an order effect alone was responsible for the low success in task 2 (given the high success in subsequent tasks), it could have contributed to that. Second, task 2 may have been cognitively challenging irrespective of task order, e.g. it probably required enhanced inhibitory control of inappropriate attempts (i.e. birds had to direct their attempts to the paper top where food was not visible) and, in contrast with the other 3 tasks, provided no obstacle movement cues (Overington et al. 2011; Bókony et al. 2014). A third, non-exclusive explanation is that task 2 was difficult because it may have required physical force and/or specific motor skills for piercing the paper (Griffin et al. 2014). Whichever the reason for the difficulty of task 2, our finding suggests that large urban birds were more successful in this task than the rest of the birds. We propose that this interactive effect of urbanization and body mass may be explained by opposing effects of urbanization on performance. House sparrows in more urbanized sites are known to suffer reduced growth rates as nestlings probably due to insufficient diet (Peach et al. 2008; Seress et al. 2012), which has a life-long limiting effect on their body size and weight (Liker et al. 2008; Bókony et al. 2012b) as also reflected by the smaller body mass of urban birds in the present study. Poor nutritional conditions are also known to negatively affect cognitive performance on the long term (e.g. Cooper and Zubek 1958; Lucas et al. 1998; Arnold et al. 2007; Bellisle 2007). Therefore, even if urban environments promote innovation-facilitating cognitive abilities such as reversal learning (Griffin et al. 2013a), inadequate nestling diet might constrain the development of these skills. While this is a possible scenario, it has to be noted that problem-solving performance does not necessarily reflect cognitive capacity (Rowe and Healy 2014; Thornton et al. 2014). Alternatively, urbanization may select for physical skills such as motor diversity (Griffin et al. 2014), but the reduced mass of many urban birds could prevent them from applying their motor skills with enough physical force in some situations. In either case, we may speculate that only a few urban individuals can realize their full potential for solving difficult problems, whereas rural birds are neither constrained nor selected (or have the necessary experience) for such performance. It is important to note that, although both cognitive capacity and physical force can be influenced by physiological condition, actual health state is unlikely to have confounded our present results because adult house sparrows show little variation in health indices along the urbanization gradient (Bókony et al. 2012b, see also Supplementary Table 5 in the present study), and the effects of urbanization were not influenced by the physiological variables that predicted performance in the same set of birds (Supplementary Tables 34).

The most consistent predictor of problem-solving success was the frequency of problem-solving attempts, in agreement with the repeated finding that individuals making more or longer attempts are more likely to innovate in several species (Morand-Ferron et al. 2011; Benson-Amram and Holekamp 2012; Sol et al. 2012a; Thornton and Samson 2012; Griffin et al. 2014). This suggests that perseverance and/or task-directed motivation may play a major role in foraging innovations (reviewed by Griffin and Guez 2014). This phenomenon has often been attributed to individual differences in neophobia or exploration (Bouchard et al. 2007; Overington et al. 2011; Sol et al. 2012a), although evidence for the relationship between neophobia and innovation is equivocal so far (Boogert et al. 2008; Cole et al. 2011; Benson-Amram and Holekamp 2012; Thornton and Samson 2012; reviewed by Griffin and Guez 2014), and experimental studies indicate that exploration is not likely to causally affect learning (Matzel et al. 2011). In this study, we did not quantify neophobia as our previous work showed that this trait is not correlated with problem-solving success in house sparrows (Liker and Bókony 2009); instead, we tried to minimize any effect of novelty by familiarizing the birds with the feeders before tests. Nevertheless, urban and rural sparrows did not differ in the frequency of attempts; thus, individual variation in novelty-avoidance is unlikely to have confounded our results.

Despite the heterogeneity in the set of variables that predicted problem-solving success in different situations (also see Bókony et al. 2014), we found that birds successful in one task tended to be successful in other tasks. The proportion of variation explained by between-individual differences was small, which fits well with the overall picture that problem-solving propensity shows individual consistency only in certain contexts or species (Cole et al. 2011; Griffin et al. 2013b) but not in others (Boogert et al. 2010; Morand-Ferron et al. 2011), and the evidence for a “general intelligence factor” at the individual level remains equivocal in animals (reviewed by Thornton and Lukas 2012). We found that the repeatability of problem-solving latencies did not differ significantly between habitats, which parallels our previous result that urban and rural sparrows showed only slight differences in individual consistency across various foraging situations involving novel objects and novel food (Bókony et al. 2012a). However, a meta-analysis of various animal behaviours suggested that repeatability is higher in the field than in the lab (Bell et al. 2009); therefore, it may be worth comparing individual consistency between urban and rural conspecifics in more natural experimental settings.

In conclusion, our study suggests that there is no marked, consistent difference in problem-solving success between urban and rural house sparrows. While urban birds may be better at exploiting some aspects of novel environments, such habitat differences may be influenced by other factors like the possible consequences of developmental constraints in urban sparrows. These findings support the conclusion emerging from recent research that the relationship between habitat urbanization and behavioural flexibility, including the ability for problem solving, seems context-dependent. Understanding the causes of this heterogeneity is an inspiring challenge for future research.