Background

Cardiovascular diseases (CVD) are the leading cause of death in Europe1 and their social, medical and economic burden will likely increase over the coming decades.2 One of the major risk factors for CVD is physical inactivity.3 Walking and cycling, as a mean of active transportation and commuting4, are the main determinants of physical activity levels when considering population health approaches.5 , 6 Worldwide, it is estimated that the failure to spend 15–30 min a day briskly walking is responsible for 6–10 % of the burden of major noncommunicable diseases.7

In order to tackle this problem, researchers, health professionals, and policy-makers have identified a range of population approaches to support physical activity accumulation throughout the day. Population approaches for disease prevention attempts to shift the whole distribution of a population for a given risk factor, so the effect of the prevention is not limited to high-risk individuals.8 This approach focuses on contextual factors, such as physical and social environments, and is often conceptualized within a socioecological framework.9 Physical activity behaviors are increasingly studied within socioecological frameworks, with one of the possible areas for investigation being the urban environment.10 14

The development of reliable tools and methods to audit the physical qualities of the urban environment that likely impact physical activity remains an ongoing challenge for public health researchers. More than 30 field audit instruments have been developed in recent years,15 which can be further classified into subjective (questionnaires and interviews) and objective tools (direct observation with checklists, official dataset, etc.).11 , 16 , 17 The use of subjective or objective tools depends on the objective of the study and the availability of research resources.16

Direct observation is one type of objective measure, whereby trained observers undertake audits or checklists, which assess different aspects of the urban environment. These can be simple measures of one characteristic of the environment, such as the distance to specific destinations; information about general environment (e.g., density of buildings or traffic);11 or assess multiple items that represent different aspects of the urban environment. The study of specific urban domains for walking (e.g., streets quality, pedestrians’ safety) and cycling (e.g., cycle lanes, cycle storage) has been one of the main focuses when developing audit tools to study the relationship between urban environment and physical activity16.

For logistical reasons, especially for studies in large geographic areas, there has been increasing interest for virtually measuring attributes of the urban environment thought to be associated with physical activity. Furthermore, many of these measures lend themselves well to using open-access mapping technologies, such as Google Earth, Google Street View, or Microsoft Visual Earth.18 Compared with physical audits, virtual audits may provide a faster, easier, cheaper, safer, and more reliable method to assess the urban environment.18 To date, Google Street View is the most available and accessible form of omnidirectional imagery, providing coverage for most European, USA, and Australasian urban areas.

Experiences of using open-source mapping technologies have emerged mostly in the USA15 , 19 24, but also in Canada,25 , 26 Australia,27 New Zealand28, UK,29 Netherlands,30 and Belgium.31 Due to historical reasons, European cities32 have different urban forms compared to cities in North America or Australia, and therefore, the study of the urban environment in Europe has its own challenges. As there are few experiences in Europe where the validity of applying a streetscape audit tool virtually has been assessed, another European context is warranted. Besides, within the European context, Mediterranean cities are characterized for being more compact in terms of urban sprawl.33

The objectives of this study are the following: (1) to test if the Madrid Systematic Pedestrian and Cycling Environment Scan (M-SPACES) can differentiate walking and cycling environments across different urban-form areas in Madrid, Spain; (2) to assess the validity of the M-SPACES in a virtual urban setting using Google Street View; and (3) to assess the reliability (intra- and inter-rater) of the M-SPACES audit tool.

Methods

Setting

This study is part of the Heart Healthy Hoods (HHH) project (http://hhhproject.eu/). The HHH project examines the association between the social and physical features of the urban environment in relation to adults’ cardiovascular health living in the city of Madrid, Spain.

In 2011, Madrid city had an estimated population of 3,198,645 citizens.34 Madrid is structured in 21 districts, each of which is subdivided into neighborhoods. Neighborhoods are also divided in units of ∼1000–1500 residents, called census sections. A total of 36 census sections were selected for this study, providing an estimated population of 49,260 residents.34 Three areas, each comprising 12 census sections, were selected in Madrid based on variation (low, medium, high) in residential population per square mile and on homogeneity in terms of sociodemographic characteristics (all three areas had average scores in terms of education, immigration and aging).34 In order to obtain clusters of 12 census sections, the Kulldorff’ spatial scan statistic software was used (Fig. 1).35 These 36 census sections included a total of 500 street segments defined as the street line from one intersection to another. These areas were located in the districts of Carabanchel, Ciudad Lineal, and Chamartín, respectively. Population density was used as a proxy of different urban form; also, population density has been positively related with walking in a recent systematic review.36

FIG. 1
figure 1

Madrid city with the study areas selected.

Measurement

Development of M-SPACES

The M-SPACES is an observational audit tool of urban attributes associated with walking and cycling along a street network. The original tool was developed by Australian researchers to represent physical environments that may promote or inhibit walking or cycling.37 Using stakeholder interviews and a Delphi study, four factors were identified that likely support physical activity behavior within the neighborhood environment: function, safety, aesthetics, and destinations.37 Each factor consisted of different elements, which were further reduced to items. Within the same Delphi group, depending on the importance for supporting neighborhood walking and cycling, weights were applied.

As part of the URBAN study,38 the SPACES tool was further refined by adjusting the item weights for the New Zealand context.28 From the NZ-SPACES tool, small adjustments were made for the M-SPACES measure to be applied to Madrid, as some features of the audit tool could not be differentiated in this urban context. Small adjustments were: summing the item weights for “negotiation of footpath” and “type of footpath” into a single item called “type of footpath”; similarly, we aggregated “footpath smoothness” and “footpath smoothness/condition” into a variable called “footpath smoothness.” We also modified the “Destinations” item by adding the number of destinations present in the street segment, as most of Madrid’s segments presented many destinations, and we considered important to discriminate the number of destinations. These adjustments did not affect the final weights of the elements and factors. Final items, elements, factors and their weights are shown in Table 1. M-SPACES audit tool can be found in the online supplementary material.

TABLE 1 M-SPACES tool: factors, elements, and items for walking and cycling

M-SPACES Training

Two field researchers underwent familiarization training, where practice physical and virtual audits occurred using the M-SPACES. Both researchers piloted the tool with supervision for 2 h in Ciudad Lineal. Training audits were completed following the SPACES protocol outlined in the manual.39

On Field and Virtual Measurements

Two researchers conducted physical and virtual audits of the 36 census sections previously identified using the M-SPACES audit tool (Fig. 2). Between February and May 2014 the field researchers completed the M-SPACES physical audit by walking together along both sides of the street segments. Virtual audits of the streetscapes were completed using Google Street View software. Images of the 36 census sections were recorded by Google between May 2008 and February 2014 (Fig. 3); 152 segments (30.4 %) were recorded before 2010. The order of measurement of the three areas was not randomized; starting with the medium density area (Ciudad Lineal) followed by the high (Chamartín) and the low (Carabanchel) density areas. Within each area, we randomly selected half of the census sections for physical audit first, followed by the virtual audit; the remaining half were audited in the reverse order to reduce the effect of taking the same measures of the same streetscape.

FIG. 2
figure 2

Description of the M-SPACES assessment by two raters across the study areas.

FIG. 3
figure 3

Picture of a street feature as it was observed in physical (left) and virtual audit (right).

Time taken to complete the audit for each street segment was recorded for both virtual and on-field measurements. Also, time travel to each of the areas by public transportation was calculated using Google Maps software. To do so, Instituto de Salud Carlos III (Avenida Monforte de Lemos 5, 28029, Madrid, Spain) was used as the departure location for all areas, and its closest point of the area as the arrival location. Monday 8:00 A.M. was chosen as the departure hour.

Inter-Rater and Intra-Rater Reliability

Inter-rater agreement was assessed between the two researchers for both physical and virtual audits. In order to measure intra-rater reliability, physical and virtual audit data collected in April 2013 by one of the researchers were compared for the median-density area (Ciudad Lineal). The adjustments to the M-SPACES audit tool (i.e., grouping and refining items) were made after this first assessment; therefore “Destinations” was excluded for the intra-rater agreement analysis.

Statistical Analysis

In order to test if the M-SPACES scores differed by urban form, we performed an analysis of variance (ANOVA) of mean scores by area. Level of significance was set as p < 0.05.

A two-way mixed model intraclass correlation coefficient (ICC) was used to measure the following: (a) agreement between physical and virtual audits, (b) inter-rater agreement between the two researchers, and (c) intra-rater agreement comparing M-SPACES assessments in 2013 and 2014.40 , 41 The ICC measured the percentage of total variability for a given street segment:

$$ \mathrm{I}\mathrm{C}\mathrm{C}=\frac{k\cdot \mathrm{I}\mathrm{S}\mathrm{S}-\mathrm{T}\mathrm{S}\mathrm{S}}{\left(k-1\right)\cdot \mathrm{T}\mathrm{S}\mathrm{S}} $$

Where “k” refers to the number of street segments, “TSS” total sum of squares, and “ISS” inter-group sum of squares.

Following Landis and Koch classification, the cutoff ranges for ICC values used were as follows: 0.0–0.20 (weak agreement), 0.21–0.40 (poor agreement), 0.41–0.60 (moderate agreement), 0.61–0.80 (substantial agreement), and 0.81–1.00 (almost perfect agreement).42All analyses were conducted using Stata SE version 12.1 (StataCorp., College Station, TX, USA).

Mapping

Geographic information systems (GIS) were adopted with a twofold aim. On the one hand, it constitutes the underlying technology allowing for the integration of all collected information together with contextual information such as street network, land parcel and building polygons. On the other hand, the establishment of a geo-referenced database on the subject will make possible future analysis and modeling. All data layers, both collected and downloaded from official sources, were first projected and referenced to a common system. Absolute differences between on-field and virtual audits of M-SPACES total scores for walking and cycling were then joined to the attribute table of the street segment data sets by means of relational union in order to represent it in a map. All GIS-related operations were undertaken with ArcGIS software (ESRI, Redlands, CA, USA).

Results

A total of 454 street segments (90.8 %) included in the three study areas were measured both physically and virtually. Of the 46 street segments that were not audited, 20 were excluded as Google Street View imagery was unavailable (i.e., pedestrian streets that were inaccessible to the vehicles where images are captured from). The remaining excluded street segments (n = 26) were too short to measure, and their characteristics were subsequently absorbed into the closest street segment.

Differentiation of Areas with M-SPACES in Madrid

Data presented in Table 2 show the scores for the M-SPACES factors across the three different residential density areas. Function for walking scores was higher in the medium-density area (Ciudad Lineal), while cycling functionality was higher in the lowest-density area (Carabanchel). Safety factor scores for walking and cycling were higher in the highest density area (Chamartín). Also, Chamartín had the highest scores for destinations to walk and cycle. Destinations yielded the greatest difference between the areas, whereby Chamartin (0.55) almost doubled Carabanchel’s mean score for destinations present (0.29). All differences were statistically significant (p < 0.05).

TABLE 2 Differences in M-SPACES factors between the three study areas

Google Street View Validity

Physical and virtual audits reported substantial (ICC > 0.60–0.80) or almost perfect agreement (ICC > 0.80) for 6/11 elements for walking and 5/10 elements for cycling, and also for total scores (Table 3). Walking infrastructure, traffic safety, streetscape aesthetics, and destinations demonstrated almost perfect agreement (ICC = 0.86, 0.89, 0.80, and 0.85, respectively). However, walking surface and cycling infrastructure showed poor agreement (ICC = 0.36 and 0.39, respectively), while cycling surface, streets lane, and subjective walking and cycling assessment showed moderate agreement (ICC = 0.51, 0.41, 0.55, and 0.53, respectively). Figure 4 shows walking and cycling total scores’ difference between physical and virtual audits.

TABLE 3 Levels of agreement between elements assessed physical and virtual audits (n = 454 street segments)
FIG. 4
figure 4

Absolute differences between physical and virtual audits for walking (left) and cycling (right) total M-SPACES score.

Intra-Rater Reliability

Most elements demonstrated moderate or substantial intra-rater agreement (ICC = 0.40–0.80) (Table 4). Walking infrastructure, streets lane, traffic safety, and view aesthetics had almost perfect agreement (ICC > 0.80). Intra-rater agreement for the virtual auditing was higher for aesthetics (ICC = 0.81) than the intra-rater agreement for aesthetics in the physical audits (ICC = 0.45).

TABLE 4 Intra-rater agreement between elements assessed by one rater twice (2013 and 2014) (n = 152 street segments)

Vast differences existed for view aesthetics (physical audit: ICC = 0.15; virtual audit: ICC = 0.88). Path safety, cycling surface (only for the physical measurement) and cycling infrastructure had an ICC = 0.00, and thus were regarded as being highly unreliable.

Inter-Rater Reliability

In general, inter-rater agreement was low (Table 5). The agreement between the two observers was better for the physical audits when compared with the virtual audits. Both physical and virtual walking and cycling surface elements showed weak inter-rater agreement (ICC < 0.20). Apart from virtual cycling infrastructure (e.g., cycle storage) (ICC = 0.47), physical and virtual audits of walking and cycling infrastructure showed substantial agreement (ICC = 0.60–0.80). Aesthetics and subjective assessments tended to have lower agreement (ICC < 0.40) than the streetscape aesthetics. Virtual path safety inter-rater agreement was weak (ICC = 0.08), especially when compared with the inter-rater agreement using the physical audit (ICC = 0.32).

TABLE 5 Inter-rater agreement between physical and virtual elements by two raters in three study areas (n = 454 street segments)

Time Taken to Complete the Audits

Overall, the time taken to measure each street segment was faster for on-field auditing (2.45 min per segment) than virtual auditing (2.76 min per segment), and this was consistently shown across the three areas. In Chamartín, virtual auditing took 2.87 min per street segment compared with 2.42 min for the physical assessment. In Ciudad Lineal it took a meantime of 2.84 min virtually and 2.33 min physically per street segment. In Carabanchel, virtual auditing was faster (2.62 min per street segment), and physical audits were slower (2.57 min per street segment) compared with Carabanchel and Chamartín. Time travel from Instituto de Salud Carlos III to the study areas was estimated as 36 min to Chamartin, 40 min to Ciudad Lineal and 60 min to Carabanchel.

Discussion

These findings showed that M-SPACES is a useful tool to measure walking and cycling environments in the streets segments of a European city like Madrid. SPACES original tool and its variants are based on a conceptual framework which makes it an easy tool to compare different urban forms and has been used and validated in several other settings.28 , 43 , 44 Google Street View was, for most features of the streetscape, a valid instrument to assess physical urban environment using M-SPACES when the physical and virtual audits were compared. Furthermore, most elements had substantial (neighborhood permeability) or almost perfect agreement (walking infrastructure, traffic safety, streetscape aesthetics, and destinations). Despite that, some features did not reach acceptable agreement when virtual and physical measures were compared. For some of these elements (walking and cycling surface and streets lane), the use of supplementary information via other secondary spatial sources (e.g., council databases) could be used to improve reliability. However, the differences shown for cycling infrastructure (cycle storage) may be a temporal issue, whereby new cycling infrastructure (e.g., bicycle parking) in Madrid were not present when the Google Street View imagery was taken. In summary, our findings are compatible with the work of other studies,45 that have found that the more subjective characteristics (e.g., sidewalk conditions) have lower agreement between virtual and physical audits. However, in our findings view aesthetics and streetscape aesthetics showed an acceptable agreement. Furthermore, little availability for some elements (e.g., low prevalence of cycling storage) could explain low levels of agreement with ICC due to low variability.41

It was not anticipated that the mean time auditing the street segments using Google Street View would be slower than conducting physical audits; however, time taken to go to the areas by public transportation also has to be taken into account. Nevertheless, other studies conducted in Australia, New Zealand, and UK found that virtual audits were faster than physical audits.28 , 30 A possible explanation may be that the greater complexity of the street structures in Mediterranean and continental Europe cities may eliminate the time advantage of undertaking virtual audits.32 Given this fact, in this study, time spent auditing street segments in the low-density (where the streets segments are expected to be longer) area was similar when measuring the environment physically or virtually. Despite this, virtually assessing urban environments has other advantages as virtual audits: (1) are not affected by daylight restriction or weather changes; (2) require limited resources (computer and internet connection); (3) are less financially costly than physical audits;28 (4) are useful for regional and international comparisons; (5) may be a good alternative when measuring unsafe areas;21 and (6) it provides an alternative when visiting the study area physically is not possible. Also, metadata provided by Google Street View allow researchers to match environmental conditions.

Our intra-rater reliability results are comparable with previous studies.30 , 31 These results, besides demonstrating the intra-reliability of M-SPACES both physically and virtually, showed that in most cases, the urban environment in these three areas did not change between 2013 and 2014. Lower levels of agreement were found for cycling infrastructure and surface, as also path safety, which may be due to low variability of these elements in the study area, as also some new elements (e.g., cycle storage) that were not present in 2013.

Inter-rater agreement between the two observers was low. Only walking infrastructure, streets lane (physical audit), streetscape aesthetics, and destinations have acceptable levels of agreement. One possible explanation for the low agreement observed might be systematic auditor training differences. Even though both researchers received training with the SPACES manual, this manual was not translated and adapted to the Madrid context, which may have resulted in different understanding of the different options to choose in the audit tool. Potentially due to these reasons, our findings are not consistent with the results of Kelly et al., which found high levels of agreement between four auditors when measuring the streetscape with Google Street View.22 In their work, all auditors received systematically 4 h of training; our auditors read the SPACES manual39 and piloted the tool in Ciudad Lineal for 2 h. Griew et al. found that inter-rater agreement was different between different types of neighborhood (industrial, residential, etc.).29 However, we did not find any differences between inter-rater agreements for our three areas that could help us to understand this low agreement between observers.

Limitations and Strengths

We acknowledge that this study has several limitations. The M-SPACES tool was designed to measure the streetscape, therefore did not assess access to parks, pedestrian- or cycling-only paths, recreation centers, or other facilities, and it is possible that these residential features are associated with walking behavior. Images from Google Street View were taken between 2008 and 2014 (30.4 % before 2010); so there may be some areas for which the images are not updated. As there has been an acceptable agreement for both intra-rater and physical-virtual, we donot believe that timing of the Google Street View pictures was a major problem in this study; however, this may be an important topic to take into account when using virtual assessments to study the relationship between urban environment and health results. Google Street View was the web-based tool used to measure the streets, and when used in conjunction with other measures, such as Google Earth or MS Visual Oblique, may provide additional environmental context; however, we did not explore these additional datasets. Ten percent of the street segments could not be measured by Google Street View, as there were no images, almost all of them being pedestrian streets. It is important to note that pedestrian streets are very relevant to develop walkable neighborhoods in European cities, so it is important to create specific tools to assess pedestrian streets, as also ways to measure it virtually (e.g., not 3D satellite images). Low variability affects ICC agreement values, so the assessment of some aspects of the streets (especially cycling infrastructure and surface characteristics) may need to be completed with other concordance methodologies. Despite this, we regarded Google Street View as being the most accessible and most appropriate web tool for measuring fine-grained streetscape elements.23. Inter-rater agreement was low, and in the future, we would recommend developing systematic training adapted to the study area. The question for Destinations was adapted between the 2013 and 2014 auditing, and therefore was not included the agreement of this factor in the intra-rater analysis; future work needs to explore the reliability of this factor for the M-SPACES.

The present study also presents several strengths. As far as we know, there are no other studies that have used such a large number of street segments set to test the accuracy of virtual audits. Previous studies have used between 48 and 369 street segments.18 This is also the first study to test the validity of the virtual assessment of an audit tool that measures streets characteristics that may influence both walking and cycling in Europe, and compared areas based on residential density, a factor potentially related to physical activity. Moreover, to our knowledge, this is the first study to study the agreement between virtual and on-field tools in a Mediterranean context, where urban form patterns differ greatly from the more sprawled North-American or Australasian cities.

Conclusions

The M-SPACES audit tool is able to discriminate between different population-density areas. Google Street View provided a valid way of measuring most aspects of the residential environment in a European city like Madrid, especially neighborhood permeability walking infrastructure, traffic safety, streetscape aesthetics, and destinations. However, for some features (e.g., street lane), the audits may need to be completed with other secondary spatial databases. Characteristics of the streets that may inhibit or promote cycling had lower correlation between on-field and virtual audits. Inter-rater agreement was, in general, weak; therefore, intensive observer training and the use of complementary objective techniques may be required. Intra-auditor agreement was substantially better when measuring urban environments virtually.