1 Introduction

Understanding the drivers of residential water use is pivotal to water demand planning and management. However, in the complex urban environments this task can be challenging since the heterogeneity in characteristics of household and dwelling may lead to considerable variations in water demand and its determinants among different groups of consumers and across urban areas (Abrams et al. 2012; House-Peters and Chang 2011; Mieno and Braden 2011).

To address this variability, the studies of residential water demand typically use data in household-scale or aggregated-scale such as census area levels. In order to take into account the heterogeneity across consumer groups, household-scale data is preferred, especially when econometric models are used and the estimation of price elasticity is desirable (Arbués et al. 2010, 2004; Höglund 1999). However, in practice due to the unavailability of household-scale data or the high cost of obtaining such data the empirical studies mainly have relied on small random samples of household data. In this way, this group of studies typically failed to show the spatial variation of water demand and the influence of neighbourhood characteristics on water use (Arbués et al. 2004; Pint 1999). In contrast, the studies in which the aggregated data were used although have managed to address the spatial variations of water demand, they typically overlooked the variations across households (Chang et al. 2010; House-Peters et al. 2010; Wentz and Gober 2007).

In order to bridge this gap, this study proposes the use of a multi-scale analysis approach through the integration of the water consumption, land use and demographic data. In this approach, using urban databases and the geographical information systems (GIS), first the information of water consumption and property are linked together to build a large sample of household-scale data, theoretically as large as the number of all dwellings within a city. This large sample of household-scale data can be used to evaluate the effects of water pricing, property characteristics, and weather conditions across different customers groups. Afterward, the household-scale data is aggregated into an appropriate spatial scale such as census area unit in order to include the socioeconomic characteristics of households in the demand analysis and evaluate the spatial variation of demand over the urban areas. In this way, the study can take advantage from both individual-scale and aggregated data analysis.

This study uses the multi-scale analysis approach to evaluate the determinants of water demand in Auckland, New Zealand. Auckland is the largest city in New Zealand. This city formerly was comprised from seven territorial authority areas (i.e. Rodney District, North Shore City, Waitakere City, Auckland City, Manukau City, Papakura District, and Franklin District). However, in 2010 these areas amalgamated to form a single authority as the Auckland Council. Auckland has experienced fast growth rates both in population and in the housing stock in the last decades. The population of Auckland has increased by 22% since 2001, reaching around 1.4 million people in 2013 (Statistics-NZ 2015). Under pressure of this growth, the city has experienced considerable changes in the urban structure. For example, the dwelling density increased in Auckland region between 2001 and 2013, from 86 to 102 dwellings per square kilometre (Goodyear and Fabian 2014). The section size of properties also decreased over the past decades (LINZ 2015). The variations of household and housing characteristics in Auckland are remarkable. The average household size in Auckland is around three. However, this number can increase to five people in some parts of southern Auckland where the multifamily household is more common (Statistics-NZ 2015). At the time of the 2013 Census, single-unit housing (i.e. separate dwelling) made up about three quarters of occupied private dwellings in Auckland while the percentage of private dwellings in Auckland that were joined (i.e. flats and apartments) was 25% (Goodyear and Fabian 2014).

The present study only focuses on the single housing (i.e. single-family detached houses) water use, where the apartment water consumption in Auckland was discussed in others studies (Ghavidelfar et al. 2016a, b, c). This segregation is necessary since there are substantial differences in water use and its determinants across different housing types (e.g. single houses, low-rise and high rise apartments). These distinctions can be attributed to the differences in the socioeconomic characteristics of households and the level of outdoor usage (e.g. gardens and swimming pools) between these two housing types (Fox et al. 2009; Ghavidelfar et al. 2016b, c; Russac et al. 1991).

Through integration of household-level water use data with the property information and micro-census data, this study developed a large random sample of 31,400 separate housing over 291 census area units in Auckland. This large dataset is used to quantify and test the effects of household socioeconomic (household income and household size), dwelling characteristics (number of bedrooms, section size, swimming pool), urban structure (housing density), weather (rainfall, temperature), and water pricing on water demand. All of these variables have been frequently reported as the influential factors on the empirical water demand studies (Al-Zahrani and Abo-Monasar 2015; Ashoori et al. 2016; Babel et al. 2007; House-Peters and Chang 2011).

Over the last decades, GIS has been widely used in the urban planning and management (Bathrellos et al. 2012, 2013, 2016). In recent years, with advances in database technology, data accessibility and computing power the usage of GIS in data integration has become more plausible in the water demand studies (Dziedzic et al. 2015; Polebitski and Palmer 2010). In an early attempt of data integration, as a pilot study, Troy and Holloway (2004) linked water demand and property information in 6 census area in Adelaide, Australia, to examines the water consumption patterns for different types of residential dwellings and areas. Shandas and Parandvash (2010) integrated water consumption, land use and demographic data in parcel level to examine the relationship between land-use planning and water demand. Polebitski and Palmer (2010) integrated utility billing data with census demographic and property appraisal data in census track level in order to forecast residential use in Seattle, Washington. Under GIS environment, Panagopoulos et al. (2012) also combined different spatial data for the urban growth pattern (e.g. distance from the city centre, distance from the coastline, topographic slope, land use/land cover, existing water supply and sewerage system) in order to seek and model major determinants of future growth of urban water demands in in the city of Mytilene, Greece. In a recent study, Dziedzic et al. (2015) integrated water billing records, demographic census information, and property information in Ontario, Canada. Through this data integration and subsequent cluster analysis, they identified the pattern of water demand over different areas and groups of customers for the purpose of conservation planning. They emphasized the importance of data integration in order to use the full potential of rich data available with the organizations. In contrast, multi-scale analysis of water demand has been relatively new in the domain of demand study. In a recent study, Ouyang et al. (2014) used water demand in three different scales (i.e. household, census tract and city scales) to identify the determinants of water demand and examine whether spatial scale may lead to ecological fallacy problems in residential water use research. They showed that the results of demand study on different scales are comparable. To the present knowledge of the authors, the data integration never has been used for the multi-scale analysis of water demand through developing a large sample of household-level data.

Thus, the main contribution of this study is to demonstrate that how the data integration can be used in the contemporary water demand study to address the complexity of water demand in the urban environments. In this way, the proposed multi-scale analysis approach can help to make use of the full potential of large datasets produced by data integration to thoroughly assess the variation of water consumption across different group of consumers and urban areas. In addition, in a broader perspective, the proposed data integration approach can help to understand the water demand in different housing types (e.g. in terms of section size), since it combines water consumption with the property information. This can help water planner and policy maker to better understand the implication of housing intensification (i.e. transition from large single houses to the more intensified multi-unit housings), which is an on-going phenomenon in many major cities around the world (Ghavidelfar et al. 2016b). This information can help to more reliably evaluate the effects of urban development policy on the future water demand.

2 Data Integration

This study integrates the data of water consumption, dwelling, weather, water pricing, and census socioeconomic for the purpose of water demand analysis. The water consumption, dwelling, weather, water pricing information is available at both the household-scale and the census area unit level (i.e. after aggregating the data). However, the socioeconomic data only is available at the census area unit level.

The water consumption data was provided by Watercare Services Limited, an Auckland Council Organization, on the monthly basis for the period of 2008–2014. The property information was obtained from the publicly available database at Auckland-Council (2015) and Land Information New Zealand (2015). The property data is available in GIS format providing information regarding the housing type, section size, building size, assessed value of property, age of property, and address.

The weather data, included monthly average air temperature and rainfall, was obtained from the New Zealand’s National Climate Database (CliFlo 2015) for the periods of 2008 to 2014. The water and wastewater charges for six districts of Auckland, from 2008 to 2014, were also provided by Watercare. The socioeconomic information of households was collected from Statistics New Zealand Data Lab (Statistics-NZ 2015) for census 2006 and 2013. The Data Lab provided access to the census microdata. From census microdata it is possible to estimate household and housing information (e.g. household income, household size, education level, number of bedrooms, etc.) for different types of housing. More details about the datasets can be found in Ghavidelfar et al. (2016b, c).

The data integration was carried out using geographical information systems (GIS). The water consumption and property data were arranged in GIS and linked together using the addresses and geographical coordinates. By this integration the information of water consumption and property for around 350,000 housing including single-unit and multi-unit (i.e. flats and apartments) became available for the water demand analysis.

The present article only focuses on the evaluation of water demand in single-unit housing. Around 75% of houses in Auckland are single-unit. Thus, after filtering the database based on the property type around 260,000 single-unit houses remained for the rest of analysis. From this data the houses with replaced meters (i.e. houses with more than one meter records) were excluded from the analysis. This is because in these houses the records from erroneous old meters usually overlap the new meters records for a period of time, thus they may cause error in the estimation of historical water consumption. After this data filtering around 130,000 single-unit houses remained available for the demand study.

This study selected a random sample of 31,400 properties from the developed dataset in order to check the data for completeness and quality. Using high-resolution aerial images, this study visually inspected all the properties in the sample mainly to complete some unreported property characteristics such as presence of swimming pool in the dwellings. This random sample is large enough to reliably represent the total population of single-unit dwellings (i.e. there was no statistically significant difference between average water consumption estimated from the random sample and all meters) as well as fully cover all suburbs of Auckland to show the spatial variation of water use.

Using GIS the water pricing and weather information were also assigned to this random sample of single-units houses based on the geographical location of houses. This dataset is used to carry out the household-scale demand analysis. Then, the dataset is aggregated at the census area unit level to include the census socioeconomic variables on the demand study as well.

Using this data integration, the developed database provided a unique opportunity to investigate the determinant of water demand on the different scales.

3 Water Demand Models

This study utilizes regression methods specific to panel data to analyse water demand in Auckland from years 2008 to 2014. A panel data set contains repeated observations over the same units (e.g. households, census areas units), collected over a number of periods (Hill et al. 2010; Verbeek 2004). The panel data models incorporate both the temporal and the spatial variations of water use in the modelling. Thus, they can generate better parameter estimates than traditional regression approaches (Arbués et al. 2003; Polebitski and Palmer 2010; Weber 1989). More details about the panel data models can be found in Ghavidelfar et al. (2016c) along with other papers (Arbués et al. 2004, 2010; Fenrick and Getachew 2012; Kenney et al. 2008; Martinez-Espiñeira 2002; Nauges and Thomas 2000; Polebitski and Palmer 2010).

The panel data models are developed using both household and census area unit scales data. At household scale, the dependent variable is annual average daily water consumption. To calculate this, the annul water consumption of household (calculated by adding monthly data) was divided by the number of days in each year for the individual dwellings. The developed dataset included water consumption of around 31,400 individual houses over 6 years (i.e. August 2008 to July 2014). The water consumption data was estimated on the annual basis because the water price in Auckland changed annually (i.e. in July each year). Thus, it can better reflect the overall effects of changing in price across the years. The independent variables in the household-scale model are price of water, average air temperature, annual rainfall and housing characteristics.

This study investigates the effects of both volumetric and fixed charges of water and wastewater. Since in Auckland the wastewater price is calculated based on metered water use the study summed up the charges of water and wastewater. This helps to evaluate the overall effect of volumetric and fixed charges.

Taking advantage of household-scale data, the study evaluated the effects of water pricing, along with other variables, across different groups of customers. In this line, the individual houses were clustered into different groups based on the housing value, as a proxy of household income, and water consumption. The k-means algorithm (Everitt et al. 2011) were used for the clustering. Using cluster analysis, 3 different groups of household were distinguished in Auckland (i.e. high income, middle income and low income). In addition, the houses with swimming pools were separated to estimate the price elasticity of water demand in this group of high users.

At the level of census area unit, similar to the household level, the dependent variable is the average daily water use. In this level, the census variables were also added in the model. This study collected a wide range of census variables including household size, household income, ownership of property, percentage of one-person households, number of bedrooms, household education level, and age of population. However, some of these census variables had strong correlation with each other or with the property characteristics estimated from household-level data through aggregation. Thus, in order to avoid multicollinearity issue among these variables only household size, household income and number of bedrooms were selected to be included in the models. These three variables had a high correlation with water demand and were frequently reported as the influential factors in the water demand studies (House-Peters and Chang 2011). A yearly estimate of census variables was used for the panel data analysis.

Besides census variables, similar to household-scale models, water price (i.e. volumetric and fixed prices), average air temperature and annual rainfall were included in the model. Average section size of property, estimated from the household-scale data, and density of dwellings were also included in the models. The study also included two dummy variables representing the low income and high-income census area units in Auckland. The dummy variables were estimated through cluster analysis, where k-means method distinguished three different groups of consumers at the census are level based on the housing value, as proxy of income, and average daily water consumption. Based on the pseudo F-statistic, this is the optimal number of clusters which can maximize both within-group similarity and between-group difference. Table 1 provides a list of variables which were used for demand analysis in household and census ate unit scales. The prices and income were deflated into real 2013 terms using the customer price index (CPI) (Statistics-NZ 2015).

Table 1 List of variables available for the multi-scale demand analysis

4 Results and Discussion

4.1 Water Demand Models at Household Scale

The study developed five panel data models at household-scale. The first model used entire sample of single-unit houses, while the models 2, 3, 4 used the grouped data for low, middle and high-income houses, respectively. The last model also used a sample of houses with swimming pools.

The study examined pooled, fixed and random effects models to select the best panel data method. For all 5 models the result of pooling tests (partial F-test) showed that the panel models (i.e. fixed and random effects models) are an improvement on the pooled model. To choose between fixed and random effects models the Hausman test were carried out for all datasets. The result of tests revealed that random effect model is not valid on the household-scale datasets, thus the fixed effect model is the best estimator which can produce consistent parameter estimates. One drawback of fixed effects model is that this model cannot provide parameter estimates for time-invariant variables such as housing characteristics (i.e. HValue, BFootP, DumPool, SecSize) which generally do not change over time. This feature of fixed effect models however does not mean that the model omitted the time-invariant variables. In fact, the fixed model controlled these variables, alongside with other unobserved household characteristics, to provide unbiased parameter estimates for the remaining variables (Kenney et al. 2004).

Table 2 shows the results of all developed models. The time trend was included in all models to accommodate the nonlinearities in the underlying data. All the variables (except FPrice that contains zero values) were also transferred by natural logarithm thus the coefficients can be interpreted as the elasticity.

Table 2 Fixed effects water demand models at the household-scale

The results of study showed that the price elasticity of water demand was negative and significant for all models, varying from −0.02 to −0.05. This result is within the range of values obtained by a number of previous studies (Abrams et al. 2012; Arbués et al. 2003, 2004). The models showed that the pricing response within households with higher income and swimming pool is slightly greater than households with low or middle-income. This difference can be attributed to the higher outdoor water use among households with higher income and swimming pool. In general, outdoor use is assumed to exhibit higher price sensitivity (Arbués et al. 2003; Polebitski and Palmer 2010).

The heterogeneity associated with the outdoor water use among different group of users also affected the household response to average temperature and total rainfall. The results of study revealed that the households with higher income and swimming pool showed the greater response to temperature and rainfall variables. In contrast, the low and middle-income households who have lower outdoor water consumption show a lower response to the weather variables. This finding is also in agreement with other studies (Balling et al. 2008).

The study revealed that although price has a negative relationship with consumption, its effect on water demand, for all groups of customers regardless of their water consumption levels and household and housing characteristics, is limited. Table 3 compares the water consumption and housing characteristics in 4 studied groups of consumers. Based on the results both high water users (i.e. low-income, high-income and houses with pool) and low water users (mid-income) responded weakly to the pricing signal.

Table 3 Water consumption and housing characteristics of different groups of consumers

The low price elasticity in Auckland can be attributed to the fact that the water bill generally comprises a small share of total household expenditure. In addition, the current water/wastewater pricing scheme with flat volumetric rates may not provide enough incentive to reduce water consumption specifically among higher user groups. The study also showed that the fixed price had very small and insignificant effect on water consumption in all models. In general, the only effect of the fixed charge on water consumption would be through its effect on reducing disposable income. Since the water costs usually comprises a small share of household expenditures it is not surprising that the effect of fixed price becomes insignificant (Mieno and Braden 2011).

Time trend also was negative and statistically significant in all models, representing a reduction trend in water use in all groups of consumers.

4.2 Water Demand Models at Census Area Unit Scale

Similar to household-scale analysis, the study examined pooled, fixed and random effects models to select best panel data method. The result of partial F-test showed that the panel models are an improvement over the pooled model. The Hausman test also revealed that the random effect model is more efficient than fixed effect model and can better produce consistent parameter estimates. The Table 4 shows the results of random effects model. The variables were transferred by natural logarithm thus the coefficients are elasticities.

Table 4 Random effects water demand model at census area unit scale

Similar to household-scale fixed effects models, the random effect model also provided satisfactory results where all variables were highly significant (except section size and fixed water charge) and had the expected sign. The coefficient of variation (R2) of model was also 0.77, implying the high explanatory power of the model.

In general, the census area unit model produced comparable results to the household-scale models for water price and weather variables. The random effect model estimated a volumetric price elasticity of −0.03, which was small but statistically significant. The fixed price was statistically insignificant. The model also showed that the temperature positively and rainfall negatively affect water demand. These results confirmed the finding of Ouyang et al. (2014), noting that scale of data does not significantly affect the results of demand models.

Besides the water price and weather variables, the model at the census area unit scale evaluated the effect of socioeconomic and urban structure on water demand.

The results of study showed that household size has a positive impact on water consumption, where a 10% increase in the average number of people in a household would result in a 3.6% increase in household water consumption. This result is in agreement with many other water demand studies, where it was argued that due to economies of scale in the use of water, the increase in water consumption is less than proportional to the increase in household size (Arbués et al. 2003, 2004; Hoffmann et al. 2006; Schleich and Hillenbrand 2009).

The income variable had a positive impact on water consumption. That is in line with many other demand studies (Kenney et al. 2008; Schleich and Hillenbrand 2009; Syme et al. 2004; Worthington and Hoffman 2008). In general, higher income household are associated with larger water consumption since they are likely to own more water-using capital stock, such as larger lawns and gardens, and swimming pools (Hoffmann et al. 2006; Mieno and Braden 2011; Schleich and Hillenbrand 2009).

The study also showed that the number of bedrooms in the property, as a proxy of size of dwelling, has a postive impact on household water consumtion. This is because increasing house size typically results in more bathrooms and higher chances of leaks (Polebitski and Palmer 2010).

This study also evaluated the effects of housing density and section size, as two important factors associated with the urban structure, on water demand. These variables generally influence the amount of outdoor water use (Abrams et al. 2012; Jorgensen et al. 2009). In general, dwelling density has a negative and section size, which is associated with smaller lot size and garden size, has a positive impact on water consumption (Balling et al. 2008; Chang et al. 2010; Domene and Saurí 2006; Polebitski and Palmer 2010; Shandas and Parandvash 2010). The results of study showed that the dwelling density in Auckland had a statistically significant negative impact on water consumption. However, this impact was limited where the 10% increase in housing density only was associated with a − 0.3% decrease in water consumption. The relationship between section size and water use also was insignificant. These results imply that the effect of compact development, through building higher density single-unit houses with smaller section size, would be limited on water demand in Auckland.

Finally, two dummy variables estimated through cluster analysis were highly significant, implying that water demand is different across low, middle and high-income suburbs. Figure 1 shows these three groups of census area units in Auckland. The first group is the low-income areas mainly clustered in Manukau City. The second group is the mid-income suburbs which were distributed all over Auckland and the third group included the high-income suburbs mainly clustered in Auckland City and North shore City.

Fig. 1
figure 1

Three clusters of census area units in Auckland

Table 5 compares water consumption, housing and households characteristics across three groups of census area units.

Table 5 Water consumption, housing and households characteristics across different groups

Similar to the household-scale demand analysis, the results of study showed that the low-income and the high-income suburbs had the higher per household water use in comparison to the middle-income area units. This difference generally can be attributed into the higher outdoor water demand in the high-income suburbs (e.g. the percentage of houses with pool in the high-income areas is 13.4 in comparison to 4.8 in the middle-income areas), and higher indoor water use in the low-income area units (e.g. the household size in low-income areas is 4.1 where this number is 3.1 in the middle-income areas) in comparison to the middle-income areas.

Although the low-income suburbs had the highest per household water consumption, mainly due the larger household size, the amount of per capita water consumption among this group of consumers is as low as the mid-income area units (Table 3). In contrast, the high-income area units had the highest per capita water consumption with an annual average of 196 litres per person per day. The seasonal variation of water demand is also considerable among the high-income suburbs where the water consumption increases by around 20% in the summers. Figure 2 shows the seasonal variation of water demand across 3 groups of suburbs in terms of per capita water use.

Fig. 2
figure 2

Monthly variation of per capita water consumption across three groups of census area units (average of 6 years data; error bars with 1 standard deviation are shown for each dataset)

Estimation of per capita water consumption is generally required by water utilities for the purpose of water planning and forecasting. The multi-scale analysis approach made this information available at the census area unit level, an appropriate scale for the management purposes, via including the census household demographic information into the datasets. In this way, the per capita water consumption can be estimated through dividing the average per household water consumption by the average household size for each census area unit.

5 Conclusions

This study pioneered a new approach in multi-scale analysis of water demand through integration of water consumption, land use and demographic data. Water demand studies typically use data from household scale or aggregated scale. The household-scale data is useful to evaluate the heterogeneity of responses to the determinants of water demand practically water pricing among different group of customers, where the aggregated data can be useful to evaluate the spatial pattern of water consumption and the effects of urban structure (e.g. density) on demand. This study took advantages from both scales through carrying out the water demand analysis, using panel data models, both in household and census area unit scales. In this way, first the study integrated the water consumption and property data. Developing a large sample of more than 31,000 individual houses, the study estimated the price and weather elasticities for low, middle, high-income households and houses with swimming pools. The results of study showed that the price elasticity of water demand for the groups of high users (i.e. household with high-income and swimming pool) is slightly higher. However, in general the price elasticity of water demand in Auckland was low for all groups of consumers, implying that the price of water would have limited effects on the water demand. The analysis also showed that the household with higher income and swimming pool are more sensitive to the weather conditions since they have more outdoor water use.

The household-level data was then aggregated at the census area unit level to include the census socioeconomic information. The study revealed that household income, household size and number of bedrooms positively correlated with the household water consumption. Dwelling density although had a negative correlation with the water use however its impact was limited. The section size of property also had an insignificant correlation with the water consumption. These results imply that the effect of compact developments would be limited on the water demand in Auckland. The results from aggregated model for water pricing and weather variables also were in agreement with the household-scale models.

With advances in database technology, data accessibility, computing power, and spatial GIS tools it is becoming more plausible to integrate disaggregated water consumption, land use and demographic data to make use the full potential of them in water demand studies. This data integration through multi-scale analysis allows the visualization and evaluation of demand information that was not previously possible. It provides planners with greater insights on the manner by which water is consumed spatially and how specific land use, demographics and weather impact consumption across space and time. This information can help water utilities to plan the water supply system in an optimal manner to meet demand and also better target a specific group of consumers or urban areas (e.g. high water users) for the conservation planning and demand management.