Introduction

The incidence of forest fires has increased in recent decades in the Mediterranean (Pereira et al. 2006; Velez 2006; Pausas et al. 2008). In Portugal, burned area reached a total of about 3.8 × 106 ha in the period from 1975 to 2007, i.e., equivalent to nearly 40% of the country area. Wildfires have a substantial economic, social, and environmental impact and became a public calamity and an ecological disaster affecting a considerable area in Portugal (Gomes 2006). Forest managers and policy-makers thus face the challenge of developing effective fire prevention policies. This context highlights the need to characterize the current fire regime and understand the ecological and human drivers of observed patterns in areas burned.

Several factors may explain the ignition and spread of forest fires, such as fuel characteristics (Rothermel 1972, 1983; Albini 1976), climate, ignition sources, and topography (Agee 1993; Barton 1994; Viegas and Viegas 1994; Mermoz et al. 2005; Pereira et al. 2005). Fuel characteristics are a function of vegetation structure and composition in addition to anthropogenic factors. Topography, climate, and socioeconomic factors determine the mix available at any given site (Rothermel 1983; Cardille et al. 2001; Lloret et al. 2002; Badia-Perpinyà and Pallares-Barbera 2006; Sebastián-López et al. 2008). Topography further affects fire behavior, via its direct influence on flame geometry and, indirectly, through its effect on weather (Rothermel 1983; Kushla and Ripple 1997). Climate, cover type, and topographical data are frequently used to develop fire risk indices (Pereira et al. 2005; Carreiras and Pereira 2006). Recent characterizations of forest fires in Portugal underlined the impact of climate variables, e.g., the number of days with extreme fire hazard weather, on the number and size of fires (Viegas and Viegas 1994; Pereira et al. 2005; Gomes and Radovanovic 2008). Pereira et al. (2006) further claimed that more than 2/3 of the inter-annual variation of the area burned can be explained by changes in weather conditions. Other studies have analyzed the impact of species composition and of fuel reduction activities on fire intensity and spread (Fernandes 2001; Fernandes et al. 2005; Fernandes and Rigolot 2007).

Yet another essential driver for ignition is the influence of human activity, which increases the risk of fire in the vicinity of road networks and urban areas (Aranha and Gonçalo 2001; Díaz-Delgado et al. 2004; Koutsias et al. 2002; Pereira and Santos 2003; Wittenberg and Malkinson 2009; Mermoz et al. 2005; Carreiras and Pereira 2006). Nevertheless, due to human presence, it is easier to detect and suppress fires in these areas (Pereira and Santos 2003; Sebastián-López et al. 2008). Traditional agricultural practices that encompass the use of fire (e.g., pasture renewal) may also lead to the occurrence of forest fires (Gomes 2006; Sebastián-López et al. 2008). In Portugal, socioeconomic and demographic trends leading to large-scale abandonment of rural areas have contributed to increase fires’ severity (Moreira et al. 2001).

The approach to wildfire characterization in Portugal discussed in this paper extended the scope of former studies by recognizing land uses changes over time, including socioeconomic variables and applying weighted generalized linear models rather than classical generalized linear models. It was instrumental for analyzing changes in burned area, number of fires, and fire distribution in Portugal and for examining the influence of topography, land-use, socioeconomic, and climate variables on fire occurrence probability. Specifically, it modeled the proportion of burned area to derive relationships between ecological and socioeconomic features and fire occurrence. The proposed approach combined the use of geographic information systems and statistical analysis techniques. Feature maps were overlaid with perimeters of forest fires, and the proportion of burned area was modeled using weighted regression analysis. This study also extended the approach proposed by Gonzalez and Pukkala (2007) to further consider socioeconomic factors. Moreover, using a weighted rather than a classical generalized linear model to develop multiple regression analysis (Nelder and McCullagh 1989) was instrumental to take into account the relative importance of the area occupied by each land class.

Data and methods

Mainland Portugal (Fig. 1) extends over approximately 89,000 km2 located between 37°N and 42°N latitude and between 6°W and 10°W longitude. Altitude ranges from sea level to ca. 2,000 m, with the high elevations concentrated in central and northern Portugal. Mean annual temperature and precipitation follow a gradient of increasing temperature and decreasing rainfall from northwest to southeast. Mean annual temperature ranges from 7 to 18°C and annual rainfall from 400 mm to 2,800 mm. Forestry is a key element in the Portuguese landscape pattern and forests and woodlands extend over one-third of the country. Further, shrub lands extend over about 25% of the country’s area (DGRF 2006). Albeit ecological diversity as a result of climatic influences that range from Mediterranean to Atlantic or continental, over 80% of the forest area is occupied by four species: Maritime pine (Pinus pinaster), eucalypt (Eucalyptus globulus), cork oak (Quercus suber), and holm oak (Quercus rotundifolia). The agricultural area extends over about 30% of the country’s area (DGRF 2006).

Fig. 1
figure 1

Land-use occupation using LOM90

The descriptive analysis of wildfire occurrences in Portugal was based on historical fire information from a 33-year period (1975 to 2007). Burned area mapping was obtained every year in this period, by semi-automated classification of high-resolution remote sensing data (i.e., Landsat Multi-Spectral Scanner (MSS), Landsat Thematic Mapper (TM), and Landsat Enhanced TM+). Mapping by the Remote Sensing Laboratory of Instituto Superior de Agronomia identified 35,198 fire perimeters with burned areas equal to or greater than 5 ha in the period 1975–2007.

In order to further analyze variations in the number and size of wildfires, this temporal horizon was classified according to the availability of up-to-date land cover maps into three 5-year sub-periods (1987–1991, 1990–1994, and 2000–2004). The areas burned in each 5-year sub-period were included as map layers in the geographic information system (GIS) database. This time frame was a compromise between having a larger sample size and minimizing the time impact on land cover changes (Moreira et al. 2009).

Land cover maps included 1990 and 2000 Corine Land Cover (CLC maps) produced by the Remote Sensing Group from Instituto Geográfico Português, at a scale of 1:100,000. The former (CLC 90) was produced using Landsat TM images from 1985 to 1987, while the latter (CLC 00) was produced using Landsat TM Images from 2000. In both cases, the standard classification encompassed 3 levels and 42 classes at the highest level. Thus, the CLC maps provided information on cover type distribution in 1987 and 2000. The set of cover type maps used in this study further included a map (LOM90) produced by Instituto Geográfico Português using cartographic information on a scale of 1:25,000 from aerial photography mostly dated from 1990. LOM90 land cover mapping was more detailed than CLC’s and provided the information about cover type distribution in 1990. One important difference between CLC and LOM information is that the former did not separate eucalyptus forests from hardwoods, whereas LOM did.

In order to study the relationships between ecological and socioeconomic features and fire occurrence in the three sub-periods, cover types were classified into 10 classes (Table 1). For modeling purposes, four other environmental and socioeconomic variables were considered: altitude, slope, proximity to roads, and population density (Table 1). The selection of variables and their ordinal-level segmentation was based on extensive preliminary data analysis.

Table 1 Description of variables used to characterize fire occurrence in Portugal

The country’s digital terrain model (DTM) data were used to provide information about the distribution of altitude and slope. This DTM was obtained from elevation vector data used in the production of the Ortophoto-cartographic Series at a scale of 1:10,000 by the Instituto Geográfico Português. The distribution of the country’s area per classes of altitude and slope considered a 225-m grid size as the minimum fire patch area was 5 hectares. The Spatial Analysis module of ARCGIS 9.2 was used to get the altitude and the slope map layers. ARCGIS 9.2 was also used to create a 1 km buffer around roads in the Estradas de Portugal, S.A. map, and to get the GIS layer with the distribution of the study area per road proximity classes. The country population density data—number of habitants living in each parish—were obtained from Instituto Nacional de Estatística, (Population Census from 1991 and 2001). These data were used to calculate population density per square kilometer and to get the population density class map. It was assumed that population density and road proximity did not change over the 5 years in each sub-period (Fig. 2).

Land cover type, altitude, slope, proximity to roads, and population density GIS layers for each of the 5-year sub-periods were overlaid using ARCGIS to produce three maps where each polygon represents a contiguous homogeneous area (thereafter designated as stratum) (Fig. 3). According to the 5 classification criteria used to determine the maximum number of homogeneous strata, 720 strata were observed. In the second sub-period (1990–1994), the cover type map was more detailed and 716 strata were present; 543 and 558 strata were present in the first (1987–1991) and third (2000–2004) sub-periods, respectively. The polygon coverage for each sub-period was overlaid with the burned area layer for the corresponding 5-year period. This provided the data needed to estimate the proportion of each stratum that was burned during the sub-period.

Fig. 2
figure 2

Population density in 2001 (Census 2001)

Fig. 3
figure 3

Fire perimeters between 1987 and 1991 in Portugal (left), a zoom over a burned area is shown in the right, as well as the independent variables, a land-use classes, b altitude classes, c slope classes, d roads proximity classes, e population density classes, f perimeters of fire events, and g layer indicating forest classes enclosed within the fire perimeters

The relationships between ecological and socioeconomic features and fire occurrence in the three sub-periods were analyzed with weighted generalized linear models (WGLMs). This approach takes as predicted variables transformations of the proportion of burned area in each stratum i (p i ) and as predictors the levels of the covariates (land cover type, altitude, slope, proximity to roads, and population density). The main objective of WGLMs is to develop multiple regression analysis using the weighted least squares method (Nelder and McCullagh 1989), where the weights take into account the relative importance of the area of each stratum. This approach further contributes to meet multiple regression requirements. WGLMs are adequate for addressing situations where the variance is not constant, and/or when the errors are not normally distributed. This is often the case of response variables expressed as proportions (Nelder and McCullagh 1989). This study tested both logit and arcsin transformations of p i .

$$ y_{i} \equiv \log ({\frac{{p_{i} }}{{1 - p_{i} }}}) = X_{i}^{t} \beta + \varepsilon_{i} $$
(1)
$$ \mu_{i} \equiv \arcsin (\sqrt {p_{i} } ) = X_{i}^{t} \beta + \varepsilon_{i} $$
(2)

where the random errors ε i may be considered independent Gaussian random variables with mean zero and variance σ2, \( \varepsilon_{i} \sim N(0,\sigma^{2} ) \), i = 1,…,n. β is the regression coefficient vector associated with the covariate vector X i .

The models were estimated using the backward logistic regression, and the parameters were estimated using the method of maximum likelihood in the R software version 2.7 (R Development Core Team 2008). This estimation considered all the covariates and the relevant interactions between covariates.

Both the coefficient of determination, pseudo R 2, and the Akaike Information Criterion (AIC) were used to select the best model for each 5-year sub-period. After this selection, the goodness of fit of each model was tested using deviation statistics. The size of the discrepancy between the fitted values produced by the model and the values of the data is a measure of the inadequacy of the model. Deviation statistics measure the discrepancy in a WGLM in order to assess goodness of fit. If L denotes the likelihood and D the deviance of a model involving p parameters, the deviance may be simply defined as minus twice the log likelihood D = −2log L. These statistics have a Chi-square distribution and were compared to P-values to test the hypothesis of model adequacy. To further assess the goodness of fit, model residuals were analyzed using Normal Q-Q plots and Cook’s distances. Finally, a collinearity diagnostics was conducted to check whether the covariates were correlated.

Results

In the 33-year period (1975 to 2007), there were 35,194 wildfires greater than 5 ha. In total, they burned about 3.8 × 106 ha. The analysis of yearly data (Fig. 2) shows that the burned area ranged from 15,500 ha in 1977 to 440,000 ha in 2003. In the year with the largest burned area (2003), a single fire perimeter extended over about 58,000 ha.

In the first sub-period (1987–1991), there were 7,672 wildfires and the total burned area extended over 647,312 ha. Only 232 wildfire perimeters were larger than 500 ha, accounting for 43% of the total area burned in the sub-period (Table 2). In 1987, there was a large wildfire that burned nearly 13,000 ha. In 1988, only 656 wildfires were recorded burning about 4% of the total burned area for in this sub-period. In 1989, the burned area and the number of fires accounted for about 32 and 30% of the burned area and the number of fires in this sub-period, respectively.

Table 2 Number of fire events and total area burned, during the sub-periods 1987–1991, 1990–1994, and 2000–2004

In the second sub-period (1990–1994), a lower number of wildfires (5,706) were recorded and the total burned area extended over 442,745 ha. The average area burned per wildfire (77 ha) was also the lowest among all three sub-periods. In this sub-period, 149 wildfires extended over 500 ha, accounting for 44% of the burned area (Table 2). Yet none extended over 10,000 ha. In 1993, only 462 wildfire events were recorded, the lowest yearly number in the sub-period.

During the third sub-period (2000–2004), both the number of wildfires (7,383) and the total burned area (930,128 ha) increased substantially. In this sub-period, only 264 wildfires extended over 500 ha and yet they accounted for 65% of the burned area (Table 2). Moreover, four wildfire perimeters extended over an area greater than 20,000 ha. They occurred in 2003 and 2004, and they represented 15% of the burned area in the third sub-period. Further, the area burned in 2003 accounted for about 47% of the area burned in this 5-year sub-period (Fig. 3).

Descriptive statistics of wildfires historical data provided information about changes in the number and in the size of wildfires during the study period. Yet this information had to be combined with other spatial variables such as topography, land cover, proximity to roads, and population density to help explain those changes and to identify fire occurrence patterns at the landscape level to further identify the areas that are most susceptible to fire. The ten most burned combinations in the three periods show that shrubs were the most vulnerable land cover (Table 3).

Table 3 The ten most burned combinations, in the three periods

In both the first (1987–1991) and the second (1990–1994) sub-periods, the highest percentage of burned area occurred in the stratum with shrubs at altitudes over 400 m, located at more than a kilometer from roads and with population density lower than 25 habitants per km2. In the third sub-period (2000–2004), the stratum with the highest percentage of area burned was hardwoods, at altitudes between 200 and 400 m in areas with low population density and more than a kilometer from a road. This stratum occupies an area of approximately 28,500 ha. In the case of other strata, the relative importance of the percentage of area burned was approximately the same in all three sub-periods.

Models using the logit transformation performed better than arcsin transformation and were further used in this study. The coefficients of determination (R 2) of the weighted generalized linear models (WGLM) for the three sub-periods reached values between 0.87 and 0.89 in the case of the logit transformation of p i (Table 4). Interactions between covariates were not significant and were not included in the final models. The goodness of fit of the three models was tested using deviation statistics. The values were 0.316, 0.317, and 0.242 in the case of the first, second, and third sub-periods, respectively. The comparison between these deviation statistics and the corresponding p-values (1 in all three sub-periods) for Chi-square distributions with 523, 698 and 542 degrees of freedom leads to the acceptance of the null hypothesis that the models fit well the data in all three sub-periods. The adequacy of the model was further checked by the analysis of the residuals by using both Normal Q-Q plots and Cook’s distances. This analysis further confirmed the goodness of fit by the three models. The former showed that residuals are normally distributed while the latter showed that there are no large residuals that may distort the accuracy of the regression (all distances are lower than 0.05).

Table 4 Regression parameter estimates and fit statistics for models for periods 1987–1991, 1990–1994, and 2000–2004

The computation of values of the generalized variation inflation factors (GVIF) showed that the covariates were not correlated and that multicollinearity was not a problem as they ranged from 1.001 to 1.01, approximately (Table 5).

Table 5 Values of the generalized variation inflation factors (GVIF) for the collinearity diagnostics for models for periods 1987–1991, 1990–1994, and 2000–2004

The weighted generalized linear models (WGLM) confirmed that in the first sub-period, areas occupied by shrubs were more likely to burn. Mixed stands (HardSoftEuc), Softwoods, and Hardwoods were the second, third, and fourth cover types that most impacted the proportion of area burned. As expected, the no fuel (NoFuel) and the permanent crops (PermCrops) land cover classes had a negative impact on the proportion of burned area. In the case of the first and third sub-periods, the land cover type map CLC 90 did not provide data about individual forest species and the Euc and SoftEuc classes could not be analyzed separately (i.e., both eucalyptus and hardwoods are pooled in the same category). As for altitude, the regression coefficients indicated that higher altitude values were associated with a higher proportion of burned area. The proximity to roads’ covariate had a similar behavior, i.e., larger distances lead to an increase in the proportion of area burned. Conversely, the increase in population density lead to a decrease in the proportion of area burned.

The model developed for the third sub-period showed similar patterns than the one for the first period. It showed that areas occupied by shrubs were more likely to burn. Regarding the proximity to roads, the regression coefficients indicated that larger distances lead to an increase in the proportion of area burned, whereas the increase in population density lead to a decrease in the proportion of area burned (Fig. 4).

Fig. 4
figure 4

Annual burned area and number of fires recorded in Portugal during the period 1975–2007

Compared to the other two periods, the model developed for the second sub-period included a more detailed description of land cover categories with a separation of different forest types (e.g., Euc and SoftEuc). In this sub-period, the shrub lands were still the class that most impacted the proportion of area burned, i.e., they increased the proportion of area burned. Furthermore, the areas occupied by conifers and eucalyptus were also very susceptible, whereas mixed forests including hardwoods were less affected by fire. On the contrary, the no fuel, annual, and permanents crops land cover classes had negative impact on the proportion of area burned. In this period, areas with slopes greater than 5% were more susceptible to burn. In general, the higher the altitudes and distance to roads (accessibility), the higher the proportion of area burned. As for the other periods, the increase in population density lead to a decrease in the proportion of area burned (Table 4).

Discussion and conclusions

An approach for characterizing fires in Portugal is presented. The analysis of historical fire data allowed modeling the variation in the distribution of fire size, changes in burned area, and number of fires over the study period. The combination of wildfire historical data with ecological and socioeconomic variables (namely topography, land use, proximity to roads, and population density) helped explaining changes in number and size of wildfires. It further contributed to identify fire occurrence patterns at the landscape level. The results may help managers and policy-makers to develop effective fire prevention policies.

A novelty of the proposed approach is the use of weighted generalized linear techniques (WGLM). The use of WGLM rather than classical generalized linear models allows taking into account the relative importance of the area associated with specific ecological and socioeconomic features when assessing its impact on the proportion of area burned. The proposed approach also extends the scope of former studies (e.g., González and Pukkala 2007; Sebastián-López et al. 2008) by recognizing land-use changes and by including socioeconomic variables in the analysis. This is instrumental to acknowledge key relationships between ecological and socioeconomic features and the proportion of area burned (Wittenberg and Malkinson 2009).

The analysis of the three sub-periods under study showed an overall increase in area burned for all the classes of fuel (cover type). This increase is mainly due to the large fires of 2003 in the third sub-period that devastated around 440 × 10^6 hectares. The second sub-period was characterized by smaller area burned and has a relatively small number of fires, compared with the other two.

Modeling results underlined the impact of the land cover type on the proportion of area burned. Clearly, shrubs were the cover type that had higher impact on the proportion of area burned. This has also been indicated by other authors (Moreira et al. 2001, 2009; Catry et al. 2009a, b; Nunes et al. 2005; Pereira et al. 2006; González and Pukkala 2007, Wittenberg and Malkinson 2009,) and can be explained by a combination of both a higher rate of fire spread in this fuel type (both because of fuel properties, namely the presence of resins and flammable essential oils, and of its widespread occurrence in steeper slopes), a larger frequency of ignitions (e.g., to create pastures), and a lower fire fighting priority (Moreira et al. 2009). In contrast, annual and permanent crops were much less fire prone. This is in agreement with results presented by Sebastián-López et al. 2008. Our study confirmed that hardwoods, either as pure or as mixed stands, decrease the fire risk in forested areas, when compared to pine and eucalyptus stands. Moreira et al. (2009) explained these findings by differences in fuel load, moisture content and flammability. Wittenberg and Malkinson (2009) also showed that pine stands are more flammable than oak stands,. In short, this study confirmed that for most land-use classes, fire behaves selectively, with marked preference for shrub lands in terms of both fire number and fire size which is consistent with findings of other studies (e.g., Cumming 2001; Nunes et al. 2005; Bajocco and Ricotta 2008).

Slope is an important factor impacting fire behavior because it accelerates the rate of spread (Agee 1993). This probably explains the higher frequencies of fire found on steeper slopes. In relation to altitude, a recent paper (Catry et al. 2009a, b) showed that fire ignitions are more likely at higher altitudes, probably as a consequence of pastoral burns or a higher frequency of lightning. Again, this is consistent with our results of a higher proportion of burned areas for stratums at higher elevation.

The statistical analysis further showed that socioeconomic features such as the proximity to roads and population density impacted the proportion of area burned. Population density has been found to be the main driver of fire ignitions in Portugal (Catry et al. 2009a, b, Sebastián-López et al. 2008). Yet although fire ignitions in Portugal, as in other regions where fires are human-caused, are much more likely close to roads (Catry et al. 2009a, b; Wittenberg and Malkinson 2009), recent analysis (Romero-Calcerrada et al. 2010; Moreira et al. 2001) suggested that ignitions that resulted in large fires occur further away from roads. This is consistent with our results. The proportion of burned area increases with the distance from the road network, as accessibility (e.g., for fire fighters) is lower (Cardille et al. 2001; Vasconcelos et al. 2001; Sebastián-López et al. 2008; Badia-Perpinyà and Pallares-Barbera 2006). Conversely, the proportion of burned area decreased with population density as lower densities may delay fire detection and increase the time before initial suppression operations.

In summary, the presented methodology was suitable to study the relationship between ecological and socioeconomic features with fire occurrence. Further, it provided insight needed to develop fire prevention policies. This study showed that emphasis has to be placed on fuel management as land cover has a substantial impact on the proportion of area burned. It confirmed that forest management may play a key role to successful fire prevention. Further research is needed in Portugal to translate these findings into information about specific management practices to diminish wildfire occurrence probability and wildfire damage.