1 Introduction

The most urbanized province of Turkey in terms of economy and infrastructure is Istanbul. With a population exceeding 15 million, this is proportional to almost 20% of Turkey’s total population (TUIK 2019). However, of great concern is the challenge of soaring fire incidences experienced in Istanbul exacerbated by population growth as well as urban infrastructure expansion (largely attributed to urbanization). As stated in the 2016 annual report (IMM 2016) of the Istanbul Metropolitan Municipality Department of Fire Brigade, 58,666 fire incidents occurred with an average response time of more than five minutes.

From this perspective, the main goal of this research is motivated by the critical need to mitigate fire impacts and improving fire response action within the urban environment of Turkey, particularly Istanbul. An urgent need arises to explore tools and methods that have emerged with advancements in the information technology fields, to enhance the urban planning process for disaster mitigation (Erden 2012). To be more effective, the urban planning approach presently involves more complex decision-making and multiple stakeholders in a rather participatory, rational and communicative way that complements modern planning theories, ensuring sustainability.

Multi-criteria decision-making (MCDM) techniques were first developed in the 1960s, to resolve difficulties in harmonizing diverse opinions among interest groups, evaluating complex problems which have many criteria and conflicting objectives in the decision-making procedure (Afshari et al. 2016). The key benefit of integrating MCDM with geographical information systems (GIS) is the ability of GIS to manage, analyse and present geographic data in the form of maps to support decision-making. The map outputs can be utilized as a basis for discussion and review of decision problems which may result in the clarification as well as the optimal selection of decision alternatives (Malczewski and Rinner 2015). Regardless of the MCDM method of choice, the common goal to be achieved is the evaluation and selection among many available alternatives based on a large number and variety of criteria to acquire the most viable planning outcome. This is achieved through a systematic analysis procedure that overcomes the limitations of an unstructured individual or group decision-making (Afshari et al. 2016; Bhushan and Rai 2007; Vaidya and Kumar 2006). Within the context of urban planning efforts, information associated with the state and dynamics of urban regions is multi-varied (i.e. consisting of scales, time, dependences, values, beliefs and objectives) and targets diverse interest groups drawing particular political dimensions. Therefore, to be relevant, the decision-making objectives and feasibility of agreed planning decisions must be considered in the planning process by contextualizing the factors that structure the decision to avoid unnecessary conflict (Prévil and Thériault 2003). In this view, methodologies and tools are indeed required to manage and analyse this kind of complex information, which is spatial in nature, for urban emergency facility planning and fire risk mitigation in this study.

This research thereby proposes the use of GIS-based MCDM techniques for optimal selection of new urban fire stations and emergency facilities in Istanbul, required to lessen the fire response time to within five minutes. The significance of the research will be very vital in mitigating fire impacts and improving fire response action. GIS-based MCDM models can be applied to improve the quality of decision-making by merging spatial data and expert value judgements (Malczewski 2006). In this paper, a recently developed MCDM technique called best–worst method (BWM) (Rezaei 2015) integrated with GIS was suggested to optimally site new fire station facilities for the case of the Istanbul region. The spatial location of suitable areas for fire stations is an intricate procedure that requires careful analysis of many influencing and conflicting criteria. Six criteria were identified from a range of social/demographic, built environmental, spatial/accessibility and risk and safety variables based on relevant literature, expert opinion and international norms. Thereafter, the BWM method through group decision-making (GDM) approach was applied to determine the subjective weights of the criteria from the preference evaluations of two groups of decision-makers (DMs) composed of academic-related professionals and fire brigade practitioners. These criteria weights were ultimately used as input in GIS to resolve the selection problem that resulted in the production of two raster suitability maps representative of both DM groups. The GIS capabilities provided the necessary aid in the decision-making process for analysis and visualization of the developed BWM model.

The research outcome would make an important contribution to the literature and field of urban emergency facility location as few studies have applied this proposed model. Further, the BWM model result is tested by comparison with the AHP in terms of determined criteria weights, reliability and consistency. This study, additionally, explores the BWM preference criteria evaluations of two DM groups comprising of academicians/professionals and fire brigade personnel who have the requisite knowledge and experience in emergency management and planning. The findings of this analysis would provide one of the first-ever investigations and generate new insights into the different perspectives and opinions of these groups of DMs about the level of importance of criteria and how this impacts the overall planning outcome for siting optimal fire station and emergency facilities. Understanding the link in the preference evaluations between these two groups of DMs will help to gain a deeper understanding of GDM processes and intersections with group consensus and social perspectives.

The current paper is organized into the remaining sections as follows. The next section provides a review of the literature on emergency management and facility planning issues and the background on BWM’s relevant applications. Section 3 outlines the proposed methodology, GDM process and its implementation, followed by the BWM results, analysis and discussion of the two groups of DMs compared with the AHP for validation, GIS analysis and production of suitability maps for comparison in Sect. 4. Finally, the study conclusions are clarified in Sect. 5.

2 Background and related works

The selection of suitable areas for emergency facilities is affected by many influencing factors and has a finite number of alternative locations, thereby representing a typical multi-attribute decision-making (MADM) problem, which falls under the class of MCDM methods. For such MADM decision problems, the method for determining the criterion weight and that for the ranking of alternatives are crucial in realizing optimal outcomes. Generally, criterion weight determination methods are categorized into subjective and objective weighting methods (Wu et al. 2019). The subjective weighting method evaluates the weights based on preference judgements from associated experts, such as the analytical hierarchical process (AHP) (Saaty 1980). The AHP is based on pairwise comparisons that take a long time to evaluate and whose pairwise comparison matrix lacks consistency (Wu et al. 2019).

Recently, a new MCDM technique called best–worst method (BWM) was developed by Rezaei (2015, 2016) that uses fewer pairwise comparison matrices thereby less time to implement and has better consistency than existing subjective weighting methods such as AHP (Salimi and Rezaei 2018; Wu et al. 2019). BWM evaluates several alternatives to select the best alternative (s) with respect to pairwise comparisons between each of the two criteria (best and worst, identified by the decision-maker) and the other criteria (Rezaei 2015). As demonstrated by Rezaei (2015), the BWM technique can be applied to efficiently and reliably solve multi-criteria problems and has fewer comparison data. (BWM needs to have \(2n - 3\) comparisons, while for AHP, as an example, \(n\left( {n - 1} \right)/2\) comparisons are required.) Therefore, the BWM does not need complete pairwise comparison matrices, is easy to implement and saves lesser time compared to alternative MCDM methods (Suhi et al. 2019).

Considering all these advantages over other MCDM methods such as AHP, ANP and SMART, scholars have recently applied the BWM method in facility location and site selection research fields. Pamucar et al. (2017) proposed a hybrid GIS-MCDM model in Serbia that applied rough BWM and rough MAIRCA methods to determine the criteria weights for input into the GIS resulting in the production of final map for selecting most suitable locations for wind farms. Another hybrid BWM-EDAS model was applied by Özmen and Aydoğan (2020) for the logistics centre location problem in Kayseri, Turkey. BWM was used for calculating the weights of each criterion and sub-criteria and distance from average solution (EDAS), in the final stage, for ranking alternative logistics centre areas according to these criteria. The main evaluation factors for optimal logistics centre location included land, market, social, transportation, environmental effects, costs, risk and safety, national stability and international operation and management. Kheybari et al. (2019) suggested the BWM method for selecting suitable locations for a bioethanol facility in Iran. The researchers synthesized many criteria that affect the optimal location of a biofuel facility across only three main criteria using the sustainability approach, namely social, economic and environmental, thus simplifying the problem resolution and model application for optimal site selection. Other related works were conducted by Zolfani et al. (2019) for hotel site selection in Iran based on a proposed BWM and Weighted Aggregated Sum Product Assessment (WASPAS) approach; You et al. (2016) for selection of a culture centre in China by integrating modifications of BWM with ELECTRE III; and Stević et al. (2017) for selection of rail wagons for internal regional transportation across Bosnia and Herzegovina by proposing a rough BWM and rough simple additive weighting (SAW) approach.

So far, as reviewed in the literature, few studies to date have applied the BWM method for facility location and site selection. The present study, therefore, contributes to the literature by proposing the use of the hybrid GIS-based BWM model for two decision-maker (DM) groups to more reliably and efficiently select new fire station facility areas for a case study region of Istanbul province, Turkey not previously considered in (Erden and Coskun 2010; Nyimbili and Erden 2020). Further, with reference to the prior work done by Erden and Coskun (2010) where combined AHP and GIS methods were applied for fire station location in Istanbul city, this study presents an improvement in the methodology for overcoming the drawbacks of the AHP usage.

3 Research methodology

The urban emergency facility location problem of selecting new fire station sites is spatial in nature and involves an elaborate process involving multiple and conflicting criteria, data analysis and evaluation of many alternatives to achieve optimal results. The most influencing factors that affect the decision-making (DM) process were identified and abstracted from social/demographic, built environment, risk and safety and spatial considerations as part of the criteria screening procedure.

The construction of the model proposed in this study was based on an integration of GIS and BWM methodology. Through a series of steps outlined in the subsequent subsections, the BWM method was applied using a group decision-making (GDM) approach to determine the subjective weights of the identified criteria from preference judgements of two different groups of decision-makers (DMs) comprising academic-related professionals and fire brigade personnel. Survey questionnaires based on the BWM were designed and filled in by respondents representing a total of nineteen DMs and experts using a 9-point scale. An objective method based on the consistency index of the individual decision matrices determined from the BWM was used to derive the weights of the experts (Koksalmis and Kabak 2019). The weight coefficients of the criteria for individual DMs were calculated by using the BWM optimization model and the arithmetic means was later on used to aggregate the weight for each criterion. These criteria weights were used as input into GIS and applied on processed and re-classified criteria map layers through a weighted sum analysis to produce two final raster suitability maps for each DM group that showed the optimal areas for the new fire station and emergency facilities. Integrating GIS into the BWM model provided the requisite support in the spatial decision-making problem of siting new fire station locations, because of the GIS’ inherent capabilities to handle spatial data and its powerful visualization and analysis tools. Figure 1 presents the general proposed approach used in this study procedure.

Fig. 1
figure 1

Proposed group decision-making (GDM) framework based on the BWM-GIS model

3.1 Case study

The most densely populated and industrialized province of Turkey is Istanbul. With a population of about 15,067,724 as of 2018, the province stretches over an estimated total area of 5343.02 km2 (TUIK 2019). Due to increasing population and infrastructural expansion, there has been a rising number of fire incidences in Istanbul as registered in the annual report of the Istanbul Metropolitan Municipality (IMM 2016) with the period it took to arrive at these locations of fire incidents exceeding five minutes.

Considering this situation, the research is proposed for optimally planning new areas for the fire station and emergency facilities to reduce the fire response time to five minutes or less, acceptable by international standards (Dong et al. 2018; NFPA 2010; Wang 2019; Yao et al. 2019). The research outcome underscores the importance of mitigating impacts of fire risk as well as improving the fire response activities as part of disaster and emergency management (DEM). To achieve the research objectives, this study covered the whole extent of Istanbul province, comprising of about 960 sub-districts as depicted in Fig. 2.

Fig. 2
figure 2

Map of Istanbul province showing existing fire stations and population density

The population density ranges up to 1333.2 people per hectare (ha) with a larger concentration of existing fire and emergency facilities located in the densely populated metropolitan areas of Istanbul. Within these urbanized regions, the high population density is associated with a high risk of fires resulting from increased socio-economic, tourism, infrastructure expansion and commercial activities. The fire departments are engaged in the key function of fire prevention and suppression to minimize loss of life and property. Additional services include ambulance, emergency rescue and recovery activities such as in the event of automobile accidents, disasters that include floods, earthquakes, landslides, etc.

3.2 BWM

The best–worst method (BWM), recently introduced by Rezaei (2015), is a subjective weighting, pairwise comparison-based multi-criteria decision-making (MCDM) technique. BWM was selected for use in this research because it has the following advantages over other MCDM methods of requiring the use of fewer comparison data and therefore has better consistency in pairwise comparisons, achieving more reliable weight results, being easy to understand and revise by decision-makers for increased consistency (Rezaei 2015, 2016; Rezaei et al. 2018). Since introduced, BWM has been applied in many recent studies covering wide domain areas such as location and site selection (Kheybari et al. 2019; Özmen and Aydoğan 2020), mining (Ajrina et al. 2018), energy (van de Kaa et al. 2019; Zhao et al. 2018), manufacturing (Moktadir et al. 2018), supply chain (Liu et al. 2018a), transportation (Stević et al. 2017), environment (Liu et al. 2018b), water management (Chitsaz and Azarnivand 2017; Nie et al. 2018a, b), risk assessment (Torabi et al. 2016), emergency routes evaluation (Mei et al. 2018), search problem (Sotoudeh-Anvari et al. 2018), technology selection (Ren et al. 2017; Ren 2018), piping selection (Safarzadeh et al. 2018) and materials selection (Zolfani and Chatterjee 2019).

In this study, the BWM was applied to evaluate the criteria weights (\(w_{1} , w_{2} , \ldots , w_{n}\)) for optimally planning locations of the new fire station and emergency facility areas for fire response mitigation in Istanbul province, using the following outlined steps (Rezaei 2015, 2016):

  1. (1)

    Identify and select the set of decision criteria for analysis \(\left\{ {c_{1} , c_{2} , \ldots , c_{n} } \right\}.\)

  2. (2)

    Determine the best \(\left( B \right)\) and the worst \(\left( W \right) \) criteria.

  3. (3)

    Determine preference via a pairwise comparison between the best criterion \(B\), and all the other criteria. In this step, decision-makers (DMs) evaluate their preference, using a number between 1 and 9 (a 9-point scale with two extremes, where for pairwise comparison of \(i\) and \(j\), 1 implies \(i\) is equally as important as \(j\), and 9 implies \(i\) is extremely more important than \(j\)).

    $$ A_{B} = \left( {{ }a_{B1} ,{ }a_{B2} ,{ } \ldots ,{ }a_{Bn} } \right), $$
    (1)

    is the resulting best-to-others vector of comparisons, where \(a_{Bj}\) indicates the preference of best criterion \(B\) over the worst criterion \(j\) and clearly,\( a_{BB} = 1\).

  4. (4)

    Determine the preference of all the other criteria over the worst criterion W. In this step, DMs evaluate their preference using the 9-point scale.

    $$ A_{W} = \left( {{ }a_{1W} ,{ }a_{2W} ,{ } \ldots ,{ }a_{nW} } \right)^{T} , $$
    (2)

    is the resulting others-to-worst vector of comparisons, where \(a_{jW} \) indicates the preference of criterion \(j\) over the worst criterion \(W\) and clearly, \(a_{WW} = 1\).

  5. (5)

    Compute the optimal weights (\(w_{1} , w_{2} , \ldots , w_{n}\)). Given each pair of \(W_{B} /{ }W_{j}\) and \(W_{j} /{ }W_{W}\), the optimal weight obtained is \( W_{B} /{ }W_{j} = a_{Bj}\) and \(W_{j} /{ }W_{W} = a_{jW}\). These conditions for all \(j\) are satisfied by finding a solution where the maximum absolute differences \(\left| {W_{B} /{ }W_{j} - a_{Bj} } \right|\) and \(\left| {W_{j} /{ }W_{W} - a_{jW} } \right|\) for all \(j\) is minimized. Based on the concept of minimizing the maximum deviation, this is translated to the following constructed mathematical model to determine the optimal weights:

    $$ \begin{gathered} \min \max_{j} \left\{ {\left| {W_{B} / W_{j} - a_{Bj} \left| { {\text{and}} } \right|W_{j} / W_{W} - a_{jW} } \right| } \right\}, \hfill \\ {\text{subject to}}, \hfill \\ {\text{Sum}}_{j} \;W_{j} = 1 \hfill \\ W_{j} \ge 0,\quad {\text{for}}\;{\text{all}}\,j, \hfill \\ \end{gathered} $$
    (3)

Conversion of the model in Eq. (3) to a linear programming form is as presented below:

$$ \begin{gathered} \min \,\xi \hfill \\ {\text{subject to,}} \hfill \\ \left| {W_{B} / W_{j} - a_{Bj} } \right| \le \xi ,\quad {\text{for}}\;{\text{all}}\;j \hfill \\ \left| {W_{j} / W_{W} - a_{jW} } \right| \le \xi ,\quad {\text{for}}\;{\text{all}}\;j \hfill \\ {\text{Sum}}_{j} \;W_{j} = 1 \hfill \\ W_{j} \ge 0,\quad {\text{for}}\;{\text{all}}\, j, \hfill \\ \end{gathered} $$
(4)

Solving the linear programming problem in Eq. (4), the unique solution of optimal weights (\(w_{1} , w_{2} , \ldots , w_{n}\)) and ξ* is obtained. To check the consistency of the BWM pairwise comparisons, the following formula is applied:

$$ {\text{Consistency Ratio}} = \xi^{*} /{\text{ Consistency Index}} $$
(5)

The consistency index uses maximum possible values of ξ* and can be retrieved from Table 1.

Table 1 Consistency index (CI) table

Observably, the smaller the value of ξ*, the lower the consistency ratio and the more reliable the comparisons and vice versa. The bigger the ξ*, the higher the consistency ratio resulting in less reliable comparisons.

3.3 Criteria selection

Screening of the criteria for selecting suitable fire station sites is a crucial process in the research procedure of the proposed model. The most influencing factors affecting the selection process were identified across social/demographic, built environment, spatial/accessibility and risk/safety variables. After a thorough evaluation and analysis, six specific criteria were selected based on a review of a project report relating to fire station locations in Istanbul (IMM 1989), and as suggested by Johnston (1999) and Gay and Siegel (1987) that the criteria should comprise population density, distances between fire stations and distinct hazards (Erden and Coskun 2010). From these influencing factors and other considerations, within the framework of achieving a comprehensive planning process, the criteria selected and adopted are as described in Table 2.

Table 2 Criteria selection assessment

Once the criteria were determined, the subjective evaluation of the defined criteria for optimal fire station locations was undertaken by decision-makers (DMs) as part of the BWM procedure through a group decision-making (GDM) process, as elaborated in the next subsection.

3.4 Group decision-making (GDM) and decision-maker (DM) selection

There are two main classifications of group decision-making (GDM) or collaborative decision-making, called process- and content-oriented approaches. Process-oriented approaches relate to the process of making group decisions with the main goal being the generation of new ideas to understand and structure the problem, while content-oriented approaches focus on the content of the problem and attempt to find an optimal solution within given social or group constraints or objectives (Kabak and Ervural 2017). There are three classes of content-oriented approaches, namely implicit (or social choice theory) and explicit multi-attribute evaluation and game theory approaches. Among these three categories, the interest of this research is directed towards the explicit multiple attribute evaluation, which refers to multiple attribute decision-making (MADM) with multiple decision-makers and therefore called multi-expert or multiple attribute group decision-making (MAGDM). The term MADM is synonymously used with MCDM, referring to ‘‘multiple attributes’’ and ‘‘multiple criteria’’, to describe the multiple and conflicting criteria that are characteristic of decision environments (Kabak and Ervural 2017). MADM becomes MAGDM with the involvement of multiple DMs and where criteria are explicitly provided as well as the alternatives in relating to the criteria (Kabak and Ervural 2017). In this regard, the research proposed a MAGDM process that aimed at arriving at a group satisfactory solution of determining optimum weights for suitable fire station facility locations. The study was based on the input of two DM groups drawn from both academic-related professionals and fire brigade personnel with relevant knowledge and experience in emergency and planning activities.

The conceptual framework for the MAGDM process proposed for use in this study was based on a literature analysis and adaptation from Kabak and Ervural (2017). Three main stages of the MAGDM process included: structuring and construction, assessment and selection/ranking stage, as earlier illustrated in Fig. 1.

Within the structuring and construction stage, the MAGDM problem was structured by first understanding and defining the decision objective which was to locate suitable areas for the fire station and emergency facilities in Istanbul. This was followed by selecting two groups of DMs for criteria assessment that included academic-related professionals and fire brigade practitioners who have the requisite knowledge and experience within the scope of emergency planning.

The assessment stage, applying the criteria-based assessment approach, was conducted where the criteria were explicitly described and an agreed set of criteria was used by the DMs. From a survey of literature and various information sources, the evaluation criteria and constraints for the decision goal were determined. Applying the Delphi technique, questionnaires were designed to acquire the preference evaluations of the DM groups. The criteria weights were thereafter calculated according to their importance regarding the decision problem via pairwise comparisons of the best–worst method (BWM) technique, representing the preferences and judgements of the DMs. A quantitative method using the consistency index of each DM was applied to derive the objective weights of the respective DMs (Koksalmis and Kabak 2019). The DMs’ individual preferences were computed, initially, and then aggregated.

Finally, in the selection/ranking stage, a calculation of the collective preference ordering (of the criteria weights) was done based on the results of the assessment stage for all the DMs’ across both groups of the DMs’ evaluations, separately. A simple arithmetic mean was used as an aggregation operator and for ranking in this process.

After the preference orderings of the DMs were evaluated, collectively, and in both groups separately, a consensus process was undertaken. This process was necessary to determine whether these results could be consolidated as the final weights of the fire station site selection goal and the design criteria for developing the BWM assessment model. For this determination, an overall measure of the consensus degree in the DMs’ rankings, using Kendall’s coefficient of concordance, W (Kendall 1938), was proposed to be computed. Kendall’s W is a meaningful measure of correlation/association for evaluating the degree of agreement between three or more sets of ranks on a given number of subjects/objects (Kendall 1938; Sheskin 2003). The possible values of W may fall between 0 and + 1 (inclusive of both 0 and + 1). If the value of \(W = 1,\) that indicates complete agreement among the sets of ranks and if the value of \(W = 0\), there is no pattern of agreement (Kendall 1938; Lee and Chan 2008; Sheskin 2003). The weights of the criteria determined from each of the DMs’ preference evaluations were converted into the total number of DMs that formed the sets of ranks required for computing the Kendall’s W. The Kendall’s W for the rankings of evaluated criteria weights were calculated to determine the level of agreement among the DMs in each of the two DM groups. Kendall’s W is a ratio of the variance of the sums of the ranks for the subjects or objects divided by the maximum possible value that can be calculated for the variance of the sums of the ranks (for the relevant values of \(m\) and \(n\)) (Sheskin 2003), summarized in Eq. (6).

$$ W = \left( {\frac{S}{{m^{2} n \left( {n^{2} - 1} \right)/12}} } \right) $$
(6)

where \(S\) is the variance of the sums of the ranks for the subjects (i.e. the variance of the \(\sum R_{j}\) values), \(m\) is the number of decision-makers (DMs), relative to their rankings of the \(n\) subjects (criteria, in this study case).

The closer the value of \(W\) to 1 denotes a high degree of agreement among the DMs in respect of how they rank the six criteria under evaluation.

Applying the BWM model, an interview of the two groups of DMs was undertaken to obtain their preference judgments regarding the decision problem of optimizing the selection of suitable fire station locations and for determining the related criteria weights. The premise upon which the two DM groups that have requisite knowledge and experience in related emergency planning services were chosen was the need to incorporate both academician/researcher professionals and actual practitioners to gain both an inside knowledge and outside perspective view, objectively. In the survey, face-to-face interviews of the DMs using the adapted Delphi method were conducted. The respondents were asked to indicate which criteria they deemed the most and least important to be used in the BWM for determination of the weights and associated preference rankings. Additional statistical tests such as one-sample t test, one-way ANOVA and Tukey’s HSD test were used to analyse and make inferences of the DM group preferences. A total of nineteen experts were interviewed and information about their profile, background and expertise was specified for each group of DMs in the subsequent subsections.

3.4.1 Academic-related professionals

A pool of ten academic-related professionals was selected for conducting the survey used in the BWM weight determination process. The background of these experts spanned across planning, geomatics and industrial engineering fields with over 10-year related experience in academics, research and industry. All the DM experts have been involved in disaster and emergency planning activities and projects in Istanbul. Two of the DMs have Federal Emergency Management Agency (FEMA) certifications and therefore well recognized as emergency planning experts. Summary information of the experts’ profiles and their corresponding weights is presented in Table 3.

Table 3 Summary profile and weights of academician DMs

3.4.2 Fire brigade personnel

Nine DMs working in the fire service stations and departments of Istanbul were selected to be interviewed for the BWM weight determination procedure. The fire brigade practitioners, all having more than 10-year related experience, comprised personnel working as research, planning and coordinators, sectional heads, supervisors and firefighters. A summary of the background information about the DMs and their respective weights is provided in Table 4.

Table 4 Summary profile and weights of fire brigade personnel DMs

3.5 Data collection

In this study, the criteria were determined and subjectively evaluated by two groups of experts through a group decision-making (GDM) process using the BWM technique to calculate the final optimal weights to be used as input into ESRI ArcGIS software. Thereby, resulting in the production of raster suitability maps for suitably siting additional fire station facilities covering the whole of Istanbul province. Six criteria were determined, from the approach outlined in Sect. 3.3, and most of the related data were acquired in the form of vector-based polygon data layers and converted to raster format at a cell resolution of 50 × 50 m2 for further processing and analysis in a GIS environment, leading to the generation of final raster suitability maps after applying the weighted sum analysis function. Table 5 shows the criteria, layer format, description and source of the data collected for GIS modelling in ESRI ArcGIS 10.3 software.

Table 5 Criteria data source, description and processing

4 Results and discussion

This section discussed the research results from the determination of the criteria weights, related statistical tests including Kendall’s W for consensus measurement, comparative analysis using the analytical hierarchy process (AHP) for validating the BWM model, GIS analyses, production and comparison of two suitability maps for each DM group visualizing the optimal locations for establishing fire station facilities in Istanbul.

4.1 BWM results

The best–worst method (BWM) results of the questionnaire responses from both the academicians and fire brigade personnel decision-maker (DM) groups were analysed in terms of the pairwise comparisons, preference rankings and the weights of the criteria. Excerpts of pairwise comparisons for each case of one of the experts from the academician (DM1) and fire brigade personnel (DM11) groups are shown in Tables 6 and 7.

Table 6 Excerpt from BWM pairwise evaluation of academic expert (DM1)
Table 7 Excerpt from BWM pairwise evaluation of fire brigade personnel (DM11)

From the perspective of expert DM1 (from the academicians’ group), the density of hazardous materials (DHM) criterion was selected as the best, while the distance from earthquake-prone areas (DER) was considered as the worst criterion (Table 6). Table 7 reflects the fire brigade personnel’s (DM11′s) viewpoint where the high population density (HPD) criterion was chosen as the best, while the distance to existing fire stations (DEF) was viewed as the worst criterion. Frequencies of the most and least important criteria preference ratings from the BWM weight coefficients calculated for each of the academicians and fire brigade personnel DM groups are shown in Table 8.

Table 8 Frequencies of the most and least important criteria preference ratings from BWM weights for each DM group
Table 9 Summary statistics of aggregated final BWM weights for each DM group

This showed the number of times the criteria were rated by each DM as the most and least important across the associated DM groups. From the academicians DM group, the most and least important criteria were DHM (with 7 out of the total 10 DMs) and DEF (with 4 DMs), respectively, while for the fire brigade personnel DM group, HPD criterion (with 5 out of the total 9 DMs) was the most important and DER, the least important criterion (with 4 DMs). The BWM weight coefficients calculated for each of the individual DMs in both the academician and fire brigade practitioner DM groups are illustrated in Fig. 3.

Fig. 3
figure 3

BWM criteria weights for each individual DM’s preference in both DM groups

In the academician DM group, the highest weight coefficient value of 0.4793 was computed for the DHM criterion by DM9 and the lowest value of 0.0348 was calculated for the HPD criterion by DM5, whereas, in the fire brigade practitioner DM group, the HPD criterion weight value of 0.4762 by DM14 was the highest and 0.0354 was the lowest weight value determined for the DER criterion by DM13. These preferences from the respondents were used as input for the BWM model to compute the final weights. Table 9 shows the summary statistics of the aggregated final BWM weights (tabulated in the mean column) for each of the academician and fire brigade practitioner DM groups. Additionally, the criteria weight rankings, minimum, maximum and standard deviation (s.d.) values were computed.

The consistency ratio (CR) was also calculated for each of the DM groups and determined to be acceptable (less than 0.25). From the final weight ranking results, the criteria were ranked in the following order for the academician DM group: C4 ≻ C1 ≻ C2 ≻ C3 ≻ C5 ≻ C6 and for the fire brigade practitioner DM group: C1 ≻ C4 ≻ C2 ≻ C5 ≻ C3 ≻ C6. The academician DM group viewed DHM (C4) and DER (C6) as the most and least important criteria, respectively, while the fire brigade practitioner DM group considered HPD (C1) as the most important criterion for optimal fire station facility location. In both DM groups, DER (C6) was perceived as the least important criterion. Figure 4 compares the box plots with the associated BWM weight (mean) value distributions for each of the DM groups.

Fig. 4
figure 4

Box plots of BWM criteria weight distributions for each DM group

For a more thorough analysis to establish whether the criteria weight values within each DM group differed significantly, a statistical measure called the one-sample t test was carried out.

4.1.1 One-sample t test

To determine whether there was a significant difference in the weight comparisons across the criteria, a one-sample t test was performed. Assuming that each criterion was equally important, the weights for each criterion were assigned a value of 0.1667. This value was assumed to be the population mean and used as the test value in the one-sample t test evaluation to ascertain if a significant difference existed between the weights of the criteria and the 0.1667 value. The results in Table 10 indicated for the academician DM group that C1 (HPD), C2 (PMR) and C3 (DEF) criteria weights were not significantly different from the 0.1667 equal average weight since the corresponding significance values (sig.) or p values were greater than 0.05 (alpha value). Thus, the null hypothesis (H0) stating that the mean of differences is zero was accepted to be true.

Table 10 One-sample t test

There was, however, a significant difference from the equal weight value for the remaining C4 (DHM), C5 (WBD) and C6 (DER) criteria weights, since the respective significance values were less than 0.05 (alpha) and therefore the null hypothesis (H0) was rejected. Within the fire brigade personnel DM group, the one-sample t test results showed that all the six criteria weights were significantly different from the equal mean weight value of 0.1667 since the associated p values for each criterion were less than 0.05 (alpha). These results provided new insight into the level of importance attached to the criteria for suitable fire station location studies. The criteria are not equally weighted (i.e. not homogeneous); therefore, every criterion has a different level of importance correlated by the assignment of relevant weight values.

Having established using the one-sample t test that there were significant differences in the weight values for each DM group, a one-way analysis of variance (ANOVA) statistical test was conducted to determine whether there was an overall significant difference between any of the criteria weights across the DM groups.

4.1.2 One-way ANOVA

A one-way analysis of variance (ANOVA) test was performed to check if there was an overall significant difference between the evaluated weights of the fire station selection criteria within both the academician and fire brigade personnel DM groups. The null hypothesis (H0) of the comparisons stated that there was no significant difference between any of the weights of criteria for each DM group and the alternate hypothesis (H1), to be tested, stated that there was a significant difference between the criteria. The results in Table 11 for both DM groups show that there was a significant difference in criteria weights.

Table 11 One-way ANOVA

The null hypothesis (H0) was therefore rejected in favour of the alternative hypothesis (H1) as the significance value (sig.) or p value in both DM groups was less than 0.05 (alpha value). This analysis tells us that there was an overall significant difference in at least two criteria mean values but does not, however, show us exactly where the differences lie within the criteria comparisons. To locate and uncover these specific differences between the criteria means in both DM groups, a post hoc statistical test called Tukey’s honest significant difference (HSD) test was applied.

4.1.3 Tukey’s HSD test

The Tukey’s honest significant difference (HSD) post hoc test was applied after the determination of the overall significance of results using ANOVA. Tukey’s HSD test was used to find out which specific criteria means were different from each other by comparing all the possible pairs of mean values in both DM groups. Only the results of the Tukey test that indicated a significant difference in the paired comparisons of the criteria means are shown in Table 12.

Table 12 Tukey’s HSD test

Within the academician DM group, there was a significant difference observed in the paired comparisons between criterion C4 (DHM) and all the other five criteria. Since the significance value (p value) was less than 0.05 (alpha value) for all the five paired criteria comparisons, the null hypothesis (H0) was rejected in favour of the alternative hypothesis (H1) that stated that all the combinations of the five paired criteria means were significantly different. The results showed that the C4 (DHM) criterion was assigned a weight value that was so much higher and significantly different from all the criteria indicating its stronger influence and importance over all the other criteria. For the fire brigade personnel DM group, there was a significant difference in the weight values when the C1 (HPD) and C4 (DHM) criteria were each paired with the C2 (PMR), C3 (DEF), C5 (WBD) and C6 (DER) criteria indicated by the significance values that were less than 0.05 (alpha value). There was, however, no significant difference between the C1 (HPD) and C4 (DHM) criteria pairs. These findings indicated that both the C1 (HPD) and C4 (DHM) criteria were assigned significantly higher weight values that had the strongest influence and importance over all the other criteria but when compared to each other, neither had any significant influence over the other.

4.1.4 Kendall’s coefficient of concordance, W

The final BWM weights were computed for both the academician and fire brigade personnel DM groups from which the criteria rankings were obtained (Table 9). An analysis of the measure of agreement among the DMs in both the academician and fire brigade groups as well as the mean rankings of the criteria weights was evaluated using Kendall’s coefficient of concordance, W. The null hypothesis (H0) that the distributions of the weights for the criteria were the same was rejected at a significance level of 0.05 for both DM groups, and the results of the analysis are shown in Table 13.

Table 13 Kendall’s coefficient of concordance, W

The minimum (min.) column of Table 13 under the academician DM group showed that the criteria C2 (PMR), C3 (DEF) and C4 (DHM) were all ranked as the most important by at least one of the DMs and similarly, criteria C1 (HPD), C3 (DEF), C5 (WBD) and C6 (DER) each received the lowest rank from one of the ten DMs represented. Under the mean ranks column, criterion C4 (DHM) was ranked the best overall with rank 1.60, while the least importance was assigned to criterion C6 (DER) at 4.10. This result was consistent with the final criteria weight rank order from Table 9. The Kendall’s coefficient of concordance, W, was determined to be 0.264 and because the measure was less than 0.6 this indicated a low level of agreement (Kahraman et al. 2009) among the DMs in the academician group. The low consensus degree was attributed to the heterogeneity of the expert group that had diverging opinions.

The results from the fire brigade personnel DM group showed that criteria C1 (HPD) and C4 (DHM) were ranked with the highest level of importance by at least one of the DMs, while the rest of the criteria ranked the least important. The overall mean ranking showed that criteria C1 (HPD) and C3 (DEF) were ranked as the most and least important, respectively, with corresponding rank values of 1.44 and 5.06. There was a slight deviation from the rank order of the least important criterion determined to be C3 (DEF) in comparison with that from the final weight ranking of C6 (DER) criterion given in Table 9. The Kendall’s W of 0.700 was evaluated indicating a good level of agreement since the consensus level was more than 0.7 (Kahraman et al. 2009) among the fire brigade personnel DMs. The high consensus degree obtained was as a result of the homogeneity in the set of DMs within the fire brigade personnel that shared similar preferences.

4.2 Comparative analysis using AHP

The analytical hierarchy process (AHP) (Saaty 1980) method was used in this study for comparison and validation of the best–worst method (BWM) results. For this analysis, the respondents comprising the two decision-maker (DM) groups of academicians and fire brigade personnel, were also asked to fill in a questionnaire designed based on the AHP method. The AHP model results were compared with BWM across the two DM groups and analysed by their respective pairwise comparisons, final aggregated weight coefficients of the criteria and their corresponding rankings, as well as the mean rankings from individual DM’s preference evaluations.

4.2.1 Aggregated criteria weight and ranking comparison

In the AHP model, the preferences (pairwise comparisons) of individual DMs from the academicians and fire brigade personnel groups were aggregated by taking the geometric mean to arrive at their corresponding final group weights of the criteria. Table 14 shows the summary of the obtained aggregated weight results from the AHP compared with the BWM results, for each DM group.

Table 14 Comparison of BWM with AHP weights

The consistency ratio (CR) calculated from the AHP showed an acceptable level of consistency in the pairwise comparisons of less than 0.1 for both the academicians (CR, 0.007) and fire brigade personnel (CR, 0.015) DM groups. Using the AHP method, the number of pairwise comparisons required for the calculation of the weight coefficients of the criteria was 15, given by \(n\left( {n - 1} \right)/2\) pairwise comparisons, where n is equal to a total number of six criteria under consideration in this study. In contrast, the BWM required only 9 pairwise comparisons, \(2n - 3\). This demonstrated that the BWM provided a simpler and faster solution for computation of the weights than by the use of the AHP. Moreover, the BWM uses only integers which eased the calculations as opposed to both integers and fractional numbers used in AHP which made computations a tad difficult. Besides, the respondents from both DM groups found the BWM to be easier to understand than the AHP and could, therefore, more easily revise their preferences to improve the consistency.

From the AHP weights, the criteria were possible to be ranked in the following sequence for the academician DM group: C4 ≻ C1 ≻ C2 ≻ C5 ≻ C3 ≻ C6 and for the fire brigade personnel DM group: C1 ≻ C4 ≻ C2 ≻ C5 ≻ C3 ≻ C6. In comparison with the BWM results, the AHP criteria rank order is closely similar except for criteria C3 and C5 that are interchangeably ranked 4th and 5th, respectively, in the BWM model. In the AHP, criteria DHM (C4) and DER (C6) were also perceived as the most and least influential, respectively, whereas, for the fire brigade personnel DM group, HPD (C1) was considered as the most important criterion. The criteria rank order of the AHP for the fire brigade DM group was exactly the same and therefore consistent with that from the BWM model, where DER (C6) criterion was also viewed as the least important in both DM groups.

4.2.2 Comparison of mean rankings

The mean criteria rankings of the AHP model based on the individual DM criteria preference rankings from both the academician and fire brigade personnel DM groups were computed, and the obtained results are presented in Table 15.

Table 15 Mean rankings of AHP based on individual DM preference evaluations

In the AHP academician DM group, at least one of the DMs ranked the criteria C1 (HPD), C2 (PMR), C3 (DEF) and C4 (DHM) as the most important based on individual DM preference evaluations. Compared with the BWM result from Table 13, only three criteria: C2 (PMR), C3 (DEF) and C4 (DHM), were ranked as the most important at least once by the DMs. Similar to the BWM model, criteria C1 (HPD), C3 (DEF), C5 (WBD) and C6 (DER) were assigned the lowest rank in the academician DM group. Consistent with the AHP criteria weight rank order (Table 14) and the mean criteria ranking from BWM (Table 13), criterion C4 (DHM) was ranked the best overall with a mean rank of 1.60, whereas the lowest level of importance was attached to criterion C6 (DER) at 4.4 for the academician DM group.

From the AHP model, at least one of the DMs in the fire brigade personnel DM group ranked criteria C1 (HPD) and C4 (DHM) with the highest importance level similar to the result from BWM whereas criteria C3 (DEF) and C6 (DER) were ranked as the least important. In contrast with the AHP model, the lowest importance ranking by at least one of the DMs in the BWM model included two extra criteria, C2 (PMR) and C5 (WBD). In the overall mean rankings of the AHP model, criteria C1 (HPD) and C6 (DER) were ranked as the most and least important, respectively, with associated mean ranking values of 1.33 and 5.44. The mean rank order of the most important criterion was consistent though differed only in the least important criterion with that from the BWM result of mean rankings (Table 13) as well as final weight ranking (Table 9) for the fire brigade personnel DM group.

After the criteria identification and screening, the final weights from both the academician and fire brigade DM groups were evaluated through the proposed group decision-making (GDM) procedures discussed in the previous subsections, using the BWM. A degree of consensus was evaluated among the DM groups using the Kendall’s W from Sect. 4.1.4. Additional statistical tests that included the one-sample t test, one-way ANOVA and Tukey’s HSD test showed significant differences in the criteria weight values clarifying the distinctions in the levels of importance and influence of criteria across the two DM groups. The BWM model results for each of the two DM groups were determined to be consistent and reliable as shown in the acceptable consistency ratios (CRs), comparison analyses by aggregated weights, their respective rankings and validation of the model conducted using AHP.

4.3 GIS analysis

The calculated weights from the BWM model were subsequently applied onto the criteria for each of the two DM groups in the form of processed criteria map layers in GIS to spatially identify optimal areas, visualized via two resultant raster suitability maps for locating fire station facilities within a 5-min response time for the case study region of Istanbul. ESRI ArcGIS 10.3 software was used to process and analyse each of the criterion map layers in vector data format before their conversion to raster data at a defined cell resolution of 50 × 50 m2. The value ranges of the criteria were determined in terms of their corresponding thresholds of suitability for locating fire station facilities based on expert knowledge and input. A reclassification procedure using the natural breaks (Jenks) classification method was applied onto the criteria raster map layers using new class values ranging from 1 to 5, indicating a corresponding suitability value representation from very low to very high as shown in Table 16.

Table 16 Suitability map class value representations

The six reclassified criteria raster map layers using the suitability index values ranging from 1 to 5 are presented in Fig. 5.

Fig. 5
figure 5figure 5figure 5

Reclassified raster criteria map layers: a HPD, b PMR, c DEF, d DHM, e WBD and f DER

The BWM weights for each of the two DM groups were eventually multiplied by their respective reclassified criteria raster map layers using a weighted sum—overlay analysis and final reclassification operation in ArcGIS that resulted in the production of two separate composite raster suitability maps for new optimal fire station areas, each representing the academician (Fig. 6) and fire brigade (Fig. 7) DM groups, as illustrated.

Fig. 6
figure 6

Academician DM group—suitability raster map for the new fire station and emergency facility areas of Istanbul

Fig. 7
figure 7

Fire brigade personnel DM group—suitability raster map for the new fire station and emergency facility areas of Istanbul

4.3.1 Comparison of suitability maps

The raster suitability maps for locating new fire station facilities were generated as shown in Figs. 6 and 7, based on the criteria weight inputs computed from the BWM preference evaluation procedure for both the academician and fire brigade personnel DM groups. The raster maps provided the necessary support in the group decision-making (GDM) process to spatially resolve the emergency facility location problem and visually present the model outputs to reflect the perceived levels of importance that each DM group had on the criteria under evaluation. A visual comparison of the two suitability maps indicated a similarity in the classified suitability areas depicting a high level of correlation and consistency in the BWM model result of the academician and fire brigade personnel DM group criteria weight evaluations. The noticeable variation between the two raster maps was distinguished by the presence of slightly more classified areas for the academician DM group than that for the fire brigade personnel DM group, represented by medium (class 3), high (class 4) and very high (class 5) suitability areas that were depicted in yellow, orange and red colours, respectively. The choice of red colour selected for the class representation for very high (class 5) suitability also intended to reflect the vulnerability corresponding to high population areas exposed to fire risk.

To thoroughly assess the spatial variability between the suitability maps, the two maps were compared statistically. A cell-by-cell comparison was applied to assess the similarity for each pair of pixels between the raster suitability maps and computed by dividing the number of equal pixels in both maps (361,456 pixels) by the total number of pixels (459,075 pixels). The spatial distribution of agreement as an overall measure of similarity was determined to be 0.787 or 78.7% (361,456 pixels/459,075 pixels) indicating a very high correlation between the maps. The similarity measure corresponding to 78.7% of the total number of pixels in similar class distributions of the two maps was represented by a total area coverage of 903.64 km2.

A more accurate similarity measure called the Kappa statistic was used that accounted for bias in the model attributed to the overestimation of prevalent class categories by considering the total number of classes as randomly distributed over the maps. The similarity between the observed and predicted results is assessed by the Kappa statistic, K which is a function of two similarity statistical types: similarity in quantity, Kappa location (or KLoc that refers to the total number of pixels as a fraction of all pixels in a particular class over the entire map) and similarity of location, Kappa Histo (or KHisto, referring to the spatial distribution of two classes over the compared maps) (Hagen 2002; Pontius 2000). Therefore, for the map comparison analysis, Kappa statistic, K given by KHisto * KLoc, was calculated based on the contingency table (also referred to as confusion matrix) as depicted in Table 17.

Table 17 Contingency table

The contingency table expressed in a cross-tabular matrix form consisting of the number of pixels and total map class areas detailed how the distribution of classes in the suitability map for the academician DM group related to that of the fire brigade DM group suitability map, and their corresponding total class area covered in km2.

The overall Kappa, KLocation and KHisto were determined to be 0.66 (66.0%), 0.768 (76.8%) and 0.859 (85.9%), respectively, for the two comparison maps. The Kappa statistics for each suitability map class are presented in Table 18.

Table 18 Kappa statistics for each suitability map class

The Kappa statistical measures calculated indicated a high correlation and similarity between the academician and fire brigade DM group suitability maps, confirming the observations earlier inferred from the visual map interpretation.

5 Conclusions

This study presented a group decision-making (GDM) approach based on the integration of a novel best–worst method (BWM) and GIS for planning suitable areas for the new urban fire station and emergency facilities in Istanbul. The suggested model incorporated the perspectives of two groups of decision-makers (DMs) comprising academic-affiliated professionals and fire brigade practitioners with related experience in emergency planning and disaster management activities. The study aimed to explore the different points of view of DMs in connection with the decision problem from an insider perspective for the case of fire brigade practitioners and an outsider viewpoint for the case of academic professionals.

From a survey of literature and expert input, criteria for suitability assessment of new fire station sites in Istanbul were comprehensibly determined from social, environmental, spatial, risk and safety considerations. Utilizing a new MCDM method called BWM, the relevant weights and preference rankings were derived based on pairwise comparisons of the best and worst criterion for each DM across the two DM groups. To check the reliability of the final weight results, a consistency ratio (CR) for each DM group was calculated and determined to be acceptable. From the individual decision matrices, the objective weights of the experts were derived using their respective consistency indices. The higher weights of the experts were assigned to those with higher consistency values. The results were also compared with the AHP method for validating the BWM model by considering the pairwise comparisons, computed weights, rankings from the weights and mean rankings. The BWM model proved to be a much simpler and faster approach for weight computation as it required fewer comparisons, 9 than the AHP which required 15. The final BWM weights, rankings from weights and mean rankings showed consistent and reliable results with the AHP.

From the results of the BWM weights, statistical tests that included the one-sample t test, one-way ANOVA and Tukey’s HSD test for each DM group, it can be inferred that the academician DM group strongly views the criterion, C4 (DHM) as the most important for the fire station and emergency facility selection studies. This was apparent from the higher weight assignment and dominance that this criterion, C4 (DHM) had over all the other criteria. From the fire brigade practitioners’ perspective, criterion C1 (HPD) was viewed as the most important. This was not an unusual phenomenon considering that the practitioners interact more with affected people in communities and view saving people’s lives through rescue and emergency operations as their main objective and focus. Both DM groups considered criterion C6 (DER) as the least important.

A degree of consensus was evaluated among respective DMs for both DM groups using Kendall’s coefficient of concordance, W which indicated low to a high level of agreement. It could be explained that a higher level of agreement was reached for the fire brigade personnel DM group because it was obvious that this group of DMs comprised like-minded individuals working in similar work environments/functionalities that shared the same perception, values and belief systems relating to fire station and emergency facility planning considerations. The academician DM group, on the other hand, incorporated experts of different training backgrounds, experiences and specializations with more divergent views on the importance levels of criteria for emergency facility planning.

The planning decision outcome of establishing suitable sites for new urban fire station facilities could be visualized in the form of raster suitability maps via GIS depicting each DM group’s influence of the overall result from the application of aggregated BWM weights on the criteria. The two raster suitability maps were compared visually and through the use of the Kappa statistic (calculated value of 0.66) to show high correlation and similarity. This indicated a high level of agreement between the two DM groups and demonstrated the utility and usability of the produced maps for requisite emergency facility planning and service management by relevant authorities and policy-makers.

The proposed group decision-making framework using a BWM-GIS hybrid model ensures that the fire station and emergency facilities are sufficiently planned for achieving sustainable development of urban environments. The quality of the overall planning decision process is therefore enhanced through explicitly promoting inclusiveness by stakeholder participation and improving the resilience of communities at risk by mitigating social, environmental and economic impacts of fires.