1 Introduction

Humanitarian disasters occur more frequently than ever before, and the demand for humanitarian assistance is expected to continue to increase (Guha-Sapir et al. 2015). Facilitated by the development of information technologies, humanitarian supply chains (HSCs) have become more transparent (Comes and Van de Walle 2016). However, the improved/improving information about HSC have not yet been translated into better response (Prasad et al. 2016), and some authors have argued that this is owing to a lack of flexibility (Scholten et al. 2010).

HSC literature often refers to network flexibility in terms of distribution (ability to provide access to aid) or responsiveness (ability to adapt to changing needs) (Fabbe-Costes and Jahre 2009; Pettit et al. 2013). However, a detailed definition of what ‘network flexibility’ is in humanitarian contexts and what it depends on, is lacking in HSC literature. Thus, there is no systematic approach for measuring network flexibility in HSCs.

Our paper aims at synthesizing the literature on ‘network flexibility’ and adapt the concept to the humanitarian response context. From the definition, we propose a flexibility measurement framework for HSCs that can be implemented by practitioners. Our research has a threefold contribution:

  1. (i)

    we define HSCs network flexibility in the humanitarian response to sudden onset disasters;

  2. (ii)

    we develop a framework to measure flexibility in HSCs;

  3. (iii)

    we apply our theoretical framework to the 2015 Nepal earthquake case.

We follow an exploratory mixed-method research design in our study: we develop a theoretical framework which is informed by a field research after the 2015 Nepal earthquake. Then, we apply the framework to the Nepal case and conduct the quantitative analysis based on multi-criteria decision analysis. For the sake of brevity, we use ‘flexibility’ instead of ‘network flexibility’ throughout the paper.

The remainder of this paper is structured as follows: we review definitions of flexibility in HSCs in Sect. 2. Then, we describe our research design in Sect. 3. The development of the flexibility measurement framework is explained in Sect. 4. The analysis of applying our framework to the Nepal case and related results are described in Sect. 5. We discuss these results with some implications for theory and practice in Sect. 6. Eventually, we conclude with limitations of our study and directions for future research in Sect. 7.

2 Literature review

Flexibility has attracted much interest in commercial SC (CSC) management (Maria Jesus Saenz et al. 2015; Esmaeilikia et al. 2016). We focus here specifically on the humanitarian context but we complement our review with CSC findings if literature is sparse.

In CSC literature, flexibility is “the ability to respond to variations with little penalty in time, effort, cost or performance” (Christopher and Peck 2004). More recently, researchers highlight that considering flexibility is the key toward resilient and agile supply chain design (Kamalahmadi and Mellat-Parast 2015; Garcia-Herreros et al. 2014). Improving flexibility prepares SC to adapt to an environment with foreseen and unforeseen changes (Husdal 2010; Sheffi and Rice 2005).

2.1 Flexibility in humanitarian supply chains

In the chaotic environment of a response, HSCs are exposed to uncertainties that may cause serious disruptions (Perry 2007). Flexibility has therefore been recognized as one of the success factors of HSCs (Manoj et al. 2015; Bozorgi-Amiri and Asvadi 2015; Abounacer et al. 2014; Najafi et al. 2013; Yushimito et al. 2012; Afshar and Haghani 2012; Shen et al. 2009; Abidi et al. 2013). Flexibility impacts organizational structures, information systems, and logistics processes (Scholten et al. 2010). However, only few researchers propose definitions for ‘flexibility in HSCs’ (Scholten et al. 2010; Santarelli et al. 2013, 2015). Scholten et al. (2010) define flexibility in HSC through its dimensions, such as resources and distribution while Santarelli et al. (2013, 2015), define it as an ability to adapt to changing external conditions.

Combining these aspects, we define HSC flexibility as a multi-dimensional ability to efficiently adapt to changing external and internal conditions in disasters to maintain or improve HSC performance. Our definition supports HSC resilience and agility by improving SC’s response to changes and disruptions.

Very few HSC papers admit how to consider flexibility when establishing HSCs. For instance, Oloruntoba and Gray (2006) note that in order to deal with unpredictable and turbulent contexts of response, HSCs require flexibility; however, the question of where and how remained unanswered. Without sufficient flexibility, HSCs are prone to several disruptions and hence, proposed quantitative models and simulations models in the literature (Anaya-Arenas et al. 2014) can not improve the work in the field. In this regard, measuring the state of flexibility is the first step toward benchmarking and then, further planning.

2.2 Measuring flexibility in humanitarian supply chains

Monitoring and measuring tools are designed to show the status of system features. These tools help to assess the current state and thus, enable developing plans for improvements. Two approaches can be distinguished: indirect and direct assessment (Santarelli et al. 2015; Abidi et al. 2014). There are several performance measurement tools in HSC literature that follow the first approach (Van Wassenhove 2006; Oloruntoba and Gray 2006; Jahre et al. 2007; Beamon and Balcik 2008; Perry 2007; Santarelli et al. 2013; Day 2014). These tools mainly measure the speed, effectiveness, responsiveness, and/or efficiency of HSC through key performance indicators (KPIs) such as delivery time, the number of saved lives, the quantity of distributed relief items, and operations’ costs.

However, developing a systematic framework for measuring flexibility that follows the second approach has been rarely discussed. Direct assessment is considerably important because it enables more targeted (and detailed) improvement plans compared to indirect assessment with KPIs. Table 1 summarizes key points in contributions related to the second approach.

  • Criteria: Beamon and Balcik (2008) and Charles et al. (2010) suggest measuring flexibility through 4 criteria: volume, delivery, mix, and product. If we compare these criteria with contributions in CSC literature, we find further criteria that have not been used for measuring flexibility in HSCs. For instance, Moon et al. (2012) discuss asset flexibility and its effects on material flow. Gong (2008) investigates information systems flexibility. Slack (2005) mentions fleet flexibility as one of the key performance indicators in commercial logistics.

  • Aggregation: Only Charles et al. (2010) suggest the evaluation of all four flexibility criteria through an aggregation grid. The other papers neither propose any aggregation method (despite having multiple criteria) nor explain how they aggregate their evaluation results. For instance, Beamon and Balcik (2008) propose that among the introduced criteria, it is sufficient to measure only one flexibility criterion given the case scope and context.

  • Planning: Beamon and Balcik (2008) and Charles et al. (2010) suggest working on flexibility improvement proposals as the future research direction.

  • Research design: Despite the demand for empirically grounded research in HSCs literature (Baharmand et al. 2015; Pedraza-Martinez and Van Wassenhove 2016), to the best of our knowledge, no paper use empirical evidence to develop, inform, and/or implement flexibility measurement framework for HSCs. Few papers apply their framework to real cases (Charles et al. 2010; Santarelli et al. 2013, 2015).

Table 1 Overview of how flexibility is directly measured in HSC literature

Table 2 shows a summary of analyzing flexibility measurement research in both CSC and HSC literature. This table shows domains and criteria that are relevant for measuring the flexibility in HSCs. The list is not exhaustive, and only includes the most common domains and criteria in the literature. We have tried to cover relevant aspects as far as we could.

Table 2 The outcome of literature analysis regarding flexibility domains and their criteria in SC

In order to assess the level of flexibility through each criterion consistently, some metrics are required. An evaluation grid is also necessary to enable measuring each criterion by associated metrics (Charles et al. 2010). Level of flexibility in each criterion can then be referred by the corresponding score or grade from the evaluation grid.

2.3 Literature gaps and research statement

Flexibility in HSC is a prerequisite for dealing with the complexities and disruptions typical for sudden onset disaster response. However, flexibility is often assumed as a given and HSC literature lacks a specific, transparent analysis of flexibility with field driven insights. Without any applicable tool for measuring network flexibility in HSCs, practitioners are left with their intuition and experience to improve it.

In this paper, we translate our analysis of literature into a flexibility measurement framework which we use to support analysis of our field research. Our ambition is to develop a framework that can be used by practitioners. It should help to assess the flexibility of HSCs in the humanitarian response and to improve it by enabling targeted planning. To show this, we implement our framework on a real case, the 2015 Nepal earthquake response, and discuss the implications of results for both theory and practice.

3 Research design

Our paper aims at developing a comparable quantitative measure of HSC flexibility which can help to identify improvement strategies. Our research design follows Bourne et al. (2000) who propose four steps for developing a performance measurement system: (a) system design; (b) implementation of measures; (c) use of measures to assess the implementation strategy; and (d) use of measures to challenge strategy. In this regard, we modified steps by adapting sub-parts in them and/or by allocating tasks to them when necessary. The proposed steps and corresponding research methods in our research design are shown in Fig. 1. In designing the system, step (a), we develop an initial catalogue of flexibility domains and related criteria as well as required metrics (cf. Sect. 4.1). In implementation of the system, step (b), we design our Nepal field research (interview guidelines and observation protocols) and after collecting data, we use content analysis to revise the primary catalogue of flexibility criteria for the Nepal case, cf. Sect. 4.2. In step (c), we use the empirical data to apply our flexibility measurement framework for the Nepal case. We carry out a critical evaluation of the flexibility assessment by combining the resulting weights of fuzzy analytic hierarchy process (AHP) with fuzzy technique for order preference by similarity and solution (TOPSIS), cf. Sect. 4.3. Finally, in step (d), we provide HOs with instructions on how to develop improvement plans, cf. Sect. 4.4. We explain these four steps in detail in Sect. 4.

Fig. 1
figure 1

Framework design and incorporated methodologies inspired by Bourne et al. (2000)

4 Flexibility measurement framework for humanitarian supply chains

4.1 Designing the measurement system

Our proposed framework uses literature driven metrics for each criterion in Table 2. While metrics for volume, mix, local sourcing, assets, and fleet are shown in Table 3, the rest of metrics are provided in “Appendix A”. Accordingly, an evaluation grid has also been prepared for the aforementioned metrics. This grid works with linguistic variables; poor, very poor, medium, good, and very good. Special care has been taken to keep the evaluation grid as robust and reproducible as possible.

Table 3 Some examples of metrics for flexibility criteria and related measurement grid

4.2 Implementing the measurement system

This step comprises two parts: on-site data collection through a field research, and analyzing empirical data by content analysis.

4.2.1 On-site data collection

In order to collect relevant data, we propose conducting field research in the targeted humanitarian context. Field evidence strengthens the applicability of framework since field evidence enables researchers to provide more convincing insights for practitioners (Van de Walle and Comes 2015).

Developing practical data collection guidelines and protocols is of great importance for an effective field research (Holguín-Veras et al. 2014). Developing such protocols requires reviewing literature related to research objectives and questions. Our suggested methods for data collection in the field are in-depth semi-structured interviews and observations, which can be complemented by remote interviews. Using multiple methods helps to collect relevant information given the constraints in disaster contexts. For interviews, we suggest open questions because they enable acquiring much information without compromising research limits, such as time or number of key informants (Salvadó et al. 2015). Open questions also help to continue with more detailed questions when an interesting theme emerges during the interview (Baharmand et al. 2016). Also, we suggest eliciting quantitative data through targeted questions.

4.2.2 Data analysis

Our analysis approach has two sub-steps: (1) content analysis to verify criteria, and (2) assessment of metrics. Figure 2 illustrates the link between the qualitative field work and the framework that we are proposing in this paper.

  • Verifying domains and criteria Due to high dependency of HSCs’ characteristics on the targeted context (Pedraza-Martinez and Van Wassenhove 2016), the criteria have to be verified. To this end, our approach aims at identifying those flexibility criteria which are frequently referred to during field interviews. Thus, we use literature driven criteria as keywords and then, we count the number of references in transcripts. Observations and field notes can effectively complement related findings. Besides, deep analysis of keyword ‘flexibility’ in field data enables finding attached concepts that may not be referred in the literature. Eventually, the verified list of flexibility criteria consists of: first, common criteria between the literature and the field data; and second, attached concepts to flexibility that their relevance can be confirmed through deep analysis of empirical data.

  • Qualitative assessment of metrics In this sub-part, we assess the coded categories through the literature-driven metrics (cf. Table 3) to provide the linguistic inputs of Kahraman et al. ’s TOPSIS approach (2007). This part is composed of several iterations between collected data, assessment metrics, and checking the findings with interviewees to reduce biases as far as possible. The aim is to relate the status of each HO in every criterion to a corresponding linguistic variable (poor, very poor, medium, good, or very good).

Fig. 2
figure 2

Schematic illustration of how our field study informed model building

4.3 Flexibility assessment

To ensure that HOs can benefit from a flexibility assessment tool, we develop a quantitative framework that enables benchmarking and continuous evaluation. In this step, we explain how to conduct such assessment with the collected and analyzed data.

4.3.1 Fuzzy analysis

Due to different scope of domains and criteria, they may not have the same impact on the overall flexibility. Therefore, the relative importance of each criterion has to be determined. We suggest using fuzzy AHP to determine the weights for verified criteria through pairwise comparison. Figure 3 illustrates the fuzzy analysis in our research.

Fig. 3
figure 3

Combination of Fuzzy AHP (Chang 1999) and fuzzy TOPSIS (Kahraman et al. 2007) for implementing flexibility measurement framework

AHP has been popular to determine weights across a wide range of disciplines (Oguztimur 2011). The strength of AHP lies in its ability to structure a complex, multi-domain, multi-criteria problem hierarchically. AHP scales the weights of attributes at each level of the hierarchy with respect to a goal using the experts’ experience and knowledge in a pair-wise comparison of criteria.

There are several papers on fuzzy AHP; our research follows Chang ’s approach (1999) due to its simplicity and ease of use. In this approach, values are converted to triangular fuzzy numbers (TFNs) and then, the geometric mean is calculated. Afterwards, fuzzy weights are normalized. We follow Paksoy et al. (2012) in the definition of linguistic values and their equivalent TFNs (see Table 4). All other calculations follow the original fuzzy AHP by Buckley et al. (1986). For a review of the complete formulations, we refer to Buckley et al. (1986), Chang (1999), and Paksoy et al. (2012).

Table 4 Linguistic values, their relative importance and equivalent TFN (Paksoy et al. 2012)

Having linguistic variables (cf. Sect. 4.2.2) and normalized fuzzy weights, we use fuzzy TOPSIS for the assessment. For the fuzzy TOPSIS, we propose using the approach of Kahraman et al. (2007) since it is widely appreciated as effective and simple (Beskese et al. 2015).

As depicted in Fig. 3, Kahraman et al. ’s fuzzy TOPSIS approach (1999) includes six steps. First, linguistic variables are substituted by their corresponding TFNs, as shown in Table 5, for each alternative. Second, the fuzzy decision matrix is constructed using these TFNs and then normalized by a linear scale transformation. Third, a weighted normalized fuzzy decision matrix is constructed by multiplying the outcomes of fuzzy AHP with the normalized fuzzy decision matrix. Fourth, fuzzy positive and negative ideal solutions are determined. Fifth, distances of each alternative from these ideal solutions are calculated. Finally, the closeness coefficients to the ideals are obtained. For a complete list of formulas, see Kahraman et al. (2007). The result of this step is a ranking of HOs with respect to their flexibility level.

Table 5 Linguistic variables for evaluation (Kahraman et al. 2007)

4.4 Improving the flexibility

This final step is designing and developing improvement plans, with the aim of achieving the desired level of flexibility. To design the improvement plans, HOs can follow two approaches.

First, they can focus on those criteria with highest weights that received the lowest scores. This enables HOs to carry out improvements based on their current status and available resources. Then, they can evaluate their status again with the provided metrics (cf. Sect. 4.1) to check if they achieved a satisfactory flexibility level.

The second approach refers to targeting a given profile of desired flexibility. A profile represents a set of pre-selected criteria (see Fig. 4). Our work for shaping aforementioned profiles is inspired by the work of Charles et al. (2010) on agility. In our framework, levels of flexibility are not flexible, hardly flexible, semi flexible, flexible, and highly flexible.

In order to develop practical plans, we propose complementing theoretical work with best practices and field driven insights. Developing plans based on only literature findings has three main problems: lack of relevant literature with respect to flexibility in HSCs, critical differences between CSC and HSC, and lack of confidence regarding the applicability of a solution without evidence. We suggest developing improvement plans based on best practices from those HOs with better flexibility rankings. These practices can be observed in the field or elicited through interviews.

5 Flexibility in the 2015 Nepal earthquake case

Bookman Old Style In this section, we illustrate the application of our framework in the Nepal case. We decided to focus on the downstream network of HSCs.

5.1 Case overview

Two major class earthquakes hit Nepal on April 25th and May 12th 2015. These earthquakes affected roughly 5.5 million people in 14 districts (out of country’s 75 districts). They left nearly 9,000 casualties, and approximately 7.1 billion dollars in economic damages (GoN 2015).

The official request for international assistance was placed quickly after the first earthquake. Over time, emergency relief and humanitarian assistance to the affected population were provided by distinct HOs from over 60 countries including United Nations (UN) and other international agencies (GoN 2015).

5.2 On-site data collection

We conducted our field research June 21st–29th 2015, approximately 2 months after the earthquake. Representatives who took part in our research were from the following national and international HOs: International Federation of Red Cross and Red Crescent Societies (IFRC); two UN agencies; and eight international non-governmental organizations (iNGOs). All of these organizations deployed their response teams to Nepal a few days after the earthquake to distribute relief items. Some of these organizations planned to remain active during the recovery (for instance Cordaid). All interviewees had served in previous operations.

Our approach for on-site data collection is inspired by multi-disciplinary field research in sudden-onset disasters (Chan and Comes 2014; Holguín-Veras et al. 2014). Data collection followed the inductive approach and was designed to collect information regarding logistics challenges and bottlenecks. The framework that informed our field study was derived from a previous literature review regarding HSC challenges (Baharmand et al. 2015) and their relation to flexibility in downstream network. Data collection was done through:

  1. (i)

    sixteen in-depth semi-structured interviews with representatives from eleven HOs active in relief operations with follow-up questions through email;

  2. (ii)

    field observations in Kathmandu, Rasuwa, and Nowakut districts;

  3. (iii)

    reviewing field notes, online resources (Reliefweb, LogCluster, HumData), cluster meeting minutes, and local newspapers.

We arranged most of our interviews before arrival through pre-existing contacts at iNGOs and the UN. The specific focus of selection was on logisticians. We also used online community platforms, especially LinkedIn and Reliefweb, to set up interviews. Additionally, we used snowballing to identify new participants. To avoid the limitations inherent in interviews, we used cross-validation whenever possible.

Our interview protocol was composed of questions regarding the structure of relief distribution network, needs assessment, decision-making process, collaborations with other HOs, and information sharing. Our questions also targeted HSC characteristics, e.g. flexibility, to identify the strengths and weaknesses of downstream networks.

We used open questions, asking interviewees to describe their downstream network and related problems in details. Interviews’ durations were less than one and half hours and within the consent of interviewee, all conversations were recorded. We also asked interviewees for other relevant documents, maps, sheets, photos, and reports that they could share with us.

In addition to Kathmandu, which served as the central logistics hub, we conducted exploratory visits to field offices in remote and hard-to-reach areas (Rasuwa and Nuwakot districts). We spent 4 days observing interactions between humanitarians and local communities in these severely affected areas, with a specific focus on flexibility. We had interviews with two district managers mainly regarding relief operation performance. Our observation protocol also included logistics issues, needs assessment techniques, and collaboration among HOs.

We also collected documents including reports, maps, white papers, and meeting minutes from online sources. These resources provided us complementary information regarding relevant quantities, locations, and best practices in relief operations. Furthermore, we collected articles from two newspapers (The Kathmandu Post and The Himalayan Times) that were published in English during our stay.

5.3 Data analysis

For verification and qualitative assessment (cf. Sect. 4.2.2), we analyzed our empirical data by content analysis (Elo and Kyngäs 2008). First, we converted the recorded interviews into transcripts and then, we added the supporting materials (field notes, operation reports, cluster meetings minutes, etc.). A coding sheet was used to assess and classify (Ritchie et al. 2013) the collected data regarding the flexibility domains/criteria (Table 2). We used the software NVIVO 11 for coding.

5.3.1 Verifying criteria: implications for flexibility theory

We first categorized findings with respect to flexibility domains and criteria. In most of our interviews, participants acknowledged the importance of flexibility, but they were struggling to indicate potential improvements. This was the main motivation from practice to measure HSC flexibility.

“We need more flexibility, and we are trying to.”(23. 06. 2015, WVI, Kathmandu)

“You can talk about that [flexibility] easily but how can we reach it? Who knows?” (26. 06. 2015, UMN, Kathmandu)

The attributed importance and attention to flexibility criteria was highly diverse. Table 6 shows the list of flexibility criteria and their coverage in our interviews. According to this table, volume and delivery were mentioned most often, while DSS and local partners were referred to least. The biased distribution can, at least partly, be explained through the focus on operational decision-makers who we had access to in our field research. Furthermore, Table 7 shows a summary of insights from our field study with respect to each criterion. Details are provided in “Appendix B”.

Table 6 Number of references (percentage of coverage) in analyzed documents regarding the flexibility criteria
Table 7 Summary of field findings related to flexibility criteria

We did not include two categories of criteria in our investigations. The first category included trans-shipment flexibility and access/routing flexibility, that were not applicable in the context of Nepal case (downstream vs. upstream focus). The second category contained policy/strategy and donors flexibility that were not in the scope of our field study (strategic vs. operational focus).

5.3.2 Qualitative assessment of metrics

Using criteria metrics (cf. “Appendix A”), we assessed the classifications and scored HOs with corresponding linguistic variables (see Table 5). Then, results were checked with relevant interviewee. In cases that consensus was not achieved between our result and the interviewee’s viewpoint, we re-conducted assessments and then, results were checked with the interviewee again. In our worst case, this iteration was carried out two times.

When we could not find related quantitative data in our classifications to compare with the metrics, we reached out to interviewees by emails and asked them to provide corresponding inputs. Subsequently, we had sufficient information to construct the fuzzy decision matrix for further fuzzy analysis (cf. Table 13).

5.4 Flexibility assessment: fuzzy analysis

The data for the pairwise comparison matrices was gathered from a survey of eight humanitarian logisticians with experience in four or more response operations. At the time of this research, they were working at UNWFP, Oxfam, IRW, WVI, Cordaid, UMN, Humedica, and Handicap International. We sent a questionnaire with pairwise comparisons to them via email with a brief explanation of included terms and concepts. The response rate was 100%.

Having received the pairwise matrices, we constructed the fuzzy comparison matrices by replacing linguistic values with alternative TFNs. Since all consistency ratio (CR) values were lower than 0.1, we concluded that our data was consistent. Thus, we continued to the next step: using fuzzy comparison matrices to calculate the weights of domains and related criteria. Tables 8, 9, 10, 11 and 12 illustrate our results regarding constructed fuzzy matrices, ratio weights, and overall weights.

Table 8 Fuzzy pairwise matrix for flexibility domains and their weights
Table 9 Fuzzy pairwise matrix for product flexibility criteria and their weights
Table 10 Fuzzy pairwise matrix for distribution flexibility criteria and their weights
Table 11 Fuzzy pairwise matrix for information systems flexibility criteria and their weights
Table 12 Fuzzy pairwise matrix for resource flexibility criteria and their weights

According to Table 8, the dominant flexibility domains from the experts’ viewpoints were distribution and information. Pairwise comparison results for criteria and their overall weights (Tables 9, 10, 11 and 12) depicted that delivery, IT support, and fleets criteria had the most influence on flexibility with 23.3, 14.2, and 12.6%, respectively. Product volume with 11.7% and DSS with 11.1%, were the next important criteria. The product mix was among the remaining five criteria with less influence as well as information database, local sources, local partners, and human resources.

This finding is not in line with the CSC literature where product mix is often referred to as one of the main criteria on flexibility (Siham et al. 2015; Sillanpää 2015; Nudurupati et al. 2011). The implications of results from our fuzzy AHP analysis are discussed further in Sect. 6.1.

After determining weights of domains and criteria with the fuzzy AHP, and constructing the fuzzy decision matrix (Table 13), the weighted fuzzy decision matrix was obtained and then normalized (Table 14). Following Kahraman et al. ’s formulas (2007), we calculated the distances of each HO from fuzzy positive and negative ideal solutions. Table 15 shows the final results: gaps and satisfaction degree. These degrees indicate how far/close an alternative is from the desired satisfaction level, 1, respectively (Sun 2010).

Table 13 Fuzzy decision matrix by linguistic variables
Table 14 Weighted normalized fuzzy decision matrix

Despite the importance of distribution and information in flexibility (see Table 8), our analysis shows studied HOs had low flexibility levels in mentioned domains, as shown in Table 13. For instance, with respect to information domain, we observed that information was treated as a product that HOs used for multiple purposes, including attracting funding. Thus, each HO had an interest in creating their own. However, decision makers often had to deal with information conflicts and redundancies in multiple products. Also, the problem with cluster meetings was that some HOs could not participate due to the lack of available human resource. Also, we did not observe any common tool that HOs used for sharing assessment information. Duplications in efforts for needs assessment confirmed this point. Similar evidence (cf. “Appendix B”) in our analysis depicted that studied HOs had low levels of flexibility in most criteria, as also later confirmed by interviewees.

According to Table 15, flexibility SDs of all but one of studied HOs are surprisingly less than 0.5. This means the majority of studied downstream networks were far from flexible. Also, this implies that studied HOs need to improve their flexibility significantly if they want to experience less disruptions. Furthermore, difference between highest and lowest satisfaction degrees is remarkable; nearly 0.25. This shows the range of practices for flexibility in studied HOs. Also, the diversity of flexibility levels shows that more than 50% of studied HOs have fairly similar flexibility status (5 HOs with SD between 0.42 and 0.47). These results are further discussed in Sect. 6.

Table 15 Flexibility satisfaction and gaps degrees of studied HOs

5.5 Improving the flexibility

Having the list of validated criteria for flexibility and their corresponding weights, we developed an aggregation grid, shown in Fig. 4. Within this aggregation grid, all conditions in each level must be met to assess the flexibility accordingly. For instance, if an HO has medium score in delivery, information technology, fleets, volume, and decision support systems but poor score in assets, this means it is not flexible (but semi-flexible).

Fig. 4
figure 4

Aggregation grid for flexibility in HSC downstream

Comparing flexibility levels of Fig. 4 with our finding in Table 15, one can notice that the majority of our studied HOs had ‘Hardly flexible’ downstream SCs during the Nepal response. The only ‘Semi flexible’ downstream SC belonged to UNWFP (with a better flexibility SD in comparison to the other studied HOs). Hence, we continue the interpretation by focusing on UNWFP’s best practices. This approach helps other HOs to develop improvement plans. Given the organizational size of UNWFP and their access to resources (monetary and non-monetary), other HOs may need to consider relevant constraints when reviewing UNWFP’s practices.

Following best practices are mainly derived from comparing our key observations in the field for each criterion with relevant literature. We also selected those practices that improved relief operations in Nepal, in comparison to previous operations, according to interviews.

  • Effective use of information and communication technology (ICT) tools for different processes including dispatching, warehousing, etc. (Van de Walle and Turoff 2007; Perego et al. 2011).

  • Having access to a wide range of fleets. Other HOs can enable such access by establishing partnerships with logistics service providers (LSPs) that provide different fleet types (Vega and Roussat 2015; Baharmand et al. 2017).

  • The number of UNWFPs’ staging areas (SAs) and their geographical distribution. We noticed that UNWFP was the only HO with more than one SA. These SAs were located in different affected areas enabling quick access and remarkable storage capacity (Glenn Richey Jr et al. 2009).

  • The use of DSS for operational decisions in the field which has shown the ability to improve the performance in critical situations (Baharmand et al. 2015).

To understand why UNWFP chose a different approach (and what constraints other HOs had) a deeper analysis of empirical data is required. Our observations showed that UNWFP invested partly in mobile storage units, IT tools, information database, partnerships, pre-stocking of highly demanded relief items, local procurement, and collaboration with local partners. Such efforts were not observed in any of other HOs due to policy issues and a lack of sufficient monetary resources, according to interviewees.

In addition, it is necessary to answer what UNWFP can do to improve. Further investigation of scores in Table 14 shows that UNWFP local procurement and collaboration with local partners can be improved. UNWFP tried to import pre-stocked items from their regional warehoused but problems with customs delayed the delivery. In this regard, some other HOs faced less problems mainly because they procured relief items from local markets, or through local partners that had surplus resources (Baharmand et al. 2016). Another improvement targets can be delivery and DSS tools. Although UNWFP established contracts with locals to supply required fleets, number of available helicopters was not sufficient for timely delivery during the immediate response. Also, according to our interviews, Logistics Cluster (ran by UNWFP) faced some challenges in adapting their delivery scheduling DSS tool to Nepal context. These two challenges decreased the timeliness of deliveries for UNWFP and other HOs who were working under the Logistics Cluster.

6 Discussion and implications

The Nepal case showed that the relevance and the impact of a criterion for measuring flexibility highly depends on the context. In our study, the list of criteria included volume, mix, local sourcing, assets, fleets, delivery, IT support, information database, DSS, human resources, and local partners. Having compared our lists with commercial literature (for instance Siham et al. 2015 or Sillanpää 2015), we noticed that verifying elements of our measurement framework before implementation is of great importance.

The chance to study HSC flexibility in the Nepal response offered some interesting insights for specific challenges. For instance, some severely affected areas with high demand were in remote and hard-to-reach areas. This entailed constrained access (also by helicopters) and challenged HOs’ flexibility in some criteria like delivery significantly. Furthermore, due to governmental policies that restricted importing some relief items, HOs had to procure items from local markets while production volumes were limited. While for the former, HOs managed to adapt ad-hoc relief transportation modes (changing from trucks to trails of porters), for the latter, they looked for substitute relief items in the same cluster that could be procured in high volume from local markets (different shelter kits).

Our finding regarding the low levels of flexibility explains why many HSC disruptions occurred in the Nepal response. It refers to limited possibilities to efficiently respond to environmental, political, and operational challenges. This is also in line with our observations that most HOs in the Nepal response were investing more efforts to respond to disruptions after they happened. Not having flexible SCs, HOs often had to re- organize, plan, and schedule relief operations which resulted in delays.

We also observed that HSCs’ flexibility evolved considerably between immediate response and early recovery in Nepal. Our interviews confirmed that issues in some criteria, like volume, assets, and fleets, were resolved in the early recovery (partially or completely). This was due to capacities that corresponding HOs could establish.

To find out how HOs can enhance their SCs’ flexibility, our framework offers practical solutions. Comparing the top HO from our fuzzy TOPSIS analysis, UNWFP, with others helped us to recommend best practices for enhancing flexibility. Among them, sharing assets, access to a good variety of fleets, and using IT support are promising enhancement criteria. Such comparisons can bring insights for HOs to challenge their strategies and/or adapt new ones in future response. Meanwhile, we note that the specific settings of Nepal case (for instance topography) have to be considered before generalizing the practices for other cases.

Although improving flexibility in some domains may not be the most efficient way to decrease disruptions, it is an effective one. Preparing alternative delivery plans, improving access to information and establishing information sharing platforms, enabling multi-modal transportation, sharing assets, and using IT solutions for more accurate demand estimation can be considered to improve flexibility. Resilience and agility in HSC cannot be reached without medium/high levels of flexibility (Charles et al. 2010; Heckmann et al. 2015). With respect to our framework, this means to have at least medium scores in delivery, IT, fleets, volume, and DSS flexibility criteria.

6.1 Implications for theory

Our study supports HSC literature regarding the impact of different domains and criteria on flexibility. For instance, it validates the positive influence of product mix (Vaillancourt 2016) and using IT (Kabra and Ramesh 2016) on HSCs’ flexibility. Similarly, delivery and decision support systems, which ranked first and fifth in our fuzzy AHP analysis respectively, had been previously referred to as important factors for flexibility (Charles et al. 2010).

Our research has the following contributions to the theory.

  • We provide a clear definition of HSC flexibility in the response given the characteristics of disaster settings.

  • We introduce criteria for measuring flexibility in the context of HSC that were not considered in previous studies. For instance, we incorporated criteria related to information systems and resources in HSC context for the first time.

  • We show that differences between CSCs and HSCs as well as the focusing scope (upstream vs. downstream) impacts flexibility criteria.

Surprisingly, our finding regarding the low levels of flexibility in the majority of studied HSCs is not consistent with literature. In their research, Santarelli et al. (2015) and Scholten et al. (2010) explain that HOs have high levels of flexibility in their SCs while we could not confirm this in Nepal case. The divergence can be explained by the differences between either the set of flexibility measurement criteria, corresponding weights for them, or the specific topography of Nepal. Hence, two other implications for theory can be suggested.

  • Our study shows that the effective use of weights helps to account for the practitioners’ preferences in the measurement system.

  • Due to differences between distinct contexts (humanitarian vs. commercial, response vs. recovery, case A vs. case B, etc.), generalization of one framework to another context is hardly possible, if not impossible. This implies the need for studying the adaptation of measurement frameworks to other contexts.

6.2 Implications for practice

Our measurement framework has three main implications for practice.

  • It helps HOs to assess their flexibility level without following all steps of the fuzzy analysis. They just need to use the evaluation and aggregation grids to measure their performance within each criterion (cf. “Appendix A”) and then assess their level. However, special care is required since the criteria for the Nepal case may not be applicable in other contexts.

  • Our non-compensatory approach can help HOs to design their improvement plans with respect to their desired flexibility level. To get the best of our framework for downstream of HSCs, our proposed aggregation grid can be of great help. Each desired flexibility level is addressed with a criteria profile. When practitioners decide what their desired level is, they can incorporate our grid to find out which criteria they have to concentrate on. Being able to design the improvement plan by using our grid, we suggest to incorporate other sources (best practices from literature or other HOs) to develop further detailed plans.

    For instance, we showed that improving flexibility in delivery, IT support, fleets, and volume impact the level of flexibility in HSC downstream significantly. In this regard, reviewing best practices for IT support in humanitarian contexts confirms that inclusion of ICT tools in different processes including planning, controlling, tracking, and monitoring improves HSC network performance (Tchouakeu et al. 2013). Similarly, using logistics service providers (LSPs) for managing fleets and other logistics activities enables better risk management and enhances relief distribution performance (Baharmand et al. 2017).

  • Our framework can act as a prerequisite for implementing network design concepts that flexibility is among their key drivers, such as resilience and agility. Considering results of the framework, before recommending models or simulations for the design of the downstream network, leads to more convincing solutions.

7 Conclusions

In this paper, we aim at supporting humanitarian organizations (HOs) to improve their ability to deal systematically with disruptions in their supply chains (SCs). Planning towards agility and resilience in SCs helps to deal with potential disruptions, irrespective of when and where they occur.

Network flexibility is often referred as the primary driver of resilience and agility in commercial literature. However, in the humanitarian literature, network flexibility has not been adequately discussed. Furthermore, to the best of our knowledge, no framework has been developed to measure the network flexibility of humanitarian SC (HSC) with a concrete application case. Without tools for measuring network flexibility in HSCs, practitioners are left with their intuition and experience to improve it.

Given aforementioned gaps in the humanitarian literature, our paper has a threefold contribution. First, we define network flexibility in the context of humanitarian response after a sudden onset disaster. Second, we propose a framework for measuring network flexibility in HSCs that covers a wide range of flexibility domains and criteria. Our framework uses linguistic variables and follows simple computational approaches. Therefore, it is easy to understand, implement and is based on experts’ viewpoints. We suggest conducting field research as the main methodology for collecting relevant data for both verification and application of the framework. Third, we examine the proposed framework in the light of our findings based on a field research after the 2015 Nepal earthquake.

Our results show that delivery, IT support, fleets, and product volume have the most influence on overall flexibility level. Hence, dedicating more efforts on improving them can increase HSC flexibility significantly. We also found low levels of flexibility in the downstream network of majority of studied HOs. This finding explains why several disruptions happened in relief distributions during the Nepal response. To improve the network flexibility, we developed an aggregation grid based on the impact of each criterion in our study. Using the suggested grid through two proposed approaches helps HOs to improve the network flexibility of their SCs’ effectively.

Our field study in Nepal imposed some limitations to this research. First, Nepal’s topography, comparing to other earthquake-affected countries like Japan, is specific and this brought several logistics challenges to relief delivery. Nepal is located in Himalaya and thus, some affected areas could not be reached by trucks or even helicopters. Besides, high-capacity transportation infrastructures, like highways, were not available. Instead of effective and timely relief shipment with large 15mTons trucks, HOs had to revert to 3m Tons 4 \(\times \) 4 trucks and tractors. Although road blockage can be common after disasters, narrow roadways and mountainous context of Nepal constrained relief support through air and ground considerably. This means the contextual characteristics should be considered carefully for practical implications of our framework. Hence, further empirical research on the application of our proposed framework on other cases across disasters is required. Second, due to the timeframe of our field research, we could not effectively investigate temporal evolution of flexibility within the response, which we leave for future work.

With respect to flexibility criteria, there are two criteria that we could not collect relevant information in our field research despite their importance; donors’ flexibility and policies flexibility. Moreover, our Nepal field research focused only on downstream network. Therefore, routing/access and trans-shipment flexibility were not applicable, and a complementary research on upstream is needed. Also, evaluation of the framework with practitioners and implementation of the improvement system have to be further investigated. Another important research direction is to study the impact of flexibility on HSC resilience and/or overall HSC performance.