1 Introduction

Beamon (1998) suggests that a supply chain may be defined as an integrated process of various business entities interacting with each other. Supply chain network design (SCND), on the other hand, is a modeling module that aims at optimizing an organization’s resources. In a classical SCND problem, attributes such as cost, location, and functionality are lumped in an objective function to be optimized accompanied with operational constraints.

The aim of this work is to reduce the manufacturing or production time (Pt), which is an important component of the overall supply chain lead-time. Lead-time in supply chain operations is an area receiving much attention with numerous works highlighting the importance of its reduction. However, there is a gap in the research for works that look specifically at Pt reduction. The work of Towill (1996) discusses the importance of reducing Pt as a competitive weapon in supply chains. As mentioned, reducing Pt cuts the overall supply chain lead-time. This, in turns, allows firms to be more flexible to demand changes. The advantages of reducing Pt are many. First, with shorter lead-time firms are able to order closer to selling season or consumption point. In illustration, the work of Blackburn (1991) shines light on the operations of the retail giant, Wal-Mart. According to the investigation, if the retailer orders 16 weeks instead of 26 weeks before the selling season, the error in forecasting market demand drops from 40 to 20% (Blackburn 1991). Secondly, reducing lead-time can prove important in industries, which are time sensitive. For instance, in the aerospace industry, aircraft manufacturers pay heavy penalties for late deliveries. Third, since this work looks specifically at Pt, an important benefit of reducing Pt is the attainment of excess capacity. In elaboration, the reduction in Pt frees production resources and hence creates excess capacity from which more output can be produced. Hence, the objective of this work is the reduction of Pt by crashing production time. In this work, crashing cost is the cost incurred for shortening production duration by some time, S. However, crashing Pt requires a deeper understanding of the production cost.

Another important contribution of this work is the formulation of heuristics to search the nonlinear and discrete solution space. Given the surge in papers that present nonlinear and binary mathematical models (Table 1), works that analyze the accuracy of heuristics are necessary. The literature shows a number of works that introduce nonlinear binary models but the body of work lacks work that contests a set of heuristics to solve the corresponding problems. In contrast, this paper devises seven heuristics and analyses the effectiveness and accuracy of each.

Table 1 Relevant SCND literature

The paper is organized in the following manner. Previous relevant works are presented and reviewed in Sect. 2. Section 3 presents important definitions, a general description of the model, and a discussion of the challenges present in it. Section 4 presents the model assumptions and the model itself. Section 5 explains the challenges of solving the nonlinear binary model at hand. It introduces all seven solution heuristics to be used (gradient and binary search heuristics). Section 6 presents comparisons between the different heuristics, validates the accuracy of each of the heuristics, and presents the overall results of the model. At the end of the section, parameter analysis is performed to assess important variations of the model. Section 7 discusses the benefits, limitations, and future prospect of the research.

2 Literature review

The field of SCND has seen a significant number of works. The earliest work in this area can be traced to Geoffrion and Graves (1974). They introduce a multi-commodity logistics network model for optimizing the flow of finished products from plants to distribution centers. Arntzen et al. (1995) provide a deterministic model for supply chain network designs. Table 1 gives a snapshot of the literature, with emphasis on recent publications in SCND. Most of the papers in the table meet the general description of the supply chain with multiple echelons focusing on suppliers, manufacturers, and/or distributors. Overall, the state-of-the-art tends to be populated with models that look at transportation and inventory costs. However, few papers look at the individual components of the production cost. Papers devising solutions that search a discrete solution space are many. However, papers that deal with nonlinear and binary variables are a minority.

Relevant to this work, Cakravastia et al. (2002) present a mixed integer programming (MIP) model to analyze different production scenarios for ultimately performing supplier selection. The work devises a crashing cost where suppliers have two choices: crash production or normal production. The work presents a crashing cost per unit, which can be analogous to the crashing cost presented in this paper. However, the work of Cakravastia et al. (2002) fails to dig into the relationship between crashing and normal costs. The work of Hoque and Goyal (2006) deals with crashing lead-time; here, the crashing pertains to preparation to a series of elements such as time, supplier lead-time, and other elements. This is quite different from the objective of the paper, which looks specifically and exclusively at production time. Jian et al. (2015) look at the crashing cost of production; however, their work focuses on balancing the trade-off between demand forecasting risk and production cost. Yang (2010) looks at crashing the lead-time and not specifically at production time. The work utilizes a polynomial function to model crashing, which is nonlinear. This work also utilizes a nonlinear convex function to model crashing. The work of Diaby et al. (2013) looks at shortening cycle time by reducing setup cost in a capacitated production setting. They present an exponential function, which depicts the capital investment required to shorten production setup time. However, the work looks at the shortening of setup time and not the whole production cycle. Moving to other works, the trade-off between normal costs and crashing costs in the context of SCND is absent. The work of Esmaeilikia et al. (2016) differentiates between regular time and overtime costs. However, the work does not address the crashing of production time. The work of Mizgier (2017) looks at direct and indirect costs. However, these costs are associated with risk. For instance, his/her work defines the crashing cost as the cost of property damage for a firm due to a given hazard.

In terms of solution procedures, Hammami and Frein (2013) present a model for a global multi-echelon supply chains with lead-time constraints. They devise a mathematical model with nonlinearity in the constraint. The authors linearize the constraints and solve the model using CPLEX. Kaya and Urek (2016) present a closed-loop supply chain that integrates a combination of three sub problems: location, pricing, and inventory. The modeling results in a nonlinear problem. The authors utilize a gradient-based search method to solve the model. Jayaraman and Ross (2003) introduce a global distribution system design that utilizes simulated annealing methodology. Merz and Freisleben (2002) present a greedy heuristic and two local search algorithms, l-opt local search and k-opt local search. Their work is the basis of the heuristic used in this paper with the incorporation of a feasibility check criteria. Many works (Badria et al. 2017; Diabat and Al-Salem 2015; Diaby et al. 2013; Esmaeilikia et al. 2016; Fahimnia et al. 2015; Govindan et al. 2014; Govindan and Fattahi 2017; Hasani et al. 2015; Jayaraman and Ross 2003; Kaya and Urek 2016; Keyvanshokooh et al. 2016; Mangla et al. 2016; Marti et al. 2015; Petridis 2015; Pan and Nagi 2013; Pishvaee and Torabi 2010; Pham and Yenradee 2017; Rezapour et al. 2017; Sadjady and Davoudpour 2012; Santos et al. 2005; Sarrafha et al. 2015; Vahdani and Mohammadi 2015; Varsei and Polyakovskiy 2017; Yildiz et al. 2016) utilize integer-based search heuristics (designed for binary variables).

Historically in the SCND literature, components of the production cost have been lumped into one cost. In Table 1, we summarize the literature review we’ve carried. Column five illustrates a gap in works that look at production cost components. To remedy this, the work looks at two very important components of the production cost: crashing and normal costs. Crashing costs encompass variable costs that are geared towards the shortening of Pt. On the other hand, normal production costs include classical costs such as rent, electricity, and labor. Seeing the production cost in terms of crashing and normal costs allows one to infer a relationship between the crashing of Pt and its consequential overall production cost (Fig. 1). This nonlinear relation highlights the contribution of this work, since it is a novel way of looking at Pt crashing. Figure 3 illustrates the classical crashing cost behavior coming from the Project Management discipline (Stevenson et al. 2007). In the figure, we use the data presented later in Table 3.

Fig. 1
figure 1

Crashing and normal production costs

3 Problem description

Before getting to the description of the problem at hand, it is important to introduce important definitions. The shortening length (S) is the time a production run is shortened. Production time (Pt) is the time required to produce the planned production quantity (Qo). Qo is the initial quantity a plant produces during a production cycle if there is no shortening (S). Production cycle (PC) is the original production run time before crashing. Whenever a production run is shortened, the production cycle remains the same while an excess capacity forms. The demand, D, is the quantity of units demanded during a production cycle. Here, the demand remains constant for each time period regardless of crashing. Figure 2 visually illustrates the production cycle before crashing, where a production cycle is exactly equal the production time (Pto). After crashing, the production cycle remains the same but the production time is shortened (Fig. 2). The demand rate remains the same and it is per production cycle. Notably, the crashing creates an excess capacity, of which a plant can increase output as shown in Fig. 2. In context of the overall supply chain, excess capacity at one plant might alter product assignments. For instance, if a product is assigned to two different plants but one plant crashes production, the output at the other plant might be reduced or even stopped altogether.

Fig. 2
figure 2

Pt before and after crashing

For the purpose of this study, normal costs include facility costs (property rental/mortgage/financing costs), electricity costs, labor costs, machining costs (rental costs or depreciation), and supervision costs. Crashing costs include the resources needed to shorten Pt. It includes incremental labor costs due to additional hiring, part-time costs, overtime costs, excess machine costs (additional machining), and acquisition costs of superior production process technologies. Reducing the production time (Pt) results in additional production costs but these costs are offset by a reduction in the overall production costs (crashing plus normal) as the overhead costs is spread over larger production volumes (i.e., more lot sizes per time period). Figure 1 illustrates this, where the total production cost (normal + crashing) is convex and hence has a unique minimum.

The normal production cost (NC) is a function of Pt and is proportional to Pt. Normal costs (NC) can be written as follows.

$$ NC = CPt = C(Pto - S) $$
(3.1)

C is an overhead cost parameter that is dependent on workforce costs, property costs, machine rental costs, and production lot size. Pto is the length of the production time if no crashing is applied.

The crashing cost of production (CC) is the cost of the additional resources (e.g., additional machines, additional labor, and/or acquisition of superior production process technology) required to shorten Pt by time S. CC is a function of S. This is analogous to other works in the literature. We see a similar phenomenon in Project Management and in particular project crashing (see Fig. 3). When we shorten (i.e., crash) a given activity within a project, the cost of crashing increases exponentially. Elsewhere, the work of Diaby et al. (2013) presents an exponential function to depict the capital investment required to shorten production setup time. Based on their work, the CC can be written as follows.

$$ CC = \alpha e^{\beta S} $$
(3.2)

α is a costing parameter that is dependent on the production environment. It is a positive number (α ≥ 0). β is the exponential factor, which is a positive real number between zero and one (β ≥ 0). Both terms were necessary in the work of Diaby et al. (2013) and are used in this work as they construct mathematical functions that mimic the classical behavior of the crashing cost (Figs. 1, 3).

Fig. 3
figure 3

(taken from Stevenson et al. 2007)

Cumulative cost of crashing activities in Project Management

4 Model

The model is a multi-period and multiproduct system. The supply chain has assignable costs (i.e., binary variables). The production at each plant is communicated in terms of lot sizes while the production time, Pt, is a variable. The objective of the work is to reduce the overall production lead-time which, in turn, can reduce the overall supply chain lead-time. The following assumptions are necessary for the conceptualization of the model.

Assumptions:

  1. 1.

    Inventory is held at suppliers and the cost of carrying inventory is visible to all supply chain partners.

  2. 2.

    No inventory, or negligible inventory, is held at the plants.

  3. 3.

    Suppliers procure raw material or components at a procurement cost.

  4. 4.

    Parts arrive from the suppliers to the plants just before the start of each production cycle so no inventory is needed to be held at plants.

  5. 5.

    Products are shipped from plants to distributors at the end of each production cycle.

  6. 6.

    Demand is a constant and is per production cycle for each period.

  7. 7.

    Plants with excess capacity (due to crashing) can increase output.

  8. 8.

    Distributors are the end customers for our network and hence operational parameters (e.g. inventory) at the distributors are outside the boundaries of our model.

  9. 9.

    Before production cycle one, all initial condition are based on Pto.

  10. 10.

    This is a multiproduct supply chain model, which means if a product is not assigned to a plant, the plant is still open for production on other products. This is quite common practice for plants to work on more than one product.

  11. 11.

    The normal costs of production are incurred per a given time period for a set quantity of units. The increase of production throughput (i.e., increase of production quantity), due to the shortening of Pt, results in unitary cost reduction. This assumption is grounded in most manufacturing settings where economy of scale brings unit production savings (Alzaman et al. 2018; Stevenson et al. 2007).

  12. 12.

    More than one production run can occur in a production cycle, if Pt’s shortening is achieved.

The model is as follows.

Sets:

  • I = Group of suppliers.

  • J = Group of Manufacturing Plants.

  • K = Set of Distributors.

  • P = Set of all product types.

  • R = Set of part types that are predecessors to P.

  • T = Set of production cycles.

Parameters:

  • Ptoj,p,t = Standard production run time at plant j, for product p, if no crashing is applied, during production cycle t; j ∈ J, p ∈ P, t ∈ T.

  • Qj,p,t = Production patch size (Lot size) given Ptoj,p,t, for product p at plant j during production cycle t; j ∈ J, p ∈ P, t ∈ T.

  • PCri,r,t = Procurement cost at supplier i, for one unit of part r, during production cycle t; i ∈ I, r ∈ R, t ∈ T.

  • ICi,r,t = Inventory holding cost for holding inventory at supplier i, for one unit of part r, per production cycle t; i ∈ I, r ∈ R, t ∈ T.

  • Fixj,p,t = Fixed cost for assigning product p to plant j, during production cycle t; j ∈ J, p ∈ P, t ∈ T.

  • Tri,j,r,t = Unit transportation cost of transporting a part r from supplier i to plant j during production cycle t; i ∈ I, j ∈ J, r ∈ R, t ∈ T.

  • Tpj,k,p,t = Unit transportation cost of transporting a product p from plant j to distributor k during production cycle t; j ∈ J, k ∈ K, p ∈ P, t ∈ T.

  • SCapi,r,t = Allowable capacity of part r that can be held at supplier i, during production cycle t; i ∈ I, r ∈ R, t ∈ T.

  • PCapj,p,t = Allowable capacity of product p that can be produced at plant j, during production cycle t; j ∈ J, p ∈ P, t ∈ T.

  • Dk,p,t = Number of lot sizes of product p demanded at distributor k during production cycle t; k ∈ K, p ∈ P, t ∈ T.

  • Ii,r,0 = Initial Inventory level of parts r, at supplier i, at the beginning of production cycle 1; i ∈ I, r ∈ R.

  • nr,p = Number of parts r required to manufacture one unit of product p; r ∈ R, p ∈ P.

  • Cj,p = An overhead cost parameter for product p at plant j; j ∈ J, p ∈ P.

  • αj,p = A costing parameter that is dependent on the production setting at plant j for product p. It is a positive number (α ≥ 0); j ∈ J, p ∈ P.

  • βj,p = An exponential factor, which is a positive real number between zero and one at plant j for product p; j ∈ J, p ∈ P.

Decision variables:

  • Xj,p,t = Number of lots produced for product p at plant j, during production cycle t; j ∈ J, p ∈ P, t ∈ T.

  • XPi,r,t = Number of parts r procured or produced at supplier i, during production cycle t; i ∈ I, r ∈ R, t ∈ T.

  • Zj,p,t = Binary variable that takes the value of 1, when output is assigned for product p at plant j, during production cycle t, or zero otherwise; j ∈ J, p ∈ P, t ∈ T.

  • XTri,j,r,t = Number of parts transported from supplier i to plant j for part r, during production cycle t; i ∈ I, j ∈ J, r ∈ R, t ∈ T.

  • XTpj,k,p,t = Number of products transported from plant j to distributor k for product p, during production cycle t; j ∈ J, k ∈ K, p ∈ P, t ∈ T.

  • XHi,r,t = Number of parts, r, held at supplier i, during production cycle t; i ∈ I, r ∈ R, t ∈ T

  • Sj,p,t = Time shortened from original Ptoj,p,t for plant j, for product p, at production cycle t; j ∈ J, p ∈ P, t ∈ T

  • Ptj,p,t = Production run time at plant j, for product p, as a result of crashing the production run by Sj,p,t in production cycle t; j ∈ J, p ∈ P, t ∈ T.

  • PCFj,p,t = Production cost as a function of period shortened, S, for product p and lot size Qj,p,t manufactured at plant j during production cycle t; j ∈ J, p ∈ P, t ∈ T.

Model:

$$ \begin{aligned} Min Z & = \mathop \sum \limits_{i \in I} \mathop \sum \limits_{r \in R} \mathop \sum \limits_{t \in T} PCr_{i,r,t} XP_{i,r,t} + \mathop \sum \limits_{i \in I} \mathop \sum \limits_{j \in J} \mathop \sum \limits_{r \in R} \mathop \sum \limits_{t \in T} Tr_{i,j,r,t} XTr_{i,j,r,t} + \mathop \sum \limits_{i \in I} \mathop \sum \limits_{r \in R} \mathop \sum \limits_{t \in T} IC_{i,r,t} XH_{i,r,t} \\ & \quad + \mathop \sum \limits_{j \in J} \mathop \sum \limits_{p \in P} \mathop \sum \limits_{t \in T} X_{j,p,t} PCF_{j,p,t} + \mathop \sum \limits_{j \in J} \mathop \sum \limits_{p \in P} \mathop \sum \limits_{t \in T} Z_{j,p,t} Fix_{j,p,t} + \mathop \sum \limits_{j \in J} \mathop \sum \limits_{k \in K} \mathop \sum \limits_{p \in P} \mathop \sum \limits_{t \in T} Tp_{j,k,p,t} XTp_{j,k,p,t} \\ \end{aligned} $$
(4.1)

The objective function (4.1) minimizes the total operational costs. It includes (order from left to right) the procurement costs at suppliers, transportation costs from suppliers to plants, sum of inventory holding costs at suppliers, production costs (crashing and normal) at plants, transportation costs from plants to distributors for all product types. Importantly, the objective function minimizes the total of normal and crashing production costs and, in turn, optimizes production time at a given plant.

The production time, Ptj,p,t, can be written as:

$$ Pt_{j,p,t} = Pto_{j,p,t} - S_{j,p,t} $$
(4.2)

The production cost term in the function returns a lot-size production cost that is a function of the shortening of production time, Sj,p,t. The PCF is the total production cost, encompassing both the crashing and normal production costs.

$$ PCF_{j,p,t} = \alpha_{j,p} e^{{\beta_{j,p} S_{j,p,t} }} - C_{j,p} S_{j,p,t} + C_{j,p} Pto_{j,p,t} $$
(4.3)

Subject to

Constraint (4.4) ensures that parts held at supplier i do not exceed the capacity limit.

$$ XH_{i,r,t} \le SCap_{i,r,t} \quad \forall i,\forall r,\forall t $$
(4.4)

Constraint (4.5) balances the parts procured, and held at supplier i, with the parts to be transported to plants.

$$ XP_{i,r,t} - \mathop \sum \limits_{j \in J} XTr_{i,j,r,t} - XH_{i,r,t} + XH_{i,r,t - 1} = 0 \quad \forall i,\forall r,\forall t: \mathop \to \limits^{t = 0} XH_{i,r,t} = I_{i,r,0} $$
(4.5)

Constraint (4.6) balances the outbound supply from suppliers with the demanded parts required for production at plants (also ensures that no transportation is performed for plants that are inactive).

$$ \mathop \sum \limits_{i \in I} XTr_{i,j,r,t} = X_{j,p,t} Z_{j,p,t} Q_{j,p,t} n_{r,p} \quad \forall j,\forall p,\forall r,\forall t $$
(4.6)

Constraint (4.7) ensures that demanded products at a given distributor k are met.

$$ \mathop \sum \limits_{j} \frac{{XTp_{j,k,p,t} }}{{Q_{j,p,t} }} = D_{k,p,t } \quad \forall k,\forall p,\forall t $$
(4.7)

Constraint (4.8) balances the number of products transported to distributors with the number of products made at plants.

$$ X_{j,p,t} Q_{j,p,t} Z_{j,p,t} - \mathop \sum \limits_{k} XTp_{j,k,p,t} = 0\quad \forall j,\forall p,\forall t $$
(4.8)

Constraint (4.9) restricts the number of products made at a given plant to the plant’s capacity plus excess capacity resulting from crashing.

$$ Q_{j,p,t} X_{j,p,t} \le Z_{j,p,t} \left( {PCap_{j,p,t} + S_{j,p,t} \frac{{Q_{j,p,t} }}{{Pt_{j,p,t} }}} \right)\quad \forall j,\forall p,\forall t $$
(4.9)

Constraint (4.10) ensures non-negativity and binary representations.

$$ \begin{aligned} & XTr_{i,j,r,t} ,XTp_{j,k,p,t} ,XH_{i,r,t} ,X_{j,p,t} ,S_{j,p,t} ,Pt_{j,p,t} \ge 0;\\ & Z_{j,p,t} \in \left\{ {0,1} \right\} \quad \forall i, \forall r, \forall j,\forall p,\forall t \end{aligned}$$
(4.10)

5 Methods

The contribution of this work is emphasized in the following:

  • The inclusion of crashing/normal production costs and the conceptualization of the tradeoff between production time crashing and production costs.

  • The solving of a nonlinear binary model and the proving of convexity.

  • The design of accurate solution heuristics (minimizing the solution gap with respect to the lower bound).

To understand the contribution of this paper, close attention needs to be paid to the special characteristics of the model. The work of Dua (2015) presents a generic formulation of a binary nonlinear mathematical programming problem as P1 (Grossmann 2002).

As shown in Fig. 4, P1 is a function of x and y. Where x is a vector of continuous variables, y is a vector of binary variables, h is an nh dimensional vector of equality constraints, g is an ng dimensional vector of inequality constraints and f is the scalar objective function. Solving P1 is NP-hard (Dua 2015). The objective function (4.1) in this work has similar characteristics but is complicated in the fact that two of the decision variables (Xj,p,t and Sj,p,t) are multiplied by each other. The production cost term present in Eq. (4.1) is nonlinear as PCFj,p,t is a function of the decision variable Sj,p,t and is exponential. Moreover, the plant assignments are represented by binary variables, Zi,p,t. Further, nonlinearities in constraints (4.6), (4.8), and (4.9) are present. Given the complexity of the problem at hand, heuristics need to be devised to solve the model. If the binary variables were fixed (i.e., assigned), then the resulting objective function would only contain continuous variables, which can be solved by a gradient search method. The near-optimal solution can then be found by iteratively searching the solution space using search heuristics. This linearizes constraints (4.6), (4.8), and (4.9). The remains of the section will highlight the convexity proof (Sect. 5.1), gradient search (Sect. 5.2), local search procedures (Sect. 5.3), genetic search (Sect. 5.4), and finally the simulated annealing search (Sect. 5.5).

Fig. 4
figure 4

Problem P1

5.1 Convexity proof

Proposition

The objective function (4.1) is convex.

The total cost, PCF, of normal and crashing costs is PCF = CC + NC. Using definitions and equations introduced in Sect. 3, PCF can be written as:

$$ PCF = \alpha e^{\beta *S} + C\left( {Pto - S} \right) $$
(5.1)

For any function f(x) to be convex, then a second derivative everywhere must be \( \frac{{d^{2} f(x)}}{{dx^{2} }} \ge 0 \), for all values of x (Shah et al. 2007; Mohri et al. 2018). Therefore, taking the second derivative of the function on the only variable in the equation, S, brings the following expression:

$$ \alpha \beta^{2} e^{\beta S} \ge 0 $$

Since S, α, and β can only take positive values by definition (see Sect. 3), then the whole expression must always be positive and hence the function is convex.

The PCF term in the objective function is also multiplied by X which is a decision variable. Both partial derivatives of \( \frac{{d^{2} f(S)}}{{dS^{2} }}\; {\text{and}}\; \frac{{d^{2} f(X)}}{{dX^{2} }} \) are greater or equal to zero rendering the overall expression convex. Given that the sum of convex functions is also convex and the holding of Z as constant for each iterate in the solution procedure (to be discussed in Sects. 5.35.5), the whole objective function is convex.

5.2 Gradient search method

Since the objective function (4.1) is convex, a gradient search method can fetch the optimal values of Sj,p,t and solve the model effectively. As discussed, the objective function, f, is convex. Thus f is differentiable at x, and a vector \( d \in \Re^{n} \) is a descent direction for f at x if:

$$ - \nabla f(x)T^{d} > 0 $$

According to the definition of the derivative (Kolda et al. 2003):

$$ f(x + \alpha d) = f(x) + \alpha \nabla f(x)^{T} d + o(\alpha ) $$

If d is a descent direction, and α > 0 is sufficiently small, then \( x^{k + 1} = x^{k} + \alpha^{k} d^{k} \) reduces the value of the objective f. This observation forms the basis of line search methods. At the iterate \( x^{k} \), a descent direction \( d^{k} \) is chosen and a search is conducted along this direction for a point \( x^{k + 1} = x^{k} + \alpha^{k} d^{k} \) (with \( \alpha^{k} \) > 0) that has a smaller objective value. Choosing the correct α is important to guarantee faster convergence (Kolda et al. 2003).

Gradient methods are specified in the form:

$$ y^{k + 1} = y^{k} + \alpha^{k} \nabla f\left( {y^{k} } \right) $$

\( \alpha^{k} \) is the step size and has only a positive value that minimizes: \( f\left( {y^{k} + \alpha^{k} \nabla f\left( {y^{k} } \right)} \right) \).

$$ f\left( {y^{k} + \alpha^{k*} \nabla f\left( {y^{k} } \right)} \right) = \mathop {\hbox{min} }\limits_{{\alpha^{k} }} f\left( {y^{k} + \alpha^{k} \nabla f\left( {y^{k} } \right)} \right) $$

The expression ‘\( f\left( {y^{k} + \alpha^{k} \nabla f\left( {y^{k} } \right)} \right) \)’ can be alternatively thought of as ‘\( f(x) \)’ evaluated at ‘\( y^{k} + \alpha \left( {\frac{\partial f}{{\partial y^{k} }}} \right) \).’ The expression is a function of constants with the exception of: \( y_{i}^{k} \). When the ‘\( y_{i}^{k} \)’ expressions are fixed at each iteration, the ‘\( f(x) \)’ becomes a function of just a single variable: ‘α’ (Hillier and Lieberman 1995). The gradient search procedure first starts at y (pertaining to this work, y is S: Sj,p,t) equals zero and solves the model linearly to produce ‘f(y)’. Then the stepsize is calculated by finding a stepsize (α) that minimizes f(y + αd). As the stepsize approaches zero, the value of f(y + αd) approaches that of f(y) and the minimum is intuitively achieved. This minimum is a global minimum, since the objective function has been proven convex. Figure 5 illustrates the steps in the solution methodology where the initial solution of f(y) is a zero vector. The procedure stops when the difference between f(y + αd) and f(y) is very small and less than a tolerance (tolerance is set to 0.03% of objective value).

Fig. 5
figure 5

Gradient-based solution methodology (y is Sj,p,t)

5.3 Local search heuristics

The paper uses a number of local search heuristics to deal with the set of binary variables in the model, Zj,p,t. Starting with an initial solution, the procedure iteratively searches for the best solution vector Zj,p,t. Then at each iteration, the gradient search method is invoked to compute the cost of the solution. Local search (LS) procedures are improvement heuristics that search in the neighborhood of the current solution for a better one until no further improvement can be made.

Five local searches are contested:

  1. 1.

    1-opt-first procedure

  2. 2.

    k-opt-first procedure

  3. 3.

    1-opt-best procedure

  4. 4.

    k-opt-best procedure

  5. 5.

    Hybrid k-1-opt procedure.

The 1-opt-first (Merz and Freisleben 2002) procedure starts from a feasible solution and searches the neighborhood for solutions with a hamming distance of l to the current solution (i.e., by flipping a single bit) and the gain calculated as the difference between the objective value prior to the move and post to the move. If no better neighbor solution can be found in a predetermined set of tries, the search immediately stops and returns the best solution found. The 1-opt-best procedure differs from the 1-opt-first procedure in that it contests a group of flips and records the best move among them. The binary solution vector in our case is Zj,p,t as defined previously. Similarly, there are two types of k-opt move strategies. The first strategy is the first improvement move strategy, which scans solutions in the N(z) space according to pre-specified order, and Z becomes incumbent if the solution Z improves the objective value. The second strategy is a best improvement move strategy, where solution Z is chosen if it is the best cost in the entire candidate set of N(z).

The hybrid solution procedure combines both the 1-opt-best and k-opt-best methodologies. The hybrid solution procedure is inspired by the results obtained from the 1-opt-best and k-opt-best preliminary results. For the k-opt solution procedure, when a large number of bits are flipped simultaneously, the chance of breaching feasibility becomes more imminent. We see this in Table 2, where computational time can be really high even for small supply chain networks. The 1-opt methodology only flips one bit at a time and hence has a lesser chance of breaching feasibility. As shown in Fig. 6, the k-opt and 1-opt moving strategies are altered and each respective gain is recorded. The best move that brings about the fittest solution, among a group of k-opt and 1-opt, is then chosen. In this way, the best move could be a k-opt or a 1-opt move, depending on the solution topography and the proximity of the infeasibility threshold, and would correspond to the highest gain while meeting the feasibility criterion. nni is the stopping criteria for all the five searches, which counts the number of non-improvement in the solution. This will be discussed in Sect. 6.

Table 2 Comparison between the seven heuristics
Fig. 6
figure 6

Hybrid solution procedure

5.4 Genetic search

Along with the local search heuristics, the paper utilizes a genetic-based heuristic (GBH). Holland (1975) originally developed the method over the course of the 1960s and 1970s. The heuristic begins by defining the optimization variables and the cost function. GBH defines a chromosome or an array of variables to be optimized. The chromosome has N variables (an N-dimensional optimization problem) given by v1, v2, v3, …, vN, which is in our case Zj,p,t. It is necessary to define the cost function which generates an output from a set of input variables. The cost function is a function of the chromosomes. The heuristic starts with a group of chromosomes known as the ‘population’; a matrix, Npop × Nbits. This results in an Npop chromosomes (i.e., population) and Nbits number of bits. The Npop × Nbits matrix gets filled with random ones and zeros. Natural Selection is then applied to improve the fitness of the population of solutions. In a minimization problem as the current, survival of the fittest translates into discarding the chromosomes with the highest cost. The overall solution procedures are highlighted in Fig. 7.

Fig. 7
figure 7

Genetic-based heuristic schematic

As illustrated in Fig. 7, we start with a population of random feasible solutions (i.e., chromosomes). The number of chromosomes is set to 30. Figure 8 illustrates preliminary runs where we increase the number of chromosomes from 10 to 200. The figure shows 30 chromosomes to be a good cutoff point (Df curve becomes less steep after 30 chromosomes). Here, Df is the solution gap [see Eq. (6.1)].

Fig. 8
figure 8

Average Df (%) of 100 runs at each respective number of chromosomes

Next, the feasibility is insured in step 1.4 (see Fig. 7). Then, we sort chromosomes from best to worst fit. Afterward, we introduce mutations to the population at a rate of 1% for each generation. The procedure’s last step is the stopping criterion. Using a non-improvement count as a stopping criterion is quite common in the literature, especially for local search methods. However, for genetic search procedures, the number of generations is usually a stopping criterion of choice (Michalewicz 1996). In our case, we have studied deeply the behavior of the genetic procedure. We ran hundreds of iterations to identify the best stopping criterion based on accuracy and computational time. In Fig. 9, we vary the number of generations in the genetic search from 1 to 10 generations. At each generation iterate, we conduct 100 runs to assure good representation. We find that the steepest decrease in the solution gap occurs between one and four generations. After four generations, the decrease in DF is smaller while the computational time is steadily higher. Hence, we choose four generations to be the stopping criterion.

Fig. 9
figure 9

Results from varying the number of generations in the genetic procedure

5.5 Simulated annealing search procedure

Last of the heuristics is the simulated annealing (SA) procedure, which is based on multiple annealing processes starting from an initial annealing temperature. The first annealing temperature starts at a higher temperature and then is gradually lowered (Kattayama and Narihisa 2001). Each iteration of the search process moves from a current trial solution to an immediate neighbor in the local neighborhood of the solution. The immediate neighbor is then selected to be the next trial solution. Now among all immediate neighbors of the current trial solution, a solution is accepted if it improves the objective function. If the solution does not improve the objective function, it will be compared to a probability criterion. Here, a random number is compared to the probability; if the probability is greater than the random number, the solution will be accepted or rejected if otherwise. The SA procedure is shown in Fig. 10. Like the other solution procedures, the search converges if nni reaches four.

Fig. 10
figure 10

Simulated annealing search procedure

6 Results

In order to measure the accuracy and the performance of each of the heuristics, a lower bound is constructed on the objective function (4.1). It is obtained by relaxing the binary variables, Zj,p,t’s, associated with the assignable costs, to become continuous variables with a lower bound of zero and an upper bound of one (0 < Z < 1). Once the binary variables are relaxed, the model is solved using the gradient procedure and the solution becomes the lower bound to the objective function (4.1). Then, the original model is solved using all seven heuristics discussed and the solutions are then contrasted against the lower bound. The gap difference between the solution of the heuristic and the lower bound, Df, is calculated as follows.

$$ Df = \frac{Hv - Lv}{Lv} \times 100 $$
(6.1)
  • Hv = Objective value of respective heuristic.

  • Lv = Objective value of lower bound.

To introduce the results, we first discuss the stopping criterion (Sect. 6.1), and then we vary the supply chain configurations (Sect. 6.2). We analyze the effect of crashing in Sect. 6.3 and we contest larger runs in Sect. 6.4.

6.1 Stopping criterion

The input data for the supply chain is simulated to give a higher number of instances and the results are shown in Table 2. The stopping criterion, nni, for all seven solution procedures is the same, where nni is the maximum number of consecutive non-improvement moves allowed (except for the genetic search where it’s four generations). An infeasible move is considered a non-improving move as it does not improve the solution. Therefore, the time duration till the stopping criterion is a measurement of how efficient and successful the solution procedure is in reaching the solution without breaching feasibility. It’s important to note that the stopping criterion for all the heuristics (except for the genetic search: see Sect. 5.4) is the maximum number of non-improvements, nni. nni is a limit set by the user to signal the arrival at the minimum solution. This constructs a fair comparison between all the heuristics since arriving at the solution is tied to nni. To set the limit value of nni, we carried numerous runs over different nni values. We ran over 200 runs. In the runs, we varied the value of nni from 1 to 14. Figure 11 shows the corresponding solution gaps (averaged over 20 runs each) when varying the limits (nni). Initially, when we increase the limits (from 1 to 3), the solution gaps (Df) drastically drop. The drop then stabilizes above three. To elaborate, increasing the value of the limits will initially produce high gains then stagnate. In addition, we see higher computational times at higher limits (see Fig. 12). In all, at nni = 4, the benefits (accuracy and efficiency) of increasing nni diminishes. For this, we chose to run the model at nni = 4.

Fig. 11
figure 11

Effect of non-improvement limits on the solution

Fig. 12
figure 12

Computational times for the six different heuristics

6.2 Varying supply chain configuration

In Table 2, we note the time duration of convergence for each of the seven solution procedures. To assure consistency and fair representation, the coding, compilation, and execution are done using a MATLAB platform on a fixed computational system (Intel R, Core TM, i5-2400 CPU @3.10 GHz). Table 2 presents the simulation of twenty-four runs. The runs are for a small supply chain, medium supply chain, and large supply chain networks to ensure better representation of practical implications. Local search heuristics, specifically K-opt (first and best), tend to do well and outperform the simulated annealing and genetic heuristics for small supply chain networks. However, for larger networks, the genetic heuristic tends to do best in terms of accuracy. In terms of computational times, both hybrid and K-opt tend to be faster than the genetic search but at two points lower than the genetic in terms of accuracy (Table 2).

6.3 Effect of crashing

Aside from the heuristics, we start with a network of two suppliers, seven potential plants for product assignments, and two distributors (Fig. 13) to infer the monetary impact of crashing the production time (Pt). A leading firm in the Aerospace industry has provided us with data. However, the data has been masked to assure the privacy of the operations as per the request of the firm (Table 3). The production cost parameters α, β, and C are extrapolated from the crashing data in Table 4, using exponential data fitting where the error of fitting is very small and negligible (0.001). Notably, the scaling does not affect the results especially given the fact that our aim is for comparative analysis.

Fig. 13
figure 13

Example of a supply chain network solved by the model

Table 3 Example of base data for the production cost parameters (α, β, K)
Table 4 Production settings

All the results, presented in this paper, are comparative between the crashing of Pt case and the null case (where no crashing is allowed). This is to answer the main question of this work: Does the crashing of Pt minimize the overall production costs, reduce supply chain costs, and/or increase supply chain throughput? To answer this question, two models are contested: one where crashing Pt is not possible and one where it is. If there is no crashing of Pt, the problem becomes linear as the Sj,p,t terms in the model becomes zero. Hence, no gradient search is required to solve the model. Solving the model produces an objective function value, which we should call: Objvnull. While the objective value of the model where crashing is possible is termed, Objvcrash. Accordingly, the percent difference or objective function reduction can be expressed as Acc.

$$ Acc = 100 \times \frac{{Objv_{null} - Objv_{crash} }}{{Objv_{null} }} $$

Based on the costing parameters (\( \upalpha \), \( \upbeta \), and \( {\text{C}} \)) shown in Fig. 13, the reduction in the objective function, Acc, is approximately 25%. This validates the model since the two plants with the lower cost parameters (\( \upalpha \), \( \upbeta \), and \( {\text{C}} \)) are chosen in the model.

In Table 5, the values of the costing parameters are varied (using MATLAB simulations) to produce eight different instances. Here, the Acc’s range is between 12 and 27%. Further, the work devises a measure for supply chain throughput, which is the total time a product (or components to be assembled to be part of a given product) spends in inventory and production, PDT. Given this definition, the percent increase of throughput, due to the allowance of crashing, is approximately 24% on average. Moreover, savings in inventory costs result, since products are moving faster through the supply chain and spending less time in inventory. The average percent savings in inventory cost, PDI, is approximately 23%. The average time shortened at the plants, which are open for production, is reported in Table 5 as PTS. PTS hovers around 58 h on average. It’s important to note that the time shortened provides a plant with excess capacity and opens up the plant for more output. To elaborate, the allowance for crashing enables plants with advantageous cost structure to fulfill more of the requested demand. For instance, in a three plants network, each plant might fulfill one-third of the overall demand. Due to the allowance of crashing, a plant that has an advantageous cost structure (\( \upalpha \), \( \upbeta \), \( {\text{C}} \)) would get higher than a third of the demand and hence brings the overall supply chain production cost.

Table 5 Eight instances of two inventory locations, seven plants, and two distributors

Vitally, the runs show that the model is highly sensitive to the values of the production cost parameters \( (\upalpha,\upbeta,{\text{C)}} \), and inventory cost parameter (IC). Hence, it is imperative to carry a large volume of runs to represent different variations of those parameters. Notably, these parameters are user dependent and reflect the costing structure of the operational environment at hand. Therefore, higher number of runs would allow us to cover a variety of implications. The runs should simulate the high and low production parameter values presented in Table 6. In Table 4, real production data is used to infer realistic cost parameters. Feasible variations are conducted on the values in Table 4 and extrapolated to form limits on the high and lows (Table 6) while remaining feasible. Table 7 illustrates the parameters’ variant of each run where the first column designates the magnitude of the parameters. Each parameter is tested at two instances: One at a high value and another at a low value. For \( \upalpha \) (designated as A in the first column of Table 7), the high value is 700 while the low value is 600 (See Table 6). For instance, if the run is contesting a high value, then a value of one is placed next to A, while a value of zero is placed if a low value is used. Similarly, B represents the \( \upbeta \), C represents the C, and H represents the inventory cost (ICi,r,t) parameter.

Table 6 High and low values of production and inventory cost parameters
Table 7 Runs for variant production cost parameters

Analyzing Table 7, one can infer the impact of having a lower crashing production cost structure on the overall objective function’s cost reduction. With lower crashing cost structure, the objective function reductions are more magnified. Hence, lower values of \( \upalpha \), \( \upbeta \), and \( {\text{C}} \) result in higher production cost savings when crashing. However, \( \upbeta \) tends to be the most influential of the three. This is quite logical since β is an exponential term. With higher production cost parameters (\( \upalpha \), \( \upbeta \), \( {\text{C}} \)), the cost reductions shrink due to the high costs of additional resources required to crashing Pt. However, the numbers still show around 10% overall supply chain cost savings, which is still quite significant. With favorable values of the production cost parameters, the cost reductions in the objective function reach 29%. It’s important to note that these cost avoidances are open to interpretation since the size of the supply chain network and the magnitude of the production costs are relative to overall supply chain costs and are highly variable. Nevertheless, these reductions are significant from the perspective of plant owners/managers and more importantly would bring other benefits to the supply chain such as lead-time reduction, excess capacity, increased throughput, and higher inventory turnover. For instance, percent reduction in the production lead-time can reach 38% (Table 7), which implies that products are moving faster through the supply chain. This can be quite important in cases where lead-time is vital or even more important in the case when customers might penalize producers for late deliveries. In addition, the cost savings in inventory become quite pronounced when the value of ICi,p,t is set at a high value. This could be the case when dealing with products with high holding cost characteristics (i.e., space costs, opportunity cost, and obsolescence costs). For this case, the cost reductions in inventory can reach as high as 35%. To demonstrate other benefits of crashing Pt, Table 7 devises new measures. The percent difference between the average time a product spends in production, at the plant with highest crashing duration (S), when crashing is allowed, to the average time when crashing is not allowed is termed, PDH. The same measure is applied but for the plant with the lowest crashing duration and is termed, PDL. These two measures exemplify the competition between plants in terms of crashing opportunities. PDH demonstrates the ability of plants that have advantageous costing parameters (or have access to superior manufacturing processes that can bring the cost of crashing down) to aggressively reduce Pt. Importantly, these plants are able to win more production quota. Moreover, these two measures show the benefits of crashing from the plants’ owner/manager point of view, regardless of the magnitude of the supply chain operations.

6.4 Large scale runs

Given the complexity of the work and more importantly the vulnerability to variation, we need to compare the heuristics at more instances. To assure consistency and fair representation, the coding, compilation, and execution are done using a MATLAB platform on a fixed computational system (Intel R, Core TM, i5-2400 CPU @3.10 GHz). In Fig. 14, we contest one hundred runs for a supply chain network of medium size (7-15-17). The figure presents a box plot of the 100 runs. Here, the genetic search tends to give the lowest Df average of approximately 6.5%, followed by the K-opt-best, Hybrid, and K-opt-first. The genetic search also gives the lowest variation and least outliers (i.e., no outliers). However, the computational times average at around 65 s which is the highest among the heuristics. The 1-opt procedures tend to give the lowest time. The Hybrid heuristic shows a good performance in terms of solution gap and computational time.

Fig. 14
figure 14

Heuristic performance for a (7-15-7) supply chain network (Df is to the left)

7 Conclusion

The main contribution of this paper is the inclusion of crashing cost in supply chain network design (SCND). The vast majority of papers in SCND do not attempt to look at the opportunity of crashing production time (Pt). There is a gap in the research for works that minimize the overall supply chain costs via the crashing of Pt. This work integrates the cost of crashing among other prominent supply chain costs such as inventory, production, and transportation. The practical implications of this paper are the ability to shorten Pt in a manner that would reduce the overall production costs and to increase the throughput of the supply chain. However, the benefits of reducing the overall production cost and decreasing lead-time are highly dependent on the characteristics of the production cost parameters discussed. It is also dependent on the topography of the supply chain. Conversely, regardless of the topography of the supply chain or the production environment, the work brings many benefits such as lead-time shortening, throughput improvement, and capacity enhancements. Alternatively, other practical implications such as penalties (for late deliveries) might have companies invest more aggressively in optimizing their production resources by crashing Pt. Further, shortening Pt might prove important for companies that are adopting new process technologies or are still in route to optimality.

Importantly, the work’s strong contribution is in the heuristics used to solve the model. Nonlinear binary models are difficult to solve (Dua 2015; Katayama and Narihisa 2001). Binary quadratic programming is said to be NP-Hard (Katayama and Narihisa 2001). Given the complexity of the model, near-optimal solution are obtainted within small tolerance of the lower bound. This work pushes the envelope in terms of complexity as the objective function has mathematical terms that include the multiplication of two different decision variables and nonlinearity in the constraints. To solve the model, a combination of gradient and search heuristics are supplied. The work presents seven different heuristics to solve the model. From the results presented, we can sense the complexity of the solution especially with larger networks. The work contests the seven solution heuristics and shows the effectiveness of a genetic-based heuristic in arriving at the solution. This is especially true in the case where supply chain networks are large.

This work paves the road ahead for other works to follow. For instance, future works can integrate a penalty cost where companies incurring charges for late deliveries can lessen the financial impact by crashing Pt. In addition, companies in highly competitive environments where arriving first to the market is pivotal might benefit from a model that integrates the opportunity cost of time. Pt in this work has been shortened without attaching a dollar value for the time saved. For instance, the time reduction of producing a given product can be seen as an opportunity to beat a competitor to market and hence can have a dollar value attached to it. This can be the subject of future research.