1 Introduction

Today’s competitive world places a tremendous pressure on companies to use their resources efficiently and satisfy customer requirements on time. Achieving these goals requires one to deal with a series of production planning problems. Among these problems, the capacitated lot sizing problem (CLSP) has received a lot of attention from researchers. CLSP is a typical example of big bucket models where the length of a period is long enough to produce more than one product. The CLSP considers only the quantities and timings of the production and ignores the sequence of the products within a period [1]. Different from big bucket models, small bucket models (i.e. Discrete Lot Sizing and Scheduling Problem, Proportional Lot Sizing and Scheduling, Continous Setup LotSizing and Scheduling) also deal with the sequencing decisions. In recent years, a new model combining the properties of the big bucket and small bucket models, called CLSP with Setup Carryover (CLSPC) has emerged. Like big bucket models, the CLSPC allows producing more than one product per period and like small bucket models it carries over the setup state of a product from one period to the other. From the solution of the problem, it is possible to obtain the production quantities along with the semi-sequencing (i.e. first and last products produced in a period) of the products in a period. The Capacitated Lot Sizing Problem with Setup Carryover is the extension of the Capacitated Lot Sizing Problem (CLSP) which is known for its computational intractability. Florian et al. [2] have shown that the general case of the single-item CLSP is NP Hard. Trigeiro et al. [3] state that when set-up times are introduced in the multi-item CLSP, even the feasibility problem becomes NP Complete. Naturally, it becomes even harder to solve the CLSPC. Since it is computationally very difficult to find optimal solutions to large-scale problems, nearly all earlier studies employed heuristic approaches. The heuristic approaches dealing with this problem can be classified as mathematical programming based approaches (Dillenberger et al. [4], Sox and Gao [5], Suerie and Stadtler [6], Sahling et al.[7], Helber and Sahling [8], Lang and Shen [9], Xiao et al. [10]), metaheuristics (Gopalakrashinan et al. [11], Karimi et al. [12], Nascimento and Toledo [13]), greedy heuristics (Haase [14], Haase [15], Karimi and Ghomi [16], Gupta and Magnusson [17] and hybrid approaches (Gören et al. [1], Seanner et al. [18]). Table 1 chronologically presents the studies conducted during the last 20 years based on the solution approach proposed, type, properties and complexity of the lot sizing problems. It should be noted that the complexity is given in terms of the number of binary variables in the mathematical model of the lot sizing problem considered in the study. Unlike most conventional methods and some of the metaheuristics (i.e. tabu search, simulated annealing) which conduct a single directional search, GAs perform multiple directional searches using a set of candidate solutions, require no domain knowledge and use stochastic transition rules to guide the search (Gen and Cheng [19]). Different applications of GAs can be found in literature such as maritime transportation (Kang et al. [20]), vehicle routing (Ombuki et al. [21]) or grouping (Korkmaz [22]) problems. However, it is well known that pure GAs can locate the promising regions for global optima in a search space, but in a large and complex solution space, they have difficulty in finding the exact minimum/maximum of these optima. Therefore, in this study, to further improve the performance of GAs in solving the CLSPC, a new hybrid approach is proposed. This study can be seen as the continuation of our earlier study Gören et al. [1] in which Fix-and-Optimize heuristic is embedded into the loop of GA to refine the solutions during the generations of GA. Different from our earlier study, this study suggests using a sequential hybridization scheme in which the GAs and then Fix-and-Optimize heuristic are run sequentially. By doing so, we hope that the GAs can locate the promising region where good quality solutions exist and then the Fix-and-Optimize heuristic further searches this region for the best solution. Another issue affecting the performance of the meta-heuristics is the choice of the initial solution/solutions (Mohammadi and Ghomi [23]). If the initial solution/solutions is/are good enough, the probability of finding better solutions will increase and the convergence to the near-optimal or optimal solution will be more quickly. In this study, to further improve the performance of the proposed heuristic, we used the initialization scheme introduced in Gören et al. [1] which creates half of the initial population randomly and the other half using problem specific information. Encouraged from success of this initialization scheme in our previous study, in this study we aimed at using this scheme more effectively. For this purpose, we carried out a comprehensive experimental analysis to determine the most appopriate portion of the random and problem-specific part for each problem size.

Table 1 Proposed heuristic approaches for solving the capacitated lot sizing problem with setup carryover

The remainder of the study is organized as follows. In the following section, the capacitated lot sizing problem with setup carryover is defined. The details of the proposed hybrid approach to solve this problem is presented in Section 3. Section 4 presents the results of experimental study comparing the proposed approach to the recent results in literature. Finally, the last section summarizes the findings of this study and presents some future research directions.

The Capacitated Lot Sizing Problem with Setup Carryover (CLSPC)

In this study, to solve the CLSPC the model proposed by Suerie and Stadtler [6] has been employed under the following assumptions:

  • The planning horizon T is fixed and divided into time buckets (1,…..,T).

  • Several products requiring a unique set-up state are produced on each resource in each period (property of a big bucket model).

  • The resource consumption to produce a product j on a resource is fixed.

  • Setups incur setup costs and consume setup time. Setup costs and setup times are sequence independent.

  • At most one setup state can be carried over from one period to the next on the resource, such that no setup activity is necessary in the second period.

  • Single item production is possible (i.e. the conservation of one setup state for the same product over two consecutive bucket boundaries).

  • A setup state is not lost if there is no production on the resource within a bucket.

The mixed integer programming model (MIP) with the inventory and lot size representation is given with the sets, indices, parameters and variables in the following.

Sets and indices:

j::

items jP = {1, 2, 3,...P}

t::

periods tT = {1, 2, 3,...T}

Parameters:

s c j :

setup cost for item j

h j t :

unit holding cost for item j in period t

C t :

amount of resource available in period t

a j :

time to process one unit of item j

s t j :

setup time of item j

M :

a large number

d j t :

demand for item j in period t

Decision Variables:

I j t :

Inventory level for item j at the end of period t,

X j t :

Production amount of item jin period t,

Y j t :

Binary setup variable (=1, if a setup for item j is performed in period t, =0 otherwise),

W j t :

Binary linkage variable which indicates whether a setup state for item j is carried from period (t-1) to (t) (=1) or not (=0),

Q t :

Single item variable which indicates that resource is occupied solely by item i in period t (=1) or not (=0).

$$ Min\;\sum\limits_{j=1}^{P} {\sum\limits_{t=1}^{T} {\left(sc_{j} Y_{jt} +h_{j} I_{jt}\right)}} $$
(1)
$$s.t. $$
$$ I_{j,t-1} +X_{jt} -I_{jt} =d_{jt} \quad \forall j\in P;\;\forall t\in T $$
(2)
$$ \sum\limits_{j\in K} {\left(a_{j} X_{jt} +st_{j} Y_{jt} \right)} \le C_{t} \,\,\;\forall t\in T $$
(3)
$$ \sum\limits_{j\in K} {W_{jt} } \le 1\quad \;\forall t\in \{2,....,T\} $$
(4)
$$ W_{jt} \le Y_{jt-1} +W_{jt-1} \quad \forall j\in P,\;\forall t\in \{2,....,T\} $$
(5)
$$ W_{jt+1} +W_{jt} \le 1+Q_{t} \quad \forall \;t\in \{1,....,T-1\} $$
(6)
$$ Y_{jt} +Q_{t} \le 1\quad \forall \;t\in \{1,....,T-1\} $$
(7)
$$ X_{jt} \le M(Y_{jt} +W_{jt} )\;\forall j\in P;\forall \;t\in \{1,....,T\} $$
(8)
$$ Q_{t} \ge 0\quad \forall \;t\in \{1,....,T-1\} $$
(9)
$$ W_{jt} \in \left\{ {0,1} \right\}\quad \left(W_{j1} =0\right), \;\;Y_{jt} \in \{0,1\}\;\; \forall j\in P;\forall t\in T $$
(10)
$$ X_{jt} ,I_{jt} \ge 0 {\kern30pt} \forall j\in P;\,\forall t\in T $$
(11)

The minimization of inventory holding and setup costs is given in the objective function (1). The inventory balance equations are stated in constraints (2). Constraints (3) show the capacity constraints. Constraints (4) state that at most one setup state can be preserved from one period to the next. Constraints (5) ensure that a setup can be carried over to period t only if either item j is setup in period t-1 or the setup state is already carried over from period t-2 to t-1. A setup state can only be preserved over two bucket boundaries, if Q t =1 in constraints (6), which is only possible if there is no setup in this period (7). The upper bounds on the production quantities are expressed in constraints (8). Finally, the non-negativity conditions on production and inventory quantities and the binary nature of setup variables are given by (9) to (11). It is assumed that there are no setup carryovers in the first period as stated in constraints (10).

2 The proposed hybrid approach

The main idea of the proposed sequential hybrid approach is to run GAs for a predetermined number of generations and use the overall best solution as the initial solution for the Fix-and-Optimize heuristic. The Fix-and-Optimize heuristic improves the best solution coming from the GAs throughout the iterations. In doing so, we hope that the GAs will locate a promising region where good quality solutions exist and then the Fix-and-Optimize heuristic will refine this region to obtain the best solution.

The setup (Y j t ) and setup carryover (W j t ) variables given in the MIP model are the main elements in the search of the GAs. The values of these binary variables are stated in each chromosome. Fixing these values to zeros and ones in the MIP model results in a linear programming (LP) model from which the objective function value is easily calculated. Therefore, each solution visited in the search space is a chromosome and its fitness is the sum of the objective value of the LP model and fixed setup costs.

2.1 Fix-and-optimize heuristic

Fix-and-Optimize heuristic is a MIP based heuristic in which a sequence of MIP models is solved over all real-valued decision variables and a subset of the binary variables. Note that the number of binary setup and setup carryover variables determines the numerical effort required to solve the MIP model. In recent years, the Fix-and-Optimize heuristic has been widely used in lot sizing area. Sahling et al. [7] employ the Fix-and-Optimize heuristic to solve the multi level capacitated lot sizing problem with setup times and setup carryover and they test product, resource and process oriented decomposition strategies. Lang and Shen [9] develop Fix-and-Optimize and Relax-and-Fix heuristics to address the single machine capacitated lot sizing problem with item substitution options. A new heuristic which combines the principles of Variable Neighborhood Decomposition Search and Fix-and-Optimize heuristic is presented in Seeanner et al. [18] to solve the multi level lot sizing and scheduling problems. Two different Fix-and-Optimize heuristics are presented in Xiao et al. [10] to deal with the CLSP with sequence-dependent setup times, time windows, machine eligibility and preference constraints.

The idea in the Fix-and-Optimize heuristic is to systematically solve a series of sub-problems that are derived from the model stated in Section 2. In each iteration of the algorithm, one sub-problem is solved by setting most of the binary setup and carryover variables to fixed values. This reduction leads to a limited number of non-fixed binary variables which are optimized in a given sub-problem. Then the problem is solved using a standard MIP solver. In the next iteration, a new sub-problem with a different subset of fixed binary variables is solved. The solution with respect to the binary variables is a fixed parameter for the next MIPs that optimize other binary variables [8]. Therefore, in each sub-problem the complete set of real-valued decision variables are considered. The optimization of the model is done within the MIP solver and therefore the approach is flexible [7]. The definition of the sub-problem and algorithm are given in detail in the following sections.

Definition of sub-problems

Simply, a sub-problem can be derived from the model given in Section 2 by adding the following constraints.

$$ Y_{kt} =\bar{{Y}}_{kt} \quad \forall (k,t)\in PT_{Y,s}^{fix} $$
(12)
$$ W_{kt} =\bar{{W}}_{kt} \quad \forall (k,t)\in PT_{W,s}^{fix} $$
(13)

The explanations of the parameters used above can be found in Table 2. Fix-and-Optimize heuristic starts with an initial solution. This initial solution yields an initial objective value which is shown by Z new. After initialization, the algorithm iterates through the ordered set of sub-problems according to time decomposition either once or until it reaches a local optimum.

Table 2 Additional notation for the definition of a sub-problem

Time decomposition:With respect to the setup and setup carryover variables, each sub problem k is defined for a time window of five consecutive periods. For example, in one iteration of the Fix-and-Optimize heuristic, three sub-problems for a problem with 15 periods. (The reader can refer to Gören et al. [1] for further details on time decomposition).

The algorithm

The basic structure of the algorithm is outlined in Fig. 1. The algorithm needs an initial solution to start and goes through the sub-problems defined by time decomposition either once (= max) or until it reaches a local optimum.

Fig. 1
figure 1

The outline of the Fix-and-Optimize heuristic

Each solution to a sub-problem yields an objective value of Z which is at least as good as the current best solution. So, a new solution is accepted only if it yields an objective value lower than the current best solution.

It should be noted that a capacity infeasible solution is never considered as a candidate for the best solution.

2.2 Genetic algorithms

Chromosome representation

A matrix is used for representing the chromosomes (see Gören et al. [1]). The setup variables (Y j t ) and setup carryover variables (W j t ) are stated in the rows of the chromosomes. The setup and setup carryover variables for each product in each period are stated in the columns of the matrix. Thus, a chromosome with the length of the number of products (P) multiplied by the number of periods (T) is obtained.

Proposed Initialization scheme

The initialization scheme generates the chromosomes in two steps. First, setup variables which take place in the first row of the chromosomes are generated. Next, setup carryover variables in the second row of the chromosomes are created (Gören et al. [1]). In creating the first rows of chromosomes, i.e. setup variables, some of them which we call Smart Part are created by utilizing the information gained from the LP relaxation of the CLSP with setup times, and the others called Random Part are created randomly. The intention is that using problem specific information for creating the smart part of the initial population will direct the search towards the search spaces where feasible and good quality solutions exist. The procedure for creating the Smart and Random parts of initial population is explained in Fig. 2. Note that λ shows the number of chromosomes that are generated using problem specific information. The results of an experimental study evaluating different levels of this parameter are given in Section 4. β is another threshold parameter which is used for determining the values of genes in the chromosomes of the Random Part. Based on pilot experiments, this parameter is set to 0.5.

Fig. 2
figure 2

The procedure of creating the initial population

Once the setup variables are generated, the products are prioritized based on descending order of setup costs, next the setup carryover variables are generated following the logic given in the pseudo-code below. It must be noted that the setup carryover variables are generated starting from the second period (see the constraint (10) in the MIP model). A setup carryover for a product in the current period can only be generated if there is a setup and no setup carryover in the previous period. As can be seen in the pseudo-code below, if there is a setup variable in the current period, it is eliminated to attain the feasibility.

i = 1 to POP

   t = 2 to T

    j = 1 to P (Products are descending order based on setup costs)

     IF Y j t−1 = 1 and W j t−1 = 0,

     THEN

     Set W j t = 1

        IF Y j t = 1

        THEN

        Set Y j t = 0

        ENDIF

     ENDIF

     BREAK (go to the next period)

Genetic operators

The modified roulette wheel selection operator defined in Gören et al. [1] is used for selecting the chromosomes for recombination. The single point crossover and single bit flip mutation operators are used as the crossover and mutation operators, respectively. Infeasible chromosomes that violate the constraints related to the semi-sequencing of the products in a period (constraints 4, 5, 6, 7) are repaired and they are put back into the population. Moreover, to reduce the chance of selecting chromosomes that violate the capacity constraint stated in (3), a very high penalty cost is added to the fitness function. Elitism is used as a survival scheme and the search in GAs is terminated when the total number of generations exceeds a maximum number.

3 Computational results

Computational experiments involve two sets of experiments. First, an extensive experimental study was carried out to decide the ratio of smart part to random part in the initial solution by employing three groups of problems (i.e. small, medium and large size). Second, using these best performing ratios for each problem size, the performance of the proposed hybrid approach was compared to that of pure GAs and the latest results reported in the literature, Suerie and Stadtler [6] and Gören et al. [1]. All computations were carried out on a PC with Dual Core, 2 GHz microprocessor and 2 GB RAM. The pure and hybrid GAs were coded in Visual C ++ 2008 Express Edition and all problems were solved using Concert Technology of Cplex 11.2.

3.1 Benchmark problems

Trigeiro et al. [3] proposed an algorithm (referred to as TTM, Trigeiro-Thomas-McClain) to solve the CLSP and generated various test instances to test the performance of this algorithm. These test instances were generated in three phases. Phase one involved 70 problem instances which were used for fine-tuning of the parameters of the TTM algorithm. In phase two, 141 problem instances were used to analyze different problem characteristics, and in phase three, 540 problem instances were generated to test the algorithm. Table 3 gives the classification of these instances [6]. It should be noted that our experimental study does not include the problem instances used in phase one since no results have been reported in the literature for these instances to carry out a comparative study.

Table 3 Classification of TTM-test set [6]

Based on problem complexity, these nine problem classes have been placed into three groups, small, medium and large (See Table 4). Note that problem complexity is measured based on the number of binary variables (number of periods*number of products).

Table 4 Problem sizes

3.2 Experiments for the initialization strategy

The performances of initialization strategies were investigated on three groups of problems (small, medium, large). Five problem instances from each problem class (see Table 3) were solved using the pure GAs with the initial population created by the proposed initialization strategies. As stated earlier, the initial population consists of two parts: random and smart part. One of the design issues in these computational studies is to evaluate the effects of the ratio of smart part to random part on solution quality. The experimental design used in these computations is given in Table 5. The other design parameter is the threshold value (β) used in creating the random part of the initial population. Based on some pilot experiments, we set the threshold value (β) to 0.5 in creating the random part. Throughout all experiments, the same parameters were used (i.e., combination of population size and number of generations, the crossover rate and the mutation rate were set to 20/500, 0.9 and 0.011, respectively) and the results of 10 runs using different random seeds were reported. Therefore, 20 instances for small size problems (200 runs in total), 10 instances for medium size problems (100 runs in total) and 15 instances for large size problems (150 runs in total) were solved. It should be noted that only feasible solutions are taken into consideration while calculating the average gap.

Table 5 Experimental design for the ratio of random part to smart part

The gap is calculated based on the formula given in the following where the lower bounds are taken from the study of Suerie and Stadtler [6].

$$ Gap=100\ast \frac{(heuristic\;solution-lower\;bound)}{lower\;bound} $$
(14)

The results of computational experiments are summarized in Table 6. It should be noted that the computational time is not taken into consideration in comparisons since nearly the same computational time is observed across all problem sizes (i.e. small, medium, large).

Table 6 Results of Comparative Experiments

In Table 6, the results given in parenthesis show the number of infeasible solutions obtained throughout the runs. It should be noted that while calculating the average gap only feasible solutions are taken into consideration.

As seen from the table, using pure random initial population results in large number of infeasible solutions, i.e., 150 infeasible solutions out of 200 runs and 95 infeasible solutions out of 100 runs, for small and medium size problems, respectively. Note that it is not even possible to find a feasible solution for large size problems.

As stated in Section 3, the initialization scheme employed in this study suggests creating a smart part in the initial population by utilizing the information obtained from the LP relaxation of the CLSP which is the basis of the CLSPC. The purpose is to use this information to generate problem specific chromosomes so that the search will be directed towards the search spaces where feasible and good quality solutions exist. As seen from Table 6, the results of experimental studies support our expectation and when some problem specific information is added into the initial population generation scheme, GA finds feasible solutions in nearly all runs for all problem sizes. It is also quite clear from Table 6 that for all problem sizes, generating some portion of the initial population randomly creates diversity in the initial population and this further improves the performance of GAs in solving the CLSPC. When both randomness and problem specific information are included in the initial population, the ratios of random part to smart part giving minimum average gap are found as RS02, RS and RS04 for small, medium and large size problems, respectively. Having observed the same behavior for all problem sizes we can state that keeping the portion of the smart part bigger than the portion of the random part improves the performance of the proposed initialization scheme.

Based on these results, it can be stated that the initial population generation method using problem specific information scheme has high potential to reduce the number of infeasible solutions in each generation and hence it can ensure a feasible solution at each run of the GAs. Other insights gained as a result of these experimental studies are that generating some part of the initial population randomly and keeping the smart part larger than random part further improve the performance of the proposed initialization scheme.

3.3 Comparative experiments

This section presents the results of the computational studies comparing the performance of the pure GAs and proposed sequential hybrid approach to the recent results reported in the literature. It should be noted that the problem instances generated in phase two and three (i.e. total of 681 instances) were used to evaluate the performance of the proposed sequential hybrid approach and pure GAs (PGA). The experiments were repeated 10 times for every instance and the solutions with the best and average performance were recorded. We also carried out some preliminary experiments and ANOVA to determine the most appopriate GAs parameters (see Table 7).

Table 7 Parameter settings for each problem size

To obtain compatible test results for each problem class, the computational times for the proposed hybrid approach were set to the average computational time (i.e. the time calculated based on the computational times of 10 runs) of the PGA. For instance, the proposed hybrid approach was run for 58.75 seconds for each run to solve the problem instances in Class 1 (see Table 8). Moreover, as a result of some preliminary tests the computational time to solve each problem using the Fix-and-Optimize heuristic was limited to 2 seconds. Lastly, based on the experimental studies presented in earlier section, the ratio of random part to small part in the initial population of GAs was set to RS02, RS and RS04, for small, medium and large size problems, respectively.

Table 8 Computational results

When compared to the PGA, the solution quality of the proposed sequential hybrid approach which uses the Fix-and-Optimize heuristic after GAs is quite remarkable for every problem class. As seen in Table 8, the average gaps obtained by the hybrid approach are much smaller than those of the PGA. Table 8 also presents the average gaps reported in Gören et al. [1] and Suerie and Stadtler [6]. Considering the average of ten runs the proposed sequential hybrid approach outperforms the time decomposition heuristic of Suerie and Stadtler (2003) for the problem instances in Classes 7, 8, and 9. Since problem classes 8 and 9 include large size problem instances we might state that the performance of proposed hybrid approach improves as problem size increases. However, regarding the computational time, the proposed hybrid approach requires much more computational effort than the time decomposition heuristic.

When compared to the results obtained in Gören et al. [1], the solution quality of the proposed approach both with respect to average of ten runs and the best of ten runs shows a better performance in almost every problem class. This could be attributed to the two factors. As mentioned earlier, while in Gören et al. [1] Fix-and-Optimize heuristic is embedded into the GA, in this study, the GAs and Fix-and-Optimize heuristic are run sequentially. Besides employing a new hybridization scheme, in this study, we carried out a comprehensive experimental analysis to determine the best performing ratios of random to small part in generation of the initial population. Hence, unlike Gören et al. [1] in which half of the initial population was created randomly while the other half was created smartly, in this study, these best performing ratios were used when generating the initial population. The general conclusion which can be drawn from this comparative experimental study is that apart from requiring a long computational time to find good quality solutions, hybridising the pure GAs with the MIP-based heuristic in a sequential way significantly improves its performance.

4 Conclusion and future research directions

Lot sizing problems determine when and how many items (i.e. lot size) of a particular product to produce for a given horizon. Lot sizing problem is one of the most challenging production planning problems and solving this problem optimally has an important effect on the efficiency of the production and inventory systems.

In this study, we proposed a sequential hybrid approach for solving the CLSP with the extension of the setup carryover, CLSPC. The proposed approach combines Fix-and-optimize heuristic with GAs and the hybridization scheme is a sequential scheme where Fix-and-Optimize heuristic approach is executed after GAs. To evaluate the performances of pure GAs and proposed hybrid approach, we carried out comparative experiments using benchmark problems reported in the literature. The results show that the performance of pure GAs improves notably when hybridized with the Fix-and-Optimize heuristic and the solution quality of the proposed approach is superior compared to the recent results in the literature.

As a future research direction, the proposed hybrid approach can be extended to solve the multi-level capacitated lot sizing problems. This study focuses on a deterministic lot sizing problem where demand is assumed to be known. Another future research direction can be to solve the stochastic version of this problem using the proposed hybrid method. The recent trend in lot sizing field is to build multi objective problems which integrate lot sizing, distribution and scheduling decisions. Solving these multiobjective problems using evolutionary approaches such as GAs can be another interesting future research direction.