1 Introduction

In the current context of sustainable development, the cultivation practices favoring minimization of environmental degradation through alternative ways that avoid the intensive use of chemical products to combat pests besides increasing soil usage are being intensively studied. These practices, if adopted, can prevent an environmental crisis on the planet and make agricultural products healthier.

In this sense, one of the central focuses in crop production discussed of late is the use of measures designed to target sustainable and ecological planning, in the light of environmental degradation that has occurred in recent years. For this reason, the planning of agriculture activities, such as crop rotation, has been gaining prominence in the studies aimed at sustainable cultivation, since it is one of the means of cultivation whose practical principles facilitate ecological and productive agriculture. This practice, once conducted by the rural farmers, brings numerous benefits, since the control of pests, pathogens and weeds are performed biologically, thus decreasing the harmful effects of pesticides to the environment and leading to actions promoting soil recovery, thus making it forever fertile, as stated by Snapp et al. (2005).

In this paper, we present a nonlinear, more specifically, a bi-objective binary quadratic optimization model inspired by Santos et al. (2010), Santos et al. (2011), Santos et al. (2015a) and Aliano Filho et al. (2014). The first three papers referred to presented versions of an integer linear programming model, considering binary variables, constraints of demand, adjacent plots, green manure and fallow, whose aim was to maximize a single objective function (representing the total time of occupation for the planting area or the total profit of the plantation). These papers were focussed on exact methodologies based on branch-and-bound and column generation. In the fourth paper, a different approach was employed. The authors proposed several metaheuristics (evolutionary algorithms, simulated annealing and hybrids) to solve the crop rotation problems presented in Santos et al. (2010), Santos et al. (2011). The objective function, to maximize, represents the total profit in the planning horizon. One of the main features of this paper is to consider a formulation inspired on those works, but here we consider two conflicting objective functions.

The mathematical model considered in this paper deals with two conflicting objectives. The first one, of environmental nature, is defined to minimize the possibility of pest dissemination among the crops in the planting area, along the planning horizon. A quadratic function is proposed to measure the proliferation of pests which depends, essentially, of the crops’ botanical families and of the distance between the respective plots. This minimization ensures a decrease in the use of chemical pesticides to combat pests and weeds. Moreover, the maintenance of several crops in the same farm maintains the soil fertile, with high concentration of nitrogen and organic matter, thus reducing the amount of artificial fertilizers. The second objective is purely economic and is defined to maximize the total profit of the plantation along the planning horizon.

To determine Pareto optimal solutions for the bi-objective binary quadratic optimization problem, the weighted sum method was used. The mono-objective problems of this approach were initially solved using an integer quadratic programming solver. In a second phase, a linearization strategy for the original model was applied, thus obtaining a mixed binary linear programming formulation. The respective mono-objective problems were solved using an integer linear programming solver. The high number of integer variables and the computational difficulties arising when running the solvers caused the authors to develop a genetic algorithm (GA) to solve the mono-objective problems of the weighted sum method. Two constructive heuristics were used, with the purpose of improving the performance of the GA. Computational results have shown that this methodology is efficient, in the sense that it obtains potentially Pareto optimal solutions for the bi-objective problem in a CPU time below the exact methods, specially for the high dimension instances. This enables us to assume that the developed heuristic algorithm is an excellent tool for supporting decision-making in this field.

This work introduces the following innovative aspects:

  • The adaptation of a bi-objective binary mathematical programming model for sustainable cultivation;

  • A comparative computational study between integer quadratic and linear versions of a mono-objective problem used to determine Pareto optimal solutions for the problem of sustainable cultivation;

  • The development and computational testing of a GA, embedding two constructive heuristics which determines potentially Pareto optimal solutions for the problem of sustainable cultivation.

The remainder of this paper is organized as follows. Section 2 briefly presents the problem of sustainable cultivation and correlated studies in the operational research area. Section 3 presents the mathematical modeling of the problem, including a quadratic and a linearized model, all employed to obtain Pareto optimal solutions for the problem. Section 4 describes the GA proposed, including the constructive heuristics necessary for this algorithm. In Sect. 5, 27 real-size instances are also introduced to computationally validate the metaheuristic and the mathematical models. Finally, Sect. 6 describes the conclusions of this work.

2 Sustainable agriculture and crop rotation

According to Tilman et al. (2002), the global demand for food in 2052 will double its figures in 2002. As predicted by these authors, food production designed to guarantee sustainability is a huge challenge for agro-industrial management. New studies and new incentives are required to find a way to improve food production, while preserving environmental integrity and public health.

Parra (2002) presented a work that clarifies the advantages of using the crop rotation technique and shows the benefits of this technique to the natural control of pests, soil protection and water resources. As stated by this author, a sustainable agriculture committed to the environment and natural resources must intensively use crops planted sequentially in the same field. It is an important practice for any sustainable agricultural system because different crops improve the physical, chemical, and biological characteristics of the soil. This practice retains soil fertility and prevents the appearance of disease, insect infestation, and weeds.

According to Havlin et al. (1990), West and Post (2002) and Costa et al. (2014), maintaining and increasing soil productivity depends, essentially, on the management practices of the crops. Usually, they cultivate organic leaf matter and increase nitrogen and carbon fixation by the atmosphere in the soil, all vital components in good production. In this context, crop rotation is an excellent technique to increase soil productivity.

Planning a rotation with diverse crops is not an easy task. Mathematical modeling with operational research techniques can be useful tools in decision-making. Santos et al. (2010) considered the problem of crop rotation in which one must meet known crop demands while respecting ecologically-based production constraints. The authors proposed a binary linear optimization model, where each variable is associated with a feasible crop rotation plan. Due to the large number of variables of the model, a column generation method was used to solve it. Santos et al. (2011) proposed a binary linear optimization model to determine a schedule of crop rotation for each plot, thus maximizing land occupation time. In addition to the restrictions considered in the previous work, planting constraints for adjacent plots and for crop sequences in the rotations were included. The authors proposed a column generation heuristic to deal with the large number of model variables. In Santos et al. (2015b) a binary programming formulation for this last problem. but considering decisions on plot sizes and schedules was proposed. A branch-price-and-cut algorithm was developed and extensive computational experiments over a set of instances based on real-life data were performed.

Alfandari et al. (2011) proposed a Mixed-Integer Linear Programming model for a multi-period crop rotation optimization problems with demand constraints and incompatibility constraints between cultivation and fallow state on a plot. The model, whose objective was maximize the control agricultural space, was applied to a case study on Madagascan farms.

Frequently, optimization is based on a single monetary criterion, for instance profit maximization. Some authors, as Dury et al. (2012), pointed out the limitations of an approach that focuses exclusively on maximizing returns. Therefore, the growing environmental concerns led researchers to explicitly segment other objectives beyond profitability, such as minimization of total labor, the equipment employed, herbicide usage, losses, pesticide exposure, erosion; and maximization of irrigated areas, organic matters’ rate of change, gross margin, among others.

In this new perspective, models with multiple objectives have increasingly been used in agriculture applications, as mentioned by Huang et al. (2011). The multi-objective methodologies are becoming popular due to the flexibility that allows decision-makers to consider all the criteria and objectives simultaneously (Kumar et al. 2017). Dogliotti et al. (2005) developed an approach to support farmers with temporal complex interactions in crop rotations and spatial heterogeneity, revealing a trade-off between the economic and environmental objectives. These authors proposed a mixed integer linear programming model to allocate production activities at a farm with land units differing in soil quality, while minimizing the socioeconomic and environmental objectives, subject to constraints at the farm level.

Hayashi (2000) presented a review of multicriteria analysis applied to agricultural resource management. Methods for selecting multi-attribute discrete alternatives and solving multi-objective planning problems are revised. Annetts and Audsley (2002) proposed multi-objective linear programming models that consider a wide range of farming situations, which permits optimization of profit or environmental outcome, or in fact both. The objective is to identify the best cropping and machinery options which are both profitable and result in improvements to the environment.

The recent paper by Boyabatlı et al. (2019) examines crop rotation in a sustainable context taking into account revenue uncertainty and allocation of multiple crops to the plots. The authors have shown that the cultivation of unique crop over the entire planning horizon, as employed in industrial agriculture, leads to a considerable profit loss and soil wear. The paper by Nechi et al. (2019) develops a model of goal programming with several criteria, where stakeholders’ preferences are explicitly integrated within a group decision-making process based on consensus and tradeoffs.

In the last 15 years, evolutionary algorithms have been used to support multi-objective cropping plan decisions. Aliano Filho et al. (2014) addressed a crop rotation problem, modeled as a binary linear programming problem. The objective is to maximize the profit from rotation, while incorporating the same ecological constraints of the model presented in Santos et al. (2015b). The authors proposed metaheuristics to solve the problem, as well as greedy constructive heuristics, which inspired the methods presented in this paper.

According to Rossing and Hammer (2006), the main advantage of using genetic algorithms is to produce a set of compromise solutions along the Pareto frontier entailing a low computation effort. These authors used a multi-objective evolutionary algorithm to solve a crop choice model, and applied it in North East Australia. The concept of sustainability of the cultivating systems is considered in terms of resources used besides other economic aspects, but does not take into account the total profit of the agricultural activity. Sarker and Ray (2009) formulated a crop-planning problem as a multi-objective optimization model and provided useful insights into solutions that are generated by using three different evolutionary strategies. The authors also considered a bi-objective formulation with continuous variables for crop rotation, where the first objective is to maximize the total gross margin (from cultivated plus imported crops) in a single crop year. The second one is to minimize the total working capital required. These are two conflicting economic objectives.

In our paper, we highlight some innovative aspects, such as the adaptation of a crop programming model [inspired in the works Santos et al. (2010, 2011, 2015b) and Aliano Filho et al. (2014)] and introduce two objective functions to balance sustainability and profit. The term “sustainability” is employed in this context because the mathematical model promotes the choice of crops to entry in the planting schedule taking into account environmental degradation, through reduction in intensive pesticide use and the maintenance of soil natural resources.

The methodologies developed provide the manager with a set of several solutions and correspond to distinct scenarios that facilitate his decision-making. We will show that the multi-objective optimization tool enables one to analyze a reduction in environmental impact, along with a modest decrease in profits.

In addition, linked to computational difficulties of the proposed mono-objective (quadratic and linear) models for the weighted sum method, a comparative computational study with different weights given to each function, is performed. To complete this analysis, we conducted a comparison between the exact and heuristic solution methodologies. These aspects show that this study addresses some points that were not covered by the specific literature of this field.

The problem of sustainable cultivation studied can encompass two distinct production scenarios. In the first scenario—cereal cultivation applied in areas of high production—where there are fewer varieties, we highlight soybeans, corn, wheat, rice, and beans as the most common crops. In the second situation, small farmers cultivate on a small scale, in response to a demand from the region, whose focus is placed on vegetables. In this case, the number of cultivated varieties (N) and plots (K) are larger (N can vary from 10 to 20 varieties and K varies from 10 to 15 plots), and the varieties are lettuces, potatoes, tomatoes, peppers, watermelons, and pumpkins, among others.

In summarizing, the problem of sustainable cultivation consists in determining efficient solutions or Pareto optimal solutions to the bi-objective problem of scheduling different varieties of plants/crops in a given set of plots. The planting schedule is characterized by the crops that should be cultivated, the dates and plots where they should be planted, with the purpose of minimizing the possibility of pest proliferation among the crops in the whole planting area and to maximize the profit obtainable in a specified planning horizon. Constraints for crop planting of the same botanical family in alternated plots, for non-overlapping planting and forcing all the plots to share the same cycle duration are also considered.

3 Mathematical modeling

This section is devoted to the quadratic and the linear bi-objective binary mathematical programming models developed for the problem of sustainable cultivation. The constraints associated with non-overlapping of planting constraints and cultivating consecutive plants in the same plot, were inspired by Santos et al. (2010, 2011, 2015b). The objective function regarding the possibility of pest proliferation is originally proposed by the authors in this paper. The additional components of the formulation considered in this work are based on Aliano Filho et al. (2014).

The indexes and parameters for the models are defined as follows:

Indexes

  • i and \(\bar{i}\): related to the crops/cultures;

  • j and \(\bar{j}\): related to the plots;

  • t: related to the planning horizon periods;

  • p: related to the botanical family of plants.

Parameters

  • N: number of crops available for planting (\(N \ge 2\));

  • K: number of available plots;

  • T: duration of the planning horizon;

  • \(N_f\): number of botanical families of plants (\(N_f \ge 2\));

  • \(F_p\): set of crops belonging to the family p, \(p=1,\ldots ,N_f\);

  • \(C_i\): duration of crop life cycle i;

  • \(l_i\): profitability of crop i per hectare;

  • \(area_j\): area of plot j in hectares;

  • \(P_{i\bar{i}}\): probability of pest infestations between the crops i and \(\bar{i}\). Pest infestation has a higher probability for similar crops and lower probability when the crops are botanically different. Therefore, we propose the following probability of infestation:

    $$\begin{aligned} P_{i\bar{i}}=\left\{ \begin{array}{lll} 0.9, &{}\quad \text {if }i\text { and }\bar{i}\text { are from the same botanical families,} \\ 0.5, &{}\quad \text {if }i\text { and }\bar{i}\text { are from botanical families of degree between 2 and 3,} \\ 0.1, &{}\quad \text {if }i\text { and }\bar{i}\text { are botanical families with degree equal greater than 4,} \end{array} \right. \end{aligned}$$

    where the degree between two crops i and \(\bar{i}\) (\(i \ne \bar{i}\)) is defined as the difference (in module) in the index of the botanical families to which they belong. The decision about three distinct and spaced values for the probability of pest proliferation between two crops came from previous computational experiments performed with the aim of strongly penalizing plantation of similar crops in nearby plots, thus promoting the cultivation of very distinct crops in nearby plots.

  • \(V_{j\bar{j}}\): the probability of pest infestation from plot j to plot \(\bar{j}\), \(j\ne \bar{j}\), and vice-versa. We propose the following formula to calculate \(V_{j\bar{j}}\):

    $$\begin{aligned} V_{j\bar{j}}=\frac{1}{1+d_{j\bar{j}}}, \end{aligned}$$

    in which \(d_{j\bar{j}}\) is the shortest distance between plots j and \(\bar{j}\). Note that, if \(d_{j\bar{j}} \rightarrow 0\) then \(V_{j\bar{j}} \rightarrow 1\); if \(d_{j\bar{j}} \rightarrow \infty \) then \(V_{j\bar{j}} \rightarrow 0\). This means that neighboring plots are more likely to be simultaneously infected.

The decision variables are defined as follows:

$$\begin{aligned} x_{ijt}=\left\{ \begin{array}{lll} 1,&{}\quad \text {if crop }i\text { is planted in plot }j\text { in the period } t, \\ 0, &{}\quad \text {otherwise,} \end{array} \right. \end{aligned}$$

for all \(i=1,\ldots , N\), \(j=1,\ldots ,K\) and \(t=1,\ldots ,T\).

A bi-objective quadratic binary formulation for the problem of sustainable cultivation is presented as follows:

$$\begin{aligned} \text {minimize}&z_1=\sum _{i=1}^{N}\sum _{\bar{i}=1}^{N}\sum _{j=1}^{K-1}\sum _{\bar{j}=j+1}^{K}\sum _{t=1}^{T} P_{i \bar{i}} \cdot V_{j \bar{j}} \cdot \left( \sum _{r=0}^{C_i-1} x_{ij(t-r)} \right) \cdot \left( \sum _{r=0}^{C_{\bar{i}}-1} x_{\bar{i}\bar{j}(t-r)} \right) \quad \end{aligned}$$
(1)
$$\begin{aligned} \text {maximize}&z_2 = \sum _{i=1}^{N}\sum _{j=1}^{K}\sum _{t=1}^{T} l_{i} \cdot area_{j} \cdot x_{ijt}\end{aligned}$$
(2)
$$\begin{aligned} \mathrm {subject\,\, to}&\sum _{i=1}^{N}\sum _{r=0}^{C_i-1}x_{ij(t-r)}\le 1, \quad j=1,\ldots , K,\quad t=1,\ldots , T \end{aligned}$$
(3)
$$\begin{aligned}&\sum _{i \in F_p}\sum _{r=0}^{C_i}x_{ij(t-r)}\le 1, \quad p=1,\ldots , N_f, \quad j=1,\ldots , K,\quad t=1,\ldots , T \end{aligned}$$
(4)
$$\begin{aligned}&\sum _{i=1}^{N}\sum _{t=1}^{T} C_i \cdot x_{ijt} = T, \quad j=1,\ldots , K \end{aligned}$$
(5)
$$\begin{aligned}&x_{ijt} \in \{0,1\}, \quad i=1,\ldots , N, \quad j=1,\ldots , K, \quad t=1,\ldots , T, \end{aligned}$$
(6)

where \(t-r \le 0\) is replaced by \(t-r+T\).

The objective function (1), to be minimized, represents the possibility of pest proliferation throughout the planting area, when considering the whole planning horizon. The closer the crops of the same botanical family of plants are, the greater the proliferation. Maximization of objective function (2) aims to optimize the profit of the planting schedule. Note that these objectives conflict with each other. On the one hand, if this last function is maximized, only the crops that provide greater profit will be selected for planting. Hence, the variability of the crops chosen to be planted will be reduced. On the other hand, if only \(z_1\) is minimized, crops of different botanical families offering a lower profit should be chosen, thus resulting in a lower overall profit.

Constraints (3) prevent planting overlapping, that is, constraints which enable a new crop to be inserted in a plot before concluding the \(C_i\) periods of cultivation of the previous crop (i) in the plot. Constraints (4) prevent crops of the same botanical family from being cultivated in the same plot in consecutive periods. Constraints (5) ensure that all the T periods of the planning horizon are used in each plot. Finally, constraints (6) impose the domain of all the decision variables of this model.

It is important to note that some previous papers, such as Santos et al. (2010, 2011, 2015b) and Aliano Filho et al. (2014), consider fallow practice as an important requirement in the elaboration of a crop rotation schedule. However, we do not require this practice. From the economic point of view, it is of no interest for farmers to leave an area fallow for a certain period, but rather to always ensure it is occupied with a profitable crop.

To obtain Pareto optimal solutions for this bi-objective problem, we balanced \(z_1\) and \(z_2\) by attributing weights to these functions and minimizing the resulting sum. In this way we solved the mono-objective problem:

$$\begin{aligned} \begin{array}{lll} \text {minimize}&{}&{}z=\uplambda \cdot \upbeta _1 \cdot z_1 + (1-\uplambda ) \cdot \upbeta _2 \cdot (-z_2) \\ \text {subject to}&{}&{} \text {constraints }(3){\text {--}}(6), \end{array} \end{aligned}$$
(7)

where \(\uplambda \in [0,1]\). According to Miettinen (2012), for each value of \(\uplambda > 0\), the solution to this mono-objective problem is efficient for the problem (1)–(6). The constants \(\upbeta _1\) and \(\upbeta _2\) selected in the range [0, 1] normalize the objectives. Such constants may be determined by solving the problems:

$$\begin{aligned} \mathbf {x}_1^*=\text {arg min} \{z_1:\text {s.t. } (3){\text {--}}(6) \} \quad \text {and} \quad \mathbf {x}_2^*=\text {arg max} \{z_2: \text {s.t. } (3){\text {--}}(6)\}, \end{aligned}$$
(8)

and, calculating:

$$\begin{aligned} \mathop {\upbeta }\nolimits _1=\frac{1}{z_1(\mathbf {x}_2^*)-z_1(\mathbf {x}_1^*)} \quad \text {and} \quad \mathop {\upbeta }\nolimits _2=\frac{1}{z_2(\mathbf {x}_1^*)-z_2(\mathbf {x}_2^*)}. \end{aligned}$$
(9)

As observed, the models (1)–(6) and (7) are nonlinear. More precisely, the function \(z_1\) is a quadratic one. The inconvenience of these models is that there is no guarantee that they are convex, thus causing difficulties for the integer quadratic programming solvers. In an attempt to circumvent this problem, we adopted an integer programming modeling technique to linearize the models, which consists in replacing the product \(\left( \sum _{r=0}^{C_i-1} x_{ij(t-r)} \right) \cdot \left( \sum _{r=0}^{C_{\bar{i}}-1} x_{\bar{i} \bar{j} (t-r)} \right) \) by \(z_{i\bar{i}j\bar{j}t}\) and imposing the linear constraints:

$$\begin{aligned} \sum _{r=0}^{C_i-1} x_{ij(t-r)} + \sum _{r=0}^{C_{\bar{i}}-1} x_{\bar{i} \bar{j} (t-r)} -z_{i\bar{i}j\bar{j}t}\le & {} 1, \end{aligned}$$
(10)
$$\begin{aligned} z_{i\bar{i}j\bar{j}t}\le & {} \sum _{r=0}^{C_i-1} x_{ij(t-r)},\end{aligned}$$
(11)
$$\begin{aligned} z_{i\bar{i}j\bar{j}t}\le & {} \sum _{r=0}^{C_{\bar{i}}-1} x_{\bar{i} \bar{j} (t-r)}, \end{aligned}$$
(12)

for all \(i=1,\ldots ,N\), \(\bar{i}=1,\ldots ,N\), \(j=1,\ldots ,K-1\), \(\bar{j}=j+1,\ldots , K\), \(t=1,\ldots ,T\) where \(z_{i\bar{i}j\bar{j}t} \ge 0\) are the new variables.

The linearized version of model (1)–(6) is presented below:

$$\begin{aligned} \text {minimize}&z^l_1=\sum _{i=1}^{N}\sum _{\bar{i}=1}^{N}\sum _{j=1}^{K-1}\sum _{\bar{j}=j+1}^{K}\sum _{t=1}^{T} P_{i \bar{i}} \cdot V_{j \bar{j}} \cdot z_{i\bar{i}j\bar{j}t} \end{aligned}$$
(13)
$$\begin{aligned} \text {maximize}&z_2 = \sum _{i=1}^{N}\sum _{j=1}^{K}\sum _{t=1}^{T} l_{i} \cdot area_{j} \cdot x_{ijt}\end{aligned}$$
(14)
$$\begin{aligned} \mathrm {subject\,\, to}&\sum _{i=1}^{N}\sum _{r=0}^{C_i-1}x_{ij(t-r)}\le 1, \quad j=1,\ldots , K,\quad t=1,\ldots , T \end{aligned}$$
(15)
$$\begin{aligned}&\sum _{i \in F_p}\sum _{r=0}^{C_i}x_{ij(t-r)}\le 1, \quad p=1,\ldots , N_f, \quad j=1,\ldots , K,\quad t=1,\ldots , T \qquad \end{aligned}$$
(16)
$$\begin{aligned}&\sum _{i=1}^{N}\sum _{t=1}^{T} C_i\cdot x_{ijt} = T, \quad j=1,\ldots , K \end{aligned}$$
(17)
$$\begin{aligned}&\text {constraints }(10){\text {--}}(12) \end{aligned}$$
(18)
$$\begin{aligned}&x_{ijt} \in \{0,1\}, \quad z_{i\bar{i}j\bar{j}t} \ge 0, \end{aligned}$$
(19)
$$\begin{aligned}&i=1,\ldots , N, \quad \bar{i}=1,\ldots , N, \quad j=1,\ldots , K-1, \quad \nonumber \\&\bar{j}=j+1,\ldots K, \quad t=1,\ldots , T, \end{aligned}$$
(20)

where \(z_1^l\) is \(z_1\) linearized and \(t-r \le 0\) is replaced by \(t-r+T\). The linear reformulation of the weighted sum model is given by:

$$\begin{aligned} \begin{array}{lll} \text {minimize}&{}&{}z=\uplambda \cdot \upbeta _1 \cdot z^l_1 + (1-\uplambda ) \cdot \upbeta _2 \cdot (-z_2) \\ \text {subject to}&{}&{} \text {constraints }(15){\text {--}}(20). \end{array} \end{aligned}$$
(21)

In Sect. 5, we compare the performance of standard software applied to Problem (7) and its linearized version (21) using three values of \(\uplambda \). The processing time of these exact methods corresponding to each of these weights is verified. The Pareto optimal solutions determined are also commented on. However, these exact methods were unable to tackle large instances. In fact, for several instances, the solver did not even find a feasible integer solution to the quadratic Problem (7) and to the linear Problem (21), in one hour’s execution.

4 The genetic algorithm

In this section a genetic algorithm (GA) with two constructive heuristics for the problem of sustainable cultivation is presented. We opted for the metaheuristic GA because it is simple to implement, it has incurred low computational costs and provided good results for combinatorial problems according to Han and Kim (2000), Deb (2001), Rossing and Hammer (2006), Sarker and Ray (2009) and de Oliveira Florentino et al. (2018), among other studies.

The core features of the proposed GA are the solution encoding and mainly the constructive heuristics developed for the initial population, all presented with additional details of this method in the following subsections.

4.1 Solution encoding

In the GA for the problem studied in this work, an individual corresponds to a solution, feasible or otherwise, to the problem. The strategy of encoding that follows was inspired in Aliano Filho et al. (2014) and de Oliveira Florentino et al. (2018). Each individual \(\mathbf {I}=(i_{kt}) \in \mathbb {N}^{K \times T}\) is a matrix, and its entry \(i_{kt} \in \{1,\ldots ,N\}\) represents the crop index that is planted in lot k and in period t, \(k=1,\ldots ,K\) and \(t =1,\ldots ,T\). This codification has the advantage of being simple to implement and provides all the information required. To facilitate ones understanding of solution encoding, we present an illustrative example.

Example 1

(Illustration of a feasible solution) The data for the crops are shown in “Appendix”, Table 11. Let \(N=20\), \(K=9\) and \(T=12\). A possible solution to this problem could be given by the matrix \(\mathbf {I} \in \mathbb {N}^{9 \times 12}\) below:

$$\begin{aligned} \mathbf {I}= \left[ \begin{array}{cccccccccccc} 11 &{}\quad 6 &{}\quad 6 &{}\quad 8 &{}\quad 8 &{}\quad 18 &{}\quad 18 &{}\quad 15 &{}\quad 15 &{}\quad 17 &{}\quad 17 &{}\quad 11 \\ 8 &{}\quad 16 &{}\quad 1 &{}\quad 1 &{}\quad 13 &{}\quad 13 &{}\quad 13 &{}\quad 9 &{}\quad 9 &{}\quad 9 &{}\quad 20 &{}\quad 8 \\ 7 &{}\quad 7 &{}\quad 17 &{}\quad 17 &{}\quad 11 &{}\quad 11 &{}\quad 2 &{}\quad 2 &{}\quad 2 &{}\quad 4 &{}\quad 4 &{}\quad 7 \\ 18 &{}\quad 13 &{}\quad 13 &{}\quad 13 &{}\quad 6 &{}\quad 6 &{}\quad 11 &{}\quad 11 &{}\quad 4 &{}\quad 4 &{}\quad 20 &{}\quad 18 \\ 8 &{}\quad 19 &{}\quad 19 &{}\quad 19 &{}\quad 7 &{}\quad 7 &{}\quad 7 &{}\quad 6 &{}\quad 6 &{}\quad 10 &{}\quad 20 &{}\quad 8 \\ 20 &{}\quad 11 &{}\quad 11 &{}\quad 8 &{}\quad 8 &{}\quad 5 &{}\quad 5 &{}\quad 18 &{}\quad 18 &{}\quad 4 &{}\quad 4 &{}\quad 16 \\ 9 &{}\quad 9 &{}\quad 9 &{}\quad 4 &{}\quad 4 &{}\quad 13 &{}\quad 13 &{}\quad 13 &{}\quad 6 &{}\quad 6 &{}\quad 10 &{}\quad 20 \\ 9 &{}\quad 2 &{}\quad 2 &{}\quad 2 &{}\quad 6 &{}\quad 6 &{}\quad 16 &{}\quad 10 &{}\quad 15 &{}\quad 15 &{}\quad 9 &{}\quad 9 \\ 15 &{}\quad 11 &{}\quad 11 &{}\quad 4 &{}\quad 4 &{}\quad 1 &{}\quad 1 &{}\quad 7 &{}\quad 7 &{}\quad 7 &{}\quad 20 &{}\quad 15 \end{array} \right] . \end{aligned}$$

In this solution, the planting of crop \(i = 6\) is programmed in plot \(k = 1\) in period \(t = 2\) and it stays in this plot until it is harvested in the period \(t = 3\), when the crop \(i = 8\) enters in period \(t = 4\) and remains until period \(t = 5\). Finally, in period \(t = 12\), crop \(i = 11\) is planted and occupies the space until its planting period has expired, that is, at \(t = 13\) (which corresponds to the first period, according to the mathematical formulation).

4.2 Initial population

The initial population of the GA is carefully created. The basic idea consists in generating individuals with enough diversity to allow the algorithm to combine the varied characteristics of the individuals and produce diverse potentially Pareto optimal solutions to the problem.

This initial population is generated by two different heuristics, named MINPEST and MAXPROFIT. We try to create solutions with distinct characteristics in order to optimize the two conflicting functions, \(z_1\) and \(z_2\). One of the constructive heuristic attempts to produce solutions that have a low probability of pest proliferation among the crops, by mixing different plants. The other constructive heuristic attempts to obtain high profit-making solutions, by forcing only the more profitable crops to enter the schedule. These two ways of building solutions are necessary for this GA, because the objectives conflict with each other. According to the weight \(\uplambda \) given to the objective \(z_1\), the GA will determine a good solution with respect to the function \(z=\uplambda \cdot \upbeta _1 \cdot z_1 + (1-\uplambda )\cdot \upbeta _2\cdot z_2\).

Let n be the total number of individuals of this populationFootnote 1. It is calculated as follows:

  • \(\frac{n}{2}\) by the MINPEST heuristic;

  • \(\frac{n}{2}\) by the MAXPROFIT heuristic.

The two constructive heuristics, MINPEST and MAXPROFIT, are presented in the following subsections.

4.2.1 MINPEST heuristic

This heuristic is based on the heuristic in Aliano Filho et al. (2014), which builds a solution to Problem (21) with \(\uplambda =0.0\). This heuristic aims to ensure the sequential insertion of different botanical families in the same plot. Each individual satisfies the constraints of the problem, except the period continuity at \(t = T\) and \(t = 1\). It is possible that the crops of these periods may belong to the same family. Should that happen and a shift of elements (in the same lot) is performed, then this inadmissibility may occur in intermediate periods of the planning horizon. Later, we will consider a way to overcome this drawback.

The pseudocode of this heuristic is in Algorithm 4.1. Initially, for each lot k we sort a permutation \(\mathcal {P}_k\) of crops’ indexes. We define the sequence of crops \(\mathbf {v}_k\) containing the planting schedule for plot k and start to run along vector \(\mathcal {P}_k\) and seek the first element whose crop i is not from the same botanical family of the last crop inserted in \(\mathbf {v}_k\). This ensures that the constraints of consecutive planting are always respected. After finding one such crop, it will be inserted in the schedule and \(\mathbf {v}_k\) is updated. The construction is made until the period of the new crop to be inserted in \(\mathbf {v}_k\), added to the number of periods of the crops that were already inserted in \(\mathbf {v}_k\) is equal to T. Should any crop in \(\mathcal {P}_k\) satisfy this, a new permutation is generated.

With the indexes of the crops to be planted in plot k in the T periods, we perform the procedure referred to as prolongation, described as follows. Let the row k of the individual \(\mathbf {I}\) be denoted by \(\mathbf {I}_k\). This row is built from the vector \(\mathbf {v}_k\). Each crop i is repeated as many times as the duration of its period of cultivation, \(C_i\). This is done until all the crops of \(\mathbf {v}_k\) are considered. In the end, \(\mathbf {I}_k\) has length T. Therefore, the final individual \(\mathbf {I}\) is the string concatenation (in rows) of \(\mathbf {I}_k\).

This heuristic produces solutions that always start their cultivation in period \(t = 1\) and end in period \(t = T\). In general, the solution does not necessarily possess this characteristic, which can be seen in individual \(\mathbf {I}\) from Example 1. To produce greater variability in the population solutions, we perform a procedure named shift in each row \(\mathbf {I}_k\) of \(\mathbf {I}\), with a probability of 70%. If it occurs, we shift the elements of \(\mathbf {I}_k\) in a positions to the right of their original positions. Here, a is an integer randomly generated from 1 to T (uniform distribution).

Example 2

(Construction of an individual with the MINPEST heuristic) We will build the scheduling for lot \(k=1\), \(\mathbf {I}_1\), considering the data of the crops displayed in Table 11, \(N=10\) and \(T=12\).

Step 1:

We randomly generated \(\mathcal {P}_1=(5,10,8,7,1,9,2,6,4,3)\).

Step 2:

Let \(i = 5\) be the first crop to be inserted in the schedule. The calendar for planting crop 5 has \(C_5 = 2\) duration periods.

Step 3:

Let \(i = 10\) be the second crop of the permutation. It is inserted in the schedule as it is not from the same botanical family of the previous inserted crop. Then, we are in period \(C_5+C_{10}=2+1=3\).

Step 4:

The crop \(i = 8\) is the third crop inserted in this plot. We already have \(C_5+C_{10}+C_8=2+1+2=5\) periods occupied.

Step 5:

The next crop of the permutation is \(i = 7\). It cannot be inserted in the current period as the last crop planted belongs to the same family. The next crop analyzed is \(i = 1\), which is not from the same botanical family. Then, the period occupied is updated to \(C_5+C_{10}+C_8+C_1=2+1+2+2=7\).

Step 6:

We analyze crop \(i = 9\), and it is introduced into the schedule. The current planting period is \(C_5+C_{10}+C_8+C_1+C_9=2+1+2+2+3=10\).

Step 7:

Analysis of the next crop, \(i = 2\), tells us it cannot be inserted in the schedule, because the duration of its cycle is greater than the number of periods needed to complete the cycle, \(T = 12\). In sequence, we analyze crop \(i = 6\) which possesses precisely the 2 periods required to finish the schedule.

Step 8:

Using the prolongation operator, we have the vector \(\mathbf {I}_1\):

$$\begin{aligned} \mathbf {I}_1= \left[ \begin{array}{cccccccccccc} 5&\quad 5&\quad 10&\quad 8&\quad 8&\quad 1&\quad 1&\quad 9&\quad 9&\quad 9&\quad 6&\quad 6 \end{array} \right] . \end{aligned}$$

(Note that it corresponds to an unfeasible solution, since the first and the last crops of the schedule are both from family 3).

Step 9:

If the operator shift acts in this vector with \(a = 3\) positions, we have a new (unfeasible) schedule for this plot given by:

$$\begin{aligned} \mathbf {I}_1= \left[ \begin{array}{cccccccccccc} 9&\quad 6&\quad 6&\quad 5&\quad 5&\quad 10&\quad 8&\quad 8&\quad 1&\quad 1&\quad 9&\quad 9 \end{array} \right] . \end{aligned}$$

4.2.2 MAXPROFIT Heuristic

This is a constructive heuristic we have developed in this study. It is specialized in producing solutions with higher profitability. A parameter \(\alpha \) is introduced to control the “strength” and “pressure” that one wants to ascribe to the profit of a planting schedule.

Consider a set of crops that the agriculture decision-maker wants to insert in the schedule of plot k, with cardinality \(\alpha \ge 2\) (if \(\alpha < 2\), there is no a feasible solution regarding the continuity constraints). The more profitable crops from different botanical families must be chosen to compose this set and they are randomly permuted, thus producing the sequence of crops \(\mathcal {C}_k\). Then, the MINPEST heuristic is employed to generate an individual of the GA, i.e., we use the set \(\mathcal {C}_k\) instead of \(\mathcal {P}_k\) for each plot. A smaller value of \(\alpha \) produces a solution with a greater profit. In the same way, we apply the shift and prolongation operators to each plot. The idea supporting this heuristic is the production of more profitable solutions (which also implies an increase in pest infestations) and, consequently, the introduction of diversity in the population of the GA.

Example 3

(Construction of an individual with the MAXPROFIT heuristic) Suppose that \(\alpha =4\), and that the most profitable crops from Table 11, belonging to families 1 and 2, are used to build the planting schedule in \(K=5\) plots. We will only illustrate the construction of one plot. Let the sequence of these crops be \(\mathcal {C}_1=(3,1,2,4)\).

figure a
Step 1:

crop \(i=3\) is inserted in the schedule followed by crop \(i = 1\). The next crop, \(i = 2\), cannot be planted, because it is from the same botanical family as the previous one. Then, crop \(i = 4\) is inserted. The sum of periods of the three crops results in \(5 < 12 = T\) periods.

Step 2:

The insertion of crops continues, starting with the first element of \(\mathcal {C}_1\), which in turn cannot be inserted at this moment. The crops \(i = 1\) and \(i = 4\) are the next to be allocated to this schedule. The duration of the planting cycle at this stage is \(9 < 12 = T\).

Step 3:

Starting the scanning once more through the sequence, crops \(i = 1\) and \(i = 3\) are the last crops to be inserted. Following application of the prolongation operator, we have the vector \(\mathbf {I}_1\):

$$\begin{aligned} \mathbf {I}_1= \left[ \begin{array}{cccccccccccc} 3&\quad 1&\quad 1&\quad 4&\quad 4&\quad 1&\quad 1&\quad 4&\quad 4&\quad 1&\quad 1&\quad 3 \end{array} \right] , \end{aligned}$$

which is unfeasible, because crop 3 is planted in sequence (periods 1 and 12).

Step 4:

Using the shift operator with \(a = 4\), we have the following schedule of this plot:

$$\begin{aligned} \mathbf {I}_1= \left[ \begin{array}{cccccccccccc} 4&\quad 1&\quad 1&\quad 3&\quad 3&\quad 1&\quad 1&\quad 4&\quad 4&\quad 1&\quad 1&\quad 4 \end{array} \right] , \end{aligned}$$

which is unfeasible for the problem, due to the planting of crop 3.

For the construction of the remaining plots using set \(\mathcal {C}_1\), we consider other random permutations of the elements from \(\mathcal {C}_1\). Our concern is to always examine the feasibility of consecutive crops in the intermediate periods, that is, if they are from different families.

The pseudocode of this procedure is similar to the Algorithm 4.1, except for line 5, where the list of crops to be inserted is \(\mathcal {C}_k\).

4.3 Fitness evaluation

The fitness \(f(\mathbf {I})\) of each individual \(\mathbf {I}\) in the population is given by

$$\begin{aligned} f(\mathbf {I})=\uplambda \cdot \mathop {\upbeta }\nolimits _1 \cdot z_1 + (1-\uplambda ) \cdot \mathop {\upbeta }\nolimits _2 \cdot (-z_2) + 10^{3} \cdot v(\mathbf {I}), \end{aligned}$$
(22)

where \(\upbeta _1\) and \(\upbeta _2\) are calculated through the Expression (8), \(0.0< \uplambda < 1.0\).

The function \(v(\mathbf {I})\) provides the number of violations that corresponds to \(\mathbf {I}\), with regard to the constraints of consecutive planting. Violations occur because the MINPEST and MAXPROFIT heuristics can determine unfeasible solutions as to consecutive planting. The first and last crop of the schedule can be from the same botanical family. Also, following the action of shift operator, this inadmissibility can be passed on to intermediate periods. The number of plots in which this occurs is the number of violations \(v(\mathbf {I})\) to be considered in the fitness of the individual. Therefore, a feasible individual \(\mathbf {I}\) has a null violation, \(v(\mathbf {I})=0\), and an unfeasible one has \(v(\mathbf {I})>0\) and tends to be extinct from the population in subsequent generations.

4.4 Selection

The process of selecting the individuals in the population consists of choosing which elements will generate descendants for the next generation. Therefore, \(\lfloor \upgamma _1 \cdot n\rfloor \) individuals are separated to perform the crossover, another genetic operator of this method. The parameter \(\upgamma _1\), called the selection rate, is defined by the user. The selection consists of performing tournaments with two randomly chosen elements of the population, and the one with the lowest fitness is chosen (the best one). The procedure ends when the \(\lfloor \upgamma _1 \cdot n\rfloor \) individuals are chosen, thus building, the intermediate population.

4.5 Crossover

The main objective of this operator is to produce new solutions from the previously built ones, by exploring the feasible space of the problem and finding more promising solutions with respect to fitness value, i.e., solutions with lower fitness value. This operator randomly chooses \(\lfloor \upgamma _1 \cdot n\rfloor /2\) pairs of individuals from the intermediate population. Each pair \((\mathbf {F},\mathbf {M})\) is considered a pair of parents father and a mother, respectively, and their genes (the rows of the respective matrix) are swapped between the two parents thus resulting in two offspring (\(\mathbf {C}_1\) and \(\mathbf {C}_2\)), each bearing some genetic information from both \(\mathbf {F}\) and \(\mathbf {M}\). In the crossover, a random binary vector \(b = (b_k)\) of length K is considered. If \(b_k = 0\), then the child \(\mathbf {C}_1\) has its kth row copied from the father. If \(b_k = 1\), then the kth row comes from the mother. The child \(\mathbf {C}_2\) is built inversely, that is, if \(b_k = 0\) the information comes from the mother and, otherwise, it comes from the father. This strategy, besides incurring low computational cost, is able to generate different and promising solutions to the problem.

Figure 1 schematically illustrates this uniform operator.

Fig. 1
figure 1

Illustration of the uniform crossover

4.6 Mutation

The mutation acts on \(\lfloor \upgamma _2 \cdot n \rfloor \) (where \( \upgamma _2 \) is the mutation rate) randomly choosing individuals of the population. The mutation avoids the premature convergence of the algorithm. The following function represents the probability for this operator to act in a given generation g of the GA

$$\begin{aligned} \upphi (g)=\frac{1}{1+20e^{-g/40}}. \end{aligned}$$
(23)

When \( g \rightarrow G\) (where G is the maximum number of generations), \(\upphi (g) \rightarrow 1\). This means that mutation occurs more strongly in the final generations of the GA, when the individuals of the population tend to be similar. For each individual \(\mathbf {I}\) that mutates, we consider a probability of 50% to replace each row \(\mathbf {I}_k\) of \(\mathbf {I}\). If the replacement occurs, the MINPEST heuristic generates another schedule for \(\mathbf {I}_k\).

4.7 Migration

Similar to the mutation process, we proposed an operator called migration. The purpose is again to ensure diversity in the population, and, consequently, to determine the most promising solutions. A mechanism to replace the \(\lfloor \upgamma _3 \cdot n \rfloor \) elements of the population (where \(\upgamma _3\) is a given parameter) is used in three moments throughout the course of the GA, more precisely, in generations \(g=0.5 G\), \(g=0.7G\) and \(g=0.9G\). In these generations, 50% of the new individuals (immigrants) are generated by the MINPEST heuristic and the remaining 50% by the MAXPROFIT heuristic.

4.8 Update and Elitism

After the selection, crossover, mutation, and migration, the population increases by the factor of \(1+\upgamma _1\). All the individuals are evaluated, according to Expression (22), and the n best are taken to the next generation. Elitism consists in preserving the best element (called Elite and denoted by \(\mathcal {E}\)) before the operators act. \(\mathcal {E}\) is included in the population before updating at the end of any generation, and constantly updated at each generation. This means that the GA, at least, preserves the best solution achieved in previous generations. Should a better individual than current \(\mathcal {E}\) appear in the population, the elite individual is updated. The GA method ends its search after G generations.

It is important to note that this GA determines one solution for the problem according to the weight \(\uplambda \) given to the objective function \(z_1\) (and \(1-\uplambda \) given to \(z_2\)). This means that the method determines a potentially Pareto optimal solution of Problem (1)–(6). The approximation of all Pareto optimal solutions would be a computationally expensive task, since the problem can have a large number of such solutions. In the next section, we will demonstrate that each solution demands moderate to high CPU time.

Consequently, we propose to determine only three of these solutions that will give a reasonable representation of different solutions for the Problem (1)–(6). For two of the three solutions, we individually optimize the objectives, thus obtaining the so-called potentially lexicographic solutions (Miettinen 2012), when \(\uplambda =0.0\) and \(\uplambda =1.0\). The third one is obtained by using a combination of the same weight for the objectives (the so called mixed potentially Pareto optimal solution, when \(\uplambda =0.5\)). The three well-spaced values for \(\uplambda \) show the conflicting character of the objectives of our problem. A similar procedure was followed in de Oliveira Florentino et al. (2018) with an algorithm whose characteristics, in terms of computational requirements, are similar.

The GA’s pseudocode is presented in Algorithm 4.2.

figure b

In the following section, we present the computational results performed to validate the mathematical models with the corresponding exact methodologies, and the heuristic solution method proposed.

5 Computational experiments

The computational experiments were implemented in the MATLAB (2017) software version 2017 in an Intel Core i7 with 8GB of RAM. For comparison purposes, the exact methods were implemented with IBM ILOG CPLEX version 12.8 (CPLEX 2017). Table 9 in the Appendix presents some real characteristics of the 27 instances generated to validate the methodologies. Such instances were semi-randomly generated by combining different values of NK and T, many of which correspond to real situations. In the same table, we present the number of variables and constraints of the quadratic model (1)–(6) and for its linearized version (13)–(20). The performance of the CPLEX solvers running with these two formulations will be presented in the next tables. The GA parameters, presented in Table 10, were determined by preliminary computational experiments and they were the best cost-benefit choices.

The shape of the plots is illustrated in Fig. 6. Each plot is a square whose sides are 200 meters in length. Therefore, the distance between the plots 1 and 3 is 200 meters, and between plots 5 and 7 is zero, because they are adjacent. For instances with \(K < 12\) plots the K first plots were considered, as shown in Fig. 6. The remaining data of the crops, such as planting cycle duration (in periods), the botanical family to which the plant belongs and profitability (per \(\hbox {m}^2\)) are presented in Table 11.

5.1 Results

As mentioned above, in the experiments we used three distinct values for the weight in the function \(z_1\): \(\uplambda = 1.0\), \(\uplambda = 0.5\) and \(\uplambda = 0.0\). With each of the 27 instances considered, the GA was run 20 times.

Tables 1, 2, 3, 4, 5 and 6, indicate the values of the objective functions determined for the quadratic and linear cases for each \(\uplambda \) value, along with all the 27 instances generated \((z_{CPLEX})\). In addition, we provided the average, maximum and minimum values of the GA objective values in the 20 runs. The relative Gap (in %) between the objective values from CPLEX and GA, was measured by using the average values of the GA figures. Hence, this gap was calculated on the basis of the averages of the objective value obtained in GA runs \((\bar{z}_{GA})\) as follows:

$$\begin{aligned} \text {Gap}=\frac{\bar{z}_{GA}-z_{CPLEX}}{\bar{z}_{GA}} \times 100. \end{aligned}$$

We imposed a time limit of 3600 seconds for CPLEX to determine the optimal solution to Problem (7) (quadratic formulation) or Problem (21) (linearized formulation). The symbol (*) indicates that CPLEX did not determine any feasible integer solution in 3600 seconds.

Table 1 displays the \(z_1\) figures obtained with \(\uplambda =1.0\). CPLEX determined a solution for only seven instances with the linearized formulation. With the quadratic formulation, the CPLEX determined a solution for 24 instances and did not find any feasible integer solution for three large instances. Note that, with the non-convex quadratic function (1), the CPLEX solver can stop in a local optimum or in the global optimum.

Table 1 Computational results of the \(z_1\) objective function using \(\uplambda =1.0\)
Table 2 Computational results of the \(z_1\) objective function using \(\uplambda =0.5\)

The GA solved all the 27 instances. For most instances, this method produced a better solution with respect to the objective \(z_1\) as compared with CPLEX. For example, with instance 10, the GA provided a solution with a 49% and 61% lower proliferation on average, with regard to those determined by CPLEX with the linearized and the quadratic formulations, respectively.

As for the difference, for instance, among the instances and the \(z_1\) value determined, we should highlight certain observations. In the column that presents the results from the GA, note that the differences among instances 1, 10 and 19 consist in the number of crops N, respectively, equal to 5, 10 and 20. As for the \(z_1\) values observed, they decrease as N decreases. This is accountable as there is a greater variability among the crops which leads to more freedom in combining them in the same planting area. On the other hand, if the instances 1, 2 and 3 are compared, we see that they differ only in the number of periods, and the \(z_1\) values increase with/by T. The same happens when K is modified. Note that in instance 9, with \(N = 5\) crops, \(K = 12\) plots, \(T = 24\) periods, the highest \(z_1\) value was attained.

Table 2 illustrates the \(z_1\) values with \(\uplambda =0.5\). This means that the (normalized) total profit carries 50% weight in the approximation of the Pareto optimal solutions. The linear and quadratic formulations were solved once more for seven and 24 instances, respectively.

In inserting an objective that is diametrically opposed to minimization of \(z_1\), the values of this function increase. For example, in the case of instance 26, this increase was 38% with respect to the quadratic formulation (from 8.930 to 12.331). GA performed well considering the average of its objective values as compared to those determined from the linear formulation. For 15 out of the 23 instances solved by CPLEX with the quadratic formulation, the gap was also negative. This means that the GA can generate more promising solutions for this problem with larger dimension instances, where the exact methods fail.

Table 3 indicates the \(z_1\) values with \(\uplambda =0.0\), that is, only the total profit was optimized. In this way, pest dissemination in those solutions is expected to be greater than in the previous cases. When considering the solutions of the instance 10 produced by GA, the possibility of pest dissemination leaps from 0.04 (\(\uplambda =1.0\)) to 0.22 (\(\uplambda = 0.5\)) and, ultimately to 9.96 (\(\uplambda = 0.0\)), on average. The same instances that were not solved with \(\uplambda = 1.0\) and \(\uplambda = 0.5\) by the CPLEX, when considering the linear formulation, are not solved with the aim of optimizing \(z_2\), \(\uplambda =0.0\). The main justification factor, characteristic of these instances, is the dimension. As for the quadratic formulation, the CPLEX solved 25 instances within the processing time limit. However, the GA did not achieve the same performance in this objective when compared to the results obtained with a greater \(\uplambda \). Our metaheuristic was worse in two of the seven instances solved by CPLEX using the linear formulation, and in 13 out of 25 solved with the quadratic formulation.

Table 3 Computational results of the objective function \(z_1\) using \(\uplambda =0.0\)

Table 4 presents the \(z_2 \times 10^{-3}\) values obtained with \(\uplambda =1.0\). Here the profit was not optimized. The GA was worse for all instances solved by CPLEX, using the linear formulation, and overcame CPLEX only once, for an instance with a quadratic formulation. As can be observed, on average, the \(z_2\) value obtained by the CPLEX is higher than the one determined by GA.

Table 4 Computational results of the objective function \(z_2 \times 10^{-3}\) using \(\uplambda =1.0\)

Table 5 shows the \(z_2 \times 10^{-3}\) values for \(\uplambda = 0.5\). As expected, a relative increase is noted in this function, compared with the results shown in Table 4. The GA was able to solve all the instances but its solutions continue to be worse with regard to \(z_2\) , when compared with the solutions from CPLEX.

Table 5 Computational results of the objective function \(z_2 \times 10^{-3} \) using \(\uplambda =0.5\)

Table 6 displays the results for the \(z_2 \times 10^{-3}\) values with \(\uplambda =0.0\). CPLEX solved seven and 25 instances using the quadratic and linear formulations respectively. Regardless of the objective optimized, the dimension of the problem has a preponderant role. We also note that GA, in the 20 runs, generated solutions with little or no significant difference as to the minimum and maximum values. The potentially Pareto optimal solutions determined to assure maximum profit use an alternation of only a few more profitable crops, thus resulting in high pest infestation. From a heuristic perspective, the composition of a solution involving a high profit becomes difficult to obtain and computationally expensive. In that case it is more convenient to determine these solutions by employing the exact methods, in cases when they are successful.

Table 6 Computational results of the objective function \(z_2\times 10^{-3}\) using \(\uplambda =0.0\)

Table 7 shows the CPU computational times (in seconds) of each method, for all instances and for the three \(\uplambda \) values. CPLEX, when applied with the linear formulation and the objective of minimizing pest dissemination, took one hour for five of the seven instances. In addition, these are feasible or local optima solutions, because CPLEX finalized the search before attaining a stopping criterium different from the time limit. When applied to the quadratic formulation, we have a shorter CPU time, despite having determined only local optimum solutions (known by comparing quadratic CPLEX with GA z values in Table 4). Instances 23 and 26 were the only ones for which CPLEX used the maximum allowed processing time. When \(\uplambda \) is equal to 0.5, the CPLEX with the quadratic formulation found local optimal solutions in a relatively low computational time, when compared to the linearized version (with two global optima corresponding to instances 1 and 4). In the case of \(\uplambda = 0.0\), this difference in processing time between the linear and the quadratic version is more noticeable. Therefore, the computational difficulty in determining Pareto optimal solutions by the exact methodologies for this problem depends strongly on the weight of the objective functions involved.

Table 7 Computational CPU time (in seconds) of the CPLEX and GA (on average) to solve each instance
Table 8 Number of different crops used in the planting schedule (on average)

As for the GA algorithm, we did not note any interference in its performance with \(\uplambda \)—only when N, K or T change. The CPU time is practically proportional to these three parameters. A comparative analysis between CPLEX and GA, shows that the exact methods surpass the metaheuristic with the \(z_2\) objective but lose their efficacy when \(\uplambda \) increases. For large instances, GA found potentially Pareto optimal solutions about ten times faster, on average, than CPLEX.

Table 8 shows the average total number of different crops used in the planting schedule throughout the planning horizon. For \(\uplambda = 0.0\), the exact methods determined solutions with the two most profitable crops, alternating them with each other (see figures in this table). When increasing \(\uplambda \), there is a greater opportunity to insert different crops to minimize pest infestation, thus enriching the variability of species in the schedule. The GA algorithm determines a planting calendar with a greater crop variability. This justifies why this method determines solutions with less likelihood of pest proliferation compared with CPLEX. For example, the instance 26 has \(N = 20\) available crops, from which only six were used in the solution provided by CPLEX while the GA adopted 17.40 different crops, on average, for solutions to this instance.

In the next subsection, we analyze the characteristics of the typical solutions from CPLEX and GA in function of \(\uplambda \), taking an example from instance 23.

5.2 Analysis of different potentially Pareto optimal solutions

In Figs. 2, 3 and 4 we present potentially Pareto optimal solutions for this problem obtained from CPLEX and GA. The crop data are from instance 23. The numbers presented in these mosaics are the indexes of the crops scheduled to be planted (see Table 11). For example, in Fig. 2a, crop 2 whose duration \(C_2\) is 3 periods, is planted in plot 1 in periods 2 to 4.

Figure 2 presents two typical solutions when trying to minimize \(z_1\). Figure 2a shows a typical solution from GA, whose associated objective vector is \((z_1,z_2)=(1.1674,2476.40)^T\). Fig. 2b is the solution from CPLEX (obtained from the quadratic formulation with 4000 seconds of running time) whose objective vector is \((z_1,z_2)=(1.845,3654.50)^T\). The solution from GA compared with the solution from CPLEX is 58% better with respect to \(z_1\) but 47% worse with respect to \(z_2\).

These solutions have a different structure. Note that the GA solution used only one crop of 1 period ’s duration (crop 20), while the CPLEX solution used crops 3, 10, 16, 20, all involving 7 periods of duration. Crop 14 is the one that has the longest duration period among the 20 crops and was planted six times and once, respectively, in the solutions from GA and CPLEX. To minimize \(z_1\), GA uses the strategy of planting longer duration crops, while CPLEX uses shorter period ones.

Fig. 2
figure 2

Solutions obtained by CPLEX and by GA using \(\uplambda =1.0\)

Fig. 3
figure 3

Solutions obtained by CPLEX and GA using \(\uplambda =0.0\)

Fig. 4
figure 4

Solutions obtained by CPLEX and GA using \(\uplambda =0.5\)

Figure 3 presents the solutions using \(\uplambda = 0.0\). Figure 3a shows a typical solution provided by the GA, with an objective vector of \((z_1,z_2)=(34.44,5340.40)^T\). Figure 3b shows a solution obtained by CPLEX, whose objective vector is \((z_1,z_2)=(36.12,7631.50)^T\). An analysis and comparison of these solutions led us to observe that the CPLEX solution is 43% better with respect to objective \(z_2\), but is 4.8% worse with respect to objective \(z_1\). It is interesting to note that both solutions use two crops alternately in all plots. However, the GA opts to choose crops 1 and 3 (which provide greater profit and are not from the same family) while the CPLEX determined a solution that uses crops 3 and 10. Although crop 1 ensures a greater profit, its duration is longer when compared to crop 10. CPLEX seeks to plant crops that have a higher per period profit, while GA tries to use the crops with a higher total profit. In considering the CPLEX solutions obtained from two different cultivating strategies (with \(\uplambda = 1.0\) and \(\uplambda = 0.0\)) we notice that the profit doubles, but pest proliferation increases almost 20 times.

Figure 4 shows typical solutions with \(\uplambda = 0.5\). Figures 4a, b are associated with the solutions of GA and CPLEX in which the associated objective vectors are \((z_1,z_2)=(2.534,4102.00)^T\) and \((z_1,z_2)=(7.722,5825.00)^T\), respectively. GA produces a solution which is 67% with respect to objective \(z_1\), but 42% worse compared with objective \(z_2\). A remarkable characteristic of the CPLEX solution is the fact that it only uses alternatively crops 3, 10 and 16. GA mixes 16 different types of crops, and this explains the smaller value of the solution from GA for pest proliferation.

In CPLEX solutions, the number of equal crops per period is considerable. For example, in period 1, crop 10 is cultivated in five different plots, while crops 3 and 16 are cultivated in two plots each. Something different occurs with the GA solution. Notably, we have a greater mixture of different crops. Moreover, in analyzing the CPLEX solutions with \(\uplambda =1.0\) and \(\uplambda =0.5\), we note that, in the second case the profit is 60% greater than with \(\uplambda =1.0\). However, the value for pest proliferation quadruples.

Fig. 5
figure 5

Other non-dominated points determined for instance 23

It is important to highlight that it would be possible to obtain other Pareto optimal solutions, for instance, using multi-objective exact methods of scalarization. A simple way consists in assigning other weights to the objective functions. An alternative method consists in using the \(\upvarepsilon \)-constraint method.

To apply the \(\upvarepsilon \)-constraint method, firstly, we find the lexicographic solutions, thus determining the range of \(z_2\) function in the objective space. Let \(z_2 \in [z_2^-,z_2^+]\), where \(z_2^-\) and \(z_2^+\) are, respectively, the minimum and maximum profit determined. The next step is to solve the problem of minimizing the possibility of pest proliferation, by imposing a minimal profit of \(\upvarepsilon \) in the range \([z_2^-,z_2^+]\). When we solve this problem, a Pareto optimal solution can be determined along with the respective non-dominated point (in the objective space). The \(\upvarepsilon \) values can be attributions of the agricultural manager and decision-maker. Figure 5 shows some additional aspects of the Pareto frontier of this problem, by considering instance 23 and performing five and 10 assignments for \(\upvarepsilon \). In these experiments, the \(\upvarepsilon \) values were equally distributed in the range \([z_2^-,z_2^+]=[3654.00,7631.50]\), while bearing in mind the quadratic formulation solved by CPLEX. We emphasize in black the non-dominated points already determined with \(\uplambda =0.0\), 0.5 and 1.0. The number of Pareto optimal solutions provided depends on the data of the instance and the CPU time made available by the manager.

These experiments clearly demonstrate that the methodologies, both the heuristic and the exact ones, employed to solve the problem of sustainable cultivation are powerful tools to assist decision-making addressing complex problems such as this one.

6 Conclusions

This work proposes a bi-objective quadratic mathematical programming model to schedule crop cultivation, using a crop rotation approach and constraints to sustainable cultivation. Methods to solve the problem and assist agricultural managers in their decision-making are presented and implemented. The mathematical models with binary variables (the quadratic and linearized version) determine a calendar of crop cultivation in each plot and within a pre-specified planning horizon. Two conflicting objectives are taken into account: (1) minimize the possibility of pest proliferation among the crops and (2) maximize the total profit of the planting schedule. Therefore, these objectives can be balanced, to produce different Pareto optimal solutions that establish a compromise between these two objectives. A multi-objective analysis enables several scenarios of cultivation schedules and safer decision-making.

The paper presents exact techniques of mathematical programming (based on the quadratic model and on the linearized version) to determine Pareto (local) optimal solutions for this problem. We illustrate, through computational tests with semi-random instances of real dimensions, that the exact methods have a limited application. Out of the 27 instances considered, CPLEX solved only seven instances when considering the linearized version, and 24 with the quadratic one, by imposing a maximum limit on the computation time (one hour at the most, as proposed by the agricultural managers). In addition, since the quadratic objective function is not convex, CPLEX may end its search in a local optimum, thus providing a Pareto sub-optimal solution.

To overcome these unfavorable aspects and solve large dimension instances, this work proposes a genetic algorithm. Two constructive heuristics, together with this algorithm are also presented. The experiments showed that this metaheuristic can determine good quality potentially Pareto optimal solutions at a low computational time. Tests with the largest dimension instances (where the exact methods failed to find even a feasible solution) were solved by this algorithm in a computational time below 2 minutes per instance. The results showed that GA determined solutions with a value as to possible pest dissemination 74% lower than the value provided by CPLEX, besides using, on average, one-fifth of the computational time.

In short, the bi-objective methodologies presented aim to establish strategies for different sustainable vegetable production. They preserve a compromise with the economic factor and are therefore applicable in real situations. Consequently, the methodologies can be used to assist the decision-makers in choosing different alternatives of agricultural production for their farms. Finally, this study contributes to the advance of knowledge in the field of Operational Research in a sustainable environment.