1 Introduction

Natural gas is an essential energy source for the future. Its multiple benefits include low greenhouse gas emissions and low costs, which makes it competitive in most sectors among other energy sources.Footnote 1 Global projections of natural gas reserve levels are also an indication of the relevant role that natural gas will play in supporting growth in markets in the future (El Kafazi and Bannari, 2019; Su et al., 2019).

Global trends in natural gas indicate that consumption, production, reserves, and dependencies will continue to increase steadily for the foreseeable future (Jiao et al., 2021; Hari et al., 2021; Vedik et al., 2021). These rising expectations may involve the need for optimization and decision-making tools suitable to handle larger and more complex projects in national and international fields.

There are several researches in the gas distribution industrial field. Traditionally, studies on the design of pipelines for a gas network are focused on optimizing the installation cost of the network for distribution systems. In this context, there are several works in the literature addressing the problem via linear programming (Hansen et al., 1991), dynamic programming (Rothfarb et al., 1970), genetic algorithms (Simpson et al., 1994; Demissie et al., 2017; El-Mahdy et al., 2010), ant colony optimization algorithm (Zecchin et al., 2006; Mohajeri et al., 2012b). However, these studies are limited because they only seek to minimize the network cost, thus disregarding relevant design aspects, such as the minimum gas pressure in the network, reliability, and robustness.

In this way, this paper proposes a Variable Neighborhood Search (VNS)-based multiobjective optimization tool dedicated to the design of gas network sizing. This optimization approach estimates solutions while minimizing the network installation cost and maximizing the minimum allowable gas pressure. In general, the original contributions of this work include:

  • A pipe sizing multiobjective optimization problem modeling is performed, which aims to minimize the cost and maximize the minimum pressure;

  • A multiobjective VNS algorithm dedicated to the problem, including an intelligent constructive heuristic;

  • A random instance generator for the problem, based on the database of the TSPLIB library;

  • A strategy to support multicriteria decision-making, involving the following criteria: installation cost, minimum pressure, feasibility rate, average cost of failure, and sensitivity.

This multicriteria investigation considers uncertainty scenarios in the evolution of demand over a given time horizon, which indeed aids in the process of defining an appropriate final solution for the problem. Results have shown very promising concerning previous studies of the literature.

This paper is organized as follows. Section 2 stands for a brief review of the literature. Section 3 presents theoretical concepts about the problem of natural gas network design and states the proposed multiobjective optimization problem. Section 4 discusses the proposed integrated tool. Section 5 describes the design and analysis of the experiments adopted in this paper. The results are presented in Sect. 6, and the final observations are provided in Sect. 7.

2 Literature Review

The problem of natural gas network design was first addressed in the literature in 1970 (Rothfarb et al., 1970). Since then, many other works have been published in this field. We conducted an investigative study concerning a large sample of such works so as to contrast some features of our work against the ones from the literature. These characteristics include network topology (radial or meshed), instance size (small instances usually have less than 50 nodes), nonlinear-based model, multiobjective optimization, and multicriteria analysis. Table 1 presents a brief view of this investigation and shows that our proposal explores a great set of promising features for gas network design.

According to the previous review, several limitations could be identified in the literature: many works address only radial networks (Rothfarb et al., 1970; Simpson et al., 1994; Boyd et al., 1994; Surry et al., 1995; de Wolf and Smeers, 1996; Demissie et al., 2017; Zecchin et al., 2006; Mohajeri et al., 2012b, a; Torkinejad et al., 2019; Su et al., 2019); some approaches are limited to small networks, less than 50 nodes (Rothfarb et al., 1970; Mohajeri et al., 2012b, a; Torkinejad et al., 2019; Demissie et al., 2017); others works consider only linear or quadratic approximations of the problem (Hansen et al., 1991; de Wolf and Smeers, 1996); many proposals only involve minimizing the network installation cost (Rothfarb et al., 1970; Hansen et al., 1991; Simpson et al., 1994; Boyd et al., 1994; de Wolf and Smeers, 1996; Duarte et al., 2006; Zecchin et al., 2006; El-Mahdy et al., 2010; Mohajeri et al., 2012b, a; Torkinejad et al., 2019).

Few papers in the literature perform the multi-objective optimization of natural gas pipelines. In Demissie et al. (2017), the objective is to minimize the power consumption of the compressor stations and meanwhile maximize the amount of gas delivered to customers. And in Su et al. (2019), it is proposed to minimize power demand and risk of gas supply shortage. However, none of the articles takes into account our objectives and multicriteria analysis under conditions of uncertainty.

In this context, this paper presents the following highlights in comparison with the studies in the literature:

  • Consider the analysis of meshed networks, which are assumed more complex to address;

  • Deal with test instances which have more than 50 nodes;

  • Deal with nonlinear equations;

  • A multiobjective optimization is performed, which aims to minimize the cost and maximize the minimum pressure;

  • A multicriteria analysis is carried out to assess the behavior of the network 10 years ahead.

Table 1 Comparison of the main features of the proposed methodology against previous works from the literature

3 Design of Natural Gas Networks

The design of a natural gas network consists mainly of two stages: hydraulic and mechanical (Ottoni and Batista, 2020). The hydraulic part involves issues such as the demand to be met, network topology, maximum and minimum pressure criteria for the proper operation of the network, consumer flow, and gas characteristics. On the other hand, the mechanical phase of the project involves other problems, such as the specification of the material and the thickness of the pipe wall that is adequate to withstand internal and external pressure (Ottoni and Batista, 2020; Arya, 2022).

The problem addressed in this work will involve the design of a natural gas network in the hydraulic construction stage, assuming that the mechanical phase has already been completed. For this, it is necessary to define the network topology and its demand points (Arya, 2022; Hari et al., 2021).

The next step in designing a natural gas network is the choice of pipelines that will make up the network. For this choice, there are relevant points that must be considered. In general, the larger the diameter of a pipe, the greater the amount of gas that can be transported through it; in this way, the network can safely meet the existing demands (Ottoni and Batista, 2020; Arya, 2022). However, the diameters of existing pipelines on the market are fixed, and their cost is proportional to their size. Therefore, more robust networks generally tend to be more expensive.

In this way, a combinatorial optimization problem is obtained, in which we aim to minimize the installation cost and maximize the minimum pressure on the nodes of a network. The constraints of this problem require that the pipe diameters are sufficient to meet the necessary demand so that, in each internal node of the network, a minimum pressure is available.

3.1 Proposed Optimization Problem

In this paper, it is desired to define a set of pipes that will compose the network in order to both minimize the installation cost and maximize the minimum gas pressure in the network. The types of pipelines considered in this project were selected according to those available on the market, and their cost is proportional to their diameter and length.

The optimization problem considered in this work can be defined as:

$$\begin{aligned}&\min _x F_1(x) \triangleq \quad \sum _{k \in K} {\sum _{i \in I} {L_i C_k x_{ik}}} \end{aligned}$$
(1a)
$$\begin{aligned}&\max _x F_2(x) \triangleq \quad \min _{j \in \{1,...,n\}} P_j(x) \end{aligned}$$
(1b)

subject to:

$$\begin{aligned}&P_j (x) \ge P_\text {min}, \quad j=1,...,n. \end{aligned}$$
(2)
$$\begin{aligned}&x_{ik} \in \{0,1\}, \quad i=1,...,m. \quad k=1,...,|K|. \end{aligned}$$
(3)

where K is the set of types of diameters available on the market; I is the set of m predefined pipelines in the network; \(L_i\) is the length of the i-th pipeline of the network; \(C_k\) is the cost per unit length of the k-th type of diameter; \(P_j\) is the pressure obtained at the j-th node in the network; \(P_\text {min}\) is the minimum pressure required on the j-th node; \(x_{ik}\) is a binary variable, which is 1 if the k-th diameter is assigned to pipe i and 0 otherwise.

Equations (1a) and (1b) refer to the multiobjective problem of minimizing cost and maximizing minimum pressure, respectively. Equation (2) expresses the minimum pressure constraints of the problem, which can be verified by solving nonlinear gas flow equations in the network, as shown in Ramos and Batista (2020).

The unrestricted multiobjective optimization problem is represented in (4aand 4b), where the original problem constraints are dealt with via a penalty function:

$$\begin{aligned}&\min _x F_1(x) \triangleq \sum _{k \in K} {\sum _{i \in I} {L_i C_k x_{ik}}} + \eta (x) C_\text {res} \sum _{i \in I}{L_i} \end{aligned}$$
(4a)
$$\begin{aligned}&\min _x F_2(x) \triangleq - \min _{j \in \{1,...,n\}} P_j(x) + \eta (x)C_\text {res} \sum _{i \in I}{L_i} \end{aligned}$$
(4b)

where \(\eta (x)\) is the number of nodes in the solution x that violates the minimum acceptable pressure; \(C_\text {res}= C_\text {max}-C_\text {min}\), in which \(C_ {max}\) and \(C_\text {min}\) are the cost, per unit of length, of the largest and smallest diameter available, respectively.

4 Proposed Integrated Tool

In this section, we present the integrated tool proposed in this paper. The methodology follows the flowchart shown in Fig. 1. Initially, a data set is generated by an instance generator developed in this work. Of course, the data set of an instance can also come from a real-world problem. The proposed optimizer then approximates a set of non-dominated solutions, i.e., a Pareto-front. Finally, multicriteria decision-making is carried out through the estimated Pareto-front.

Fig. 1
figure 1

Flowchart of the proposed methodology

In the following, we present the algorithms proposed in Duarte et al. (2015) and then the algorithms suggested as an innovation of this paper.

4.1 Multiobjective VNS (MOVNS)

This research proposes a multiobjective version of the Variable Neighborhood Search (VNS) algorithm presented in Mladenović and Hansen (1997). This method is based on the systematic change of neighborhood while investigating solutions in the search space. In this work, we explore a recent MOVNS algorithm proposed specifically for multiobjective combinatorial problems Duarte et al. (2015). The algorithm has some variations, such as the reduced VNS (RVNS), general VNS (GVNS), and variable neighborhood descent (VND). These algorithms are presented next.

4.1.1 Multiobjective Reduced VNS (MORVNS)

The reduced VNS (RVNS) is one of the versions of the multiobjective VNS presented in Duarte et al. (2015). Its procedure is based on two actions: (i) disturbing the solution to leave possible local optima and (ii) changing the neighborhood until a stop criterion is reached. Algorithm 1 shows these steps in the multiobjective version of the RVNS proposed in Duarte et al. (2015). It is important to note that in the mono-objective algorithm, a solution is characterized by a single vector, while in the multiobjective version there is a set of efficient points (non-dominated solutions), represented in Algorithm 1 by the set E. The other parameters are as follows: number of neighborhood structures (\( k_\text {max} \)) and the runtime limit of the algorithm (\( t_\text {max} \)).

figure a

4.1.2 Multiobjective General VNS (MOGVNS)

The MOGVNS algorithm can be divided into three consecutive steps: (i) perturbation of the solution to leave possible local optima, (ii) VND for solution refinement, and (iii) change of neighborhood until a stopping criterion is reached. Algorithm 2 shows these steps in the multiobjective version of GVNS proposed in Duarte et al. (2015).

As represented in Algorithm 2, the input parameters are as follows: the set of solutions E, number of neighborhood structures (\( k_\text {max} \)), number of objective functions (r), number of neighborhoods for local search (\( k '_\text {max} \)), and the runtime limit for the algorithm (\( t_\text {max} \)).

figure b

4.1.3 Local Search: Multiobjective VND (MOVND)

In the local search approach, called MOVND (Algorithm 3), the neighborhoods are explored for each objective i separately, through a specific execution of VND-i (Algorithm 4). The input parameters are a set of solutions (E), the number of neighborhoods for the local search (\( k '_\text {max} \)), and the total number of objectives (r).

figure c

An iteration begins with the random selection of an incumbent solution \( x'\) from the set of unexplored points of E (line 6). The set S represents the points used, grouped by objective function (\( S_1, S_2, ..., S_r \)) and updated after each execution of VND-i (line 8). The improvements in the set of points are analyzed by the function MO-ObjectiveChange, which is presented next. In this way, the method is executed until all objectives are optimized.

Algorithm 4 shows how the local search VND-i is performed concerning the objective i. Note that, as in the original VND method, the transition between neighborhoods is carried out in a deterministic manner, where each neighborhood structure is indicated by k (with k \( \in \) \( \{1, ..., k '_\text {max}\} \)).

A new point \(x'\) is defined as the best solution in the neighborhood k of the vector x in relation to the objective i (line 5). In case of improvement, the x solution is replaced by \( x '\) and a new neighborhood inspection is made from \( k = 1 \). The main difference between Algorithm 4 and the original VND is that a set of non-dominated points E is updated through searches and, at the end of the procedure, non-dominated solutions are returned (line 6). This concept of dominance is defined as: given two solutions \( x_1 \) and \( x_2 \), \( x_1 \) is said to dominate \( x_2 \) if, and only if, \( F_i (x_1) \le F_i (x_2) \) for all \( i = 1, ..., r \), and there is at least one objective i where \( F_i (x_1) < F_i (x_2) \) (Miettinen, 2012). In this definition, \( F_i (\cdot ) \) represents a given optimization objective and r the total number of objectives considered.

figure d

4.1.4 MO-NeighborhoodChange

Algorithm 5 explains how the MO-NeighborhoodChange and MO-ObjectiveChange procedures work. As with the mono-objective version, a neighborhood k is changed to \( k + 1 \) when there is no further improvement in the current search. Improvements are seen in the MO-Improvement function. If MO-Improvement returns true, that is, there has been an improvement, then the neighborhood structure is reinitialized (line 3) and the solutions are updated. The Update function (line 4) is responsible for this update, which returns the non-dominated solutions of the union of E and \( E '\). If MO-Improvement returns false, it moves on to the next neighborhood structure.

figure e

The MO-Improvement function (Algorithm 6) is responsible for verifying whether there has been an improvement between the solution sets E and \( E '\). An improvement is assumed to be true if at least one solution \( x '\ in E' \) is not dominated by any of the points in E. This leads to the update of the approximated Pareto front.

figure f

4.1.5 MO-Shake

The multiobjective version of the perturbation procedure, MO-Shake, is presented in Algorithm 7. Its main objective is to perturb the set of points to promote searches in new regions. The Shake function is responsible for modifying each solution in the E set, according to the neighborhood k, to create a new set of points \( E '\).

figure g

4.2 Proposed Approaches

This section presents the main approaches proposed in this article. Section 4.2.1 presents a constructive heuristic to generate an initial solution. In 4.2.2, the specific neighborhood structures for the problem addressed are defined. In 4.2.3, the implemented instance generator is presented. Section 4.2.4 suggests a robust proposal for MO-NeighborhoodChange and MO-Improvement. Finally, in 4.2.5, the multicriteria methodology used in this work is discussed.

4.2.1 Initial Solution

The initialization of a feasible solution for the MORVNS and MOGVNS algorithms was obtained from a constructive heuristic. The construction of this initial solution follows the step by step in the next:

  1. a.

    A network is started concerning all pipes with the smallest diameter available, thus having the cheapest possible solution. The algorithm finishes only when a feasible solution is found or a stagnation criterion is reached.

  2. b.

    It is detected which nodes are violating the minimum pressure constraint and a percentage of 20% of them is selected.

  3. c.

    New solutions are generated by increasing in one unit the diameter of the pipes connected to the nodes selected in step (b). These perturbations aim to build solutions whose pressure violations in the nodes become smaller.

  4. d.

    The solution with the lowest objective function value \( F_2 \) is verified, and if it is feasible and better than the incumbent solution, the algorithm is closed; otherwise, the process returns to step (b) and the iterative construction continues.

The best solution obtained is returned at the end of the algorithm.

4.2.2 Neighborhood Structures

The neighborhoods proposed in this work are specific to the characteristics of the problem. Due to the sensitivity of the candidate solutions, small changes in their structure can alter their cost on a large scale. Therefore, neighborhood structures with small perturbations were chosen. The three proposed neighborhood structures are described in the following:

  1. 1.

    The first neighborhood structure decreases or increases the diameter of a random pipe to the next available value, considering the same probability of occurrence. If the selected pipe is associated with the largest possible diameter, it is decreased to the next available diameter. On the other hand, if the selected pipe is associated with the smallest possible diameter, it is increased to the next viable value.

  2. 2.

    The second structure changes the diameters of two neighboring random pipes, i.e., connected to the same network node.

  3. 3.

    The third structure changes the diameters of any two random pipes in the network.

4.2.3 Instance Generator

One of the contributions of this work is the proposal of an instance generator based on the TSPLIB data set (Reinelt, 1991). This instance generator is based on the following steps:

  1. 1.

    An instance is selected from the TSPLIB library. This instance is represented by a set of vectors on the Cartesian plane.

  2. 2.

    The minimum spanning tree of the selected instance is obtained using the Kruskal algorithm (Kruskal, 1956).

  3. 3.

    Leaf nodes, i.e., connected to a single pipe, are identified and then connected to two or three other arbitrary neighboring nodes, avoiding the intersection between pipes. This operation aims to define a meshed network.

  4. 4.

    Arbitrarily define between 2 and 5% of network nodes as source nodes.

  5. 5.

    The incidence matrix of the instance is elaborated.

  6. 6.

    The network parameters are randomly generated. The demand for each internal node is drawn between 10.000 and 15.000 \( (m^3/h) \). However, there is a 5% probability that each node is only of transshipment. The other network data, such as the minimum pressure required, the pressure at the source nodes, the diameters available, and the related costs, are the same as those adopted in El-Mahdy et al. (2010), Ottoni and Batista (2020).

4.2.4 MO-NeighborhoodChange\(^\star \) & MO-Improvement\(^\star \)

The MO-NeighborhoodChange (Algorithm 5) and MO-Improvement (Algorithm 6) methods, proposed in Duarte et al. (2015), assume that if a newly generated solution is not dominated by any of the solutions in the archive, then it can contribute to its quality. However, due to the deterioration problem frequently observed in processes of updating / truncating the set of non-dominated solutions (Deb, 2001), all non-dominated solutions will be accepted, even if they do not contribute to the improvement of the archive. This limitation can lead these methods to a loop, harming the convergence of the approach.

In this context, this work proposes a different strategy to verify the possibility of improving the archive. In general, the MO-Improvement algorithm will suggest the possibility of improving the current archive, only if a new solution generated meets two basic criteria: i) be non-dominated in relation to the other archive solutions and ii) contribute to the diversity of this archive. If the archive is already full (i.e., \(|E| \ge N\)), the verification will be carried out considering the inclusion of the new solution in the archive and removal of the vector of smaller crowding distance (Deb et al., 2002). This new method is presented in Algorithms 8 and 9.

figure h
figure i

In addition, the archive and the neighborhood structure will be updated in the MO-NeighborhoodChange method according to the hypervolume quality indicator (HV) (Zitzler and Thiele, 1999). Therefore, if a newly generated solution improves the archive HV value, then this updated archive is returned and the neighborhood structure \( k = 1 \) is maintained; otherwise, the original archive is kept and the next neighborhood structure is considered, \(k = k + 1\). The same also applies to the MO-ObjectiveChange method. The pseudo-code of this new strategy is presented in Algorithms 10 and 11.

figure j
figure k

These proposed approaches are named MO-NeighborhoodChange\(^\star \) and MO-Improvement\(^\star \), which are used in this work to replace the original heuristics.

4.2.5 Multicriteria Analysis

The multicriteria analysis was performed considering scenarios of uncertainty of the evolution of the load to assist in the final decision-making. The growth of demand at internal network nodes was modeled in order to determine a distribution network configuration that will respond to an increase in the nominal demand of each node over a given time horizon. Since the real growth is unknown, a Gaussian probability distribution function was used to model the behavior of the network.

The proposed procedure is described next:

  • For a specific period of time, a set of \( N_s \) demand scenarios was generated using a Gaussian distribution. For this work, \( N_s = 2000 \) scenarios were used and a time horizon of 10 years. In general, it was assumed a mean of 0.025 and sample standard deviation of 0.012 for the first year, which results in a mean of 0.28 and standard deviation of 0.127 in the tenth year;

  • All non-dominated solutions found (network configurations) were evaluated in all scenarios, considering four pre-established merit functions: minimum pressure; feasibility rate; average cost of failure; sensitivity;

  • A dominance analysis was carried out in order to extract a final subset of non-dominated solutions from the initial set of candidate solutions.

The pre-established merit functions are modeled as follows:

Installation Cost

This function deals with the costs that mainly involve civil works, pressure control equipment, flow meters, and tubes acquisition (Goldbarg et al., 2006). In this work, the installation cost is given by (1a):

$$\begin{aligned} F_1(x)= \sum _{k \in K} {\sum _{i \in I} {L_i C_k x_{ik}}} \end{aligned}$$
(5)

Minimum Pressure

The minimum pressure ensures that the demands of the network will be met, and it is defined as (1b):

$$\begin{aligned} F_2(x) = \min _{j \in \{1,...,n\}} P_j(x) \end{aligned}$$
(6)

Feasibility Rate

The feasibility rate is defined as the relationship between the number of scenarios in which the network configuration does not violate the minimum pressure limit (\( N_f \)) and the total number of scenarios (\( N_s \)):

$$\begin{aligned} F_3 (x) = \frac{N_f(x)}{N_s} \end{aligned}$$
(7)

Average Cost of Failure

The \( F_4 \) function represents the expected average cost of failure in the network, which is evaluated considering the scenarios in which the network is viable:

$$\begin{aligned} F_4 (x) = \frac{1}{N_f(x)} \sum _{j=1}^{N_f(x)} y_j(x) \end{aligned}$$
(8)

where \( y_j \) is the failure cost of scenario j:

$$\begin{aligned} y_j(x) = \mu \sum _{i=1}^{m} \lambda _i t_i q_{i,j}(x) \end{aligned}$$
(9)

where \(\mu \) is the gas flow rate \((\$. h/m^3)\); \(\lambda _i\) is the failure rate per unit length of the pipe i \(((m.h)^{-1})\); \(t_i\) is the average duration of the pipe failure i (h), and \(q_{i,j}(\cdot )\) is the gas flow in the pipeline i for the scenario j (\(m^3/h\)). The values adopted in this work are as follows:

$$\begin{aligned} \mu = 0.10 ~(\$. h/m^3) \\ \lambda _i = 10^{-4} ~((m.h)^{-1}), ~\forall i \in \{1,\ldots ,m\} \\ t_i = 1 ~(h), ~\forall i \in \{1,\ldots ,m\} \end{aligned}$$

Sensitivity

The sensitivity allows to evaluate the behavior of the network if any pipeline has a malfunction. In this work, the impact due to a malfunction of a pipeline was modeled as the reduction of its diameter to \(50\% \) of its nominal value. The sensitivity function modeled below considers the sum of the failure effects of each of the network pipes (simultaneous failures are not considered):

$$\begin{aligned} F_5(x) = \sum _{i=1}^{m} \Bigg ( \frac{1}{N_a(x)} \sum _{j=1}^{n} \Omega _j^i(x) \Bigg ) \end{aligned}$$
(10)

in which \( \Omega _j^i(x) = {\left\{ \begin{array}{ll} 0,&{} \text{ if } P_j^i(x) \ge P_\text {min} \\ (P_\text {min}-P_j^i(x))^2, &{} \text{ if } P_j^i(x) < P_\text {min} \\ \end{array}\right. }\) \( N_a (\cdot ) \) is the number of nodes in solution x that does not violate the minimum pressure restriction.

5 Design of Experiments

All the algorithms used in the experiment were developed in MATLAB R2014b and run on a computer Ubuntu 16.04 LTS 8GB of RAM. The statistical analysis of the results was carried out in R 3.5.3 (R Core Team, 2019).

The instances used were generated from the instance generator developed in this work. To perform the tests, 9 instances were developed for the problem: Berlin52a, Berlin52b, Eil51a, Eil51b, Eil76a, Eil76b, St70a, St70b, and Rd100. In all experiments, 5 executions of the algorithms were performed for each instance.

The multiobjective algorithms compared in this work are MORVNS, MOGVNS, NSGA-IIa, and NSGA-IIb. The NSGA-II employed in this work has many similarities with the one proposed in Deb et al. (2002), however with different variation operators, as defined in El-Mahdy et al. (2010). The algorithms NSGA-IIa and NSGA-IIb differ only in relation to the stopping criterion. The stop criterion of the NSGA-IIa is equal to the number of function evaluations (nfe) performed by MORVNS and the stop criterion of NSGA-IIb is equal to the nfe of MOGVNS. In the algorithms MORVNS and MOGVNS, the stop criterion based on runtime has been replaced by a maximum number of iterations. In this context, it was considered 30 iterations for the MORVNS and 3 for the MOGVNS.

Finally, the best performing multiobjective algorithm was applied to the case study presented in El-Mahdy et al. (2010). The results are compared against the literature.

6 Results

In this section, the results obtained by the proposed tool are presented. The experiments are divided into two stages: (i) comparison of the performance of multiobjective algorithms applied to the problem using benchmark instances and (ii) use of the best performing algorithm to optimize the case study proposed in El-Mahdy et al. (2010). Finally, a multiobjective analysis is carried out under conditions of uncertainty regarding the evolution of demand.

6.1 Results from the Benchmark Instances

The parameters used by NSGA-II a and b are compatible with those used by the mono-objective genetic algorithm proposed in El-Mahdy et al. (2010). The algorithm uses binary variables of 3 bits and the probabilities of crossover and mutation are 100% and 5%, respectively. The selection operator is based on a binary tournament.

To compare the algorithms MORVNS, MOGVNS, NSGA-IIa, and NSGA-IIb, 9 instances were generated from the instance generator. Table 2 shows the mean and standard deviation of the hypervolume (HV) and runtime (in hours) for each instance.

Table 2 Comparative results between the algorithms. The best hypervolume (HV) and runtime (in hours) values are in bold
Fig. 2
figure 2

Fronts obtained by the algorithms in the instances: (a) Berlin52a; (b) Eil76b; (c) Eil70b; and (d) Rd100

From Table 2 it is possible to observe that the MOGVNS algorithm presents, in most cases, the largest hypervolume. Figure 2 shows examples of the estimated Pareto-front for the instances Berlin52a, Eil76b, Eil70b, and Rd100. From these results, it is possible to see that the MOGVNS algorithm is capable of obtaining fronts with better coverage. MORVNS tends to prioritize results with lower cost, but with pressures close to the minimum allowed. NSGA-IIa, in contrast to MORVNS, tends to map solutions with a higher cost, but with a higher minimum pressure. NSGA-IIb obtained solutions with promising coverage, however irregularly distributed along the front, with several spaces between them.

For a more elaborate comparison of the algorithms, a statistical analysis was performed to check if there is a significant difference between them in terms of the hypervolume. The analysis of variance test (ANOVA) was used to check if there is a statistical difference between the performance of the algorithms (significance of 0.05). ANOVA is applied to test the null hypothesis (\(H_0\)) of no difference in the performance of the algorithms against the alternative hypothesis (\(H_1\)) that at least one algorithm is different from another algorithm (Montgomery and Runger, 2013):

$$\begin{aligned} {\left\{ \begin{array}{ll} H_0: &{} \alpha _i = 0 \quad \forall i\\ H_1: &{} \alpha _i \ne 0 \quad \text{ for } \text{ any }\, {\textit{i}} \end{array}\right. } \end{aligned}$$

The ANOVA results indicated that there are differences between the methods, i.e., with p value less than 0.05. The premises of normality and homoscedasticity were respected; the Kolmogorov-Smirnov test and the Bartlett test were used, respectively.

Tukey’s multiple comparison test (Tukey, 1953) was performed to identify the averages of the algorithms that are statistically different from each other (Campelo, 2018). This method can be performed with ANOVA.

The results obtained by the Tukey test showed that for the instances Berlin52a, Berlin52b, Eil51b, and St70a, the algorithms MOGVNS and NSGA-IIb present statistically different performances concerning the methods MORVNS and NSGA-IIa. For instances Eil51a, Eil76a, Eil76b, St70b, and Rd100a, statistically different performances are observed in all algorithms. MOGVNS obtained the highest average of the hypervolume in most instances, as shown in Table 2.

Regarding the computational time of the algorithms, MORVNS and NSGA-IIa algorithms require similar runtime. In the same way, it is possible to notice this similarity between the algorithms MOGVNS and NSGA-IIb. This approximation is due to the same stop criterion of these algorithms.

However, the runtime difference between these two groups is considerable. This difference can be justified by the increase in the number of evaluations of the objective function, which for the problem in question is computationally expensive.

Despite this fact, due to the quality of the solutions pointed out by the MOGVNS and NSGA-IIb algorithms, it is believed that the runtime in question is not as relevant as the quality of the solutions obtained, since a solution is not needed in a short time, but an efficient project. In general, it is extremely important that the solutions are good enough to save thousands of dollars and also to meet the demand of the network.

In view of the foregoing considerations, it was decided to apply the MOGVNS in the case study.

6.2 Case Study

The instance used in this case study was proposed in El-Mahdy et al. (2010). In this article, the authors compared the solutions obtained by a binary genetic algorithm with the solutions provided by three engineers with an average of 6 years of experience.

The network consists of \( m = 21 \) pipes, \(n = 12\) internal nodes, and \(r = 2\) source nodes, as can be seen in Figure 3.

Fig. 3
figure 3

Network employed in the case study. Adapted from El-Mahdy et al. (2010)

6.2.1 Results Obtained by the Proposed Approach

Five MOGVNS executions were carried out. The fronts obtained (Figure 4) show the ability of the method to reproduce similar results.

Fig. 4
figure 4

Fronts obtained by MOGVNS in the case study

The largest hypervolume front (from execution 5) was selected and the multicriteria analysis was performed, evaluating installation cost, minimum pressure, feasibility rate, average failure cost, and sensitivity. The results obtained are shown in Table 3. The solutions proposed in Ramos and Batista (2020), El-Mahdy et al. (2010) and by the three engineers were also evaluated on the five criteria, as shown in Table 4.

From the results obtained, the difficulty of obtaining low-cost and high-pressure solutions is notable due to the strong conflict between the two objectives. Besides, based on the methodology used, it is possible to observe that important criteria (such as those discussed) cannot be ignored in network planning, as they present interesting characteristics about the behavior of a network in conditions of uncertainty.

An example of the efficiency of the proposed approach is solutions 2, 6, 11, 19, and 20 (Table 3), which had the lowest installation cost, higher value of minimum pressure, greater feasibility, and less sensitivity compared to the solutions of the three engineers (Table 4).

In order to make a decision based on the criteria presented in Table 3, taking into account the cost of installation, minimum pressure, feasibility rate, average cost of failures, and sensitivity of the network, solution 19 was chosen.

The chosen solution contains a cost value smaller than those found by all the engineers (Table 4). The minimum pressure is high and greater than all the results presented in El-Mahdy et al. (2010) and Ramos and Batista (2020). It contains a high feasibility rate (77.65%) while the highest rate shown in Table 4 is 10.5%; its average cost of failure is the fifth-lowest in Table 3 and its sensitivity is lower than those presented by the papers used as references.

Table 3 Results of multicriteria analysis

6.3 Final Discussion

The results presented in this section illustrate how the proposed approach can provide promising solutions with different characteristics. Among the results obtained, presented in Table 3, some are cheaper - both concerning installation and average cost of failure - while others are more likely to supply gas properly across the network - greater viability and smaller sensitivity.

In addition, when comparing Table 3 with Table 4, it can be seen that, given a similar installation cost range, the solutions in Table 3 should surpass those in Table 4. This demonstrates how the search for solutions under different conditions of uncertainty and merit functions characterizes a better approach than just looking for cheaper solutions under the condition of nominal demand.

However, many simplifications and approximations have been adopted in this work; some for lack of specialization, some for leading to unnecessary complications, and others for being, by nature, specific to the network being optimized. Although it is a belief that none of these undermines the robustness of the methodology itself, they give a lot of room for improvements and adjustments to be made. Some ideas to consider are listed as follows:

  • Complexity adjustment: the algorithm is computationally expensive. Maintaining a set of candidate solutions requires MOGVNS to solve the problem of nonlinear flow over and over again.

  • Data collection: many values are approximate in this work. The growth in demand adopted seems acceptable, but it certainly varies from case to case.

7 Conclusion

This work aimed to develop a multiobjective optimization tool for the problem of natural gas network dimensioning. Minimizing the cost of installing the network and maximizing the minimum pressure on the network nodes were considered, aiming to meet the demand safely. To assist the decision-making, the results obtained were evaluated in five criteria: installation cost, minimum pressure, feasibility rate, average cost of failure, and sensitivity. The multicriteria analysis took into account scenarios of uncertainty in the evolution of demand over a specified time horizon (10 years).

Table 4 Solutions from Ramos and Batista (2020) and El-Mahdy et al. (2010) evaluated on the proposed criteria

In general, this work brings gains to the academic community involved with the optimization of natural gas distribution networks. As stated before, it is believed that the original contributions of this work are as follows: the multiobjective modeling proposed; both the dedicated multiobjective VNS and the intelligent constructive heuristic; a random instance generator; and a strategy to aid multicriteria decision-making.

The constructive heuristic proposed for the definition of an initial solution proved to be very efficient. Thus, the MORVNS and MOGVNS algorithms already started with good solutions. This fact tends to favor local search since the initial solution is possibly already close to promising local optima.

From a descriptive analysis of the hypervolume values obtained by the MORVNS, MOGVNS, NSGA-IIa, and NSGA-IIb algorithms (Table 2), it was observed that in most instances, the MOGVNS obtained the highest hypervolume value. The ANOVA test indicated that there are differences between the investigated algorithms. The Tukey test showed that for 5 of the instances used, the algorithms are statistically different from each other; moreover, it showed that in the other 4 instances, the MOGVNS and NSGA-IIb are statistically different from the MORVNS and NSGA-IIa algorithms. The analysis performed concerning the computational cost shows that due to the quality of the solutions obtained by the MOGVNS algorithm, it has the best performance among the investigated algorithms. Even though it is computationally expensive, the MOGVNS algorithm can obtain good solutions considering the computational cost and the minimum pressure. Thus, taking into account the investigation carried out, it was concluded that the best performing algorithm was MOGVNS.

From the results obtained in the case study, it is possible to observe that due to the conflict of objectives between the installation cost and the minimum pressure, it is important to evaluate the trade-off between the estimated solutions. In this context, the proposed multicriteria analysis proved to be very efficient to assist in the decision-making process, as it aids based on a perspective of demand growth over 10 years.

Analyzing the quality of the solutions using a time horizon is another premise adopted that is considered of paramount importance. Installing a natural gas network is a major undertaking, and it is expected to last for decades before needing to be resized. Therefore, this multicriteria analysis offers designers a good level of confidence that the chosen layout will remain operational for a considerable period of time.

There are many potential extensions to what has been done in this work. For continuity work, it is suggested: improve the efficiency of the multiobjective algorithms used, aiming to decrease the time spent on their execution; develop an investigation concerning larger instances, as it is believed that the performance difference between the multiobjective algorithms will be even more evident; explore additional criteria related to the problem of natural gas network dimensioning; unconsidered analyzes can be included (for example, maintenance costs); additional sources of uncertainty can be studied (e.g. pipe roughness, gas flow rate).

Finally, although this procedure is also, or even more, capable of finding cheap and robust solutions to the problem of sizing gas networks, it does not imply, at all, replacing human pipeline designers. On the contrary, its main objective is to provide the stakeholders a good and varied set of solutions, leaving the real choice for designers and engineers. It is also worth emphasizing that many parts of the algorithm are highly customizable and, therefore, as good as the people involved. Thus, both human judgment and the algorithmic part are inseparable from the quality of the procedure itself.