1 Introduction

Pareto-based multi- and many-objective evolutionary algorithms use non-dominated sorting as an intermediate step (Deb et al. 2002; Deb and Jain 2014). Non-dominated sorting is also used in other domains such as economics, databases, game theory and computational geometry. In non-dominated sorting, the solutions are sorted based on the dominance relationship between the solutions. Let \(\mathbb {P} = \left\{ \textit{sol}_1,\textit{sol}_2,\ldots ,\textit{sol}_N \right\} \) be a set of N solutions where each solution is associated with M objectives. The set of solutions is known as population. Actually, these solutions are in an M-dimensional objective space. A particular solution \(\textit{sol}_i(1 \le i \le N)\) in the population is represented as \(\textit{sol}_i = \left\{ f_1(\textit{sol}_i),f_2(\textit{sol}_i),\ldots ,f_M(\textit{sol}_i) \right\} \) where \(f_m(\textit{sol}_i), 1 \le m \le M\) is the value of \(\textit{sol}_i\) for the \(m^{th}\) objective. We consider here the optimization problems where the focus is to minimize all the objectives. For sorting the solutions, the dominance relation between them is required, which is defined as follows.

Definition 1

(Dominance) A solution \(\textit{sol}_i\) is said to dominate another solution \(\textit{sol}_j\) which is represented as \(\textit{sol}_i \prec \textit{sol}_j\) iff it satisfies the two following conditions:

  1. 1.

    \(f_m(\textit{sol}_i) \le f_m(\textit{sol}_j), \forall m \in \left\{ 1,2,\ldots ,M \right\} \)

  2. 2.

    \(f_m(\textit{sol}_i) < f_m(\textit{sol}_j), \exists m \in \left\{ 1,2,\ldots ,M \right\} \).

The notation \(\textit{sol}_i \nprec \textit{sol}_j\) is used to represent that solution \(\textit{sol}_i\) does not dominate solution \(\textit{sol}_j\). We call two solutions \(\textit{sol}_i\) and \(\textit{sol}_j\) as non-dominated when neither solution dominates another, i.e., neither \(\textit{sol}_i \nprec \textit{sol}_j\) nor \(\textit{sol}_j \nprec \textit{sol}_i\). In non-dominated sorting, the set of solutions are divided into non-dominated fronts as formally defined next.

Definition 2

(Non-dominated sorting) Given a set of N solutions \(\left\{ \textit{sol}_1, \right. \) \(\left. \textit{sol}_2,\ldots ,\textit{sol}_N \right\} \). In non-dominated sorting, these solutions are divided in \(K(1 \le K \le N)\) non-dominated fronts \(\left\{ F_1,F_2,\ldots ,F_K \right\} \) in decreasing order of their dominance such that

  1. 1.

    \(\cup _{i=1}^{K} F_i = \mathbb {P}\)

  2. 2.

    \(\forall \textit{sol}_i,\textit{sol}_j \in F_k\): \(\textit{sol}_i \nprec \textit{sol}_j\) and \(\textit{sol}_j \nprec \textit{sol}_i\) \((1 \le k \le K)\)

  3. 3.

    \(\forall \textit{sol} \in F_k\), \(\exists \textit{sol}\,' \in F_{k-1}\): \(\textit{sol}\,' \prec \textit{sol}\) \((2 \le k \le K)\)

In these sorted fronts, \(F_1\) has the highest dominance, \(F_2\) has the second highest dominance and so on. The last front \(F_K\) has the lowest dominance.

Parallel programming has attracted a lot of attention in recent years as a means to reduce the execution time of algorithms. Evolutionary algorithms are normally easy to parallelize due to their low data dependency, and this has motivated a considerable amount of research in this area (Luna and Alba 2015; Van Veldhuizen et al. 2003; Kim and Smith 2004; Maulik and Sarkar 2010; Shinde et al. 2011; Wong and Cui 2013). Consequently, the parallelization of non-dominated sorting algorithms has also attracted the interest of several researchers (see for example Smutnicki et al. 2014; Gupta and Tan 2015; Ortega et al. 2017; Moreno et al. 2018; Mishra and Coello 2018).

The main focus of the current research in this area has been on the parallelization of the naive approach for non-dominated sorting proposed by Srinivas and Deb (1994). However, there are several other non-dominated sorting approaches which have also the parallelism property such as the fast non-dominated sorting Deb et al. (2002), ENS Zhang et al. (2015), BOS Roy et al. (2016), DCNS Mishra et al. (2016), T-ENS Zhang et al. (2018) and ENS-NDT Gustavsson and Syberfeldt (2018), among others.

It is worth noticing that approaches such as ENS Zhang et al. (2015), BOS Roy et al. (2016), DCNS Mishra et al. (2016), ENS-NDT Gustavsson and Syberfeldt (2018) and T-ENS Zhang et al. (2018), require to sort 2N solutions (in case of NSGA-II Deb et al. (2002), NSGA-III Deb and Jain (2014), and others) unlike the naive approach (Srinivas and Deb 1994), fast non-dominated sort (Deb et al. 2002) and deductive sort (McClymont and Keedwell 2012) where the process of sorting can be stopped when we have enough fronts which contain N solutions. Let the \(N^{th}\) solution be inserted into the \(k^{th}\) front. As soon as any other solution is inserted into the \(k+1^{th}\) front, the process of non-dominated sorting stops in case of the naive approach (Srinivas and Deb 1994), fast non-dominated sort (Deb et al. 2002) and deductive sort (McClymont and Keedwell 2012). However, ENS Zhang et al. (2015), BOS Roy et al. (2016), DCNS Mishra et al. (2016), ENS-NDT Gustavsson and Syberfeldt (2018) and T-ENS Zhang et al. (2018) continue the process until all 2N solutions are sorted. So, we have focused on the scope of parallelism in the naive approach. The worst case time complexity of the naive approach is \(\mathcal {O}(MN^3)\), and the best case time complexity is \(\mathcal {O}(MN^2)\) (Srinivas and Deb 1994). The best case of the naive approach occurs when all the solutions are non-dominated with respect to each other. Generally, when all the solutions are in single front (non-dominated with respect to each other), several approaches have their worst case (e.g., deductive sort McClymont and Keedwell 2012, ENS Zhang et al. (2015), DCNS Mishra et al. (2016), T-ENS Zhang et al. (2018) and ENS-NDT Gustavsson and Syberfeldt (2018)), having a \(\mathcal {O}(MN^2)\) time complexity. The naive approach performs close to its best case when the number of fronts are less in number.

The organization of the rest of the paper is as follows. Some of the approaches for non-dominated sorting are described in Sect. 2. The naive approach is illustrated in Sect. 3 along with its time and space complexities. Parallelism in the naive approach is explored in Sect. 4, and three parallel versions are proposed. The time and space complexities of the parallel versions are mathematically derived. Finally, Sect. 5 concludes the paper with some possible paths for future research.

2 Previous related work

We discuss here some of the approaches for non-dominated sorting that have been proposed in the specialized literature. Deb et al. (2002) proposed fast non-dominated sort which requires \(\mathcal {O}(MN^2)\) time and has a space complexity of \(\mathcal {O}(N^2)\). Jensen (2003) proposed a recursive approach based on a divide-and-conquer strategy with time complexity \(\mathcal {O}(N\log ^{M-1}N)\) and space complexity \(\mathcal {O}(MN)\). When two solutions have the same value for a particular objective, then this approach is not able to correctly sort the solutions. For two objectives, the time complexity of Jensen’s approach is \(\mathcal {O}(N\log N)\). Fang et al. (2008) proposed another approach based on a divide-and-conquer strategy. The worst case time complexity of this approach is \(\mathcal {O}(MN^2)\). However, the best case time complexity is \(\mathcal {O}(MN\log N)\) and its space complexity is \(\mathcal {O}(MN)\). This approach has improved the best case time complexity. However, in case of duplicate solutions, this approach considers one solution as dominated by another. Tang et al. (2008) proposed an approach based on arena’s principle with a worst case time complexity \(\mathcal {O}(MN^2)\) and a best case time complexity \(\mathcal {O}(MN \sqrt{N})\).

McClymont and Keedwell (2012) proposed two approaches: Climbing sort and Deductive sort. The worst case time complexity of both the approaches is \(\mathcal {O}(MN^2)\). However, deductive sort performs half the comparisons of climbing sort in the worst case. The best case time complexity of deductive sort is \(\mathcal {O}(MN\sqrt{N})\). This approach reduces the number of comparisons by inferring the dominance relationship between the solutions. The limitation of Jensen’s approach is removed by adopting the proposal of Fortin et al. (2013). However, the limitation is removed at the cost of an increased worst case time complexity which is \(\mathcal {O}(MN^2)\). The average case time complexity of Fortin’s approach is same as Jensen’s approach which is \(\mathcal {O}(N\log ^{M-1}N)\).

An efficient approach for non-dominated sorting known as ENS was proposed by Zhang et al. (2015). ENS works in two phases. In the first phase, the solutions are sorted based on a particular objective (generally the first objective). The solutions are ranked in the second phase. There are two variants of ENS based on how a solution is added to the existing set of fronts. The first one is ENS-SS which is based on sequential search, and the second one is ENS-BS which is based on binary search. ENS-SS and ENS-BS both have worst case time complexity \(\mathcal {O}(MN^2)\). However, the best case time complexity of ENS-SS is \(\mathcal {O}(MN\sqrt{N})\) and the best case time complexity of ENS-BS is \(\mathcal {O}(MN\log N)\). Buzdalov and Shalyto (2014) proposed an approach with time complexity \(\mathcal {O}(N \log ^{M-1} N)\). A Hierarchical Non-dominated Sorting (HNDS) scheme was proposed by Bao et al. (2017). Like ENS, this is also a two-phased approach where the solutions are sorted based on a particular objective in the first phase and the solutions are ranked in the second phase. The worst case time complexity of HNDS is \(\mathcal {O}(MN^2)\), and the best case time complexity is \(\mathcal {O}(MN\sqrt{N})\). Corner sort was proposed by Wang and Yao (2014) with a worst case time complexity \(\mathcal {O}(MN^2)\).

There are several other recent proposals (Roy et al. 2016; Zhang et al. 2018; Gustavsson and Syberfeldt 2018; Mishra et al. 2018b, a; Roy et al. 2018) where, for a solution to be inserted into a front, there is no need to compare it with all the other solutions. Best Order Sort (BOS) (Roy et al. 2016) is based on this same concept. In BOS, the solutions are initially sorted based on each of the objectives, considered separately. Then, the solutions are assigned to their respective front. The worst case time complexity of BOS is \(\mathcal {O}(MN^2)\), whereas its best case time complexity is \(\mathcal {O}(MN\log N)\). BOS has two advantages: (i) the number of dominance comparisons is reduced to a great extent and (ii) all the objectives values of two solutions are not considered while obtaining the dominance relationship between them. The second advantage of BOS is because of the comparison set concept which contains all the objective values of the solution. In spite of these two advantages, BOS is not able to handle duplicate solutions properly. BOS has been recently updatedFootnote 1 to remove its limitation. However, BOS loses its second advantage in the process of removing its limitation. Mishra et al. (2018b) also worked on the same limitation of BOS and handled duplicate solutions efficiently without retaining the second advantage of BOS. Recently, the generalized version of BOS called “Generalized Best Order Sort” (GBOS) has been proposed which handles duplicate solutions efficiently and retains the comparison set concept of BOS (Mishra et al. 2018a). Thus, it makes BOS more effective in terms of the number of comparisons. Bounded Best Order Sort (BBOS) (Roy et al. 2018) is an improved version of BOS. BBOS works better for a large number of fronts. A heuristic is proposed to reduce the computational effort of solution comparisons. The worst case time complexity of BBOS is \(\mathcal {O}(MN^2)\), and the best case time complexity is \(\mathcal {O}(MN\log N)\). BBOS can achieve \(\mathcal {O}(MN\log N + N^2)\) time complexity in case of a single front. The same time complexity has also been achieved by Roy et al. (2016); Mishra et al. (2018a, 2018b).

A tree-based approach known as T-ENS was proposed by Zhang et al. (2018). In this approach, a non-dominated front is represented in the form of a tree to reduce the number of comparisons. The worst case time complexity of T-ENS is \(\mathcal {O}(MN^2)\), and the best case time complexity is \(\mathcal {O}(\nicefrac {MN\log N}{\log M})\). By extending ENS-BS (Zhang et al. 2015), a tree-based approach known as ENS-NDT (Efficient Non-dominated Sort based on Non-dominated Tree) was proposed by Gustavsson and Syberfeldt (2018). This approach is able to handle duplicate solutions efficiently. The worst case time complexity of ENS-NDT is \(\mathcal {O}(MN^2)\), and its best case time complexity is \(\mathcal {O}(MN\log N)\) when \(M > \log N\); otherwise, it is \(\mathcal {O}(N \log ^2 N)\).

There has been also some research on the parallelization of non-dominated sorting. A very fast non-dominated sort was proposed by Smutnicki et al. (2014). This approach focuses on exploring the parallelization of fast non-dominated sort (Deb et al. 2002). Parallelism is considered in two different manners. The time complexity of the first parallel version is \(\mathcal {O}(M+N \log N)\), and the second parallel version is \(\mathcal {O}(M+N)\). Gupta and Tan (2015) proposed a GPU-based parallel algorithm for non-dominated sorting which is also based on fast non-dominated sort (Deb et al. 2002). Ortega et al. (2017) also explores the parallelism in fast non-dominated sort (Deb et al. 2002). Three parallel versions were developed in this regard (i.e., first based on GPUs, a second one based on multicores and a third one based on both GPUs and multicores). Recently, the parallelism in BOS (Roy et al. 2016) has been explored by Moreno et al. (2018). They have proposed two different parallel versions based on multicore processors and the GPU. Recently, the parallel version of ENS Zhang et al. (2015) has been discussed in Mishra and Coello (2018).

There are also other approaches (Drozdik et al. 2015; Mishra et al. 2016; Li et al. 2017; Mishra et al. 2017; Yakupov and Buzdalov 2017) where in spite of performing the complete non-dominated sorting, an offspring solution is inserted into its proper place in the existing sorted set of fronts. This kind of scenario is generally used in steady-state multi-objective evolutionary algorithms (Mishra et al. 2017).

3 Naive approach: serial version

In this section, we discuss the naive approach (Srinivas and Deb 1994) in its serial version. In the naive approach, each solution is compared with all the other solutions. After comparing the solutions with each other, the solutions which are not dominated by any other solution are assigned to the first front. Now, the solutions of the first front are not considered. The rest of the solutions are compared with each other. Now, the solutions which are not dominated by any other solution are assigned to the second front. This process is repeated until all the solutions are assigned to their respective front. The naive approach is described in Algorithm 1.

figure a

The worst case of the naive approach occurs when all the solutions are in the different fronts. In this case, in each iteration of the algorithm, a single solution is ranked. Thus, the time complexity in the worst case is given by Eq. (1).

$$\begin{aligned} T_{1_\text {worst}}&= \sum \limits _{i=1}^{N} \left[ \left\{ \sum \limits _{j=1}^{N-i+1} M(N-i) \right\} + (N-i+1) \right] \nonumber \\&= \frac{1}{3}MN\left( N^2-1\right) + \frac{1}{2}N(N+1) = \mathcal {O}\left( MN^3\right) \end{aligned}$$
(1)

The best case of the naive approach occurs when all the solutions are in the same front. In this case, each solution is compared with other solutions only once and they are ranked. Thus, the best case time complexity is given by Eq. (2).

$$\begin{aligned} T_{1_\text {best}}&= \sum \limits _{i=1}^{1} \left[ \left\{ \sum \limits _{j=1}^{N-i+1} M(N-i) \right\} + (N-i+1) \right] \nonumber \\&= MN(N-1) + N = \mathcal {O}\left( MN^2\right) \end{aligned}$$
(2)

To sort the solutions into different fronts, an array ‘\(\text {isDominated}[\,\,\,]\)’ is needed to store whether a solution is dominated by any other solution in the population or not. The maximum size of the population is N. Thus, the space complexity of the naive approach is \(\mathcal {O}(N)\).

Fig. 1
figure 1

The process to obtain the dominance relationship between the two solutions \(\textit{sol}_i\) and \(\textit{sol}_j\) in a parallel manner

4 Scope of parallelism

In this section, we discuss the scope of parallelism in the naive approach. Parallelism is analyzed in the naive approach in three different manners. Before discussing the parallelism in detail, we first discuss the computing environment for which parallelism will be explored.

4.1 Computing environment

In our study, we are considering the PRAM CREW (Parallel random-access machine with Concurrent Read, Exclusive Write) model as considered in Smutnicki et al. (2014). The PRAM CREW model is earliest and best-known model of parallel computation (JáJá 1992; Kumar et al. 1994). In this model, simultaneous read at the same memory location is allowed. However, simultaneous write is not allowed. As simultaneous write operations are not allowed, so there will be no concurrent write operations in our parallel version. Analysis of parallel algorithms is usually carried out under the assumption that an unbounded number of processors is available (Niculescu 2007; Mikloško and Kotov 1984). So we have also considered this assumption. In our analysis, we have also obtained the maximum number of processors which can be required for the parallel version of the algorithm.

4.2 Parallelism in dominance comparisons

In general, the dominance relation between two solutions can be obtained in \(\mathcal {O}(M)\) time as M objectives need to be compared. The dominance relation between each pair of solutions in the population can be obtained simultaneously. The dominance relationship between each pair of solutions can be stored in a matrix of size \(N \times N\). We call this matrix dominance matrix. Thus, the dominance matrix can be obtained in \(\mathcal {O}(M)\) time if the dominance relation between each pair of solutions can be obtained simultaneously. Smutnicki et al. (2014) has obtained the dominance matrix in the same manner. The time complexity of obtaining the dominance matrix can be further improved if the time complexity of obtaining the dominance relation can be improved. Now, we discuss the improved way to compute the dominance relation between two solutions.

Let us have two solutions \(\textit{sol}_i\) and \(\textit{sol}_j\). For these two solutions, we create two Boolean arrays, each of size M. Let the first array be \(B_i\) and the second array be \(B_j\). \(B_i\) is used to store whether \(\textit{sol}_i\) is better than \(\textit{sol}_j\) for each of the M objectives. Similarly, \(B_j\) is used to store whether \(\textit{sol}_j\) is better than \(\textit{sol}_i\) for each of the M objectives. If the objective value of \(\textit{sol}_i\) is better than (less than as we focus to minimize all the objectives) the objective value of \(\textit{sol}_j\) for the same objective, then the corresponding cell of \(B_i\) is set to ‘True’; otherwise, it is set to ‘False’. Similarly, \(B_j\) is also filled.

These two arrays \(B_i\) and \(B_j\) are processed simultaneously. As the size of both the arrays is M, so these arrays are processed at \(\log M\) levels. At each level, ‘OR’ operations between two consecutive array cells are performed. The number of ‘OR’ operations at the \(l^{th}\) level is \(\nicefrac {M}{2^l}\). Thus, the number of ‘OR’ operations at the last level is one and we get either ‘True’ or ‘False’ after the ‘OR’ operation at the last level. Two values are obtained after processing both the arrays \(B_i\) and \(B_j\). Let the value obtained from \(B_i\) be \(V_i\) and the value obtained from \(B_j\) be \(V_j\). The dominance relationship between two solutions \(\textit{sol}_i\) and \(\textit{sol}_j\) can be obtained from \(V_i\) and \(V_j\) based on the following four conditions:

  1. 1.

    \(\varvec{V_i} = \varvec{V_j} = \) False: Solutions \(\textit{sol}_i\) and \(\textit{sol}_j\) are the same in terms of the objective values.

  2. 2.

    \(\varvec{V_i} = \varvec{V_j} = \) True: Solutions \(\textit{sol}_i\) and \(\textit{sol}_j\) are non-dominated.

  3. 3.

    \(\varvec{V_i} = {\textsc {True}}\,\mathbf{and }\, \varvec{V_j} = {\textsc {False:}}\) Solution \(\textit{sol}_i\) dominates \(\textit{sol}_j\).

  4. 4.

    \(\varvec{V_i} = \) False and \(\varvec{V_j} = \) True: Solution \(\textit{sol}_i\) is dominated by \(\textit{sol}_j\).

The arrays \(B_i\) and \(B_j\) can be filled in \(\mathcal {O}(1)\) time, in parallel. Different ‘OR’ operations at the same level can be performed simultaneously in both Boolean arrays, so the time complexity of processing these arrays in a parallel manner is \(\mathcal {O}(\log M)\). Thus, the dominance relationship between a pair of solutions can be obtained in \(\mathcal {O}(\log M)\) time. Hence, the dominance matrix can also be obtained in \(\mathcal {O}(\log M)\) time in parallel, as the dominance relation between each pair of solutions can be obtained simultaneously. Once the dominance matrix is obtained, the time complexity of obtaining the dominance relationship between two solutions is \(\mathcal {O}(1)\) as only a lookup in the dominance matrix is required.

The dominance relation between two solutions is obtained using two Boolean arrays of size M which require \(\mathcal {O}(M)\) space. There are a total of \(N^2\) pairs of solutions, so the overall space required to obtain the dominance relation between each pair of solutions is \(\mathcal {O}(MN^2)\). Also, the space required to store the dominance matrix is \(\mathcal {O}(N^2)\). Thus, the overall space complexity to obtain the dominance matrix in a parallel manner is \(\mathcal {O}(MN^2)\).

Example 1

Let us consider two solutions \(\textit{sol}_i {=} \left\{ 4,2,6,5 \right\} \) and \(\textit{sol}_j {=} \left\{ 4,2,6,1 \right\} \) which are in four-dimensional objective space. For obtaining the dominance relationship between these two solutions, we fill the Boolean arrays \(B_i\) and \(B_j\). After obtaining two Boolean arrays, these arrays are processed simultaneously using ‘OR’ operations. The final value obtained after processing \(B_i\) is ‘False’ and after processing \(B_j\) is ‘True’. So, \(\textit{sol}_i\) is dominated by \(\textit{sol}_j\). The complete process of filling both Boolean arrays and processing them is shown in Fig. 1.

Boolean array \(B_i\) can be filled using maximum of M processors in parallel as its size is M. In the same manner, array \(B_j\) can also be filled using maximum of M processors. Both these arrays can be filled simultaneously. Thus, the maximum number of processors required to fill both the Boolean arrays is 2M. To obtain the dominance relationship between solutions \(\textit{sol}_i\) and \(\textit{sol}_j\), Boolean arrays \(B_i\) and \(B_j\) are processed at \(\log M\) levels. The maximum number of processors required to process any of the Boolean arrays is \(\nicefrac {M}{2}\). Both the Boolean arrays can be processed simultaneously, thus, the maximum number of processors required to process both the Boolean arrays is \(\nicefrac {M}{2}+\nicefrac {M}{2}=M\). Hence, using maximum 2M processors, the dominance relationship between two solutions can be obtained. As the dominance relationship between each pair of the solutions can be obtained simultaneously and there are \(N^2\) pairs of solutions, so the maximum number of processors required to obtain the dominance matrix in a parallel manner is \(2MN^2\).

4.3 Parallel version-1

The parallel version-1 of the naive approach is described in Algorithm 2. Here, each solution \(\textit{sol}\) can be simultaneously compared with other solutions (the outer for loop in lines \(4-11\) can be implemented in a parallel manner). However, a particular solution \(\textit{sol}\) is compared to other solution \(\textit{sol}\,'\) in a serial manner (the inner for loop in lines \(5-10\) is implemented in a serial manner). This scenario is shown in Fig. 2a. In this figure, all the N solutions in the top array are simultaneously compared with other solutions in the bottom array. However, each of the solutions in the top array is compared with all the solutions in the bottom array sequentially.

figure b
Fig. 2
figure 2

Different types of parallelism in the naive approach. \(1,2,3,\ldots ,N\) in a denotes the sequential comparisons whereas \(1,1,1,\ldots ,1\) in b denotes the parallel comparisons

In this parallel version, after comparing each solution with all the other solutions, we check each solution sequentially to see whether it has been dominated by any other solution or not (lines \(12-15\)). The solution which is not dominated by any other solution is assigned a rank and removed from the population so that it do not take part is rank assignment process again. We repeat this process until all the solutions are ranked.

The time complexity in the worst case is given by Eq. (3).

$$\begin{aligned} T_{\infty _\text {worst}}&= \sum \limits _{i=1}^{N} \left[ M(N-i) + (N-i+1) \right] \nonumber \\&= \frac{1}{2}MN(N-1) + \frac{1}{2}N(N+1) = \mathcal {O}(MN^2) \end{aligned}$$
(3)

The time complexity in the best case is given by Eq. (4).

$$\begin{aligned} T_{\infty _\text {best}}&= \sum \limits _{i=1}^{1} \left[ M(N-i) + (N-i+1) \right] \nonumber \\&= M(N-1) + N = \mathcal {O}(MN) \end{aligned}$$
(4)

To sort the solutions into different fronts, an array ‘\(\text {isDominated}[\,\,\,]\)’ is needed to store whether a solution is dominated by any other solution in the population or not. The maximum size of the population is N. Thus, the space complexity of this parallel version is \(\mathcal {O}(N)\). So, this parallel version does not add any extra overhead in terms of the space complexity as compared to the serial version.

In this parallel version, each solution \(\textit{sol}\) can be simultaneously compared with other solutions. However, a particular solution \(\textit{sol}\) is compared with other solutions in a serial manner. There are maximum N solutions; thus, the maximum number of processors required by this approach is N.

In this parallel version, if the dominance relationship between different solutions can be obtained initially as described in Sect. 4.2 and stored in dominance matrix, then the time complexity in the worst case is given by Eq. (5) and the time complexity in the best case is given by Eq. (6).

$$\begin{aligned} T_{\infty \text {worst}}&= \log M + \sum \limits _{i=1}^{N} \left[ (N-i) + (N-i+1) \right] \nonumber \\&= \log M+ \frac{1}{2}N(N-1) + \frac{1}{2}N(N+1) = \mathcal {O}(\log M + N^2) \end{aligned}$$
(5)
$$\begin{aligned} T_{\infty \text {best}}&= \log M + \sum \limits _{i=1}^{1} \left[ (N-i) + (N-i+1) \right] \nonumber \\&= \log M + (N-1) + N = \mathcal {O}(\log M + N) \end{aligned}$$
(6)

If the dominance relationship between different solutions can be obtained initially and stored in a matrix as described in Sect. 4.2, then the space required for obtaining the dominance matrix is \(\mathcal {O}(MN^2)\). Thus, the overall space complexity of this parallel version, when the dominance relationship between different solutions can be obtained beforehand, is \(\mathcal {O}(MN^2)\).

The maximum number of processors required by this approach is N. When the dominance relationship between different solutions can be obtained initially and stored in a matrix as described in Sect. 4.2, then the maximum number of processors required by parallel version-1 is \(2MN^2\).

4.4 Parallel version-2

The parallel version-2 of the naive approach is described in Algorithm 3. In this parallel version, each solution \(\textit{sol}\) can be simultaneously compared with other solutions (The outer for loop in lines \(4-15\) is implemented in a parallel manner). Also, a particular solution \(\textit{sol}\) is compared with other solutions \(\textit{sol}\,'\) in a parallel manner (the inner for loop in lines \(6-10\) is also implemented in a parallel manner). This scenario is shown in Fig. 2b. In this figure, all the N solutions in the top array are simultaneously compared with other solutions in the bottom array. Also, each of the solutions in the top array is compared with all the solutions in the bottom array, simultaneously.

figure c

After comparing \(\textit{sol}\) with all the other solutions simultaneously, we check whether \(\textit{sol}\) is dominated by any of the solutions or not (line \(11-14\)). For this purpose, the dominance relation of \(\textit{sol}\) with respect to all the other solutions is stored in an array ‘\(\textit{isDom}[\,\,\,]\)’. Let the size of ‘\(\textit{isDom}[\,\,\,]\)’ be N which stores whether a particular solution \(\textit{sol}\) is dominated by other solutions \(\textit{sol}\,' \in \mathbb {P}\). A True value in this array indicates that \(\textit{sol}\) is dominated by \(\textit{sol}\,'\) and False indicates that it is not dominated.

Fig. 3
figure 3

Simultaneously check whether a solution \(\textit{sol}\) is dominated by at least one of the solutions or not in \(\mathcal {O}(\log N)\) time considering N solutions

To know whether \(\textit{sol}\) is dominated by any other solution or not, the array ‘\(\textit{isDom}[\,\,\,]\)’ is processed in a parallel manner at \(\log N\) levels. At each level, an ‘OR’ operation is performed between two consecutive array cells. At the \(l^{th}\) level, \(\nicefrac {N}{2^l}\)OR’ operations are performed. We are considering ‘OR’ operations because a solution cannot be ranked even if it is dominated by at least one of the solutions and ‘OR’ gives True if any of its inputs is True. The time complexity of processing the ‘\(\textit{isDom}[\,\,\,]\)’ array in a parallel manner is \(\mathcal {O}(\log N)\) as the ‘OR’ operation is performed at \(\log N\) levels and at each level all the ‘OR’ operations are performed simultaneously. At the last level, if True is obtained, it means that \(\textit{sol}\) is dominated by at least one of the solutions.

Now, we discuss the processing of the ‘\(\textit{isDom}[\,\,\,]\)’ array in a parallel manner using an example.

Example 2

Let \(\mathbb {P} = \left\{ \textit{sol}_1,\textit{sol}_2,\ldots ,\textit{sol}_8 \right\} \) be a population of eight solutions. So, the size of the ‘\(\text {isDom}[\,\,\,]\)’ array will also be eight. Let solution \(\textit{sol}_3\) be checked to see whether it is dominated by other solutions in \(\mathbb {P}\) or not. The array ‘\(\text {isDom}[\,\,\,]\)’ stores whether \(\textit{sol}_3\) is dominated by the solutions of \(\mathbb {P}\) or not. Figure 3 shows the ‘\(\text {isDom}[\,\,\,]\)’ array. As the size of this array is eight, it is processed at \(3(=\log 8)\) levels. At the last level, we are getting True, so \(\textit{sol}_3\) is dominated by at least one of the solutions of population \(\mathbb {P}\).

After comparing each solution with respect to all the others, we check for the solutions which are dominated by at least one of the solutions. This process is implemented in a parallel manner (lines \(11-14\)). The solutions which are not dominated by any other solution are assigned rank one. This complete process is repeated until all the solutions are ranked.

The time complexity in the worst case is given by Eq. (7).

$$\begin{aligned} T_{\infty _\text {worst}}&= \sum \nolimits _{i=1}^{N} \left[ M + \lceil {\log (N-i+1)}\rceil \right] + (N-i+1) \nonumber \\&= MN + N\log N - (N-1) + \frac{1}{2}N(N+1) = \mathcal {O}(MN+N^2) \end{aligned}$$
(7)

The time complexity in the best case is given by Eq. (8).

$$\begin{aligned} T_{\infty _\text {best}}&= \sum \nolimits _{i=1}^{1} \left[ M + \lceil {\log (N-i+1)} \rceil \right] + (N-i+1) \nonumber \\&= (M + \log N) + N = \mathcal {O}(M+N) \end{aligned}$$
(8)

In this parallel version, an array of population size ‘\(\text {isDominated}[\,\,\,]\)’ is needed to store which solution is dominated by any other solution in the population. Along with this, for each solution \(\textit{sol}\), an array ‘\(\text {isDom}[\,\,\,]\)’ of the size equal to the size of population is also created. The space required to store the ‘\(\text {isDominated}[\,\,\,]\)’ array is \(\mathcal {O}(N)\). The space required to store ‘\(\text {isDom}[\,\,\,]\)’ array is also \(\mathcal {O}(N)\), and this array ‘\(\text {isDom}[\,\,\,]\)’ is created for each of the solutions. Thus, the overall space required to store ‘\(\text {isDom}[\,\,\,]\)’ is \(\mathcal {O}(N^2)\). Thus, the space complexity of this parallel version is \(\mathcal {O}(N^2)\).

In this parallel version, each solution \(\textit{sol}\) is simultaneously compared with other solutions. Also, a particular solution \(\textit{sol}\) is compared with other solutions simultaneously. Once a solution \(\textit{sol}\) has been compared with other solutions simultaneously (line \(6-10\)), we check whether \(\textit{sol}\) is dominated by any other solutions to whom it has been simultaneously compared (line \(11-14\)). The number of processors require to compare a solution with all the solutions simultaneously is N. The maximum number of processors required to check whether a solution is dominated by any other solution or not in a parallel manner is \(\nicefrac {N}{2}\). Thus, the maximum number of processors required by this approach is \(N^2\).

Here, if the dominance relationship between different solutions can be obtained initially as described in Sect. 4.2, then the time complexity in the worst case is given by Eq. (9) and the time complexity in the best case is given by Eq. (10).

$$\begin{aligned} T_{\infty _\text {worst}}&= \log M + \sum \nolimits _{i=1}^{N} \left[ 1 + \lceil {\log (N-i+1)} \rceil \right] + (N-i+1) \nonumber \\&= \log M + N + N\log N - (N-1) + \frac{1}{2}N(N+1) \nonumber \\&= \mathcal {O}(\log M + N^2) \end{aligned}$$
(9)
$$\begin{aligned} T_{\infty _\text {best}}&= \log M + \sum \nolimits _{i=1}^{1} \left[ 1 + \lceil {\log (N-i+1)}\rceil \right] + (N-i+1) \nonumber \\&= \log M + (\log N + N) = \mathcal {O}(\log M + N) \end{aligned}$$
(10)

If the dominance relationship between different solutions can be obtained initially and stored in a matrix as described in Sect. 4.2, then the space required for obtaining the dominance matrix is \(\mathcal {O}(MN^2)\). Thus, the overall space complexity of this parallel version, when the dominance relationship between different solutions can be obtained beforehand, is \(\mathcal {O}(MN^2)\).

The maximum number of processors required by this approach is \(N^2\) without considering dominance matrix. The maximum number of processors required to obtain the dominance matrix in a parallel manner is \(2MN^2\). Thus, the maximum number of processors required by parallel version-2, when the dominance relation between the solution is obtained in constant time considering dominance matrix, is \(2MN^2\).

4.5 Parallel version-3

The parallel version-3 of the naive approach is described in Algorithm 4. In this case, each solution \(\textit{sol}\) can be simultaneously compared with other solutions (the outer for loop in lines \(5-16\) is implemented in a parallel manner). Also, a particular solution \(\textit{sol}\) is compared with other solutions \(\textit{sol}\,'\) in a parallel manner (the inner for loop in lines \(7-11\) is also implemented in a parallel manner). This scenario is shown in Fig. 2b.

figure d

In the previous versions of the naive approach, the solutions which are ranked are removed from the population. However, in this version, the ranked solutions are not removed from the population but instead, they are marked so that they are not ranked again. After comparing solutions with each other, we have to check whether a solution \(\textit{sol}\) is dominated by another solution \(\textit{sol}\,'\) or not. This can be done in a parallel manner by processing an array of size N in \(\mathcal {O}(\log N)\) time as discussed in Parallel Version-2.

Fig. 4
figure 4

Simultaneously check whether N solutions are ranked or not in \(\mathcal {O}(\log N)\) time considering N solutions

At last, we check the solutions which are not dominated by any other solutions and are also not ranked, in a parallel manner (lines \(17-20\)). After assigning rank to the solutions which are not dominated by any other non-ranked solutions, we check whether all the solutions have been ranked or not. This can also be checked in \(\mathcal {O}(\log N)\) time if performed in parallel. This whole process is repeated until all the solutions are ranked.

Now, we discuss the process of knowing whether all the solutions are ranked or not in a parallel manner using an example.

Example 3

Let \(\mathbb {P} = \left\{ \textit{sol}_1,\textit{sol}_2,\ldots ,\textit{sol}_8 \right\} \) be a population of eight solutions. Consider five solutions \(\left\{ \textit{sol}_1,\textit{sol}_4,\textit{sol}_5,\textit{sol}_6\right. \)\(\left. ,\textit{sol}_8 \right\} \) which are ranked. In version-1 and version-2, after obtaining the set of solutions belonging to a particular front, these solutions are removed from the population. However, in version-3 the solutions are not deleted and to know which solutions are ranked or not, an array of size equal to the population size is considered. The array corresponding to eight solutions is shown in Fig. 4.

When a solution is ranked, the corresponding cell in the array is marked as True which signifies that the solution is ranked. After obtaining the solutions belonging to a particular front, we have to check whether all the solutions have been ranked or not. For this purpose, the array is processed in a parallel manner. As the length of the array is N (equal to the population size), so the array is processed at \(\log N\) levels where at each level, an ‘AND’ operation is performed in consecutive array cells. At the last level, if True is obtained, then all the solutions are ranked; otherwise, all the solutions are not ranked. In Fig. 4, as there are eight solutions, so the parallel operation is performed at three different levels. At the last level, False is obtained after an ‘AND’ operation which means that all the solutions are not ranked.

The time complexity in the worst case is given by Eq. (11).

$$\begin{aligned} T_{\infty _\text {worst}}&= \sum \limits _{i=1}^{N} \left[ M + \lceil {\log N} + 1 \rceil \right] + \lceil {\log N} \rceil \nonumber \\&= MN + N + 2N\lceil {\log N}\rceil = \mathcal {O}(MN+N\log N) \end{aligned}$$
(11)

The time complexity in the best case is given by Eq. (12).

$$\begin{aligned} T_{\infty _\text {best}}&= \sum \nolimits _{i=1}^{1} \left[ M + \lceil {\log N} \rceil + 1 \right] + \lceil {\log N} \rceil \nonumber \\&= M + 1 + 2\lceil {\log N}\rceil = \mathcal {O}(M+\log N) \end{aligned}$$
(12)

In this parallel version, an array ‘\(\text {isRanked}[\,\,\,]\)’ of size N is created to store which solution has been ranked. The space required to store this array is \(\mathcal {O}(N)\). The analysis of the space complexity remains the same as the parallel version-2. Thus, the space complexity of this parallel version is \(\mathcal {O}(N^2)\).

In this parallel version, each solution \(\textit{sol}\) is simultaneously compared with other solutions. Also, a particular solution \(\textit{sol}\) is compared with other solutions simultaneously. Once a solution \(\textit{sol}\) has been compared with other solutions simultaneously (line \(7-11\)), we check whether \(\textit{sol}\) is dominated by any other solutions to whom it has been simultaneously compared (line \(12-15\)). The number of processors require to compare a solution with all the solutions simultaneously is N. The maximum number of processors required to check whether a solution is dominated by any other solution or not in a parallel manner is \(\nicefrac {N}{2}\). After this, for each solution \(\textit{sol}\) which is not dominated by any other solution and has been already ranked is assigned a rank. This operation can be carried out in parallel using maximum N processors. Thus, the maximum number of processors required by this approach is \(N^2\).

Here, if the dominance relationship between different solutions can be obtained initially as described in Sect. 4.2, then the time complexity in the worst case is given by Eq. (13) and the time complexity in the best case is given by Eq. (14).

$$\begin{aligned} T_{\infty _\text {worst}}&= \log M + \sum \nolimits _{i=1}^{N} \left[ 1 + \lceil {\log N}\rceil +1 \right] + \lceil {\log N}\rceil \nonumber \\&= \log M + 2N + 2N\lceil {\log N}\rceil = \mathcal {O}(\log M + N\log N) \end{aligned}$$
(13)
$$\begin{aligned} T_{\infty _\text {best}}&= \log M + \sum \nolimits _{i=1}^{1} \left[ 1 + \lceil {\log N}\rceil +1 \right] + \lceil {\log N}\rceil \nonumber \\&= \log M + 2 + 2\lceil {\log N}\rceil = \mathcal {O}(\log M + \log N) \end{aligned}$$
(14)

If the dominance relationship between different solutions can be obtained initially and stored in a matrix as described in Sect. 4.2, then the space required for obtaining the dominance matrix is \(\mathcal {O}(MN^2)\). Thus, the overall space complexity of this parallel version, when the dominance relationship between different solutions can be obtained beforehand, is \(\mathcal {O}(MN^2)\).

The maximum number of processors required by this approach is \(N^2\) without considering dominance matrix. The maximum number of processors required to obtain the dominance matrix in a parallel manner is \(2MN^2\). Thus, the maximum number of processors required by parallel version-3, when the dominance relation between the solution is obtained in constant time considering dominance matrix, is \(2MN^2\).

The worst case time complexity of the naive approach is \(\mathcal {O}(MN^3)\), and the best case time complexity is \(\mathcal {O}(MN^2)\). The time complexity of the parallel version of the non-dominated sorting was proved to be \(\mathcal {O}(M+N)\) by Smutnicki et al. (2014). The worst and the best case time complexities of the parallel version-1 and parallel version-2 of the naive approach are \(\mathcal {O}(\log M+N^2)\) and \(\mathcal {O}(\log M+N)\), respectively. The best case time complexity is better than the time complexity as reported in Smutnicki et al. (2014). However, the worst case time complexity is not. The worst and best case time complexities of the parallel version-3 of the naive approach are \(\mathcal {O}(\log M+N\log N)\) and \(\mathcal {O}(\log M+\log N)\), respectively. The best case time complexity of the parallel version-3 is \(\mathcal {O}(\log M+\log N)\) which is better than the time complexity as reported in Smutnicki et al. (2014). However, the worst case time complexity of the parallel version-3 is \(\mathcal {O}(\log M+N\log N)\) which is not good as compared to the time complexity reported in Smutnicki et al. (2014). However, as discussed in Sect. 1, as the number of fronts decreases, the naive approach performs near to its best case. So as the evolutionary algorithm proceeds, the parallel naive approach can be advantageous because the number of non-dominated fronts start reducing.

5 Conclusions & future work

In this paper, we have explored the scope of parallelism in the naive approach. We have identified parallelism in the naive approach in three different ways. The worst case time complexity of the parallel version is \(\mathcal {O}(\log M + N\log N)\), and the best case time complexity is \(\mathcal {O}(\log M + \log N)\). The best case occurs when all the solutions are in a single front. As the evolutionary algorithm proceeds, the number of fronts decreases and the approach performs either in the best case or near to its best case. As part of our future work, we would like to find the scope of parallelism in other approaches as well. It would also be interesting to see the actual speedup when different parallel methods of the naive approach are implemented.