Parallel heuristic local search algorithm on OTIS hyper hexa-cell and OTIS mesh of trees optoelectronic architectures

Al-Adwan, Aryaf; Sharieh, Ahmad; Mahafzah, Basel A.

doi:10.1007/s10489-018-1283-2

Parallel heuristic local search algorithm on OTIS hyper hexa-cell and OTIS mesh of trees optoelectronic architectures

Published: 12 September 2018

Volume 49, pages 661–688, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Applied Intelligence Aims and scope Submit manuscript

Parallel heuristic local search algorithm on OTIS hyper hexa-cell and OTIS mesh of trees optoelectronic architectures

Download PDF

266 Accesses
27 Citations
Explore all metrics

Abstract

Heuristic local search algorithms have achieved good results in tackling combinatorial optimization problems, such as Travelling Salesman Problem (TSP). One of the well-known local search algorithms is the 2-opt algorithm. As a local search algorithm, 2-opt has achieved approximate optimal solutions for TSP within a reasonable time, especially for small data instances. However, solving large data instances of TSP using 2-opt requires extensive computation and considerable CPU time. Therefore, this paper presents a parallel version of the 2-opt algorithm, exploiting the features of Optical Transpose Interconnection System (OTIS) in solving the TSP. In this paper, we present the Parallel Repetitive 2-Opt (PRTO) algorithm for solving symmetric TSP on OTIS Hyper Hexa-Cell (OTIS-HHC) and OTIS Mesh of Trees (OTIS-MOT) optoelectronic architectures. We assess the performance of our algorithm analytically in terms of parallel time complexity, speedup, efficiency, cost, and communication cost on both optoelectronic architectures. Furthermore, a set of simulation experiments is conducted on various instances from the standard TSP library. The simulation results confirm that our algorithm is efficient regarding speedup and efficiency. For instance, the PRTO algorithm achieves a speedup of 32.9 for 6880 cities over OTIS-HHC with 36 processors. Moreover, the superiority of PRTO algorithm is shown through solving the TSP on OTIS-HHC and OTIS-MOT; its performance has been compared with the performance of the Parallel Repetitive Nearest Neighbor (PRNN) algorithm in terms of speedup, efficiency, and solution quality. For example, as a best case, the PRTO algorithm has shown 34 times improved speedup over the PRNN algorithm.

Solving traveling salesman problem using parallel repetitive nearest neighbor algorithm on OTIS-Hypercube and OTIS-Mesh optoelectronic architectures

Article 06 July 2017

Multiple k −opt evaluation multiple k −opt moves with GPU high performance local search to large-scale traveling salesman problems

Article 16 April 2020

Applying ACO to Large Scale TSP Instances

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Travelling Salesman Problem (TSP) standout one of the most combinatorial optimization problems that have been studied widely. It is considered as a challenging problem that belongs to NP-complete problems [1,2,3]. The problem itself is trivial and easy to understand, but the solution is difficult to obtain, since in the context of optimization problems, an optimal solution is required. Solving small data instances will not need substantial time, but the problem arises when solving large data instances of the TSP. Accordingly, several heuristic and metaheuristic algorithms were investigated to solve this problem to obtain approximate solutions instead of an optimal one. Recently, Pandiri and Singh [4] have presented two swarm intelligence approaches for solving multidepot salesmen problems with load balancing. The first approach is based on artificial bee colony algorithm, and the second one is based on an invasive weed optimization algorithm. Also, recently a metaheuristic method known as feasibility assured TSP-likened scheduling is presented in which the scheduling problem is translated into a construction graph, similar to TSP problem, such that it would be able to benefit from the advantages of ant colony optimization algorithm [5].

Heuristic algorithms can be categorized as constructive heuristics and improvement heuristics [6]. In the first category, a sub-optimal solution is provided by starting with an empty solution and continuously expand the existing solution until the final solution is acquired. While in the second category, the local search algorithms start with a sub-optimal solution and then attempt to improve it more, using decisions based on which solution in the neighborhood of the current solution must be moved to. This process continues until no improvement can be performed. Local search algorithm has been used to solve many optimization problems; such as, TSP and multiple sequence alignment tasks. For example, da Silva et al. [7] presented an algorithm called AlineaGA, which is a genetic algorithm, a metaheuristic approach with local search optimization for solving multiple sequence alignment tasks more accurately.

One of the local search algorithms that has significant influence, is the 2-opt algorithm [8]. This algorithm tries to improve a sub-optimal solution resulted from any constructive heuristic algorithm. The quality of the improved solutions generated by the 2-opt algorithm can be 5% above the Held-Karp lower bound [9]. However, when the TSP data instance size grows the time needed for comparing graph edges in the 2-opt algorithm grows remarkably. Therefore, researchers in this field tried to accelerate the speed of the local search algorithm and the 2-opt algorithm by means of parallelism.

Several approaches have been addressed for parallelizing this algorithm, one of them is the geometric partitioning approach proposed by Karp [10], where the author proposed a recursive schema for partitioning the cities into rectangles based on x and y coordinates of the cities, then assigns each rectangle to a processor to run the 2-opt algorithm on the set of cities within this rectangle.

Another approach is based on tour partitioning, which is proposed by Allwright and Carpenter [11], where the authors first applied a heuristic algorithm to obtain an initial tour that is divided into K segments of length $M/P$, where $M $ is the length of the initial tour and P is the number of processors. Subsequently, each segment is assigned to a processor which creates a sub-tour from it by adding an edge between the endpoints of the segment. Afterward, the 2-opt algorithm is applied to improve the sub-tour which is combined with other sub-tours from other processors to form the final improved tour. The drawback of this approach is causing the algorithm to halt before getting the sub-optimal solution. Therefore, Verhoeven et al. [12] modified the earlier approach to guarantee the tour partitioning will converge with a feasible solution. They tested their approach on several data instances from TSP Library (TSPLIB) [13] using a linear topology network of 512 transputers, their average quality of the obtained solutions was 8.3% within the optimal solution.

Other approaches were parallelized the local search algorithms using the Graphic Process Units (GPUs), where Rocki and Suda [14] copied each city coordinates in the shared memory and then one 2-opt algorithm was assigned to each GPU thread. Their average quality of the obtained solutions was8.9 % within the optimal solution. O’Neil and Burtscher [15] fully exploited the parallelism of the GPU hardware to accelerate the 2-opt local search algorithm for solving the TSP. Thus, the average quality of the obtained solutions of their parallelism was 8.24% within the optimal solution as shown by Zhou et al. [16] in their comparative evaluation. Moreover, Qiao and Créput [17] presented a parallel scheme similar to the adopted approach in [12], but on a GPU. Their approach divided the initial tour into disconnected tours and then applied the 2-opt algorithm on each sub-tour, the average quality of the obtained solutions of their approach was 11.4% within the optimal solution.

Despite the achieved acceleration of the 2-opt algorithm via the previous approaches, a noticeable degradation in the tour quality was recorded. Therefore, in this paper a repetitive approach is adopted instead of a partitioning approach, in order to gain high speedup in a much simpler approach without paying a penalty in the solution quality. Furthermore, to exploit the iterative structure offered by the Optical Transpose Interconnection System (OTIS), which is constructed from groups of processors connected using optical links, and the processors inside each group connected based on a factor network using electronic links [18]. The factor network that decides the way these processors are connected inside each group will be either a basic network such as Hypercube and Mesh [19], or a hybrid network such as Hyper Hexa-Cell (HHC) [20] and Mesh of Trees (MOT) [21]. The emergence of these optoelectronic architectures, motivates the researchers in parallel processing field, to parallelize algorithms for various problems on such architectures.

Various problems were applied on OTIS optoelectronic architectures. For example, in [22,23,24], sorting, prefix sum, routing, consecutive sum, data accumulation, and matrix multiplication algorithms were applied on OTIS-Mesh. Load balancing and wormhole routing were implemented on OTIS networks [25,26,27]. Routing, sorting and communication algorithms were implemented on OTIS-HHC [28, 29]. Parallel prefix and parallel enumeration sorting were implemented on OTIS-MOT [30, 31]. Parallel nearest neighbor algorithm for solving TSP is applied on OTIS-Hypercube and OTIS-Mesh [32]. However, until present, there is no work that is applied a local search algorithm for solving the TSP on OTIS-HHC and OTIS-MOT. Thus, in this paper, OTIS-HHC and OTIS-MOT optoelectronic architectures [20, 21] were selected to solve the TSP using the Parallel Repetitive 2-opt algorithm (PRTO). This algorithm is evaluated analytically under the following performance metrics: parallel time complexity, number of communication steps, speedup, efficiency, cost, and communication cost. Also, its performance is evaluated by several simulation experiments conducted on both optoelectronic architectures. Moreover, PRTO algorithm is applied on the factor network of each optoelectronic architecture to show the advantage of such optoelectronic architecture on their factor network. Consequently, OTIS-HHC performance is compared with HHC and OTIS-MOT performance is compared with MOT. Also, a comparison between PRTO algorithm and Parallel Repetitive Nearest Neighbor (PRNN) algorithm for solving TSP is achieved in terms of speedup, efficiency, and solution quality, which showed the superiority of PRTO algorithm over PRNN algorithm on OTIS-HHC and OTIS-MOT optoelectronic architectures. Moreover, a realistic comparison is achieved in terms of speedup and efficiency between PRTO algorithm on OTIS-HHC and MAX-MIN Ant colony System (MMAS) on Sunway Blue Light supercomputer for solving TSP, which showed the superiority of PRTO algorithm over MMAS algorithm. To the best of our knowledge, this is the first work that parallelize the 2-opt algorithm on optoelectronic architectures and that PRNN is the only heuristic algorithm that has been used for solving TSP on optoelectronic architectures.

The organization of this paper is as follows: Section 2 introduces a background on OTIS optoelectronic architectures, namely; OTIS-HHC and OTIS-MOT. Section 3 illustrates the PRTO algorithm in solving TSP over OTIS-HHC and OTIS-MOT. Section 4 presents an analytical evaluation of this algorithm on OTIS-HHC and OTIS-MOT. Section 5 shows the simulation setup and results. Finally, Section 6 concludes and summarizes the overall work.

2 OTIS optoelectronic architecture

In interconnection networks, the processors are interconnected in such a way that enables these processors to communicate. Thus, the way of connecting them plays an important role in the overall performance of the parallel system. This encourages the researchers in parallel processing field to propose interconnection networks with attractive properties such as low diameter, high bisection width, minimum and maximum node degrees, etc. In this respect, researchers propose hybrid interconnection networks that exploit the best features of two or more interconnection networks. For example, the HHC exploits the low diameter of a hypercube and a static node degree of a ring. Moreover, this mating was also on the link level, where in OTIS optoelectronic architectures, there are two types of links, the optical links that connect groups of processors, and the electronic links that connect processors within each group. This can lower the design complexity of the interconnection network, by applying the iterative structure among processors, such that higher dimension of the factor network can be simulated using lower dimension of the OTIS architecture of this factor network. Therefore, two OTIS optoelectronic architectures with hybrid factor networks were chosen to solve TSP using PRTO algorithm; namely, OTIS-HHC and OTIS-MOT. The following subsections illustrate the structure of OTIS-HHC and OTIS-MOT.

2.1 OTIS-HHC

OTIS-HHC [20] combines the topological structure of OTIS, HHC, and hypercube, where HHC is constructed by building a hexa-cell of 6 processors as the first dimension, and then three more links are added to this one dimensional hexa-cell as shown in Fig. 1a. The labels of the processors in a one-dimensional HHC are assigned in a way such that each processor in the lower triangle has a label that differs by only one significant bit from its direct neighbor in the upper triangle, and therefore, we do not use the label $<011>$. Thus, a two-dimensional HHC is constructed from two one-dimensional HHC as a hypercube. Each processor in a d-dimensional HHC has a label that is composed of two parts: the first part denotes a sub-label for the one-dimensional HHC; this sub-label refers to the location of the one-dimensional HHC in its corresponding (d-1) hypercube, while the second part denotes the label of the processor in that one-dimensional HHC. Figure 1b shows the 2D-HHC and its labeling. For example, label $<00,110>$ refers to processor six in one-dimensional HHC sub-group 0, and $<01,110>$ refers to processor six in one-dimensional HHC sub-group 1, as shown in Fig. 1b. Each one-dimensional HHC can be connected to another one-dimensional HHC, where the first one is called sub-group 0 and the second one is called sub-group 1, as depicted in Fig. 1b.

The architecture of OTIS-HHC is based on connecting n groups of a d-dimensional HHC via optical links, where each group with n or $P_{G } $processors. Particularly, d-dimensional OTIS-HHC consists of G groups that are connected via optical links and each one composed of a d-dimensional HHC. OTIS-HHC has two architectural structures. In the first structure, the number of groups equals to half of the number of processors in each group, this case is referred to as $G=$P/2, where G is the number of groups and P is the number of processors, as depicted in Fig. 2a. In this figure, the one-dimensional HHC is composed of 3 groups and 6 processors per group. Note that, each group in the one-dimensional HHC has two optical links to connect it with the other groups. In the second structure, the number of groups G in OTIS-HHC equals to the number of processors P that are located in each group, this case is referred to as $G=P$. Figure 2b depicts $G=P$ case, where in this figure the one-dimensional HHC is composed of 6 groups and 6 processors per group. Note that, each group in the one-dimensional HHC has 5 optical links to connect it with the other groups. The connection between groups is done by transposing the sequence numbers of processor and group. For example, node $<1, 5>$ is connected with node $<5, 1>$ by optical link. This means processor five in first group is connected with processor one in fifth group by optical link.

2.2 OTIS-MOT

The structure of OTIS-MOT is based on combining the best characteristics of a tree and mesh interconnection networks while throwing out their drawbacks, in addition, they inherit the features of optical communication links. OTIS-MOT combined the attractive features of MOT with a small diameter and large bisection width, and the attractive features of the OTIS interconnection networks. An $n \times n$ OTIS-MOT contains $n^{2}$ groups, each one consists of $n \times n$ or $P_{G}$ processors. For example, an $3\times 3$ OTIS-MOT has 9 groups each one consists of 9 processors [21]. The OTIS-MOT lattice is divided into rows and columns of groups of processors. Inside each group, the processors also are organized into rows and columns and connected via electronic links in two ways. The first one is row-wise, where processors in each row are connected in a way to form a binary tree with a root located at the first processor in each row, such that processor (i, j) with row R and column C, are directly connected with processor (i, j) with row R and column 2C, and connected with processor (i, j) with row R and column 2C +1 if they exist. The second way is column-wise, where processors in each column are connected in a way to form a binary tree with a root located at the first processor in each column, such that processor (i, j) with row R and column C are directly connected with processor (i, j) with row 2R and column C, and connected with processor (i, j) with row 2R +1 and column C if they exist, as depicted in Fig. 3 [21]. Different groups are connected via optical links based on the OTIS rule, which connects processor i in group j with processor j in group i.

3 PRTO algorithm

Heuristic local search algorithms have achieved good results in solving TSP. They start with a sub-optimal solution and then attempt to improve it until no improvement can be performed. One of the well-known local search algorithms is the 2-opt algorithm. This algorithm tries to improve a sub-optimal solution resulted from any constructive heuristic algorithm. The improvement starts by considering a tour, then performs exchanges between cities in this tour in order to find a better solution. In other words, the 2-opt algorithm considers each pair of cities within the tour and exchange them, if and only if, the new tour is shorter than the old one, this continues until no further improvement can be found, as illustrated in Fig. 4. So, when the generated tour is optimal in the context of the 2-opt algorithm, the process will stop. The time complexity of this algorithm is O(N²), where N is the number of cities.

The quality of the improved solution depends directly on the initial route of a given graph. The best initial tour will not guarantee the best improvement tour. Thus, to obtain the best improvement tour of a given graph, a repetitive 2-opt algorithm must be applied on each tour that can be generated from the graph. Then the best route can be selected among all the improved routes with a time complexity equals to O(N³). This approach guarantees obtaining the best quality solution in the context of 2-opt algorithm. However, it needs an iterative process which consumes the available resources and requires a considerable CPU time; particularly in large instances of cities. Therefore, in this section, a repetitive 2-opt heuristic algorithm is selected to be parallelized over OTIS-HHC and OTIS-MOT optoelectronic architectures. The PRTO algorithm for both optoelectronic architectures is illustrated and presented first, then its analytical evaluation is presented in Section 4.

3.1 PRTO algorithm on OTIS-HHC

The PRTO algorithm in this paper is designed based on the interesting features of the selected OTIS optoelectronic architectures; such as the iterative structure among groups, and the existence of optical links between these groups. The algorithm is composed of four phases, namely: load balancing phase, data distribution phase, local repetitive 2-opt phase and data combining phase. The PRTO algorithm on OTIS-HHC in Fig. 5 is illustrated in more details as follows:

Phase 1: Load Balancing Phase :: Assume that processor $<0, 0>$ in Group 0 (G₀) as the Main Coordinator (MC) processor, which handles the balancing of the cities among the number of processors. Therefore, MC processor in $G_{0}$ will partition the N cities among P processors; such that each processor will take N/P cities. It is important to mention that the TSPLIB contains data instances with various sizes. These sizes are not power of two; so, $N/P$ will generate fractions that can be considered as extra cities. The main goal of this phase is to make sure that all the extra cities will be distributed among the processor such that only one extra city will be allocated to the balanced processors. The allocated cities for each processor will be stored in the Allocated Cities Array (ACA).

Phase 2: Data Distribution Phase :: The Main Group (MG) which is $G_{0}$ in OTIS-HHC contains the MC processor that holds the Distance Matrix (DM). The DM must be distributed to all other processors in the optoelectronic architecture. DM represents the distances between the cities in the graph. Moreover, each group contains Group Coordinator (GC) processor, which is the processor that connects group $G_{i}$ with the MG via optical link.

The data distribution phase is composed of the following steps:

I
Electronic Main Group Distribution (lines 2-6 in Fig. 5): At this phase, MC processor will send a copy of the DM of size $N^{2}$, where N is the number of cities, to all other processors in the optoelectronic architecture. The need to send the whole matrix to all processors in the optoelectronic architecture is for generating an initial tour using the nearest neighbor algorithm in order to be improved using the selected local search algorithm. In addition to that, MC processor will send ACA to all processors in OTIS-HHC. Consequently, in intra HHC distribution MC processor initiates a one to all broadcast in the one-dimensional HHC of sub-group 0, to send DM through the electronic links by utilizing the difference in one-bit position of the processors labels. This process is applied in three steps as illustrated in Fig. 6a. Processor 000 will send DM and ACA to processor 001, then processor 001 will send DM and ACA to processor 010 and finally, processors in the upper triangle will send (in parallel) DM and ACA to processors in the lower triangle. Now, in d-dimensional distribution, all processors in one-dimensional HHC will send (in parallel) both DM and ACA to their connected processor in the other HHC sub-groups. For example, the processor with label $<0, 010>$ will send the communicated message to the processor with label $<1, 010>$, as depicted in Fig. 6b.
II
Optical Distribution of Data (lines 7-8 in Fig. 5): All processors in MG (in parallel) start sending DM and ACA through the optical links to their transpose processors in other groups. This will require one OTIS move. At the end of this phase, all GC’s processors will have a copy of DM and ACA.
III
Inter Group Distribution of Data (lines 9-10 in Fig. 5): Steps 2-5 in Fig. 5 will be repeated by each GC processor in each group in OTIS-HHC. Intra group distribution and d-dimensional distribution will be accomplished by each GC in parallel to resend DM and ACA to all other processors in the optoelectronic architecture. Finally, at the end of this phase, P copies of DM and ACA are distributed to P processors in the optoelectronic architecture.

Local Repetitive 2-Opt Phase (lines 11-12 in Fig. 5):: All processors in OTIS-HHC, in parallel, will apply the sequential 2-opt algorithm on each city that belongs to the set of cities that were allocated for each processor in the load balancing phase. The algorithm will be applied many times based on considering different city as a starting city every time, resulting in different routes for each starting city stored in a Route Matrix (RM); such that each processor will have its own RM

Phase 4: Data Combining Phase :

Since the structure of OTIS-HHC constructed by combining the structural properties of the HHC and hypercube, then each HHC graph will consist of HHC sub-groups that differ in one bit position of their labels. Therefore, the combining phase will exploit these features of OTIS-HHC, where three combining levels will be adopted based on three processors. They are MC processor, GC processor, and Sub-Group Coordinator processor (SGC), where SGC is the processor with label $< 000>$ in each sub-group. Thus, there will be multiple SGC s within the same group of OTIS-HHC. Thus, the following are the required steps for data combining phase:

I
Inter Group Data Combining (lines 13-18 in Fig. 5). This step will combine all the RMs from all the processors in all groups to their associated GC processors via electronic links. At intra HHC, combining each processor in one-dimensional HHC will route its RM to the SGC processor via a gather scheme. This can be done by overturning the steps applied in Fig. 6a. Now, in d-dimensional HHC combining, each SGC processor will send the gathered RM s to its directly connected SGC processor. At the end of this step, all GCs processors in OTIS-HHC will have the accumulated RM s as a Group Route Matrix (GRM), which will be sent through the optical links. Note that the size of the communicated message will be enlarged based on the size of combined matrices.
II
Optical Data Combining (lines 19-20 in Fig. 5). All GCs processors in the whole OTIS-HHC, except GC processor in $G_{0}$, will send their GRMs via optical links to their corresponding processors in the main group $G_{0}$.
III
Main Group Data Combining (lines 21-22 in Fig. 5). This step will repeat the steps in lines 13-18 of Fig. 5 with the exception of $G_{0}$, where each processor will combine the collected GRM with its own RM and sends it to its directly connected processor using a gather scheme, then in d-dimensional HHC combining operation, all SGC processors will send the collected GRM s to their directly connected SGC processor. The process continues till the MC processor has collected all GRMs from all processors in $G_{0}$. At the end of this communication, MC processor can now find the minimum route among all the collected routes.

3.2 PRTO algorithm on OTIS-MOT

PRTO algorithm on OTIS-MOT consists of four phases; load balancing phase, distribution phase, sequential 2-opt phase and combining phase. Only the distribution and combining phases represent the communication time in OTIS-HHC and OTIS-MOT. These phases depend directly on the topological structure of these optoelectronic architectures. Consequently, only the combining and distribution phases will be illustrated in more details in this section.

Phase 2: Data Distribution Phase :

The data distribution phase is composed of the following steps:

I
Electronic Main Group Distribution (lines 2-6 in Fig. 7): In the data distribution phase, MC processor will send a copy of DM and ACA to all other processors in the OTIS-MOT. The distribution will be carried out using one to all broadcast in two phases. The first one is the row-wise phase, where MC processor will start the communication using one to all broadcast to all other processors in the same row tree. The second phase is the column-wise phase, where one to all broadcast will be initiated by each processor that received DM in the earlier phase. This one to all broadcast will send DM to all processors in the same column tree of the sender processor through electronic links. Each phase will take$ \log n$ steps. Thus, the total electronic main group distribution will need $ 2\log n$ communication steps. Now, all the nodes that received DM and ACA in MG start the optical distribution of data as shown in the next step.
II
Optical Distribution of Data (lines 7-8 in Fig. 7): All nodes in $G_{0}$ (in parallel) start sending DM and ACA through the optical links to their transpose processors in other groups. This will require one OTIS optical move.
III
Inter Group Distribution of Data (lines 9-10 in Fig. 7): Steps 2-5 in Fig. 7 will be repeated by each GC processor in each group in OTIS-MOT. Row-wise and column-wise distributions will be accomplished by each GC in parallel to resend DM and ACA to all other processors in the optoelectronic architecture.

Phase 4: Data Combining Phase :

Data combining phase is done by overturning the order of steps in the distribution phase as follows:

I
Inter Group Data Combining (lines 17-18 in Fig. 7). This step will combine all the Route Matrices (RMs) from all the processors in all groups to their associated GC processors via electronic links. At column-wise combining, each processor in the same group and located in the last row of this group will route its RM to all other processors in the same column tree via a gather scheme. Now, each processor in the same row tree of the GC processor will perform a gather scheme to send the gathered RMs to all other processors. At the end of this step, all GCs processors in OTIS-MOT will have the accumulated RMs as a Group Route Matrix (GRM), which will be sent through the optical links.
II
Optical Data Combining (lines 17-18 in Fig. 7). All GCs processors in whole OTIS-MOT, except GC processor in $G_{0}$, will send their GRMs via optical links to their corresponding processors in the main group $G_{0}$.
III
Main Group Data Combining (lines 19-20 in Fig. 7). This phase will repeat steps in lines 13-16, but now in group $G_{0}$ where each processor will combine the collected GRM with its own RM and sends it to its directly connected processor using a gather scheme. The process continues till the MC processor has collected all GRMs from all processors in $G_{0}$.

4 Analytical evaluation

This section provides the analytical evaluation of the PRTO algorithm on OTIS-HHC and OTIS-MOT optoelectronic architectures. The following performance metrics are used to evaluate the algorithm: parallel time complexity, speedup, efficiency, cost, and communication cost.

4.1 Analytical evaluation of PRTO algorithm on OTIS-HHC

In this section, the PRTO algorithm on OTIS-HHC is evaluated analytically in terms of parallel time complexity, speedup, efficiency, cost, and communication cost.

4.1.1 Parallel time complexity

The communication time plus the computation time, both represent the parallel run time. The parallel time complexity of PRTO algorithm is shown in Theorem 1.

Theorem 1

The worst-case parallel time complexity of PRTO algorithm on OTIS-HHC is shown in (1), whereTis theparallel time complexity,N is the number of cities, P is the number of processors anddh is the dimension of HHC.

$$ T (N, P)= {\Theta} \left( P+ \frac{N^{3}}{P}+N^{2}\times dh\right) \quad \textup{when } G=P/2 \textup{ and } G=P $$

(1)

Proof

The PRTO algorithm on OTIS-HHC (Fig. 5) is evaluated in terms of parallel run time complexity for all phases as demonstrated in Table 1a–d.

Table 1 The parallel run time complexity for all phases of the PRTO algorithm on OTIS-HHC architecture

Parallel heuristic local search algorithm on OTIS hyper hexa-cell and OTIS mesh of trees optoelectronic architectures

Abstract

Similar content being viewed by others

Solving traveling salesman problem using parallel repetitive nearest neighbor algorithm on OTIS-Hypercube and OTIS-Mesh optoelectronic architectures

Multiple k −opt evaluation multiple k −opt moves with GPU high performance local search to large-scale traveling salesman problems

Applying ACO to Large Scale TSP Instances

Explore related subjects

1 Introduction

2 OTIS optoelectronic architecture

2.1 OTIS-HHC

2.2 OTIS-MOT

3 PRTO algorithm

3.1 PRTO algorithm on OTIS-HHC

3.2 PRTO algorithm on OTIS-MOT

4 Analytical evaluation

4.1 Analytical evaluation of PRTO algorithm on OTIS-HHC

4.1.1 Parallel time complexity

Theorem 1

Proof

4.1.2 Speedup

4.1.3 Efficiency

4.1.4 Cost

4.1.5 Communication cost

4.2 Analytical evaluation of PRTO algorithm on OTIS-MOT

4.2.1 Parallel time complexity

Theorem 2

Proof

4.2.2 Speedup

4.2.3 Efficiency

4.2.4 Cost

4.2.5 Communication cost

5 Simulation results

5.1 Simulation setup

5.2 Comparative evaluation

6 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation