1 Introduction

In recent years, saving energy and reducing emissions have become important issues in diverse fields including marine navigation. Extensive research on designing economic, efficient, and safe routes for ships is being conducted by scientists around the world. Economic routes refer to those routes that fully consider the impact of the navigation environment, save energy costs by consuming less energy, and reduce operating costs. Some studies have used isochronous algorithms and corresponding improved versions (Chen et al. 2002; Roh 2013), the Dijkstra algorithm (Yu and Ye 2007; Takashima et al. 2009), and dynamic planning algorithms (Wit 1990; Mou 2017) to plan ship routes. However, with global track planning changing from a single-objective to a multi-objective framework, intelligent optimization algorithms (Park and Kim 2014) are being widely used. Among them, the ant colony algorithm (ACA; Dorigo and Maria 1997) is popular in track planning owing to its ease of parallelization and robustness.

ACA is a heuristic search algorithm used to solve combinatorial optimization problems. However, although it is able to find optimal solutions within a specific time, ACA is only suitable for small-scale optimization problems; hence, the ant colony system was proposed (Gambardella and Dorigo 2000). The MAX–MIN ant system (Stutzle and Hoos 2002), in particular, increases the diversity of feasible solutions and leads to more accurate results by limiting the range of the underlying pheromones. Several studies have discovered that simple modifications to ACA can be used to solve complex problems, which has consequently triggered a wave of modifications in the field. For instance, a parameter optimization method (Han and Tian 2008) was proposed to improve the performance of ACA by controlling the initial values of the algorithm parameters. A novel pheromone updating scheme (Seçkiner et al. 2013) was also proposed for finding the global minimum, where the pheromones were updated are performed based on the percentage of ants searching for the optimal solution. Yuan et al. (2019) enhanced the ant pheromones by introducing certain parameters into the pheromone updating scheme to accelerate the convergence of the algorithm. The improved versions of ACA discussed in these studies achieved better performance.

In addition, some studies have combined ACA with other algorithms to achieve better results. For instance, a novel method (Wang et al. 2017) was proposed that combines artificial potential fields with ACA, which determines the transfer probability of an ant's next step based on the presence or absence of obstacles and reconstructs the corresponding heuristic information to improve the overall solution quality of the algorithm. The quantum ant colony algorithm (Xia et al. 2019) reflects the high efficiency of quantum parallel computing while retaining the better optimality of ACA; moreover, it can plan optimal tracks for unmanned surface boats. In the A* ant colony algorithm based on dynamic feedback (Huang et al. 2017), a closed-loop feedback was introduced to implement dynamic adjustment of parameters, which in turn increases the adaptability of the algorithm to complex environments. A multiple ant colony optimization (Sim and Sun 2002) algorithm was proposed to counteract the stagnation problem encountered by most ant colony algorithms. This algorithm uses multiple ant colonies to seek different paths, which increases its ability to explore better paths, improves its adaptability, and reduces the possibility of stagnation. To find the shortest path faster, the wolf pack allocation principle (Liu et al. 2011) was introduced in ACA to remove the pheromones released in the least optimal path.

Owing to the joint effort of many studies over the years, great improvements have been made to ACA in terms of both quality of optimal solutions and its speed of convergence. However, there are still several problems when applying ACA to optimizing energy consumption tracks for ships in a marine environment. The ship path-planning problem usually involves considerable data from various sources. Moreover, it is a complex oil–machine–environment system and thus the existing ACAs often have difficulty balancing accuracy and speed of convergence. The two most common problems that occur are as follows: (1) the algorithm takes a long time to find the optimal path and (2) the algorithm is trapped in a local optimum and is hence unable to find the optimal path.

Therefore, to counteract the aforementioned problems, we propose a double ant colony algorithm with dynamic feedback. Our proposed algorithm involves three main steps. First, to plan the optimal energy consumption route, the energy consumption value was introduced as a pheromone. Second, the ants in the colony were divided into two categories: exploratory and optimized ants. During the initial stage of the algorithm, exploratory ants were used to improve the quality of solutions obtained by the algorithm, whereas, at the later stage, optimized ants were used to improve the convergence of the algorithm. Finally, a dynamic parameter was introduced to adjust the number of exploratory and optimized ants and thus balance the convergence speed and accuracy of the algorithm.

The remainder of this paper is organized as follows: Sect. 2 analyzes the movement of ships in a marine environment and establishes the marine environment and energy consumption models; Sect. 3 proposes three improvement strategies for the ant colony algorithm; Sect. 4 presents the simulation experiments conducted in MATLAB to verify the proposed algorithm; Sect. 5 compares the routes planned by the proposed algorithm with existing routes in the database of a ship; and Sect. 6 concludes the paper.

2 Model establishment and analysis

2.1 Ship motion analysis

In general, there are three possible optimal routes depending on the optimization purpose in a dynamic marine environment, namely, the shortest route, the shortest voyage-time route, and the lowest energy consumption route. The relationship between the voyage time, energy consumption, and length of the route is shown in Fig. 1. Assuming that the ship sails under the condition of constant thrust and that its speed is continuously adjusted according to the navigation resistance (Xia et al. 2019), the ship motion model is set up as a function of the navigation features. Figure 1 presents a compact depiction of the ship navigation process in a dynamic marine environment.

Fig. 1
figure 1

Ship navigation process. The shaded regions represent the dynamic marine environment. The horizontal axis represents the length of the route, and the vertical axis represents both the average voyage time and the average energy consumption per unit distance

Figure 1 illustrates the route planning process from the starting point S to the ending point L of the route. Depending on the optimization objective, three optimal routes are planned, namely the shortest route \(S_{1}\), lowest energy consumption route \(S_{{2}}\), and shortest voyage-time route \(S_{{3}}\). Note that when environmental factors such as winds and waves are not considered, the shortest route \(S_{1}\) is the same as the lowest energy consumption routes \(S_{{2}}\) and shortest voyage time \(S_{{3}}\). As shown in Fig. 1, curves 1 and 4 represent the change of energy consumption and voyage time in this route, respectively.

Then, when factors such as winds and waves in the marine environment are considered, the three kinds of routes are no longer the same one. As shown in Fig. 1, curves 2 and 4 represent the change of energy consumption and voyage time on the lowest energy consumption route \(S_{{2}}\) versus route length, respectively, whereas curves 1 and 3 represent the change of energy consumption and voyage time on the shortest voyage-time route \(S_{{3}}\) influenced by winds and waves versus the length of route, respectively. Therefore, when the influence of winds and waves is added into consideration, the routes of shortest voyage time and lowest energy consumption are not same as before, while the route of shortest distance is unchanged.

Besides, at positions \(a\) and \(r\) in Fig. 1, the marine environment changes dynamically with time owing to winds and waves, which has a great impact on the ship's navigation. The routes \(S_{{1}}\) corresponding to curves 1 and 4 only consider safe navigation, resulting in a significant increase in either the energy consumption or voyage time compared to the optimal route. The lowest energy consumption route \(S_{{2}}\) corresponding to curve 2 and 4 considers both safety and energy consumption and ultimately achieves the optimization goal of saving energy. Therefore, when the ship is in a dynamic position (i.e., at positions \(a\) and \(r\)), it must dynamically plan an appropriate route in real time according to the characteristics of the energy consumption and voyage-time curves. As shown in Fig. 1, whether choosing the route \(S_{{1}}\), \(S_{{2}}\), \(S_{{3}}\) or the combination of three will be analyzed in Sect. 2.4.

2.2 Marine environment model

With the aim of acquiring the optimal route in a dynamic marine environment, we used the grid method to model the marine environment in this work. Depending on the actual marine environment, grids with different attributes were established. The actual environment was divided into a two-dimensional grid, and the environmental information was defined as the properties of the corresponding grid, as follows:

  1. 1.

    Free attribute grids: If there is an island at the center of a grid, then it is considered to be an obstacle grid, else it is a free attribute grid. Ships can only choose to detour when encountering obstacle grids; however, they can directly pass through free attribute grids.

  2. 2.

    Working attribute grids: To better reflect the influence of a real marine environment on ship motion, it is important to analyze the effect of factors such as winds and waves. Therefore, information such as the wave amplitude, current velocity, and wind speed is added as parameters to the free attribute grid. Free attribute grids under the influence of environmental factors are thus transformed to working attribute grids.

  3. 3.

    Dynamic attribute grids: In a real marine environment, certain attributes change dynamically, which refer to as dynamic obstacles. For example, winds and waves change periodically with time. Thus, the energy consumption of a ship passing through a dynamic attribute grid is related to the choice of the sailing mode and period of motion of dynamic attribute grids.

2.3 Dynamic energy consumption model

Based on a suitable energy consumption model, an economic route can be planned and energy consumption data can be predicted to provide supplementary guidance for the navigation of ships. Energy consumption models have been developed using empirical and theoretical formulations (Fagerholt et al. 2010; James et al. 2009). Several studies have also added dynamic factors (Meng and Yuan 2016; Nicolas and Dimitris 2016) involved in the navigation to further improve the accuracy of the energy consumption models.

In this study, the energy consumption model used is based on both an empirical and theoretical formulations; appropriate dynamic factors were added to improve its accuracy further. To obtain the characteristics of voyage time and energy consumption described in Sect. 2.1, we analyzed the navigation process of the ship in a real marine environment and established the corresponding kinetic model (Fossen 2011). Note that the present study mainly focuses on the impact of the marine environment on the energy consumption of the ships. Hence, other influencing factors are simplified. The kinetic model used to describe the motion of ships is given as follows:

$$ Mv + C(v)v + D(v)v = F_{{{\rm{environment}}}} + F_{{{\rm{thrust}}}} $$
(1)
$$ F_{{{\rm{environment}}}} = F_{{{\rm{hydrostatic}}}} + F_{{{\rm{wind}}}} + F_{{{\rm{wave}}}} + F_{{{\rm{current}}}} $$
(2)

where \(M\) is the inertia matrix, \(C(v)\) is the centripetal force matrix, \(D(v)\) is the damping matrix such that \(D(v) \in R^{2 \times 2}\), \(v\) is the velocity vector of the ship, \(F_{{{\rm{thrust}}}}\) is the thrust of the ship's propulsion system, and \(F_{{{\rm{environment}}}}\) is the total resistive force on the ship, which is composed of the hydrostatic resistance \(F_{{{\rm{hydrostatic}}}}\), wind resistance \(F_{{{\rm{wind}}}}\), wave resistance \(F_{{{\rm{wave}}}}\), and current resistance \(F_{{{\rm{current}}}}\). The different components of the total resistive force are as follows:

$$ F_{{{\rm{hydrostatic}}}} = \frac{1}{2}\rho_{w} S_{S} v_{s}^{2} C_{1} $$
(3)

where \(\rho_{w}\) is the sea water density, \(S_{S}\) is the ship's wetted surface area related to the payload, \(v_{s}\) is the relative water speed, and \(C_{1}\) is the coefficient of water resistance.

$$ F_{{\rm wind}} = \frac{1}{2}\rho_{w} A_{S} C_{a} (v_{a} \cos \theta )^{2} $$
(4)

where \(A_{S}\) is the orthographic projection area above the waterline of the ship, \(C_{a}\) is the coefficient of air resistance, \(\theta\) is the directions of the wind, and \(v_{a}\) is the wind speed.

The wave resistance can be calculated using Kreitner's formula (Fossen 2011), such that

$$ F_{{\rm wave}} = \frac{{1}}{{2}}\rho_{w} L_{b} \xi_{A}^{2} C_{b} cos\phi $$
(5)

where \(L_{b}\) is the length of the ship, \(\xi_{A}\) is the characteristic wave height, \(\phi\) is the directions of the waves, and \(C_{b}\) is wave drift coefficient.

$$ F_{{{\rm{current}}}} = \frac{1}{2}\rho_{w} A_{c} C_{c} (v_{c} \cos \varphi )^{2} $$
(6)

where \(A_{c}\) is the orthographic projection area of the hull below the water surface, \(C_{c}\) is the coefficient of currents, \(v_{c}\) is the speed of the current, and \(\varphi\) is the directions of the currents. \(\theta\), \(\phi\), \(\varphi\) are in the North East Earth coordinate system (the navigation coordinate system).

We assumed that the entire track is divided into n sections and the ship is sailing at a constant speed between the nodes i and i + 1; thus, the environmental resistance and thrust of the propulsion system are balanced. In the marine environment with a working attribute grid, the ship’s energy consumption during navigation is derived completely from the propulsion system (Xia et al. 2019). The energy consumption \(E_{i,i + 1}\) is equal to the work performed by the ship's propulsion system to overcome the resistance owing to the environmental interference on the ship. The energy consumption is given by Eq. (7), such that

$$ E_{i,i + 1} = F_{{{\rm{environment}}}} \cdot \left| {\overrightarrow {{v_{{{\rm{boat}}}} }} } \right| \cdot t $$
(7)
$$ t = L_{i,i + 1} /\left| v \right| $$
(8)

where \(\left| {\overrightarrow {{v_{{{\rm{boat}}}} }} } \right|\) is the speed of the ship under propulsion, \(L_{i,i + 1}\) is the length of the track between nodes i and i + 1, \(E_{i,i + 1}\) is the energy consumption between nodes i and i + 1, and \(\left| v \right|\) is the actual speed of the ship that can be obtained from Eq. (1).

By solving Eqs. (1), (2), (3), (4), and (5) simultaneously, \(\left| v \right|\) can be obtained. Using \(\left| v \right|\) and the time required for the ship to pass through the grid, the energy consumption of the ship can be calculated. As described in Sect. 2.1, the variation of voyage time and energy consumption can then be plotted as a function of the route length. This provides the basis for establishing a comprehensive cost function for the route planning problem.

2.4 Comprehensive cost function

We designed a comprehensive cost function of route planning to evaluate the advantages and disadvantages of a new route. From the perspective of operating costs, the factors that have the greatest influence on a ship's track are energy consumption, voyage time, and safe obstacle avoidance (Hansen and Freund 2010). Therefore, in this work, we considered these factors as part of our optimization objective with different weighting coefficients. The cost function consists of three parts as follows.

  • (1) Voyage time:

    $$ T = \sum\nolimits_{i = 1}^{n} {T_{i,i + 1} } $$
    (9)

where \(n\) is the number of route segments and \(T_{i,i + 1}\) is the voyage time between nodes i and i + 1.

  • (2) Energy consumption:

    $$ E = \sum\nolimits_{i = 1}^{n} {E_{i,i + 1} } $$
    (10)
  • (3) Safe obstacle avoidance:

The most important consideration for a ship's voyage is to achieve the safest route from the starting point to the ending point. The cost function \(P_{{{\rm{safe}}}}\) of safe obstacle avoidance is given by

$$ P_{{{\rm{safe}}}} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {d > d_{{{\rm{safe}}}} } \hfill \\ {\infty ,} \hfill & {d < d_{{{\rm{safe}}}} } \hfill \\ \end{array} } \right. $$
(11)

where d is the straight-line distance between the ship and the center of the obstacle, and \(d_{{{\rm{safe}}}}\) is a safe distance between the ship and the obstacle. The comprehensive cost function \((J)\) of route is determined using the following equation:

$$ {\rm{min}}(J) = P_{{{\rm{safe}}}} \cdot (w_{T} \cdot T + w_{E} \cdot E) $$
(12)

with the constraints:

$$ \left\{ \begin{array}{ll} T \le T_{{\max }} \hfill \\ 0 < \left| {v_{{i,i + 1}} } \right| \le v_{{\max }} \hfill \\ \end{array} \right. $$
(13)

where E is the total energy consumption cost, \(T_{\max }\) is the maximum voyage time allowed for the ship, \(v_{i,i + 1}\) is the speed of the ship in the track from nodes i to i + 1,\(v_{\max }\) is the maximum sailing speed allowed for the ship, \(w_{T}\) and \(w_{E}\) are the weights for the corresponding cost functions, and \(w_{T} + w_{E} = 1\). By setting different weights, routes with different optimization goals can be planned. The planning process of the lowest energy consumption route, that is, \(w_{T} = 0\) and \(w_{E} = 1\), was used as an example to elucidate the applications of our proposed algorithm. The method can also be applied to other optimization route with different values of \(w_{T}\) and \(w_{E}\).

3 Improved ant colony algorithm

3.1 Fundamental principle behind algorithm design

As described in Introduction, when performing route planning in a complex dynamic environment, ACA experiences several problems such as slow convergence and getting trapped in a local optimum. Furthermore, due to energy consumption is affected by several complex factors, it is difficult for the traditional ant colony algorithm to achieve both accuracy and speed. ACA has achieved good results when combined with other algorithms to solve different combinatorial optimization problems. For example, a multiple ant colony algorithm (Yang et al. 2019) that classified ant colonies into different classes was proposed, which updated the local and global pheromones using methods superior to previously employed methods. This effectively improved the optimization ability of the algorithm in complex environments.

Therefore, utilizing the multiple ant colony algorithm and other successful modifications of ACA, we propose a novel double ant colony algorithm (NDACA), in which the ants are divided into two classes: exploratory ants and optimized ants. The optimized ants are responsible for searching for the lowest energy consumption route, whereas the exploratory ants are responsible for detecting the lowest energy consumption route. Initially, the exploratory ants are deployed to improve the quality of solutions returned by the algorithm. Later, the optimized ants are deployed to improve the convergence of the algorithm.

In NDACA, the entire ant colony is classified into two categories using the dynamic fitness classification operator defined as

$$ {\rm{fit}}_{i} = \frac{{E_{\min } }}{{E_{i} }} $$
(14)

where \({\rm{fit}}_{i}\) is the fitness value of each ant, \(E_{\min }\) is the lowest energy consumption value in the current iteration, and \(E_{i}\) is the energy consumption corresponding to each ant. If \({\rm{fit}}_{i} \in [\left. {0,M} \right]\), then the ants are exploratory ants, whereas if \({\rm{fit}}_{i} \in (M,\left. 1 \right]\), the ants are optimized ants, where M is the classification threshold of the ant state, and the size of M directly affects the overall precision of the solutions.

To avoid algorithm stagnation or falling into a local optimum, the state transfer strategy of the algorithm was improved using a combination of pseudo-random and roulette probabilities. This strategy not only makes the ants move toward grids having a high probability but also to those with a low probability. Moreover, the energy consumption factor is considered in the transition probability. The specific process is described as follows.

Let us consider that the k ant is transferred from node i to node j. When \(\;q \le q_{0}\), the k ant selects the next node to visit according to the pseudo-random probability formula given by

$$ s = \arg \mathop {\max \left\{ {\left[ {\tau_{ij}^{k} (t)} \right]^{\alpha } \left[ {\eta_{ij}^{k} (t)} \right]^{\beta } \left. {\left[ {\varepsilon_{ij} (t)} \right]^{\gamma } } \right\}\;\;q \le q_{0} } \right.}\limits_{s \in S} $$
(15)

where \(\tau_{ij} (t)\) is the pheromone between nodes i and j at the tth iteration; α is the pheromone inspired factor; β is the expectation heuristic factor; γ is an index related to the energy consumption information; \(\eta_{ij}\) represents the heuristic functions, given by \(\eta_{ij} { = 1/}d_{ij}\), and \(d_{ij}\) is the straight-line distance between nodes i and j; \(\varepsilon_{ij} (t)\) represents the heuristic information on the energy consumption on of the route from node i to j at the tth iteration, given by \(\varepsilon_{ij} (t) = 1/E_{ij}\); S is the set of all nodes that the k ant may reach from node i; s is the set of nodes that the k ant chooses to traverse; q is a randomly generated number uniformly distributed in the interval [0, 1]; and \(q_{0}\) is a user-defined parameter that specifies the distribution ratio between pseudo-random and roulette probabilities, in [0, 1].

When \(q > q_{0}\), the k ant performs path selection according to the roulette formula given by

$$ p_{ij}^{k} (t) = \frac{{\left[ {\tau_{ij}^{k} (t)} \right]^{\alpha } \left[ {\eta_{ij}^{k} (t)} \right]^{\beta } \left[ {\varepsilon_{ij} (t)} \right]^{\gamma } }}{{\sum\limits_{{s \in {\rm{allowed}}(i)}} {\left[ {\tau_{ij}^{k} (t)} \right]^{\alpha } \left[ {\eta_{ij}^{k} (t)} \right]^{\beta } \left[ {\varepsilon_{ij} (t)} \right]^{\gamma } } }} $$
(16)

where \({\rm{allowed}}(i)\) represent the feasible nodes that the k ant can choose at node i.

After several iterations (\(t \ge 2\)), two optimal routes were obtained corresponding to the exploratory and optimized ants. If the energy \(E_{{1}}\) of the optimal route planned by the exploratory ants is less than the energy \(E_{{2}}\) of the optimal route planned by the optimized ants, it indicates that some of the exploratory ants can plan the lowest energy consumption route better than some of the optimized ants. In such a scenario, the solutions obtained from the two classes of ants were exchanged along with their identities and functions. The original exploratory ants were converted into optimized ants and vice versa. After completing this exchange of solutions, the system starts the next iteration.

3.2 Adjusting the threshold parameter M

In NDACA, the threshold parameter M determines the balance of the algorithm by providing dynamical feedback to the operation of the algorithm. A dynamic feedback mechanism is adopted to adjust the parameter M, as shown in Fig. 2.

Fig. 2
figure 2

Closed-loop control chart for NDACA. The value of the classification threshold M is adjusted according to the value of the feedback parameter R

In Fig. 2, R is the feedback parameter. When R > 0, indicating that the quality of the obtained search route is poor, the search area needs to be expanded to speed up the exploratory process by reducing the value of M. When R < 0, indicating that a better solution has been found and that the search area contains an exploratory value, then the intensity of exploration needs to be increased by increasing the value of M. In particular, when R = 0, it is necessary to count the number of times the \(E_{{{\rm{best}}}}^{t} /E_{{{\rm{best}}}}^{t - 1}\) continuously attains the value 1 and then set a stagnation threshold accordingly. When the cumulative count exceeds this threshold, the algorithm is very likely to fall into a local minimum, in which case the value of \(M\) needs to be decreased appropriately to make the algorithm converge to globally optimal solution. The specific adjustment strategy for M can be summarized as follows:

$$ M^{t} = \left\{ \begin{array}{ll} M^{t - 1} (1 - \frac{{E_{{{\rm{best}}}}^{t} - E_{{{\rm{best}}}}^{t - 1} }}{{E_{{{\rm{best}}}}^{t - 1} }})\;(E_{{{\rm{best}}}}^{t} \ne E_{{{\rm{best}}}}^{t - 1} ) \hfill \\ M^{t - 1} \;\;\;\;\;\;(E_{{{\rm{best}}}}^{t} = E_{{{\rm{best}}}}^{t - 1} ,{\rm{num}}(E_{{{\rm{best}}}}^{t} = E_{{{\rm{best}}}}^{t - 1} ) < N_{\max } ) \hfill \\ \zeta M^{t - 1} \;\;\;\;(E_{{{\rm{best}}}}^{t} = E_{{{\rm{best}}}}^{t - 1} ,{\rm{num}}(E_{{{\rm{best}}}}^{t} = E_{{{\rm{best}}}}^{t - 1} ) \ge N_{\max } ) \hfill \\ \end{array} \right. $$
(17)

where \(N_{\max }\) is the stagnation threshold; num() represents the algebraic accumulation of continuous stagnation in the ant population; ζ is the adjustment factor such that \(0 < \zeta < 1\); t is the number of iterations; \(E_{{{\rm{best}}}}^{t}\) is the energy consumption of the optimal path at the tth iteration; and \(M^{t}\) is the value of M at the tth iteration.

Note that the value of \(M\) lies in the range [0, 1]. Owing to the closed-loop feedback adopted, the initial value of \(M\) has little effect on the accuracy of the solution. However, when \(M\) is very large, numerous unusable solutions are generated, which increases the convergence time of the algorithm. By means of extensive experiments, we verified that the algorithm achieves the best performance when the initial value of \(M\) is 0.8.

3.3 Pheromone updating strategy

The tasks undertaken by the exploratory and optimized ants are different, having different significances as well; thus, the corresponding pheromone updating strategy also needs to be different. To manage the contradiction between the diversity of solutions and the convergence speed in the two cases, the following pheromone updating strategy was adopted for our proposed algorithm.

To prevent the algorithm from falling into a local optimum, a local pheromone updating strategy was adopted. The comprehensive cost function was also considered in the local pheromone updating strategy. By introducing different weight coefficients, different types of ants can execute different pheromone updating strategies. The updating strategy is outlined as follows:

$$ \tau_{ij} (t + 1) = 1 - \rho \tau_{ij} (t) + \rho \sum\limits_{k = 1}^{m} {\tau_{ij}^{k} } $$
(18)
$$ \tau_{ij}^{k} = \frac{{\lambda_{1} \cdot {\rm{fit}}_{i} }}{{J_{k} }} $$
(19)
$$ \tau_{ij}^{k} = \frac{{\lambda_{2} \cdot {\rm{fit}}_{i} }}{{J_{k} }} $$
(20)

where ρ is the pheromone volatilization factor, \(\lambda_{1}\) and \(\lambda_{2}\) are the weight coefficients of the optimized and exploratory ants, respectively, and \(J_{k}\) is the comprehensive cost function of the optimal route for k ant at the tth iteration. To distinguish between the pheromone concentrations of the search routes corresponding to the two types of ant and highlight the better route, the respective weights are set to be \(\lambda_{1}\) = 4.2 and \(\lambda_{2}\) = 2.2. The optimized ants follow Eqs. (18) and (19), whereas the exploratory ants follow Eqs. (18) and (20).

After all the ants completed one iteration, the pheromones were updated globally. At this stage, the exploratory and optimized ants are no longer distinguishable; however, the ship energy consumption factor was taken into consideration when the pheromones of the optimal route were updated. The update was performed according to the following equations:

$$ \tau_{ij} (t) = (1 - \rho )\tau_{ij} (t) + \rho \Delta \tau_{{{\rm{best}}}} $$
(21)
$$ \Delta \tau_{{{\rm{best}}}} = \left\{ {\begin{array}{*{20}l} {Q/E_{{{\rm{best}}}}^{t} } \hfill & {{\rm{if}}\;\;(i,j) \in s^{ * } } \hfill \\ 0 \hfill & {{\rm{else}}} \hfill \\ \end{array} } \right.{\kern 1pt} $$
(22)

where Q is a constant; \(s^{ * }\) is the set of optimal routes; and \(\Delta \tau_{{{\rm{best}}}}\) is the increase in pheromones for the optimal path.

The flowchart of NDACA is shown in Fig. 3.

Fig. 3
figure 3

Flowchart of NDACA

4 Experimental results

4.1 Establishment of simulation environment

We constructed a simulation box consisting of 20 × 20 grids and randomly populated 13% of this box with obstacles, as shown in Fig. 4. The lowest energy consumption route was planned from the starting point (0.5, 19.5) to the end point (19.5, 0.5) of the grid. The length of a unit raster is 1 km.

Fig. 4
figure 4

Simulation grid. The black and white grids represent obstacles and free attribute grids, respectively

Moreover, based on simulation model, the working attribute grid was adapted to reflect the real information pertaining to the marine environment, and hence, environment models I, II, and III were constructed. Environment III is composed of obstacle grids and free attribute grids which does not include any environmental attributes. And then, the influence of winds and waves was added into consideration, and free attribute grids are thus transformed to working attribute grids in the model I and II. In the model I, the attributes are set to \(v_{a}\) = 3 m/s, \(\xi_{A}\) = 2 m, \(v_{s}\) = 1.5 m/s, \(\theta =\) 30°, \(\phi =\) 60°, and \(\varphi =\) 45°. In the model II, the attributes are set to \(v_{a}\) = 2 m/s, \(\xi_{A}\) = 1.5 m, \(v_{s}\) = 1 m/s, \(\theta =\) 150°, \(\phi =\) 110°, and \(\varphi =\) 135°.

4.2 Determination of parameter values

The parameter values used in NDACA have a direct effect on the algorithm’s performance; however, there is no specific formalism available to determine these values precisely (Ma et al. 2018). Therefore, we adopted a numerical experimental method to determine the appropriate values of the parameters. Using the MATLAB simulation software, different parameter combinations were explored within the respective parameter ranges and the optimal parameter combinations were selected based on the experimental results.

The environment model I was selected, and the initial values of the parameters were set to m = 100, α  = 1, β  = 5, γ = 1, Q = 1, and ρ = 0.3, where m is the total number of ants. Based on these initial values, different combinations of the parameters were explored in our simulations, as listed in Table 1. In our experiments, each parameter was assigned different values and the route planning problem was solved repeatedly using NDACA with the values of the parameters being changed one at a time (Cheng and Mao 2007). The relationship between parameter values and mean of energy consumption on the optimal route is shown in Table 1.

Table 1 Comparison of the energy consumption obtained using NDACA for different parameter values

From Table 1, based on mean of energy consumption on the optimal route, we observe that the optimal value of α is approximately 1, the optimal value of β lies in the range [7, 9], the optimal value of ρ is approximately 0.3, and that of Q is approximately 0.90. Therefore, the final parameter values were determined to be m = 100, α = 1.0, β =7, γ = 2, ρ = 0.3, and Q = 0.90 (note that Q is the pheromone intensity factor).

Note that the parameter \(q_{0}\) has a great influence on NDACA. From the transition strategy Eq. (15) and (16), we can see that when \(q_{0}\) is small, the ants choose the roulette formula for selecting the optimal path with a higher probability. This approach increases the diversity of the candidate solutions but decreases the operational efficiency of the algorithm. Conversely, when \(q_{0}\) is large, the ants choose the optimal path with a higher probability according to the pseudo-random probability formula. In this case, the convergence speed of the algorithm is improved to some extent; however, if the ants fail to find a better path in the early stages of the algorithm, the algorithm is likely to fall into a local optimum owing to the accumulation of pheromones. The value of \(q_{0}\) was set using the “single-factor method,” that is, the optimal \(q_{0}\) value was obtained by analyzing the influence of different \(q_{0}\) values on the optimal route in the environment models I, II and III, as shown in Fig. 5.

Fig. 5
figure 5

Relationship between the energy consumption E and q0 in the environment models I, II and III

In Fig. 5, when the value of \(q_{0}\) is in the range [0.7, 1], the algorithm can find the optimal route in the three environment models. In addition, we found that the algorithm can maintain a good performance in more complex environments when the value of \(q_{0}\) was larger. For example, to enable the algorithm to better plan the optimal route, the value of \(q_{0}\) was set to be ≥ 0.9 in environment model I, whereas in environment model III it was enough to set \(q_{0}\) ≥ 0.7. In this study, the research objects are either same as or more complex than environment model I, and hence, the initial value of \(q_{0}\) was set to 0.90.

4.3 Results in the working attribute environment

In this section, NDACA is compared with other algorithms to verify its effectiveness. We considered ACA and the double-layer ant colony optimization algorithm (DACO; Xv et al. 2019) to compare the performances of the algorithms in planning the optimal route for the simulation environment I.

Figure 6 shows a comparison between the optimal routes obtained using NDACA and ACA. Figure 7 presents a comparison between the optimal routes obtained using NDACA and DACO. Figure 8 shows the energy consumption as a function of the number of iterations for the three algorithms. The total number of iterations performed was N = 100, and the ant colony size was m = 100. Table 2 lists the simulation results obtained using the three algorithms.

Fig. 6
figure 6

Route planning results of NDACA and ACA in the working attribute environment

Fig. 7
figure 7

Route planning results of NDACA and DACO in the working attribute environment

Fig. 8
figure 8

Relationship between energy consumption and number of iterations for ACA, DACO, and NDACA

Table 2 Comparison of simulation results obtained using NDACA, DACO, and ACA

From Table 2, we observe that the number iterations required for convergence reduces by 48.2%, and the energy consumption decreases by 7.85% when compared to ACA. Conversely, when compared to DACO, the speed of convergence improves by 28.6%, and the energy consumption decreases by 5.5%. These results indicate that NDACA can quickly find the optimal solution; the results also demonstrate the proposed algorithm’s good environmental adaptability.

4.4 Results in the dynamic attribute environment

In this section, we study the effect of dynamic attribute grids added to the simulation environment I, where the winds and waves change periodically with time. The dynamic attribute grids are denoted in red in Fig. 9; they move uniformly along the y-axis direction at a speed of 2 grids/s and are reciprocated several times. In Fig. 9a, b, the wave speed in the dynamic attribute grids is set to 15 m/s and 10 m/s, respectively.

Fig. 9
figure 9

Route planning results of NDACA in the dynamic attribute environment. a Optimal route not influenced by dynamic attribute grids. b Optimal route influenced by dynamic attribute grids

Owing to the complexity of the dynamic environment, it becomes more difficult for the algorithm to find the optimal result; hence, the parameter \(q_{0}\) was increased to 0.95, while the other parameters remained unchanged. The optimal ship route in a dynamic attribute environment can be obtained using NDACA, as shown in Fig. 9.

Figure 9 shows that NDACA is adaptable to different types of dynamic attribute grids and successfully plan the optimal route. When NDACA is employed for route planning in a dynamic marine environment, it finds the optimal solution quickly and attains a stable convergence, indicating that the algorithm has good performance even in a complex environment, as demonstrated by Fig. 10.

Fig. 10
figure 10

Relationship between energy consumption and number of iterations. a, b correspond to the routes in Fig. 9a, b, respectively

4.5 Evolution of the different ant classes in NDACA

As stated earlier, the exploratory ants are responsible for improving the quality of the solutions, whereas the optimized ants are responsible for accelerating the speed of convergence. The proportion of exploratory and optimized ants in the total ant population determines the performance of NDACA. The variation in the proportion of each kind of ants also reflects the variation in the parameter M. In this section, we study how the proportion of the two kinds of ants evolves with the number of iterations in the working and dynamic attribute environments, as shown in Fig. 11.

Fig. 11
figure 11

Evolution of the number of exploratory and optimized ants with the number of iterations in different environments. a Working attribute environment. b Dynamic attribute environment

In Fig. 11a, b, the initial value of the parameter M is 0.8. During the initial iterations, the number of exploratory ants is more than the optimized ants to ensure good quality solutions. As the number of iterations increases, the value of parameter M decreases continuously. Consequently, the number of exploratory ants decreases, whereas the number of optimized ants increases until all the ants in the ant colony become optimized ants. This increase in the number of optimized ants during the later iterations promotes the speed of convergence of the algorithm.

Comparing Fig. 11a, b, we find that even in a complex dynamic environment, our proposed algorithm achieves a balance between the two ant colonies by adjusting the parameter M. Moreover, the parameter M guarantees the exactness of the solution and improves the speed of convergence. Thus, we note that NDACA can dynamically adapt to complex marine environments and plan the optimal route for ships.

5 Algorithm verification and applications

In this section, we analyze the feasibility of our proposed algorithm in terms of practical applications. We study the Danish passenger ship MS Smyril, sailing in the Faroe Islands, for the ship route planning problem in this work. The ship serves between the port of Tórshavn the capital of the Faroe Islands, and the island Suðuroy as shown in Fig. 12. The system in-built in the ship collects the route data from the ship; the data of MS Smyril have been published online, and we select the data for 246 trips from February to April, 2010 (DTU 2011).

Fig. 12
figure 12

Figure of ship and its routes. a shows the MS Smyril ship, b routes obtained from the navigation data in the database of the MS Smyril, in which ① represents the main routes of the ship and ② represents the alternative routes for bad weather

The image of the Faroe Islands was first rasterized into a simulation box of 50 × 50 grids with each grid having a length of 1.4 km. The free attribute, working attribute, and dynamic attribute grids were defined as explained in Sect. 2.2. Figure 13 shows the rasterized sea area map of the Faroe Islands. In Fig. 13, we denote the dynamic attribute grids in the Faroe Islands map in red based on the meteorological and hydrological data of the winds and waves. The red grids move continuously reciprocating in the y-axis direction at a speed of 4 grids/s. The specifications of the dynamic attribute grids are listed in Table 3. In addition, due to the ship is not significantly affected by the winds and waves as it enters and leaves the ports, the grids in these areas can be considered to be free attribute grids.

Fig. 13
figure 13

Rasterized grid map of the Faroe Islands. The red regions represent dynamic attribute grids, whereas the shaded regions represent obstacle grids or islands. Note that some of the free attribute grids were changed to working attribute grids owing to environmental factors

Table 3 Wind settings for the dynamic attribute grids

Considering the factors in a real marine environment, the parameters in NDACA were optimized to yield good performance; hence, we set α  = 0.12 and β  = 7. If the volatility factor ρ is very small, then a large amount of pheromone residues tends to be deposited on multiple routes, which causes non-optimal routes to be searched continually leading to a slow convergence rate. Hence, we set ρ  = 0.37. Note that smaller the value of parameter Q, slower is the accumulation of pheromones on the route, and hence, slower is the convergence rate of the algorithm; therefore, we chose a higher value of Q, namely, Q = 1.1. In addition, the energy-inspired information index γ was set to 2 and the parameter \({q}_{0} \)was set to 0.95.

Finally, based on the aforementioned parameter values, the shortest route, shortest voyage-time route, and lowest energy consumption route were planned by NDACA. The starting and ending coordinates of the ship were (25.5, 40.5) and (24.5, 0.5), respectively. The resulting routes are shown in Fig. 14.

Fig. 14
figure 14

Route planning results. Routes 1, 2, and 3 represent the shortest route, shortest voyage-time route, and lowest energy consumption route, respectively

We choose the lowest energy consumption route (route 3) in Fig. 14 to explain the principle behind the optimization process of our algorithm. When the ship travels around the dynamic attribute grids at (27.5, 28.5), the routes 1, 2, and 3 diverged. If the ship follows route 1 or 2 in this case, then the energy consumption will increase sharply, and hence, route 3 was selected. When the ship travels near the dynamic attribute grids at (25.5, 4.5), if the ship sails along route 1 that bypasses the dynamic grids, then the energy consumption will be higher than if the ship passes directly through the grids. Therefore, in this case, the ship followed route 3. Finally, the complete optimal energy consumption route was obtained, as shown in Table 4.

Table 4 Comparison between the planned and actual routes

Furthermore, to verify the practical significance of routes 1, 2, and 3, we compared them with the actual ship routes in the existing database of the MS Smyril. We calculated the energy consumption, voyage time, and length of every route of MS Smyril’s 246 voyages in Fig. 12, the shortest route, shortest voyage-time route, and lowest energy consumption route were selected to fill in Table 4.

From Table 4, we observe that the lowest energy consumption route (route3) planned by NDACA saves approximately 152.69 L of energy compared to the actual route, which is equivalent to 4.58% of the energy being saved. This indicates that the planned energy consumption route is of practical significance. Similarly, compared with the actual routes in the database, the corresponding shortest route and shortest voyage-time route planned by NDACA both yield better results.

6 Conclusions

To plan the optimal route for ships in a dynamic marine environment, a novel double ant colony algorithm (NDACA) with dynamic feedback is proposed in this work. Moreover, we verified the significance of NDACA by both simulation experiments and comparing its results with the real data of a ship. We draw the following conclusions based on our findings.

  1. 1.

    For a ship sailing in a dynamic marine environment, the shortest route, shortest voyage-time route and lowest energy consumption route are quite different from each other. Thus, to perform the real-time route planning in such a scenario, a comprehensive cost function is designed by analyzing the ship's motion. A grid model is used to establish the marine environment in this work. A combination of free attribute, working attribute, and dynamic attribute grids was used to capture the dynamic environmental factors properly.

  2. 2.

    Our proposed NDACA with dynamic feedback is able to balance accuracy and speed of convergence in complex environments. In addition, compared with ACA and DACO, it can save 7.85% and 5.5% of the energy consumed, respectively, when planning routes in a marine environment.

  3. 3.

    NDACA can also successfully plan the shortest route, lowest energy consumption route, and shortest voyage-time route for ships in a marine environment. For instance, compared with the actual route of MS Smyril, the planned route corresponding to the lowest energy consumption saves approximately 4.58% of energy. This has great practical significance for ship operations management.