1 Introduction

Natural computing (NC) [1] refers to the process inspired by structure, function, and behavior of biological as well as natural systems. There are many variants of NC, which include artificial neural networks (ANN) [2], swarm intelligence [3], evolutionary algorithms [4], quantum computing [5], and membrane computing [6]. Membrane computing (MC) is a branch of computer science that aims to find new computational and mathematical models from the structure and functioning of cellular membranes. The models considered under MC are distributed and parallel systems called P systems handling multisets of objects in the cells defined by placing membranes hierarchically or generally.

Many engineering design optimization problems are solved using MC in a reasonable time because of its nondeterminism and maximum parallelism features [7]. The arrangement of membranes categorizes the MC systems into cell-like, tissue-like, and neural-like systems. The latter class is the most recent branch of MC, which is incorporated by spiking neurons into the area of P systems (SN P systems). SN P systems fall under the third generation of artificial neural network models. These systems work by the spiking nature of neurons.

SN P systems can be represented by a directed graph with nodes referring to spiking neurons, and edges form synapses between neurons through which communication takes place. SN P systems with neuron division and dissolution [8], budding [9], and spikes and antispikes [10] are applied in numerous applications and proven to be efficient and reliable. Earlier SN P systems were treated as language-generative devices of 0’s and 1’s [11]. An optimization SN P system to find the solutions of an unconstrained single objective optimization problem has been proposed to solve knapsack problems [12]. Spike train and timing are the two key factors of SN P systems that are used for spiking and coding of information [13].

Because of the better theoretical foundations of MC, many intelligent systems employ new methods and paradigms [14]. It is used to identify the nuclear export signals of amino acid sequence, which is a challenge in computational biology [15]. Recently, SN P systems are employed in many intelligent and expert domains such as the semantics of deductive database systems [16] and parallel multiples [17]. So it is concluded that SN P systems are more powerful in intelligent and expert systems from both theoretical and practical sides [18].

An optimization problem [19] is a problem of finding the best solution from a set of feasible solutions that cannot be solved in polynomial time limit using any deterministic method. Numerous methods and schemes are designed to solve such kinds of problems [20,21,22]. Most of the existing schemes in the literature emphasize on arriving at the optimal solutions quickly rather than focusing on the time limit. One of the objectives of this paper focuses on the balance between arriving the best solution and reducing the time limit. Apart from that, a novel way of choosing, controlling, and activating spikes is employed.

Capacitated VRP with time windows (CVRPTW) [23, 24] is a variant of VRP, which has emerged due to the increasing demands of customers in the specific time span. It is considered as a nondeterministic polynomial hard (NP-hard) problem that finds an ideal arrangement of routes to be serviced by a fleet of vehicles of some fixed capacity to serve a given arrangement of customers inside the allotted time windows. The customer’s geographic locations and demands are unpredictable while designing routes for a particular vehicle. In this way, it is critical to make a decision on a choice on the allocation and scheduling of new demands by considering all the components that are significantly influenced.

The intention of this paper is to design a spiking neural framework to solve CVRPTW with dynamism (CDVRPTW) in a feasible time without sacrificing the global optimum solutions. The proposed solution scheme is designed by combining a number of SN P systems coupled with an improved firefly optimization algorithm (SFO). It is based on the flashing behavior of fireflies, which is a signal system to attract other fireflies. As noticed in the literature [25,26,27], the firefly algorithm (FA) has been applied to many engineering design optimization problems. Only limited articles exist which combine FA and VRP [28,29,30,31]. The strong adaptability and robustness, easy to set up, minimal manual adjustments, and minimal parameters made it suitable for VRP problems. But, the reason behind its minimal usage in VRP kind of problems is that it is prone to local optima. But, in the proposed method, the authors overcome it by carefully designing the initial assignments of fireflies and parameter optimization.

The main contributions of the proposed scheme can be stated as:

  1. 1.

    We propose a spiking neural-based firefly optimization scheme (SFO) to find solutions to CDVRPTW.

  2. 2.

    The firefly optimization is modified to explore the solution space faster, and thereby achieving a high convergence rate.

  3. 3.

    Parameters in the proposed firefly scheme are optimized by adjusting the rule probabilities in the SN P systems.

  4. 4.

    A notion of movement and attraction of fireflies is introduced by taking real-world circumstances like traffic data and dynamic requests.

  5. 5.

    A high degree of parallelism is incorporated in the cluster level by considering solutions together and assigning them to different spiking neurons for simultaneous computation. Here, cluster means a group of customers who are geographically mapped by distance.

  6. 6.

    The initial feasible positions for fireflies are carefully designed using Clarke and Wright [32] algorithm rather than the random assignment.

  7. 7.

    A mathematical formulation of the problem is made with real-world constraints.

The remaining sections are organized as follows: Sect. 2 carries out a detailed literature review on the application and solution approaches to the problem. Section 3 details the problem description and mathematical formulation of CDVRPTW. Section 4 outlines a basic SN P system. Section 5 explains the improved firefly optimization scheme using SN P systems. Section 6 describes the proposed system design and the various modules involved. Section 7 talks about the experimental results and discussions in detail. Concluding comments and future scope of the system are given in Sect. 8.

2 Literature review

The advancement of technology in various fields has promoted the evolution of intelligent transportation systems (ITS), which uses the history of geolocation techniques with geographic information systems [33]. Vehicle routing is aiming for an operational task that is bound to the competence of dispatches as well as the optimization costs which are directly reliant on the size of the fleet [34,35,36,37,38]. The applications of routing in various aspects of life have been found.

2.1 Transport of goods, services, and peoples

Because of the highly variable traveling time, one should take factors like traffic, competition, and cooperation between transport companies into account [39, 40]. All these applications come under city logistics. Planning real-time traffic and dynamic routing of a fleet of vehicles, which require additional modules and attributes, are actualized [41]. A decision support system (DSS) can be used for a better trade-off between dynamic travel and service times.

In courier transportation systems, one should take into account not only the request locations, time window, and capacity constraints but also traffic data and forbidden paths. Optimization-enabled automatic fleet management system (AFMS) has considered for solving such types of problems [42,43,44].

Customer satisfaction is an important factor when we define an FMS. Newspaper delivery is an example of such type of domains where the complaints will be filed in case of delay of delivery. In order to reduce costs and improving customer satisfaction, centralized applications were proposed that made use of cell phones to continuously communicate with drivers at the same time of performing routing [45]. Additionally, future requests are anticipated using historical data of customers [46].

In grocery delivery services, the merchant designs various clients that can be considered inside a fixed time window. At the same time, it is made inaccessible to the clients when the capacity constraints are violated. The time windows fixed for a customer are dynamically planned based on future request in-home delivery problems [47]. Greedy randomized adaptive search procedure (GRASP) and adaptive large neighborhood search (ALNS) were proposed which consider uncertainty by introducing scenarios for possible demand realizations [48]. Dynamic stacker crane problems [49, 50] which considered container carriers with loading and unloading ships are coming under operations research applications. In addition, factories and hospitals are other application domains where products or medical instruments, respectively, must be transferred [51].

Transport of peoples is characterized by additional constraints such as regulation on waiting, service, and travel times. Cab system is the most common online individual transportation system where the customer’s requests are composed of pick time and location coupled with the dropping location. Many variants are available like advance booking, sharing, etc., which leaves a limited chance for optimization. The multicab metropolitan transportation system is focused on where more than one customer’s request can be serviced [52]. Other application domains include transport of children to schools, disabled people to work locations, patients to medical centers, etc. [53,54,55]. Air taxis are other domains where the limitations of the traditional airline are reduced.

2.2 Solution approaches

Ranging from linear programming (LP) to meta-heuristics, there are many schemes, which solve DVRP and its variants.

2.2.1 Dynamic and deterministic synchronous optimization techniques

The timely change of critical information forces new hindrances in the absence of stochastic information. In this specific situation, exact methods are irrelevant since the optimal solutions are time-dependent. So most of the approaches on DVRP rely on heuristics that find the best solutions to the current instance. The first optimization focused on the dynamic arc routing problem (DARP), which reconfigures the route each time a new request has been made. The main drawback was the lack of dimensionality, which prevents its application to large instances. Decision epochs or time slices are later introduced in the literature. All these schemes relied on the algorithms of static routing, but the main drawback was the updating of routing, which increases delays at the customers’ ends [56, 57].

A real-time truckload pickup and delivery (PDP) was addressed [58], where a fleet of trucks has to serve requests arriving dynamically. It is proposed a rolling horizon approach based on LP. A dynamic column generator scheme for DVRPTW has been addressed [59]. The experimental results based on Solomon’s benchmark instances reveal that it yields better results in terms of fitness function, but time for performing computation was very high.

An ant colony system (ACS) [60] to solve DVRP was developed by using the time slice strategy. At each time slice, the requests are checked and are handled till the end of that time slice. They used pheromone trace to transfer good solutions to the next time window. Similar approaches were also addressed in [61, 62]. The main drawback of this scheme is that the intensity of pheromone concentration may vanish as the number of iterations increases. It may lead to having uncertain time to converge for larger instances.

2.2.2 Dynamic and deterministic continuous optimization techniques

These kinds of strategies perform optimization throughout and keep information about good solutions in memory [63]. When a dynamic request has been made, it uses the memory where the previous information was stored to plan and serve new requests. Therefore, the computational capacity is maximized. In addition, the vehicles are unaware of the next requests to be served until they complete the service of the previous request. The introduction of parallel tabu search (TS) was the first continuous optimization approach to courier services. A pool of good route and customer pairs is maintained in memory, and this information was referred for designing initial feasible solutions. At the point when a client makes a request, it is checked in the memory to choose whether it ought to be accepted or discarded. Different variants like DVRP and DARP have additionally been implemented using a similar scheme [64, 65]. The main drawback of the TS is that, even though it is semi-deterministic, it produces fewer quality solutions in the long run because of the hard constraint settings.

A generalization of TS using a multiple plan approach (MPA) was designed to populate and maintain routing plans to generate a distinguished solution. Pool updates were performed whenever a vehicle finishes the serving of a customer. Nevertheless, it was very crucial in high-dimensional problems. Genetic algorithms (GA) with a human dispatcher to study the D-PDP variant were proposed [66, 67]. The measure of dynamism was low for GA in spite of the fact that it keeps running all through the horizon and continually updating the changes.

2.2.3 Dynamic and stochastic schemes

A truckload PDP based on the Markov decision process (MDP) was formulated [68]. A notion of probability has then been added to solve VRP along with MDP [69]. However, it lacked dimensionality and was not suitable for real-world applications. The approximate dynamic program (ADP) was designed in order to reduce the limitations of traditional DP [70]. It was successfully applied to FMS [35, 37]. But, the main limitation of ADP was that a single observation may change the whole value function, thereby modifying the constraints which further distorts successive observations. The LP with dynamic and stochastic content has also been designed [58, 71]. Later, emergency vehicle dispatching and routing was developed and was used by Haghani and Yang [72]. The problems with these schemes were again the curse of dimensionality.

2.2.4 Sampling

It relies on the generation of scenarios. A scenario is an outline of known and future events at each time unit on the horizon. Multiple scenario approach (MSA) has been connected to Solomon’s benchmark instances [73] and performed substantially better than the state-of-the-art algorithms. The MSA with a consensus scheme has been used to tackle DVRP [74], and an event-driven framework with MSA [75] has been implemented later. Dynamic sample scenario hedge heuristics (DSHH) divides the horizon into intervals. Routing is done by assigning a subset of requests to the vehicles relying upon the frequencies of their assignments [76]. It later promoted the design of branch and regret heuristics. Though all of the above schemes performed well with dynamism, maintaining and modifying scenarios at each time units added complexity of the system.

2.2.5 Other strategies

Waiting strategy was introduced in [77] in order to serve dynamic requests in the neighborhood of a served request and plan accordingly [78,79,80,81,82]. Relocation strategy was also designed in case of emergency vehicle routing problems and is applied in D-VRP, DVRPTW, DTSPTW, etc. [83]. All these strategies were based on delaying the customer’s request assignments to vehicles in a priority manner. So some customers may enter a long waiting period which in turn affects the overall system performance. Also, an adaptive spiking neural P system [84] is designed to handle VRP problems, but it did not discuss any dynamic requests and assignments of customers.

Due to the high complexity of VRP and its variants, state-of-the-art algorithms are very far from the real-time requirements in terms of quality and efficiency. Therefore, there are many factors that need to be considered such as how to find feasible solutions, how to avoid local optimum, how to control the convergence rate, and how to optimize the problem within the acceptable range. The proposed scheme presents an improved firefly optimization scheme using SN P systems to find solutions for CDVRPTW. It incorporates optimization, parallelism, and determinism into the SN P system framework. To the best of the author’s knowledge, this is the first work of its kind that solves CDVRPTW by taking advantage of both FA and SN P systems with the intelligent exploration and exploitation of solution space. Analysis of the experimental outcomes justifies the novelty of the proposed scheme.

3 Problem description and mathematical model

In this section, a brief description of CDVRPTW will be explained together with the mathematical formulation of the problem. The measures for dynamism are also explained.

3.1 Capacitated dynamic vehicle routing problems with time windows (CDVRPTW)

The CDVRPTW is a variant of VRP that has emerged because of recent improvements in real-world communications. Larsen [85] introduced two aspects of DVRP: Not all information on the routing process is available when it begins; information may change even after the routes have been constructed.

To understand the problem, Fig. 1 illustrates the route planning of a CDVRPTW. Here, there are 3 vertices (A, B, and C) that correspond to 3 different customers with known requests at t = t0. The edges connecting the customers denote different parameters like traveling cost, traveling distance, etc., that are to be optimized. The initial route assignment to serve known requests (Depot, A, B, C, Depot) is planned before the vehicle leaves the system. While the vehicle follows the route AB, one dynamic request appeared at t = t1 (Fig. 2) in that cluster. So, the vehicle needs to reroute its path as (Depot, A, B, C, D, Depot) if the total capacity of the vehicle is greater than or equal to the sum of all capacities of each customer in the route.

Fig. 1
figure 1

Route planning in the static environment (at t = t0)

Fig. 2
figure 2

Route planning in the dynamic environment (at t = t1)

3.2 The degree of dynamism

The performance of a DVRP is estimated based on the number of dynamic demands and the time when these demands arise, though static VRP depends on the number of clients and their distribution. The measure of ‘dynamism’ would be significant when we investigate the performance of a particular algorithm under constraints. The role of the new request during the calculation phase of a routing framework can be used to analyze dynamism. The measure here is the number of dynamic demands with respect to the total demands. This proportion is known as the degree of dynamism and is expressed by [86]:

$$ {\text{dod}} = \frac{{{\text{Number}}\;{\text{of}}\;{\text{dynamic}}\;{\text{requests}}}}{{{\text{Total}}\;{\text{number}}\;{\text{of}}\;{\text{requests}}}} $$
(1)

This measure relies only on the number of requests but not on the time limits.

Dynamism within fixed time windows has several routing applications where the service must start and finish within a fixed known interval of time. In the case of time windows, dynamism is expressed by effective dod (edod). It is a measure of the average of how later the customer requests are accepted compared to the latest time the requests could be received. Let tk be the time at which kth request is received, let ek and lk, respectively, be the start and end of a particular time window, and the planning horizon is defined between 0 and T such that 0 ≤ tk \( \le T \). We define reaction time (\( r_{k} \)) as the distance between the time the request is made and the latest time at which the service should starts. The reaction time of kth request (Fig. 3) is defined by:

$$ r_{k} = l_{k} - t_{k} $$
(2)

In case of time windows \( (t_{\text{w}} ) \), the effective dod (\( e_{\text{dod}} \)) can be defined as:

$$ \begin{aligned} e_{\text{dod}} - t_{\text{w}} & = \frac{1}{n}\mathop \sum \limits_{k = 1}^{n} \left( {\frac{{T - \left( {l_{k} - t_{k} } \right)}}{T}} \right) \\ & = \frac{1}{n}\mathop \sum \limits_{k = 1}^{n} \left( {1 - \frac{{r_{k} }}{T}} \right) \\ \end{aligned} $$
(3)

where 0 \( \le e_{\text{dod}} - t_{\text{w}} \le 1 \) and \( l_{k} - t_{k} \le T,\quad k = 1,2, \ldots ,n \), where n is the sum of immediate requests and advanced requests during the horizon. In the proposed work, the DVRP model is treated as a set of static VRP by the event scheduler. The event scheduler is responsible for receiving customer requests and creating static problems. Then, it sends the problems to the optimization procedure (here SFO) and returns the optimal solutions.

Fig. 3
figure 3

The reaction time of two customers under time windows

3.3 Mathematical model of the problem

CDVRPTW can be considered as a complete graph-theoretical problem with G = (V, E), where V = {V0VB}, the vertex set and E = {(Vi, Vj): Vi, Vj \( \in \) V and i \( \ne j \) }, the edge set. The vertex V0 corresponds to the depot. Each vertex Vk \( \in \) V has parameters, namely a request Pk, arrival time \( T_{{a_{k} }} \), the waiting time \( T_{{{\text{w}}_{k} }} \), service time \( T_{{s_{k} }} \), and the time window [\( T_{{e_{k} }} \),\( T_{{l_{k} }} \)]. \( C_{kj}^{t} \) is the transportation cost from customer j to customer k in the time period t. Each vehicle k has a nonnegative capacity \( W_{k} \). The total number of time periods is denoted by T, and \( f_{c} \) and \( t_{c} \), respectively, are the fixed cost and traveling cost/unit time of the vehicle.

$$ x_{ki} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\;{\text{customer}}\;k\;{\text{is}}\;{\text{delivered}}\;{\text{by}}\;{\text{vehicle}}\;i} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
$$ y_{kj}^{t} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\;{\text{vehicle}}\;{\text{serves}}\;{\text{the}}\;{\text{customer}}\;j\;{\text{from}}\;{\text{customer}}\;k\;{\text{in}}\;t} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(4)

The objective of the problem is to minimize the total cost, which is a summation of the fixed cost of vehicles and the routing cost:

$$ \begin{aligned} {\text{minimize }}\, F & = f_{c} \times \mathop \sum \limits_{k \in V} \mathop \sum \limits_{{j \in V - \left\{ {V_{0} } \right\}}} \mathop \sum \limits_{t = 1}^{T} y_{kj}^{t} + t_{c} \\ & \quad \times \mathop \sum \limits_{k \in V} \mathop \sum \limits_{{j \in V - \left\{ {V_{0} } \right\}}} \mathop \sum \limits_{t = 1}^{T} (C_{kj}^{t} \times y_{kj}^{t} ) \\ \end{aligned} $$
(5)

subject to the following constraints:

  1. (a)

    Allocation of customers to vehicles: Each customer has been allotted to only one vehicle at a given time:

    $$ \mathop \sum \limits_{k = 0,k \ne j}^{B} \mathop \sum \limits_{t = 1}^{T} y_{kj}^{t} = 1,\quad j = 1,2, \ldots ,B; $$
    $$ \mathop \sum \limits_{j = 1,j \ne k}^{B} \mathop \sum \limits_{t = 1}^{T} y_{kj}^{t} = 1, k = 1,2, \ldots ,B $$
    (6)
  2. (b)

    The capacity of the depot: The number of vehicles departed from the depot does not exceed the total number of vehicles (N) at the depot:

    $$ \mathop \sum \limits_{j = 1}^{B} \mathop \sum \limits_{t = 1}^{T} y_{0j}^{t} \le N $$
    (7)
  3. (c)

    The same vehicle services customers on the same cluster:

    $$ \begin{aligned} & \mathop \sum \limits_{i = 1}^{N} N\left( {x_{ki} - x_{ji} } \right) \ge M\left( {\mathop \sum \limits_{t = 1}^{T} y_{kj}^{t} - 1} \right),\quad \forall k,\,\,j,k \ne j \\ & \mathop \sum \limits_{i = 1}^{N} N(x_{ki} - x_{ji} ) \le M \left( {1 - \mathop \sum \limits_{t = 1}^{T} y_{kj}^{t} } \right),\quad \forall k,\,\,j,k \ne j \\ \end{aligned} $$
    (8)

    where M is a very large number which is defined as the maximum of the differences between the latest time of kth customer and earliest time of jth (consecutive) customer, \( \forall k,j,k \ne j \) in the route.

  4. (d)

    Every customer node must be serviced by a single vehicle:

    $$ \mathop \sum \limits_{i = 1}^{B} x_{jk} = 1,\quad \forall j = 1,2, \ldots ,B $$
    (9)
  5. (e)

    Vehicle capacity (\( W_{k} ) \): The load to deliver to customers by a vehicle should not exceed the vehicle capacity for a customer demand Pk.

    $$ \mathop \sum \limits_{k = 1}^{B} P_{k} x_{ki} \le W_{k} ,\quad \forall i = 1,2, \ldots ,N $$
    (10)
  6. (f)

    Eliminating alternate path: There should not be any circuit if the vehicle serves the customer j from customer k at t. Here |A| is the total number of customers in the path.

    $$ \begin{aligned} & \mathop \sum \limits_{k,j \in A \times A,k \ne j} \mathop \sum \limits_{t = 1}^{T} y_{kj}^{t} \le \left| A \right| - 1, A \\ & \quad \in \left\{ {1,2, \ldots ,B} \right\}\quad \forall k,\;j,k \ne j \\ \end{aligned} $$
    (11)
  7. (g)

    Time window constraints: The constraints on time to service a customer should not exceed the time window.

    $$ \begin{aligned} & T_{{a_{k} }} + T_{{w_{k} }} + T_{{s_{k} }} + C_{kj}^{t} - M\left( {1 - y_{kj}^{t} } \right) \le T_{aj} , \\ & T_{aj} \le T_{lj} , \\ & T_{{w_{k} }} = { \hbox{max} }\left\{ {T_{{e_{k} }} - T_{{a_{k} }} ,0} \right\} \\ \end{aligned} $$
    (12)
  8. (h)

    The values of \( x_{ki} \) and \( y_{kj}^{t} \) should be:

    $$ \begin{aligned} y_{kj}^{t} & = 0,1,\quad \forall k,j,i \\ x_{ki} & = 0,1,\quad \forall i,k \\ \end{aligned} $$
    (13)

4 Spiking neural P systems

An SN P system [87] of degree m ≥ 1 can be expressed by a tuple,

  • $$ \Pi = \left( {S,\sigma_{1} , \ldots ,\sigma_{m} ,S_{n,} O_{t} } \right) $$

    where

    1. (1)

      S = {a} is a singleton alphabet called spike.

    2. (2)

      σ1,…, σm are neurons of the form:

      σi= (Qj′, sj,0, wj,0, Rj), 1 ≤ j ≤ m, where

      1. (a)

        Qj is a finite set of states.

      2. (b)

        sj,0 ∈ Qj is the initial state.

      3. (c)

        wj,0 ∈ S* is the initial multiset.

      4. (d)

        Rj is a finite set of rules of the form:

        s w → s′ x ygo zout, where s, s0 ∈ Qj′, w, x ∈ S*, ygo ∈ (S × {go})*, and zout ∈ (S × {out})*, with the restriction that zout = λ for all j ∈ {1, 2,…, m} different from i0.

    3. (3)

      Sn \( \subseteq \) {1, 2,…,m} x {1,2,…, m} with (j, j) \( \notin \) Sn \( \forall \) j \( \in \left\{ {1,2, \ldots, m} \right\} \) are called synapses between neurons.

    4. (4)

      Ot indicates the set of output neurons σo, o \( \in \) {1,2,…, m}.

The rule set Ri defines the standard rules of the SN P system. The term ‘go’ means the spikes have to leave immediately the neuron to the neighboring neurons through synapses. Those objects that are marked with ‘out’ leave the system. Thus, when the configuration reaches a state where no rule can be applied, the computation halts.

5 An improved FA based on SN P systems

There are a few complications in using the classical FA to solve dynamic VRP problems: (1) The solution space of classical FA is continuous in the real domain, while the solution space of VRP is discrete integer domain. (2) The component \( \gamma \), the light absorption coefficient plays a key role in the light absorbance. In the event that the estimation of γ is low, at that point it makes FA falls into a local minimum, while a larger estimation of γ forces lower convergence rate. The domain of \( \gamma \) is [0.01, 100] in general.

The step length factor \( \alpha \) controls the population diversity, which improves the searching capability and avoids premature convergence. The domain of \( \alpha \) is [0, 1]. We adopted the discrete iterative position function to make the solution space suitable for VRP. In addition, the initial feasible positions of fireflies are designed by the Clarke and Wright algorithm which improves the solution space of the problem. Furthermore, the parameters \( \alpha \) and \( \beta_{0} \) are optimized using SN P systems. Also, the light absorption coefficient \( \gamma \) is reinitialized each time the scheme enters stagnation or when no improvements have made for a certain number of consecutive iterations.

The firefly movement can be expressed by:

$$ X_{i}^{\text{new}} = X_{i}^{\text{old}} + \beta \otimes \left( {X_{j} \circleddash X_{i} } \right) \oplus \alpha \left( {rand - 0.5} \right) $$
(14)

where \( X_{j}\, \circleddash \,X_{i} \) denotes the movement direction of the firefly. It is defined as follows:

$$ X_{j}\, \circleddash\, X_{i} = \left\{ {\begin{array}{*{20}l} {X_{j} - X_{i} \notin X_{i} } \hfill \\ {0,\quad {\text{otherwise}}} \hfill \\ \end{array} } \right. $$
(15)

The \( \alpha \left( {rand - 0.5} \right) \) and \( \beta \) are used to control the distance between fireflies j and i. Let \( D_{i} \) be the distance of firefly i moving to firefly j and can be defined as:

$$ D_{i} = \left\{ {\begin{array}{*{20}l} {X_{j} \circleddash X_{i} \oplus \alpha \left( {rand - 0.5} \right) < \beta } \hfill \\ {0,\quad otherwise} \hfill \\ \end{array} } \right. $$
(16)

Definition 1

The brightness of ith firefly can be represented as:

$$ I_{i} = f\left( i \right) $$
(17)

where \( f\left( i \right) \) denotes the fitness value of ith firefly which is evaluated by the objective function defined in Eq. 5.

Definition 2

The attraction factors between firefly i and firefly j are \( \beta_{\left( r \right)} = \beta_{0} e^{{ - \gamma r^{2} }} \), where \( \beta_{0} \) is the attraction degree at \( \gamma \) = 0.

Definition 3

The probability of firefly i to move toward firefly j at the kth dimension is:

$$ \begin{aligned} & {\text{Let}}\;P_{ij}^{k} \left( t \right) = \frac{{\beta_{{f_{k} }}^{\delta } \left( { {\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {r_{ij} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${r_{ij} }$}}^{{\delta^{\prime}}} } \right) }}{{\mathop \sum \nolimits_{{j \notin {\text{allowed}}}} \beta_{{f_{k} }}^{\delta } \left( {{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 {r_{ij} }}}\right.\kern-0pt} \!\lower0.7ex\hbox{${r_{ij} }$}}^{{\delta^{\prime}}} } \right)}}, \\ &{\text{where}}\;\beta_{{f_{k} }} = \beta_{0} e^{{ - \gamma r_{ij}^{2} }} \\ \end{aligned} $$
(18)

The parameters \( \delta\, {\text{ and }}\, \delta^{\prime} \) are used to weight the corresponding terms based on requirements.

If the system attains stagnation, the value of \( \gamma \) is reinitialized. So the value of \( \gamma \) is not fixed, and it varies after a certain number of iterations where there is no progress in the solution space. A firefly will move through the path with a low light absorption coefficient. The smaller the light absorption coefficient, the higher the brightness.

The FA can also be modified to direct fireflies not to blindly follow the distance metric, but also consider the obstacles, traffic data, and dynamic requests coming in between them.

The idea is as follows: Suppose firefly 1 is at position P1 and firefly 2 is at position P2. Let AB is any obstacle. In addition, let the brightness of firefly 1 at P1 is less compared to the brightness of firefly 2 at P2. Let X and Y are light sources with absorption coefficient γ1 and γ2, respectively, and γ1 < γ2. Firefly 1 will first choose the random path through Y, as the distance is less compared to the path through X. However, while moving to P2, the brightness of firefly 1 will be reduced due to the light source at Y and will lead to nonoptimal solutions in future. Even though the distance is high through X, the firefly 1 must select this path since the absorption coefficient is less and will eventually move to better solution space. If we map this scenario to the VRP domain, the light source can be treated as traffic data or dynamic requests that should be taken care of while deciding the route for the vehicles. Figure 4 shows this scenario.

Fig. 4
figure 4

The flashing process of fireflies under real-world scenarios

Determination of the parameters α and β0. The parameters α and \( \beta_{0} \) play important roles in FA. By playing out an attentive hunt of the parameter space, the preferred parameter settings those acquired through observational means could be anticipated. The domain of both the parameters is [0, 1]. In order to optimize the parameters using SN P systems, we are considering binary values. Here seven bits are taken to denote every parameter value. If the binary string is 1001111 for α or 79 in decimal representation, the actual value of α is \( \frac{79}{127} \) = 0.62 in two-point precision, i.e., if the decimal number estimation of the binary string is y, the actual estimation is \( \frac{y}{127} \). The same calculations can be applied for β0 also. So, 127 values are mapped to an interval of width 1. If the number of bits for representing value is high, exactness can be further improved, but the length of the spike train will be increased. The (α, β0) denotes parameter string, and 3 such strings are connected on each spike train.

figure a

The encodings are:

  • (α, β0) = {(87/127, 31/127), (127/127, 0/127), (65/127, 42/127)}

  • (α, β0) = {(0.68, 0.24), (1.00, 0.00), (0.51, 0.33)}

An SN P system is designed to produce a binary string, which is used to encode the values of the parameters. Then, it activates the evolution rules based on probability and the output from multiple neurons are collected. A number H of such SN P systems are connected through an adapter, which is responsible for dealing with population and probability adjustment.

The flow diagram of parameter optimization is given in Fig. 5. The family of SN P systems connected through the adapter module is shown in Fig. 6. A graphical representation of IFA using SN P system optimization is depicted in Fig. 7.

Fig. 5
figure 5

Flow diagram of SN P system parameter optimization

Fig. 6
figure 6

A family of SN P systems connected through an adapter for rule selection

Fig. 7
figure 7

Graphical representation of IFA using SN P system optimization

Definition 4

An optimization SN P system of degree mo ≥ 1 can be formally defined by [88]:

  • Π = (O, σ1\( \sigma_{{m_{o} }} \), σI1, σI2, S, Io)

    1. 1.

      O = {a} is the singleton alphabet.

    2. 2.

      σ1\( \sigma_{{m_{o} }} \) are neurons of the form σ1 = (1, Rk, Pk), and Rk = {r 1k , r 2k } is the rule set of the form, r 1k  = {a → a} r 2k  = {a → λ} and Pk = {P 1k ,P 2k } is a set of probabilities of rule set Rk, where P 1k  + P 2k  = 1, σI1 = σI2 = ({1, a → a}).

    3. 3.

      S = {(k, l) (1 ≤ k ≤ mo + 1 Λ l = mo + 2) \( \left. {{\bigvee }\left( {k = m_{o} + 2{\bigwedge }l = m_{o} + 1} \right)} \right\} \)

    4. 4.

      Io = {1, 2…\( m_{o} \)} is the set of output neurons.

Generating Initial Solutions First, we need to design initial feasible solutions considering customer allocation and path planning. A random assignment of visit day combination for each customer is planned using the Clarke and Wright savings algorithm. Here, the routes cannot be merged without violating the time duration or capacity constraints. Because of this, the number of vehicles for serving the routes may not be adequate. In such cases, route merging is carried out in which the route with the fewest customer is merged to other nearby routes minimizing the cost. However, this leads to solutions that violate time and capacity constraints. This procedure is redone until the number of routes and the number of vehicles are equal.

The Procedure The improved FA is applied to CDVRPTW at the cluster level. A cluster is a group of customers who are geographically mapped based on the distance. Each vehicle is treated as a firefly, and it moves toward a brighter customer from a less bright one. This brightness depends on the distance between the customers, traffic data, and dynamic requests. The route is created by visiting customers until all customers have been visited. As detailed in Sect. 3.3 the sum of fixed costs and transport costs of all routes of the solution has been considered as the objective function. Therefore, altogether CDVRPTW is a minimization problem in which the fireflies with a lower value of objective function are the most attractive ones.

The comparison of distances between fireflies is made cluster by cluster. For example, two fireflies of cluster i composed of 10 customers can be represented as:

  • X1(clusteri) = {0, 2, 3, 5, 7, 8, 1, 4, 6, 9}

  • X2(clusteri) = {2, 0, 5, 3, 7, 8, 1, 4, 6, 9}

The hamming distance between X1 and X2 for the ith cluster is 4. This analysis is made for every cluster, and the total distance between two fireflies is the sum of all the distances for every cluster.

In addition, instead of focusing only on the best optimal solution, the first 3 optimal solutions are selected and are assigned to 3 different spiking neurons. The reason for selecting the first 3 optimal solutions can be stated as a measure of the balance between exploitation and control. The number of solutions selecting for next iterations should be mature enough to find the global optimal solutions in a reasonable time using the minimum number of spiking neurons. Choosing a higher number of solutions may lead to an increase in the complexity of controlling and coordinating neuronal communications. The remaining solutions are collected for random movement to get diverse solution space. So, in the successive iterations, the chances of getting trapped in local minimum can be avoided and the optimal solutions can be generated more efficiently. It further improves the convergence rate due to the simultaneous exploration of solution space.

The initial value of parameter \( \gamma \) is set to 0.95. The value is chosen from several studies [89, 90] and will vary once the system enters into stagnation, and in that case, the value is reinitialized.

6 Spiking neural firefly optimization system (SFO) design

The proposed system is based on the flashing behavior of fireflies. A collection of different SN P systems is combined to produce the required spiking neural firefly optimization scheme (SFO). The SN P systems employed in the proposed scheme include arithmetic operator P systems, spike generator P systems, sorting P systems, random number generator P systems, objective function P systems, and improved firefly optimization P systems. The proposed SFO system is of the form:

$$ \Pi_{\text{SFO}} = \left( {\Pi_{\text{arith}} ,\Pi_{sg} , \Pi_{rg} , {{\Pi }}_{\text{sort}} ,\Pi_{\text{function}} ,\Pi_{\text{IFA}} } \right), $$

where \( \Pi_{\text{arith}} \) includes operations for addition, subtraction, multiplication, and division using SN P systems [91]. \( \Pi_{sg} \;{\text{and}}\;\Pi_{rg} \) are used for spike and random generators, respectively [92, 93]. \( \Pi_{\text{function}} \) indicates the SN P system for objective function evaluation based on the proposed scheme. \( \Pi_{\text{sort}} \) indicates the P system for sorting [94], and \( \Pi_{\text{IFA}} \) is used to denote the improved firefly algorithm using the SN P system for optimization as described in Sect. 5. Figure 8 shows the block diagram of the proposed SFO. Lines and arrows indicate the interactions between different modules in SFO. Lines are used for bidirectional communications, and arrows are used for unidirectional communications.

Fig. 8
figure 8

Block diagram of the proposed SFO system

6.1 Design of \( {{\Pi }}_{\text{IFA}} \)

Definition 5

We define a \( {{\Pi }}_{\text{IFA}} \) of degree mIFA ≥ 3 of the form,

$$ {{\Pi }}_{\text{IFA}} = \left( {O,\sigma ,syn,I_{in} , I_{o} } \right) $$
  1. 1.

    O is the singleton alphabet called a spike.

  2. 2.

    \( \sigma \) is the set of neurons in the IFA system.

    \( \sigma = \sigma_{s} \cup \sigma_{best} \cup \sigma_{\sec ond - best} \cup \sigma_{third - best} \cup \sigma_{remaining} \cup \sigma_{admin} \cup \sigma_{{I_{in} }} \cup \sigma_{{I_{o} }} \)where \( \sigma = \sigma_{1} \ldots \sigma_{{m_{\text{IFA}} }} . \) \( I_{\text{in}} \),\( I_{o} \) indicate input and output neurons, respectively. \( \sigma_{s} \) is the mediator neuron. { \( \sigma_{best} \cup \sigma_{second - best} \cup \sigma_{third - best} \cup \sigma_{remaining} \cup \sigma_{admin} \) } is referring to a firefly system where \( \sigma_{best} \cup \sigma_{second - best} \cup \sigma_{third - best} \) is the best, second-best, and third-best neurons of the solutions. The \( \sigma_{admin} \) is a central neuron that contains the optimal solutions after each iteration. The \( {{\Pi }}_{IFA} \) starts with the initial configuration \( Init_{c} \) and then enters into flashing configuration \( Flash_{c} \) and halting with \( Halt_{c} . \)

    Every neuron in \( \Pi_{IFA} \) is of the form,

    $$ \begin{aligned} & \sigma_{admin} = \left( {n_{l} ,R_{l} } \right), 1 \le l \le m_{\text{IFA}} , \\ & \sigma_{{best_{i} }} = \left( {n_{i} ,R_{i} } \right), 1 \le i \le m_{\text{IFA}} , \\ & \sigma_{{second - best _{j} }} = \left( {n_{j} ,R_{j} } \right), 1 \le j \le m_{\text{IFA}} , \\ & \sigma_{{third - best_{k} }} = \left( {n_{k} ,R_{k} } \right), 1 \le k \le m_{\text{IFA}} , \\ \end{aligned} $$

    \( \sigma_{{I_{\text{in}} }} = \left( {n_{\text{in}} ,R_{\text{in}} } \right) \) and \( \sigma_{{I_{o} }} = \left( {n_{o} ,R_{o} } \right) \)

    \( \sigma_{admin} \) is responsible for keeping the values of firefly positions and updating the best, second-best, and third-best solutions. The number of spikes is indicated by n with the index denoting the corresponding neuron. Every firefly has five neurons out of which four neurons are used to indicate its levels (best, second-best, third-best, and remaining) and one admin neuron to track the updated values.

  3. 3.

    Synapses:

    $$ \begin{aligned} syn & = \left\{ {(\sigma_{admin} , \sigma_{admin} ),} \right. \\ & \quad \left( {\sigma_{admin} ,\sigma_{best} } \right),\left( {\sigma_{admin} ,\sigma_{second - best} } \right), \\ & \quad (\sigma_{admin} ,\sigma_{third - best} ),\left( {\sigma_{admin} ,\sigma_{remaining } } \right), \\ & \quad \left( {\sigma_{admin} ,\sigma_{{I_{o} }} } \right), \left( {\sigma_{admin} ,\sigma_{{I_{in} }} } \right), \\ & \quad \left( {\sigma_{admin} ,\Pi_{sg} } \right),\left( {\sigma_{admin} , \Pi_{sort} } \right) \\ & \quad \left( {\sigma_{admin} , \Pi_{arith} } \right), \left( {\sigma_{admin} , \Pi_{rg} } \right), \\ & \quad \left( {\sigma_{admin} , \Pi_{function} } \right),\left( {\sigma_{admin} , \Pi_{IFA} } \right), \\ & \quad \left( {\sigma_{best} ,\Pi_{arith} } \right),\left( {\sigma_{second - best} ,\Pi_{arith} } \right), \\ & \quad \left. {\left( {\sigma_{third - best} ,\Pi_{arith} } \right),\left( {\sigma_{remaining} ,\Pi_{arith} } \right)} \right\} \\ \end{aligned} $$
  4. 4.

    The input neuron \( I_{\text{in}} \) receives parameters of the algorithm to initialize the system. The parameters include the initial position of fireflies and the maximum number of iterations.

  5. 5.

    The output neurons \( I_{o} \) act as monitors by checking the optimal solutions and maximum iterations of the algorithm. It is also responsible for sending the best, the second-best, and the third-best values and activates forgetting rules to re-instantiate or terminate the whole system.

Each firefly is denoted by a set of {\( \{ \sigma_{admin} ,\sigma_{best} , \sigma_{second - best} , \sigma_{third - best} {\text{and}} \sigma_{remaining} \} \) neurons. The mediator neuron \( \sigma_{s} \) distributes position for all admin neurons \( \sigma_{admin} \) and further broadcasts values of the best, the second-best, the third-best solutions, and iteration number. The power of the signal [95] is employed to give the needed power to reach to all admin neurons.

The Admin Neuron (\( \sigma_{admin} \)) The admin neurons are associated with a set of rules to perform the steps of the algorithm. Each admin neuron \( \sigma_{admin} \) has one initial spike and is loaded in the initial configuration \( (Init_{c} ). \)

The tasks of \( \sigma_{admin} \) are:

  1. 1.

    Accept algorithm variables and firefly positions to send to the best, second-best, third-best, and remaining neurons.

  2. 2.

    Calculate the values of the optimization equation and find feasible solutions.

  3. 3.

    Update the best optimal firefly in each stage.

All the rules are considered and controlled by time steps. This allows the incorporation of delay mechanisms in all rules, which is appropriate for various neuron stage activations. The admin neuron \( \sigma_{admin} \) starts receiving spikes from mediator neuron \( \sigma_{s} \), and system changes its transition from \( Init_{c} \) to \( Flash_{c} \) state. Then it sends the positional information of fireflies from mediator neurons to \( \sigma_{best} ,\sigma_{second - best} ,\sigma_{third - best} , {\text{ and }} \sigma_{remaining} \) neurons. At the same time \( {{\Pi }}_{function} \) is invoked by \( \sigma_{admin} \) to perform objective function evaluation. After this, it clears the neurons by activating the forgetting rules. After finishing the objective function evaluation, it invokes \( {{\Pi }}_{sort} \) in order to sort and rank the solutions, which are needed for the \( \sigma_{best} ,\sigma_{second - best} ,\sigma_{third - best} , {\text{and}} \sigma_{remaining} \) neurons. All the synapses, which are no longer needed, are deleted, and the \( \sigma_{best} ,\sigma_{second - best} ,\sigma_{third - best} , {\text{and}} \sigma_{remaining} \) neurons update the positions in the current iteration. After this, all admin neurons enter halt configuration \( Halt_{c} \).

$$ The\; \sigma_{best} ,\sigma_{\sec ond - best} ,\sigma_{third - best} ,\;and\;\sigma_{remaining} $$

Neurons These neurons are used for calculating the firefly positions in 4 levels. The number of spikes for instantiating these neurons is two. The new positions are calculated and updated by sending encoded information in spikes to the respective neurons at fixed time steps. After each iteration, the spikes are cleared by forgetting rules. The structure of the neurons representing the fireflies at runtime is shown in Fig. 9.

Fig. 9
figure 9

Structure of neurons in \( {{\Pi }}_{\text{IFA}} \) for 2 fireflies

The Input Neuron The input neuron is responsible for giving the algorithm parameters and constant values along with initial feasible solutions for the fireflies to start execution. After receiving the parameters, it changes the configuration from Initc to Flashc. The respective neurons are updated with the iteration number, initial feasible solutions, and initial firefly positions. It can also receive incoming spikes from \( \sigma_{{I_{o} }} \) containing 3 optimal solutions and the remaining solutions.

The Mediator Neuron \( (\sigma_{s} ) \) The functionalities of the mediator neurons are to act as mediators between the supporting SN P systems and \( \Pi_{IFA} \) and are provided with the power of signal strategy. The main function is to broadcast the values or objects to respective neurons to indicate the lifetime of spikes. The mediator neuron receives spikes from input neurons and distributes the parameters and positions to the admin neurons by controlling the synapses.

The Output Neuron \( (\sigma_{{I_{o} }} ) \) The responsibilities of output neurons include:

  1. 1.

    Increment the iteration number by 1 each time the algorithm proceeds.

  2. 2.

    Halt when the current iteration value and the maximum iteration value are the same.

  3. 3.

    Send spikes of position vectors to each of the \( \sigma_{best} ,\sigma_{second - best} ,\sigma_{third - best} , {\text{ and }} \sigma_{remaining} \) neurons.

  4. 4.

    Send the final output to the surroundings.

6.2 Supporting SN P systems

The supporting SN P systems include SN P systems for random number generations, objective function evaluation, arithmetic operations, sorting, and spike generators. An SN P system for objective function evaluation is designed to adapt to the fireflies’ flashing behavior based on the proposed scheme, while the remaining systems are found in the literature.

Definition 6

An SN P system for objective function evaluation ((\( \Pi_{function} \)) of degree mfunction ≥ 1 can be defined by the tuple:

\( {{\Pi }}_{function} = \left( {O,\sigma_{i} ,s_{n} ,I_{fin} ,I_{fo} } \right) \) where

  1. 1.

    O = {a} is the singleton alphabet called spike.

  2. 2.

    Set of neurons of the form,

    $$ \sigma_{i} = \left( {n_{i} ,R_{i} } \right), 1 \le i \le m_{function} , $$

    where \( n_{i} \) are the initial spikes in neuron \( \sigma_{i} \), which are employed according to the objective function. \( R_{i} \) is the rule set for calculating the new firefly positions and is encoded as \( \varphi \left( {a^{P} } \right) \) which will be broadcast to the admin neurons.

  3. 3.

    \( s_{n} = \{ (I_{\text{fin}} , {{\Pi }}_{\text{IFA}} ), \ldots ,(I_{\text{fo}} ,\Pi_{\text{IFA}} ) \)

  4. 4.

    The input neuron \( I_{\text{fin}} \) receives spikes which are encoded by the initial position information of fireflies from \( \Pi_{\text{IFA}} \)

  5. 5.

    The \( I_{\text{fo}} \) is the output neuron which sends the fitness value encoded by \( \varphi \left( {a^{{P.{\text{value}}}} } \right) \) to \( \Pi_{\text{IFA}} \)

The pseudocode of the proposed SFO system is given in algorithm 1. The algorithm starts by initializing the parameters according to the dataset. It uses the functionalities of the supporting SN P systems. The values are analyzed in each iteration to see whether the optimal solutions are reached or not. The encoded result \( \varphi \left( {X_{1} } \right) \) and \( \left( {X_{1} . {\text{value}}} \right) \), the best firefly is finally returned by the output neuron.

figure b

7 Experimental results and discussion

In this section, the performance of the proposed SFO system to solve CDVRPTW will be rigorously assessed. As far as the authors are concerned, there are no benchmark datasets on CDVRPTW till date. So the SFO is tested on DVRP instances and CVRPTW instances separately, and the results are analyzed. The various parameter settings are given in Table 1. All the experiments are simulated using MATLAB with Intel xenon 2.93 GHz processor, 12 GB RAM, and Windows 10 OS.

Table 1 Parameter settings

7.1 Experiment 1: DVRP datasets

SFO is tested by an average of 40 runs on the dataset, which are open and available at http://neo.lcc.uma.es/vrp/ (dataset 1). A comparison of solution quality in terms of best and average values is compared with ACO, k-ACO, E-ACO [96], VNS [97], and GA-DVRP [98], and the results are analyzed.

Also, the utilization rate of a vehicle is an important measure for verifying the performance of the proposed approach. It estimates whether each of the vehicles used fully or not. It is analyzed in Fig. 10. From the figure, it can be noted that the utilization rates of vehicles are high in SFO (13 out of 20 instances) scheme. In the other 7 instances, SFO is inferior to E-ACO but superior to GA, ACO, and VNS. In addition, the best and average values of the solution quality of six approaches can be analyzed from Table 2. The best results are in boldface. The proposed SFO achieves 17 out of 20 best values and 15 out of 20 average values compared with other best-known schemes. The K-ACO attains the worst performance compared to the other three algorithms. In order to clearly differentiate these schemes, K-ACO is chosen as a benchmark to plot how the improvement of the remaining algorithms is compared with K-ACO. A boxplot is given for analyzing the improvement percentage of various schemes. The minimum, the maximum, the sample median, and the first and third quartiles are shown in Fig. 11. From the analysis of boxplots, it is noted that the improvement percentage of SFO is superior to other schemes in terms of solution quality.

Fig. 10
figure 10

A comparison of the utilization rate of various schemes for dataset 1

Table 2 Comparison of solution quality in terms of best and average values of various schemes (dataset 1)
Fig. 11
figure 11

Comparison of solution improvement percentage of various algorithms for dataset 1 (K-ACO as the benchmark scheme)

From the analysis of all of the above results, we concluded that the proposed SFO system attains substantially better performance compared to other existing best-known schemes.

7.2 Experiment 2: CVRPTW datasets

The proposed SFO is applied over Solomon’s benchmark instances (dataset 2), which are partitioned into 6 categories: C1, C2, R1, R2, RC1, and RC2. The time window at depot is restricted for R1, C1, and RC1 so that only limited customers can be accommodated. But, for R2, C2, and RC2 the time window is broad with the goal that numerous clients can be overhauled in a similar route. The results of the proposed system are analyzed with the existing CVRPTW schemes, ant colony optimization (ACO) [99], ACO with tabu search [100], hybrid multiobjective evolutionary algorithm (HMOEA) [101], tissue P system with MOEA (PDVA) [102], and multiobjective goal programming (MOGP) [103]. A comparison is made on the vehicle utilization rate and solution quality and is given in Fig. 12 and Table 3, respectively. SFO achieved 5 out of 6 of the best solution values over 6 algorithms. From the analysis of results, it is noted that the vehicle utilization rate of SFO is superior to other best-known schemes. Also, the solution improvement percentage is calculated as ACO tabu as a benchmark scheme and the performance of various algorithms is plotted using boxplots in Fig. 13. From the boxplot, it is concluded that SFO is having the best solution improvement percentage and ACO is having the worst improvement over other schemes.

Fig. 12
figure 12

Comparison of vehicle utilization rates of various schemes for dataset 2

Table 3 Comparison of solution values for various schemes (dataset 2)
Fig. 13
figure 13

Comparison of solution improvement percentage of various schemes for dataset 2 (ACO tabu as the benchmark scheme)

Theorem 1

The time complexity of SFO is (r × T(IFA) +T (SS)+ T \( (I_{\text{in}} ) \) + T (\( I_{o} \))) × maxIter where r is a constant used to weight the term which can be either 1 (if T(IFA) alone is considered) or 2 (if \( \sigma_{admin} \) and T(IFA) are considered) and T(IFA) is the time units of firefly neurons. T (SS) denotes the total time needed for all the supporting SN P systems. \( T(I_{in} ) \) and T (\( I_{o} \)) refer to the time steps needed for the input and output neurons, respectively, and maxIter is the maximum number of iterations of the algorithm.

Proof

Let the SFO system have n fireflies, each of which involves 5n neurons in total. The solution neurons run in parallel and operate sequentially with their admin neurons. Therefore, the needed time is T(IFA). The n firefly operations depend on T (SS). The maximal parallelism in SN P systems allows polynomial time complexity in solving optimization problems according to the number of supporting systems. Therefore, it is concluded that the total time complexity of SFO is purely dependent on the maxIter and T(SS).□

The results from Theorem 1, Tables 2, and 3, Figs. 10, 11, 12, and 13 indicate the following conclusions:

  1. 1.

    The proposed SFO is superior to the best-known schemes in terms of solution quality and vehicle utilization rate.

  2. 2.

    With respect to the stopping conditions, the better balance between exploration and exploitation has stronger optimization capacity in polynomial time (Theorem 1).

  3. 3.

    The solution improvement percentage of SFO is higher compared to the best-known schemes.

7.3 Statistical analysis

Based on the experimental outcomes, a set of statistical analysis of algorithms over the instances has been made. We adopted a parametric test, pairwise t test to check the significance of SFO over other schemes and nonparametric test, Wilcoxon’s, and Friedman’s test to see whether the outcomes of the algorithms are significantly different or not regardless of the relationship between them. In addition, a Holm–Bonferroni [104] ranking scheme is adopted on the basis of the average solution quality obtained over 20 test instances for dataset 1 and 6 problem categories for dataset 2. All the experiments are conducted on IBM SPSS Statistics software with a significance level of 0.05.

7.3.1 Pairwise t test

The proposed SFO is compared with the state-of-the-art algorithms using the t test and is tabulated in Tables 4 and 5, respectively, for dataset 1 and dataset 2. The null hypothesis for the paired t test is assumed, as the true mean difference between paired samples is zero. On the other hand, the alternative hypothesis is that the true mean difference between paired samples is not zero. The significance level is set as 0.05. Since all the pairs have significance < 0.05, it leads to the rejection of the null hypothesis. So there is a pairwise difference of means among the samples.

Table 4 T test: paired samples statistics for dataset 1
Table 5 T test: paired samples statistics for dataset 2

7.3.2 Wilcoxon’s and Friedman’s test

These methods are used to analyze the pairwise differences among the algorithms considered for experimentation. For Wilcoxon’s test, the hypothesis is assumed as follows:

H0:

The medians of the pairs are identical

H1:

The medians for the pairs are not identical

The result of Wilcoxon’s test (WT) for dataset 1 is given in Table 6. For dataset 2, the test is not as significant as the number of samples is less than 10.

Table 6 WT test analysis for dataset 1

For the pairs ACO and SFO, the value of z = −2.9493. The corresponding p value is 0.00318. In addition, the value of w is 26. The critical value for w at n = 20 is 43. So the result is significant which leads to the rejection of the null hypothesis.

For the pairs K-ACO and SFO, the value of z = − 3.5839. The corresponding p value is 0.01209. In addition, the value of w is 9. So the result is significant which leads to the rejection of the null hypothesis.

For the pairs EACO and SFO, the value of z is − 3.1359. The p value is .00168. In addition, the value of w is 21. The critical value of w at N = 20 is 23, which leads to the rejection of H 0. The result is significant.

For the pairs VNS and SFO, the value of z is − 3.2106; the p value is .00132. The result is significant at p < 0.05. Also, the value of w is 19. The critical value at N = 20 is 43 which leads to the rejection of the null hypothesis.

For the pairs GADVRP and SFO, the value of z is − 3.2106. The p value is .00132. The result is significant at p < 0.05. In addition, the value of w is 19. The critical value at N = 20 is 43 which leads to the rejection of the null hypothesis.

The Friedman test (FT) is used to evaluate the differences between the number of samples. It relies on the rank ordering of data rather than means and variances. The value of Friedman’s test statistics for dataset 1 and dataset 2 is shown in Tables 7 and 8, respectively. In both cases, the p value is very small and is < 0.05. It indicates if the different treatments really are identical, what is the chance that random sampling would result in the sum of ranks as far apart as observed in the experiment. If p is small, reject the idea that all of the differences between columns are due to random sampling. Since the p value for both the datasets are small, the result is significant.

Table 7 FT summary for dataset 1
Table 8 FT summary for dataset 2

7.3.3 Holm–Bonferroni ranking

After analyzing the results so far, the algorithms are ranked on the basis of the solution quality for both datasets. Based on the ranks, SFO is taken as a reference scheme and the corresponding cumulative normal distribution values are calculated. It is then compared with the D/j values where j = 1,2,…5 (5 schemes except SFO) with D = 0.05. The results are given in Table 9 and 10. The significance is given as ‘Accepted’ if the null hypothesis (no difference in performance between the considered algorithms) is true otherwise ‘Rejected’ if the null hypothesis is false.

Table 9 Holm–Bonferroni ranking for dataset 1 (reference scheme-SFO (rank 5.15))
Table 10 Holm–Bonferroni ranking for dataset 2 (reference scheme-SFO (rank 5.5))

For dataset 2, it is noted that the performance of HMOEA and PDVA has the same performance as that of SFO. Therefore, it is concluded that the membrane-based algorithms and evolutionary schemes have similar performance to that of SFO compared to other schemes.

The experimental outcomes from Tables 4, 5, 6, 7, 8, 9, and 10 reveal the following conclusions:

  1. 1.

    The t test statistics show that SFO is really better than other schemes since there are significant differences between the solutions for both datasets.

  2. 2.

    From the calculated p values of WT and FT over the pairs, it is noted that the SFO is statistically superior to other schemes.

  3. 3.

    The Holm–Bonferroni ranking shows that the SFO is the highest in ranking and is able to outperform all nonmembrane-based schemes.

The results summarized above appear competitive with respect to state-of-the-art algorithms.

7.3.4 Post Hoc procedures

The main drawback of the FT is that it can only detect significant differences over multiple comparisons, being not able to define proper differences among the schemes. Therefore, a post hoc test [105] is applied to obtain a p value, which defines the degree of rejection of each hypothesis. We adopted Conover method with 2 types of p value adjustment, namely Holm FWER (forward error rate) and Benjamini–Hochberg FDR method [106]. These tests are used to discern which of the sample pair combinations are significantly different. Both FWER and FDR methods are used for reducing type 1 error inflation that leads to false-positive discovery rate. The FWER and FDR adjustment for dataset 1 is, respectively, indicated in Figs. 14 and 15.

Fig. 14
figure 14

Conover p values, further adjusted by the Holm FWER method (dataset 1)

Fig. 15
figure 15

Conover p values, further adjusted by the Benjamini–Hochberg FDR method (dataset 1)

Post hoc p values of all possible pairs (of samples/groups) can be represented as a lower triangular matrix. Each entry is the p value of row/column pair, i.e., the null hypothesis that the group represented by a particular column name is different from the group by a particular row name. The pair, which is significant, is shaded using green color. The nonsignificant pairs are colored using red.

From Figs. 14 and 15, it is indicated that the pair GA-DVRP and E-ACO pair is not significant. All the pairs involving SFO are significant. For dataset 2, there are 4 pairs that are significant in the FWER method and are indicated in Fig. 16. Among the 4 pairs 3 pairs involving SFO gives a significant result. In the FDR method, there are a total of 8 pairs which are significant. Among those, 4 pairs are having SFO which are significant. So finally, it can be concluded that the significance of SFO is higher compared to other schemes in both FWER and FWR methods of post hoc analysis (Fig. 17).

Fig. 16
figure 16

Conover p values, further adjusted by the Holm FWER method (dataset 2)

Fig. 17
figure 17

Conover p values, further adjusted by the Benjamini–Hochberg FDR method (dataset 2)

8 Conclusions and future scope

This article suggests an efficient SN P system combined with firefly optimization architecture for addressing the problems of vehicle routing, namely CDVRPTW. In this scheme, we suggested a feasible way of using the SN P system to develop an optimization method for having the approximate solutions of the problem instance. It uses Clarke and Wright algorithm to define initial assignments of fireflies to find solutions to CDVRPTW. A high degree of parallelism is employed by carefully designing the firefly–neuron combinations, and the parameters in the firefly scheme are well optimized by adjusting the rule probabilities of SN P systems. The authors presented the algorithms, complexity analysis, and experimental results with statistical validation to check the effectiveness of the algorithm. This study being the first attempt in this kind, the results appear to be positive and successful relative to ad hoc optimization algorithms. The experimental outcomes justify the novelty, effectiveness, and feasibility of the system.

The main strengths of the proposed system include:

  1. 1.

    The spiking neural P systems are operating in parallel on neuron level and sequential on system level. This point underlines the difference of earlier studies with the proposed scheme.

  2. 2.

    The optimization procedure is done by a spiking neural P system not by a human-designed algorithm.

  3. 3.

    The idea can be used for other applications such as fault diagnosis of power stations, path planning of robots, unmanned aerial vehicle routing, image segmentation.

The drawbacks of the proposed scheme are:

  1. 1.

    The time complexity depends on the iterations and supporting SN P systems. So for applications which need more number of iterations and supporting systems, the complexity will be high.

  2. 2.

    The power of signal strategy needs to be carefully designed for each neuronal communications.

For future works, the authors will try to overcome the afore-mentioned drawbacks.

Other research dimensions can be stated as follows:

  1. 1.

    Expand the method to handle more complex real-world dynamic instances.

  2. 2.

    Improve the convergence rate further by carefully designing the modules.

  3. 3.

    Consider timed SN P systems to control and coordinate the whole subsystems