1 Introduction

Today, it is well recognized that the processes of learning and the transfer of what has been learned are central to humans in problem-solving [1]. Learning has been established to be fundamental to human in functioning and adapting to the fast evolving society. Besides learning from the successes and mistakes of the past and learning to avoid making the same mistakes again, the ability of human in selecting, generalizing and drawing upon what have been experienced and learned in one context, and extending them to new problems is deem to be most remarkable [2, 3].

Within the context of computational intelligence, several core learning technologies in neural and cognitive systems, fuzzy systems, probabilistic and possibilistic reasoning have been notable for their ability in emulating some of human’s cultural and generalization capabilities [47], with many now used to enhance our daily life. Recently, in contrast to traditional machine learning approaches, Transfer Learning which uses data from a related source task to augment learning in a new or target task, has attracted extensive attentions and demonstrated great success in a wide range of real-world applications including computer vision, natural language processing, speech recognition, etc [813]. In spite of the accomplishments made in computational intelligence, the attempts to emulate the cultural intelligence of human in search, evolutionary optimization in particular, have to date received far less attention. In particular, existing evolutionary algorithms (EAs) have remained yet to fully exploit the useful traits that may exist in similar tasks or problems. In particular, the study of optimization methodology that evolves along with the problems solved has been under-explored in the context of evolutionary computation. To date, most evolutionary computation approaches continue to start a search on the given new problem of interest from ground zero state [1423]. Thus a major gap of existing evolutionary search methodologies proposed in the literature is the lack of available techniques to enhance evolutionary search on related new problems by learning from past related solved problems.

In the literature, memetic computation has been defined as a paradigm that uses the notion of meme(s) as units of information encoded in computational representation for the purpose of problem-solving. The memes are captured from recurring information patterns or structures and can be evolved to form more complex higher level structures. However, currently, a meme has usually been perceived as a form of individual learning procedure, adaptive improvement procedure or local search operator to enhance the capability of population based search algorithm [2426]. From the last decades, this integration has been established as an extension of the canonical evolutionary algorithm, by the names of hybrid, adaptive hybrid or Memetic Algorithm (MA) in the literature [25, 2730]. Falling back on the basic definition of a meme by Dawkins and Blackmore [31, 32], as the fundamental building blocks of culture evolution, research on memetic computation can perhaps be more meme-centric focus by treating memes as the building blocks of a given problem domain.

In this paper, we embark on a study towards a new Memetic Computation Paradigm: Evolutionary Optimization \(+\) Transfer Learning for search, one that models how human solves problems. We believe that by leveraging from the potential common characteristics among the problem instances that belongs to the same problem domain, i.e., topological properties, data distributions or otherwise, the effective assessments of future unseen related problem instances can be achieved more efficiently, without the need to perform an exhaustive search each time or start the evolutionary search from a ground-zero knowledge state. Above and beyond the standard mechanisms of a conventional evolutionary search, for instance the genetic operators in the case of Genetic Algorithm, our proposed approach has four additional culture-inspired operators, namely, Learning, Selection, Variation and Imitation. The role of the learning operator is to mine for knowledge memeFootnote 1 from past experiences of problem-solving, which shall then manifest as instructions to bias the search on future problems intelligently (i.e., thus narrowing down the search space). The selection operator, on the other hand, selects the high quality knowledge meme that shall then replicate and undergo new innovations via the variation operator, before drawing upon them to enhance future evolutionary search. Last but not least, the imitation operator defines the assimilation of knowledge meme in subsequent problem solving. To summarize, the core contributions of the current work is multi-facets, which are outlined as follows:

  1. 1.

    To date, most search methods start the optimization process from scratch, with the assumption of zero usable information, i.e., ignoring how similar the current problem instance of interest is to those encountered in the past [14, 3436]. The current work endeavors to fill this gap by embarking a study on evolutionary optimization methodology with transfer capabilities that evolves along with the problems solved.

  2. 2.

    To the best of our knowledge, the present study serves as a first attempt to propose transfer learning as culture-inspired operators in the spirit of memetic computation (comprising of the mechanisms of cultural learning, selection, variation and imitation [27, 30, 33, 36, 37]), as a form of ‘Intelligent Initialization’ of high quality solutions in the starting population of the conventional evolutionary optimization so as to speed up future search on related problems.

  3. 3.

    Beyond the formalism of simple and adaptive hybrids as memetic algorithm, this paper introduces and showcases the novel representation of acquired knowledge from past optimization experiences in the form of memes. In contrast to the manifestation of memes as refinement procedures in hybrids, here memes manifest as natural building blocks of meaningful information, and in the present context, serving as the instructions for generating solutions that would lead towards optimized solutionsFootnote 2 both efficiently and effectively.

  4. 4.

    We derive the mathematical formulations of the proposed transfer learning culture-inspired operators for faster evolutionary optimization of related problems. In this paper, we formulate the knowledge mining problem in the learning operator as a modelling of the mapping between past problem instances solved and the respective optimized solution via maximizing their statistical dependence. The selection operator is formulated as a maximization of the problem distributions similarity between instances, while variation and imitation are derived as the generalization of knowledge learned from past problem instances solved.

  5. 5.

    Comprehensive studies on two separate NP-hard problem domains using benchmark sets of diverse properties and a real world package collection/delivery problem showed that the proposed culture-inspired operators led to significant speedup in search performances and at no loss in solution quality, when incorporated into recently proposed evolutionary solvers. Notably, on several problem instances, improved search quality are observed over recently proposed state-of-the-art evolutionary solvers of the respective problem domains considered.

The rest of this paper is organized as follows: a brief discussion on the related works and the introduction of memes are given in Sect. 2.1. Section 3 introduces the proposed new memetic computation paradigm for search, via learning from past problem-solving experiences, to speedup evolutionary searches of related problems. Section 4 presents a brief discussion of the routing problem domain, particularly, capacitated vehicle routing problem (CVRP) and capacitated arc routing problem (CARP), and recently proposed evolutionary optimization methodologies for solving them. The proposed mathematical formulations and algorithms of the learning, selection, variation and imitation operators for fast evolutionary search on NP-hard routing problems are then described in Sect. 5. Last but not least, Sect. 6 presents and analyzes the detailed experimental results obtained on the CVRP and CARP benchmark sets and subsequently on a real world application. Lastly, the brief conclusive remarks of this paper are drawn in Sect. 7.

2 Preliminary

In this section, we first provide a brief review on related works in the literature. Subsequently, an overview of meme as building block of problems for problem-solving, which serves as one of the inspirations of this paper, is presented.

2.1 Related works

In practice, problems seldom exist in isolation, and previous related problem instances encountered often yield useful information that when properly harnessed, can lead to more efficient future evolutionary search. To date, some attempts have been made to reuse solutions from search experiences. Louis et al. [38], for instance, presented a study to acquire problem specific knowledge and subsequently using them to aid in the genetic algorithm (GA) search via case-based reasoning. Rather than starting anew on each problem, appropriate intermediate solutions drawn from similar problems that have been previously solved are periodically injected into the GA population. In a separate study, Cunningham and Smyth [39] also explored the reuse of established high quality schedules from past problems to bias the search on new traveling salesman problems (TSPs). Similar ideas on implicit and explicit memory schemes to store elite solutions have also been considered in dynamic optimization problems, where the objective function, design variables, and environmental conditions may change with time (for example, periodic changes) [40]. However, as both [38] and [39] as well as works on dynamic optimization problems [40] generally considered the exact storage of past solutions or partial-solutions from previous problems solved, and subsequently inserting them directly into the solution population of a new evolutionary search or the dynamic optimization search, they cannot apply well on unseen related problems that bear differences in structural properties, such as problem vertex size, topological structures, representations, etc. This means that what has been previously memorized from related problems solved cannot be directly injected into future search on unseen problems for successful reuse.

More recently, Pelikan et al. [41] proposed a framework to improve the model-directed optimization techniques by combining a pre-defined problem-specific distance metric with information mined from previous optimization experience on similar problems. They empirically illustrated the proposed approach can significantly speedup the original optimization technique. Further, Santana et al. [42] proposed to transfer the structural information from subproblems to bias the construction of aggregation matrix of the estimation of distribution algorithm (EDA) for solving multi-marker tagging single-nucleotide polymorphism (SNP) selection problem. They also obtained significant improvements over EDAs that do not incorporate information from related problems. However, it is worth noting that, since these transfer approaches are designed for model-based evolutionary optimization methods (e.g., EDA), they cannot apply with the model free evolutionary algorithms, such as genetic algorithm. Last but not least, Santana et al. [43] introduced to use network theory for mining structural information for evolutionary optimization, but how the mined information be used across problems for enhancing evolutionary search was not discussed.

In contrast to existing approaches, the present new memetic computation paradigm addresses the task of learning generic building blocks or knowledge of useful traits from past problems solving experiences and subsequently drawing upon them through the cultural evolutionary mechanisms of learning, selection, variation and imitation (as opposed to a simple direct copying of past solutions or mode based approach in previous works) to speedup the search on new problems of the same domain. In such a manner, the transfer and incorporation of knowledge meme as generic building blocks of useful traits, can apply with model free evolutionary optimization and then lead to enhanced search on problems of differing vertex size, topological structures, and representations, etc.

2.2 Meme as building block of problems

Like gene in genetics, a meme is synonymous to memetic as being the building block of cultural know-how that is transmissible and replicable [30]. In the last decades, meme has inspired the new science of memetics which today represents the mind-universe analog to genetics in cultural evolution, stretching across the field of biology, cognition, psychology, and sociology [33].

Looking back on the history of meme, the term can be traced back to Dawkins [31] in his book “The selfish Gene”, where he defined it as “a unit of information residing in the brain and is the replicator in human cultural evolution”. Like genes that serve as “instructions for building proteins”, memes are then “instructions for carrying out behavior, stored in brains”. As discussed by Blackmore in her famous book “The Meme Machine”, where she reaffirmed meme as information copied from one person to another and discussed on the theory of “memetic selection” as the survival of the fittest among competitive ideas down through generations [32]. Other definitions of meme that took flights from there have since emerged to include “memory item, or portion of an organism’s neurally-stored information” [44], “unit of information in a mind whose existence influences events such that more copies of itself get created in other minds” [45], and “contagious information pattern that replicates by parasitically infecting human minds” [46].

In the literature, beyond the formalism of simple and adaptive hybrids in MA, Situngkir [47] presented a structured analysis of culture by means of memetics, where meme was regarded as the smallest unit of information. Heylighen and Chielens [48] discussed the replication, spread and reproduction operators of memes in cultural evolution. Nguyen et al. [49] studied the notion of “Universal Darwinism” and social memetics in search, and investigated on the transmission of memetic material via non-genetic means while Meuth et al. [50] proposed a new paradigm of meta-learning memetic computing for search. In their work, they demonstrated the concept of meta-learning with a memetic system, consisting of an optimizer, a memory, a selection mechanism, and a generalization mechanism that conceptualizes memes not just within the scope of a problem instance, but over a more generic contextual scope. More recently, Feng et al. [51] presented a memetic search with inter-domain Learning, wherein meme is defined as the knowledge building blocks that can be reused across different problem domains for enhanced evolutionary search.

3 Proposed memetic computation paradigm

In this section, we shall present the proposed memetic computation paradigm: evolutionary optimization \(+\) transfer learning. In particular, four culture-inspired operators, which introduce high quality solutions into the initial population of the evolutionary search on related problems, thus leading to enhanced optimization performances, are proposed. In our approach, the instructions for carrying out the behavior to act on a given problem are modeled as knowledge memes. The knowledge memes serve as the building blocks of past problems solving experiences that may be efficiently passed on or replicated to support the search on future unseen problems, by means of cultural evolution. This capacity to draw on the knowledge from previous instances of problem-solving sessions in the spirit of memetic computation [27, 30, 33] thus allows future search to be more efficient on related problems.

Fig. 1
figure 1

Proposed memetic computation paradigm: evolutionary optimization (i.e., a) \(+\) transfer learning (i.e., b)

3.1 Transfer learning as culture-inspired operators

The proposed memetic computation paradigm based on evolutionary optimization (i.e., Fig. 1a) \(+\) transfer learning (i.e., Fig. 1b) is depicted in Fig. 1. In the figure, \(\mathbf {P_{old}}\) is the set of past problems solved with \(\mathbf {S_{old}}\) denoting the respective optimized solutions of \(\mathbf {P_{old}}\). \(\mathbf {P_{new}}\) is the set of unseen problems of interest to be optimized. And \(\mathbf {S^j_{new}}\) is the initialized population of potential solutions for unseen problem \(\mathbf {p^j}\). \(\mathbf {M}\) denotes a knowledge meme. The proposed paradigm is composed of a conventional evolutionary algorithm as depicted in Fig. 1a (or it can be any state-of-the-art evolutionary algorithm in the domain of interest) and four culture-inspired operators proposed for facilitating faster evolutionary optimization of related problems as depicted in Fig. 1b, namely Learning, Selection, Variation and Imitation, whose functions are described in what follows:

  • Learning operator: Given that \(\mathbf {p}\) corresponds to a problem instance and \(\mathbf {s^*}\) denotes the optimized solution of \(\mathbf {p}\), as attained by an evolutionary solver (labeled here as \(\textit{ES}\)). The learning operator takes the role of modeling the mapping from \(\mathbf {p}\) to \(\mathbf {s^*}\), to derive the knowledge memes. Thus, the learning process evolves in an incremental manner, and builds up the wealth of ideas in the form of identified knowledge, along with the number of problem instances solved. Note the contrast to a simple storage or exact memory of specific problem instance \(\mathbf {p}\) with associated solution \(\mathbf {s^*}\) as considered in the previous studies based on case-based reasoning [38].

  • Selection operator Different prior knowledge introduces unique forms of bias into the search. Hence a certain bias would make the search more efficient on some classes of problem instances but not for others. Inappropriately harnessed knowledge, on the other hand, may lead to the possible impairments of the search. The selection operator thus serves to select the high quality knowledge, from the knowledge pool, that replicate successfully.

  • Variation operator Variation forms the intrinsic innovation tendency of the cultural evolution. Without variations, maladaptive form of bias may be introduced in the evolutionary searches involving new problem instances. For instance, a piece of knowledge, which has been established as beneficial based on its particular demonstration of success on a given problem instance would quickly spiral out of control via replication. This will suppress the diversity and search of the evolutionary optimization across problems. Therefore, variation is clearly essential for retaining diversity in the knowledge pool towards efficient and effective evolutionary search.

  • Imitation operator From Dawkins’s book entitled “The selfish Gene” [31], ideas are copied from one person to another via imitation. In the present context, knowledge memes that are learned from past problem solving experiences replicate by means of imitation and used to enhance future evolutionary search on newly encountered problems.

3.2 Learning from past experiences

The schemata representation of knowledge meme in computing as the latent pattern is first identified. The problem solving experiences on the encountered problems are then captured via learning and crystallized as a part of the knowledge pool that form the memes or building blocks in the society of mind [52]. In this manner, whenever a new problem comes about, the selection operator kicks in to first identify the appropriate knowledge memes from the wealth of previously accumulated knowledge. These knowledge memes then undergo variations to effect the emergence of innovative knowledge. Enhancements to subsequent problem-solving efficiency on given new problems is then achieved by means of imitation.

Referring to Fig. 1, at time step \(i=1\), the evolutionary solver \(\textit{ES}\) is faced with the first problem instance \(\mathbf {p^1}\) to search on. Since \(\mathbf {p^1}\) denotes the first problem of its kind to be optimized, no prior knowledge is available for enhancing the evolutionary solver, \(\textit{ES}\), search.Footnote 3 This is equivalent to the case where a child encounters a new problem of its kind to work on, in the absence of a priori knowledge that he/she could leverage upon. This condition is considered as “no relevant knowledge available” and the search by solver \(\textit{ES}\) shall proceed normally, i.e., the selection operator remains dormant. If \(\mathbf {s^{1*}}\) corresponds to the optimized solution attained by solver \(\textit{ES}\) on problem instance \(\mathbf {p^1}\) and \(\mathbf {M}\) denotes the knowledge meme or building block, then \(\mathbf {M_1}\) is the learned knowledge derived from \(\mathbf {p^1}\) and \(\mathbf {s^{1*}}\) via the learning operator. Since the learning process is conducted offline to the optimization process of future related problems, there is no additional computational burden placed on the existing evolutionary solver \(\textit{ES}\). On subsequent unseen problem instances \(j=2,\ldots ,\infty \), selection kicks in to identify the appropriate knowledge memes \(\mathbf {M}\)s from the knowledge pool, denoted here as \(\mathbf {SoM}\). Activated knowledge memes \(\mathbf {M}\)s then undergo the variation operator to arrive at innovated knowledge \(\mathbf {M_t}\) that can be imitated to bias subsequent evolutionary optimizations by the \(\textit{ES}\). In this manner, useful experiences attained from previously solved problem instances are captured incrementally and archived in knowledge pool \(\mathbf {SoM}\) to form the society of mind, which are appropriately activated to enhance future search performances.

Like knowledge housed in the human mind for coping with our everyday life and problem solving, knowledge memes residing in the artificial mind of the evolutionary solver play the role of biasing the search positively on newly encountered problems. In this manner, the intellectual capacity of the evolutionary solver evolves along with the number of problems solved, with transferrable knowledge meme accumulating with time. When a new problem is encountered, suitable learned knowledge meme is activated and varied to guide the solver in the search process. This knowledge pool thus formed the evolving problem domain knowledge that may be activated to solve future evolutionary search efficiently.

4 Case studies on routing problems

In this section, we present the two widely studied challenging NP-hard domains on capacitated vehicle routing (CVR) and capacitated arc routing (CAR) considered in the present study.

4.1 Capacitated vehicle routing problem

The capacitated vehicle routing problem (CVRP) introduced by Dantzig and Ramser [53], is a problem to design a set of vehicle routes in which a fixed fleet of delivery vehicles of uniform capacity must service known customer demands for single commodity from a common depot at minimum cost. The CVRP can be formally defined as follows. Given a connected undirected graph \(G = (V, E)\), where vertex set \(V = \{v_i\}, i=1 \ldots n\), n is the number of vertex, edge set \(E = \{e_{ij}\}, i, j=1 \ldots n\) denoting the arc between vertices \(v_i\) and \(v_j\). Vertex \(v_d\) corresponds to the depot at which k homogeneous vehicles are based, and the remaining vertices denote the customers. Each arc \(e_{ij}\) is associated with a non-negative weight \(c_{ij}\) , which represents the travel distance from \(v_i\) to \(v_j\). Consider a demand set \(D = \{d(v_i) | v_i \in V\}\), where \(d(v_i) > 0\) implies customer \(v_i\) requires servicing (i.e., known as task), the CVRP consists of designing a set of least cost vehicle routes \(\mathcal {R} = \{\mathcal {C}_i\},~i=1 \ldots k\) such that

  1. 1.

    Each route \(\mathcal {C}_i,~i\in [1,k]\) must start and end at the depot node \(v_d \in V\).

  2. 2.

    The total load of each route must be no more than the capacity W of each vehicle, \(\sum _{\forall v_i\, \in \,\mathcal {C}} d(v_i) \le W\).

  3. 3.

    \(\forall v_i \in V~and~d(v_i) > 0\), there exists one and only one route \(\mathcal {C}_i \,\in \,\mathcal {R}\) such that \(v_i\, \in \, \mathcal {C}_i\).

The objective of the CVRP is to minimize the overall distance cost(R) traveled by all k vehicles and is defined as:

$$\begin{aligned} cost(R) = \sum _{i=1}^{k} c(\mathcal {C}_i) \end{aligned}$$
(1)

where \(c(\mathcal {C}_i)\) is the summation of the travel distance \(e_{ij}\) contained in route \(\mathcal {C}_i\).

4.2 Capacitated arc routing problem

The capacitated arc routing problem (CARP) was first proposed by Golden and Wong [54] in 1981. Instead of serving a set of customers (i.e., nodes, vertices) in CVRP, CARP is to serve a set of streets or segments. It can be formally stated as follows: Given a connected undirected graph \(G = (V, E)\), where vertex set \(V = \{v_i\}, i=1 \ldots n\), n is the number of vertex, edge set \(E = \{e_{i}\}, i=1 \ldots m\) with m denoting the number of edges. Consider a demand set \(D = \{d(e_i) | e_i \in E\}\), where \(d(e_i) > 0\) implies edge \(e_i\) requires servicing (i.e., known as task), a travel cost vector \(C_t = \{c_t(e_i) | e_i \in E\}\) with \(c_t(e_i)\) representing the cost of traveling on edge \(e_i\), a service cost vector \(C_s = \{c_s(e_i) | e_i \in E\}\) with \(c_s(e_i)\) representing the cost of servicing on edge \(e_i\). A solution of CARP can be represented as a set of travel circuits \(\mathcal {R} = \{\mathcal {C}_i\},~i=1 \ldots k\) which satisfies the following constraints:

  1. 1.

    Each travel circuit \(\mathcal {C}_i,~i\in [1,k]\) must start and end at the depot node \(v_d \in V\).

  2. 2.

    The total load of each travel circuit must be no more than the capacity W of each vehicle, \(\sum _{\forall e_i\, \in \, \mathcal {C}} d(e_i) \le W\).

  3. 3.

    \(\forall e_i \in E~and~d(e_i) > 0\), there exists one and only one circuit \(\mathcal {C}_i \,\in \,\, \mathcal {R}\) such that \(e_i\, \in \,\, \mathcal {C}_i\).

Fig. 2
figure 2

The example of a CARP or CVRP

The cost of a travel circuit is then defined by the total service cost for all edges that needed service together with the total travel cost of the remaining edges that formed the circuit:

$$\begin{aligned} cost(\mathcal {C}) = \sum _{e_i\, \in \, \mathcal {C}_s} c_{s}(e_i) + \sum _{e_i\, \in \,\mathcal {C}_t} c_{t}(e_i) \end{aligned}$$
(2)

where \(\mathcal {C}_s\) and \(\mathcal {C}_t\) are edge sets that required servicing and those that do not, respectively. And the objective of CARP is then to find a valid solution \(\mathcal {R}\) that minimizes the total cost:

$$\begin{aligned} C_\mathcal {R} = \sum _{\forall \mathcal {C}_i \,\in \, \mathcal {R}} cost(\mathcal {C}_i) \end{aligned}$$
(3)

4.3 Solving of CVRP and CARP domains

Here we generalized the solving of routing problems as searching for the suitable task assignments (i.e., vertices or arcs that require to be serviced) of each vehicle, and then finding the optimal service order of each vehicle for the assigned tasks. In the evolutionary search literature, the task assignment stage has been realized by means of simple task randomization [29] to more advance strategies including heuristic search, clustering [55], etc., while the optimal service order of each vehicle is attained via the mechanisms of evolutionary search operators. The example of an optimized routing solution can be illustrated in Fig. 2, where four vehicle routes, namely, \(R_1 = \{0, \mathbf {v}_1, \mathbf {v}_2, \mathbf {v}_3, 0\}\), \(R_2 = \{0, \mathbf {v}_6, \mathbf {v}_5, \mathbf {v}_4, 0\}\), \(R_3 = \{0, \mathbf {v}_{10}, \mathbf {v}_9, \mathbf {v}_8, \mathbf {v}_7, 0\}\) and \(R_4 = \{0, \mathbf {v}_{14}, \mathbf {v}_{13}, \mathbf {v}_{12}, \mathbf {v}_{11}, 0\}\), can be observed. A ‘0’ index value is assigned at the beginning and end of route to denote that each route starts and ends at the depot.

Theoretically, routing problems have been proven to be NP-hard with only explicit enumeration approaches known to solve them optimally. However, large scale problems are generally computationally intractable due to the poor scalability of most enumeration methods. From a survey of the literature, metaheuristics, heuristics and evolutionary computation have played important roles in algorithms capable of providing good solutions within tractable computational time. For CVRP, Cordeau et al. [56] considered a unified tabu search algorithm (UTSA) for solving VRP. Prins [57] presented an effective evolutionary algorithm with local search for the CVRP, while Reimann et al. [58] proposed a D-ants algorithm for CVRP which equipped ant colony algorithm with individual learning procedure. Recently, Lin et al. [59] takes the advantages of both simulated annealing and tabu search, and proposed a hybrid meta-heuristic algorithm for solving CVRP. Further, Chen et al. [55] proposed a domain-specific cooperative memetic algorithm for solving CVRP and achieved better or competitive results compared with a number of state-of-the-art memetic algorithms and meta-heuristics to date.

On the other hand, for CARP, Lacomme et al. in [60] presented the basic components that have been embedded into memetic algorithms (MAs) for solving the extended version of CARP (ECARP). Lacomme’s MA (LMA) was demonstrated to outperform all known heuristics on three sets of benchmarks. Recently, Mei et al. [61] extended Lacomme’s work by introducing two new local search methods, which successfully improved the solution qualities of LMA. In a separate study, a memetic algorithm with extended neighborhood search was also proposed for CARP in [29]. Further, Liang et al. proposed a formal probabilistic memetic algorithm for solving CARP, with new best-known solutions to date found on 9 of the benchmark problems [62].

In what follows, we present the formulations of the proposed four culture-inspired operators in the context of routing problems for learning and transferring useful traits from past problem solving experiences as knowledge meme, that can be used to enhanced future routing search process.

5 Proposed formulations and algorithms: CVRPs & CARPs

In this section, we present the proposed formulation and algorithmic implementations of the transfer learning culture-inspired operators, namely, learning, selection, variation and imitation for faster evolutionary optimization of related problems in two domains described in Sect. 4, namely CVRPs and CARPs.

figure a

The pseudo-code and detailed workflow for the realizations of the proposed ‘fast evolutionary optimization by transfer learning from past experiences’ on CVRP or CARP problem domains, are outlined in Algorithm. 1. For a given new routing problem instance \(\mathbf {p}^j_{new}\) (with data representation \(\mathbf {X}^j_{new}\)) posed to evolutionary solver \(\textit{ES}\), the mechanisms of the selection operator kicks in to select the high quality knowledge memes \(\mathbf {M}\)s to activate, if the knowledge pool \(\mathbf {SoM}\) is not empty. Variation, which takes inspirations from the human’s ability to simplify from past knowledge learned in previous problem solving experiences, then operates on the activated knowledge memes \(\mathbf {M}\)s to arrive at the generalized knowledge \(\mathbf {M_t}\). Subsequently, for given new problem instances \(\mathbf {X}^j_{new}\), imitation proceeds to positively bias the search of evolutionary optimization solver \(\textit{ES}\), using the generalized knowledge \(\mathbf {M_t}\), followed by clustering and pairwise distance sorting (\(\textit{PDS}\)) to generate the biased tasks assignment and service orders solutions that would enhance the search performances on \(\mathbf {p}^j_{new}\). When the search on \(\mathbf {p}^{j}_{new}\) completes, the problem instance \(\mathbf {p}^{j}_{new}\) together with the attained optimized solution \(\mathbf {s}^{j*}_{new}\) of \(\textit{ES}\), i.e., \(\mathbf {X}^j_{new}\) and \(\mathbf {Y}^j_{new}\) which denote the matrix representation of \(\mathbf {p}^{j}_{new}\) and \(\mathbf {s}^{j*}_{new}\) (see Fig. 4), respectively, then undergo the learning operation so as to update the knowledge pool \(\mathbf {SoM}\).Footnote 4

5.1 Learning operator

This subsection describes the learning of knowledge memes, as building blocks of useful traits from given routing problem instances \(\mathbf {p}\) and the corresponding optimized solutions \(\mathbf {s^*}\) (i.e., Line 25 in Algorithm 1). To begin, we refer to Fig. 2, which shall serve as the example routing problem instance used in our illustrations. Figure 3 on the other hand illustrate the learning of knowledge \(\mathbf {M}\) from an optimized routing problem instance and subsequently using this knowledge to bias the tasks assignment and ordering of a routing problem. Specifically, Fig. 3a depicts the distribution of the tasks in the example routing problem of Fig. 2 that need to be serviced. Figure 3b then denotes the optimized routing solution of the \(\textit{ES}\) evolutionary solver on problem Fig. 2 or 3a. The dashed circles in Fig. 3b denote the task assignments of the individual vehicles and the arrows indicate the tasks service orders, as optimized by \(\textit{ES}\).

Fig. 3
figure 3

Learning of knowledge \(\mathbf {M}\) which shall serve as the instruction for biasing the tasks assignment and ordering of a routing problem

Here a knowledge meme \(\mathbf {M}\) is defined in the form of a distance matrix that maximally aligns the given original distribution and service orders of tasks to the optimized routing solution \(\mathbf {s}^*\) attained by solver \(\textit{ES}\). Using the example routing problem instance in Fig. 2, we formulate the knowledge meme as matrix \(\mathbf {M}\) that transforms or maps the task distributions depicted in Fig. 3a to the desired tasks distribution of \(\mathbf {s}^*\) while preserving the corresponding tasks service orders, as depicted in Fig. 3b. In this manner, whenever a new routing problem instance is encountered, suitable learned knowledge memes from previously optimized problem instances is then deployed to realign the tasks distribution and service orders constructively. For instance, Fig. 3c showcases the desirable scaled or transformed tasks distribution of Fig. 3a when the appropriate knowledge meme \(\mathbf {M}\) is put to work. In particular, it can be observed in Fig. 3c that we seek for the knowledge memes necessary to re-locate tasks serviced by a common vehicle to become closer to one another (as desired by the optimized solution \(\mathbf {s}^*\) shown in Fig. 3b), while tasks serviced by different vehicles to be mapped further apart. Further, to match the service orders of each vehicle to that of the optimized solution \(\mathbf {s}^*\), the task distribution is adapted according to the sorted pairwise distances in ascending order (e.g., the distance between \(v_1\) and \(v_3\) is the largest among \(v_1, v_2\) and \(v_3\), while the distance between \(v_{10}\) and \(v_9\) is smaller than that of \(v_{10}\) and \(v_8\)).

In what follows, the proposed mathematical definitions of a knowledge meme \(\mathbf {M}\) for the transformations of tasks distribution are detailed. In particular, given \(\mathbf {V} = \{\mathbf {v}_i\ | i=1, \ldots , n\}\), n is the number of tasks, denoting the tasks of a problem instance to be assigned. The distance between any two tasks \(\mathbf {v}_i = (v_{i1},\ldots , v_{ip})^T\) and \(\mathbf {v}_j = (v_{j1},\ldots , v_{jp})^T\) in the p-dimensional space \(\mathbb {R}^p\) is then given by:

$$\begin{aligned} d_M(\mathbf {v}_{i}, \mathbf {v}_{j}) = ||\mathbf {v}_{i} - \mathbf {v}_{j}||_M = \sqrt{(\mathbf {v}_{i} - \mathbf {v}_{j})^T\mathbf {M}(\mathbf {v}_{i} - \mathbf {v}_{j})} \end{aligned}$$

where T denotes the transpose of a matrix or vector. \(\mathbf {M}\) is positive semidefinite, and can be represented as \(\mathbf {M} = \mathbf {L}\mathbf {L}^T\) by means of singular value decomposition (SVD). Substituting this decomposition into \(d_M(\mathbf {v}_{i}, \mathbf {v}_{j})\), we arrive at:

$$\begin{aligned} d_M(\mathbf {v}_{i}, \mathbf {v}_{j}) = \sqrt{(\mathbf {L}^T\mathbf {v}_{i} - \mathbf {L}^T\mathbf {v}_{j})^T(\mathbf {L}^T\mathbf {v}_{i} - \mathbf {L}^T\mathbf {v}_{j})} \end{aligned}$$
(4)

From Eq. 4, it is worth noting that the distances among the tasks are scaled by meme \(\mathbf {M}\). Thus we derive at a knowledge meme \(\mathbf {M}\) that performs the realignment of tasks distribution and service orders of a given new problem instance to one that bears greater similarity to the optimized solution \(\mathbf {s^*}\).

Next, the proposed mathematical formulations for learning of knowledge meme \(\mathbf {M}\) are given. The schemata representations of a problem instance (\(\mathbf {p}\)), optimized solution (\(\mathbf {s^*}\)) and distance constraints set \(\mathcal {N}\) are first defined. In particular, the data representations of the example problem instance in Fig. 2 is depicted in Fig. 4, where \(v_{11}\), \(v_{12}\), etc., denote the features representation of each task, and D\((\cdot )\) indicates the Euclidean distance metric. Further, if task \(\mathbf {v}_i\) and task \(\mathbf {v}_j\) are served by the same vehicle, \(\mathbf {Y}(i,j) = 1\), otherwise, \(\mathbf {Y}(i,j) = -1\). The distance constraints set \(\mathcal {N}\) contains the service order information of the tasks and derived from the optimized solution \(\mathbf {s^*}\). With respect to the example in Fig. 2, since task \(\mathbf {v}_3\) is served after \(\mathbf {v}_2\) from \(\mathbf {v}_1\), the constraint thus takes the form of D\((\mathbf {v}_1, \mathbf {v}_3) >\) D\((\mathbf {v}_1, \mathbf {v}_2)\) as depicted in Fig. 4.

Fig. 4
figure 4

Data representations of a problem instance \(\mathbf {p=X}\), the corresponding optimized solution \(\mathbf {s^*=Y}\) and distance constraints set \(\mathcal {N}\)

To derive the knowledge meme \(\mathbf {M}\) of a given CVRP or CARP problem instance, denoted by (\(\mathbf {p}\), \(\mathbf {s^*}\)), we formulate the learning task as a maximization of the statistical dependencyFootnote 5 between \(\mathbf {X}\) and \(\mathbf {Y}\) with distance constraints as follows:

$$\begin{aligned}&\max _{\mathbf {K}} tr( \mathbf {H} \mathbf {K} \mathbf {H} \mathbf {Y})\nonumber \\&\quad \text {s.t.} \mathbf {K} = \mathbf {X}^T*\mathbf {M}*\mathbf {X}\\&\qquad \quad D_{ij}>D_{iq}, \forall (i,j,q)\in \mathcal {N},\quad \mathbf {K} \succeq 0 \nonumber \end{aligned}$$
(5)

where \(tr(\cdot )\) denotes the trace operation of a matrix. \(\mathbf {X}\), \(\mathbf {Y}\) are the matrix representations of a CARP or CVRP instance \(\mathbf {p}\) and the corresponding problem solution \(\mathbf {s^*}\), respectively.

Further, \(\mathbf {H} = \mathbf {I} - \frac{1}{n}\mathbf {1}\mathbf {1}'\) centers the data and the labels in the feature space, \(\mathbf {I}\) denotes the identity matrix, n equals to the number of tasks. \(D_{ij}>D_{iq}\) is then the constraint to impose a vehicle to serve task q before task j, upon serving task i.

Let \(\mathbf {T}_{ij}\) denotes a \(n\times n\) matrix that takes non-zeros at \(T_{ii} = T_{jj} = 1\), \(T_{ij} = T_{ji} = -1\). The distance constraints \(D_{ij}>D_{iq}\) in Eq. 5 is then reformulated as \(tr(\mathbf {KT}_{ij})>tr(\mathbf {KT}_{iq})\). Further, slack variables \(\xi _{ijq}\) are introduced to measure the violations of distance constraints and penalize the corresponding square loss. Consequently, by substituting the constraints into Eq. 5, we arrive at:

$$\begin{aligned} \min _{\mathbf {M}, \xi }&-tr( \mathbf {X} \mathbf {H} \mathbf {Y}\mathbf {H}\mathbf {X}^T \mathbf {M} ) + \frac{C}{2}\sum \xi ^2_{ijq} \nonumber \\ \text {s.t.}&\mathbf {M} \succeq 0 \nonumber \\&tr(\mathbf {X}^T\mathbf {MXT}_{ij})>tr(\mathbf {X}^T\mathbf {MXT}_{iq}) - \xi _{ijq}, \nonumber \\&\forall (i,j,q)\in \mathcal {N} \end{aligned}$$
(6)

where C balances between the two parts of the criterion. The first constraint enforces the learnt knowledge denoted by matrix \(\mathbf {M}\) to be positive semi-definite, while the second constraint imposes the scaled distances among the tasks to align well with the desired service orders of the optimized solution \(\mathbf {s^*}\) (i.e., \(\mathbf {Y}\)).

To solve the learning problem in Eq. 6, we first derive the minimax optimization problem by introducing dual variables \(\alpha \) for the inequality constraints based on Lagrangian theory.

$$\begin{aligned} Lr= & {} tr(-\mathbf {H}\mathbf {X}^T\mathbf {MXHY}) + \frac{C}{2}\sum \xi ^2_{ijq} \nonumber \\&- \sum \alpha _{ijq}(tr(\mathbf {X}^T\mathbf {MXT}_{ij}) - tr(\mathbf {X}^T\mathbf {MXT}_{iq})+ \,\xi _{ijq})\nonumber \\ \end{aligned}$$
(7)

Set \(\frac{\partial Lr}{\partial \xi _{ijq}} = 0\), we have:

$$\begin{aligned} C\sum \xi _{ijq} - \sum \alpha _{ijq} = 0 \Longrightarrow \xi _{ijq} = \frac{1}{C}\sum \alpha _{ijq} \end{aligned}$$
(8)

By substituting Eq. 8 into Eq. 7, we reformulate the learning problem in Eq. 6 as a minimax optimization problem, which is given by:

$$\begin{aligned} \max _{\alpha }\min _\mathbf {M}&tr\left[ \left( -\mathbf {XHYHX}^T - \sum \alpha _{ijq}\mathbf {XT}_{ij}\mathbf {X}^T\right. \right. \nonumber \\&\left. \left. + \sum \alpha _{ijq}\mathbf {XT}_{iq}\mathbf {X}^T \right) \mathbf {M}\right] - \frac{1}{2C}\sum \alpha ^2_{ijq} \nonumber \\ \text {s.t.}&\mathbf {M} \succeq 0 \end{aligned}$$
(9)

By setting

$$\begin{aligned} \mathbf {A} = \mathbf {XHYHX}^T + \sum \alpha _{ijq}\mathbf {XT}_{ij}\mathbf {X}^T - \sum \alpha _{ijq}\mathbf {XT}_{iq}\mathbf {X}^T\nonumber \\ \end{aligned}$$
(10)

and

$$\begin{aligned} \varDelta J^t_{ijq} = tr[(\mathbf {XT}_{iq}\mathbf {X}^T-\mathbf {XT}_{ij}\mathbf {X}^T)\mathbf {M}] - \frac{1}{C}\alpha _{ijq} \end{aligned}$$
(11)

Upon solving the above formulations and derivations, we arrive at Eqs. 10 and 11. Then, as a common practice in Machine Learning, parameter C of Eq. 11 is configured by means of cross validation and the learning problem of Eq. 9 is solved using readily available methods [64]. As the update of \(\mathbf {M}\) is the SVD on \(\mathbf {A}\), its complexity is \(O(p^3)\), where p is the dimension. The complexity of calculating \(\mathbf {A}\) is \(O(n^2p+np^2+mp^2)\), where n is the number of vertices, and m is the number of constraints in \(\mathcal {N}\). Further, the complexity of updating each \(\alpha _{ijq}\) is \(O(p^2r)\), where r is the rank of \(\mathbf {M}\). Thus the complexity of updating all \(\alpha \) is \(O(mp^2r)\).

5.2 Selection operator

Since different knowledge memes introduces unique biases into the evolutionary search, inappropriately chosen knowledge and hence biases can lead to potential negative impairments of the evolutionary search. To facilitate a positive transfer of knowledge that would lead to enhanced evolutionary search, the selection operator (i.e., Line 5 in Algorithm 1) is designed to select and replicate from high quality knowledge memes that share common characteristics with the given new problem instance of interest. In particular, for a given set of z unique \(\mathbf {M}\)s in \(\mathbf {SoM}\), i.e., \(\mathbf {SoM} = \{\mathbf {M}_1, \mathbf {M}_2, \ldots , \mathbf {M}_z\}\) that form the knowledge pool, the selection operator is designed to give higher the weights \(\mu _i\) to knowledge memes that would induce positive biases. Further, as we consider the learning and selection of knowledge memes from problems of a common domain, positive correlation among the problems is a plausible assumption.Footnote 6 If the unseen problem instance is very different from the past problems solved (e.g., the customer distribution of the new unseen problem differs greatly from previously solved problems in the CVRP and CARP), the selection operator shall not deploy any inappropriate knowledge memes, since no positive knowledge existed in the past experiences. In this case, the original state-of the-art optimization solver shall operate as routine.

Here, the knowledge meme coefficient vector \(\varvec{\mu }\) is derived based on the maximum mean discrepancy criterion.Footnote 7

$$\begin{aligned}&\max _{\mu ,\mathbf {Y}} tr( \mathbf {H} \mathbf {X}^T\mathbf {M}_t\mathbf {X} \mathbf {H} \mathbf {Y}) + \sum _{i=1}^z (\mu _i)^2 Sim_i\nonumber \\&\quad \text {s.t.} \mathbf {M}_t = \sum _{i=1}^z \mu _i \mathbf {M}_i,\quad \mathbf {M}_i \succeq 0\\&\qquad \qquad \mu _i \ge 0,\quad \sum _{i=1}^n \mu _i = 1\nonumber \end{aligned}$$
(12)

In Eq. 12, the first term serves to maximize the statistical dependence between input matrix \(\mathbf {X}\) and output label \(\mathbf {Y}\) of the clusters of tasks. The second term measures the similarity between previous problem instances solved to the given new problem of interest. \(Sim_i\) defines the similarity measure between two given problem instances. In vehicle routing, tasks distribution and vehicle capacity are two key features that define the problem. Hence the similarity measure is formulated here as \(Sim_i = -(\beta *MMD_i + (1-\beta )* DVC_i)\), where \(MMD(D_s, D_t) = ||\frac{1}{n_s}\sum _{i=1}^s\phi (x_i^s) - \frac{1}{n_t}\sum _{i=1}^t\phi (x_i^t)||\) with \(\phi (\mathbf {x}) = \mathbf {x}\) denoting the maximum mean discrepancy between the distribution of two given instances by considering the distance between their corresponding means. \(D_s\) and \(D_t\) are the two given routing problem instances, with \(x_i^s\) and \(x_i^t\) denoting the location information of a customer in \(D_s\) and \(D_t\), respectively. \(DVC_i\) denotes the discrepancy in vehicle capacity between any two problem instances. The vehicle capacity is available as part of the problem definition. From domain knowledge, the task distribution (location of nodes to be serviced) has a higher weightage than vehicle capacity information. This implies that \(\beta > 0.5\). In this work, \(\beta \) is configured empirically as 0.8 to favour task distribution information over vehicle capacity information.

In Eq. 12, two unknown variables exist (i.e., \(\mu \) and \(\mathbf {Y}\)). \(\mathbf {Y}\) is obtained from the results of task assignment (i.e., if task \(\mathbf {v}_i\) and task \(\mathbf {v}_j\) are served by the same vehicle, \(\mathbf {Y}(i,j) = 1\), otherwise, \(\mathbf {Y}(i,j) = -1\). The respective task assignment is obtained by clustering on the \(\mathbf {M}\) transformed tasks \(\mathbf {X}\).). With \(\mathbf {Y}\) fixed, Equation 12 becomes a quadric programming problem of \(\mu \). To solve the optimization problem of Eq. 12, we first perform clustering (e.g., K-Means) [65] on input \(\mathbf {X}\) directly to obtain the label matrix \(\mathbf {Y}\). By keeping \(\mathbf {Y}\) fixed, we obtained \(\mu \) by maximizing Eq. 12 via quadric programming solver. Next, by maintaining the chosen \(\mathbf {M}\) fixed, clustering is made on the new \(\mathbf {X}^{'}\) (i.e., transformed by selected \(\mathbf {M}\). \(\mathbf {X}' = \mathbf {L}^T\mathbf {X}\), where \(\mathbf {L}\) is obtained by SVD on \(\mathbf {M}\)) to obtain label matrix \(\mathbf {Y}\).

5.3 Variation operator

Further, to introduce innovations into the selected knowledge memes during subsequent reuse, the variation operator (i.e., Line 6 in Algorithm 1) kicks in. In the present context, we take inspirations from human’s ability to generalize from past problem solving experiences. Hence variation is realized here in the form of generalization. However, it is worth noting that other alternative forms of probabilistic scheme in variations may also be considered since uncertainties can generate growth and variations of knowledge that we have of the world [66], hence leading to higher adaptivity capabilities for solving complex and non-trivial problems.

Here, the variation is derived as a generalization of the selected knowledge memes:

$$\begin{aligned} \mathbf {M}_t = \sum _{i=1}^z \mu _i \mathbf {M}_i, \left( \sum _{i=1}^z \mu _i = 1, \mu _i \in [0, 1]\right) \end{aligned}$$
(13)

where \(\mathbf {M}_i\) denotes the meme knowledge stored in the meme pool, z is the total number of memes stored, and \(\mu _i\) is obtained by selection operator discussed in Sect. 5.2.

Fig. 5
figure 5

An illustration of knowledge imitation in the generating positively biased CVRP or CARP solutions

5.4 Imitation operator

In CVRP and CARP, the search for optimal solution is typically solved as two separate phases. The first phase involves the assignment of the tasks that require services to the appropriate vehicles. The second phase then serves to find the optimal service order of each vehicle for the assigned tasks obtained in phase 1.

In what follows, the imitation of learned knowledge memes to bias the initial population of solutions in subsequent \(\textit{ES}\) searches are described. For each solution \(\mathbf {s}_g\) in the EA population, the knowledge \(\mathbf {M}_t\) generalized from past experiences is imitated for the purpose of generating positively biased solutions (see Line 10 in Algorithm 1) in the evolutionary search by transforming or remapping the original tasks distribution solution (i.e., both tasks assignments and tasks service orders), as denoted by \(\mathbf {X}^{j}_{new}\), to become new tasks distribution \(\mathbf {X}^{j'}_{new}\) given by:

$$\begin{aligned} \mathbf {X}^{j'}_{new} = \mathbf {L}^T\mathbf {X}^{j}_{new} \end{aligned}$$
(14)

where \(\mathbf {L}\) is derived by singular value decomposition of \(\mathbf {M}_t\). An illustrative example is depicted in Fig. 5, where Fig. 5a denote the original task distribution \(\mathbf {X}^{j}_{new}\), while Fig. 5b is the resultant knowledge biased or transformed tasks distribution \(\mathbf {X}^{j'}_{new}\) using \(\mathbf {M}_t\).

In phase 1, K-Means clustering with random initializations is conducted on the knowledge biased tasks distribution \(\mathbf {X}^{j'}_{new}\) to derive the tasks assignments of the vehicles as depicted in Fig. 5c, where the dashed circles denote the task assignments of the individual vehicles, i.e., denoting the tasks that shall be serviced by a common vehicle.

In phase 2, the service orders of each vehicle are subsequently achieved by sorting the pairwise distances among tasks in an ascending order. The two tasks with largest distance shall then denote the first and last tasks to be serviced. Taking the first task as reference, the service order of the remaining tasks are defined according to the sorted orders. Referring to Fig. 5d as an example, where the arrows indicate the service orders of the tasks, the distance between \(\mathbf {v}_{10}\) and \(\mathbf {v}_7\) are the largest among \(\mathbf {v}_{10}\), \(\mathbf {v}_9\), \(\mathbf {v}_8\) and \(\mathbf {v}_7\). In assigning \(\mathbf {v}_{10}\) as the reference task to be served, \(\mathbf {v}_9\) is then the next task to be serviced, since the distance between \(\mathbf {v}_{10}\) and \(\mathbf {v}_9\) is smaller than that of \(\mathbf {v}_{10}\) versus \(\mathbf {v}_8\) or versus \(\mathbf {v}_7\).

Table 1 Criteria for measuring performance
Table 2 Properties of the “Augerat” CVRP benchmark set

6 Experimental study

To evaluate the performance of the proposed transfer learning as culture-inspired operators for fast evolutionary optimization of related problems via transfer learning from past problem solving experiences, comprehensive empirical studies with regard to search speed and search solution quality, are conducted on the two challenging NP-hard routing problems and a real world application in this section. In particular, we first present studies on the capacitated vehicle routing problem (CVRP) domain and then the capacitated arc routing problem (CARP) domain. These consist of problem instances of diverse properties in terms of vertex size, topological structures (task distribution), and vehicle capacity, which cannot be readily handled using existing case based reasoning approaches [38, 39] as previously discussed in Sect. 2.1. In the present study, two recently proposed evolutionary algorithms for solving CVRPs and CARPs, labeled in their respective published works as CAMA [67] and ILMA [61] respectively, are considered here as the baseline conventional evolutionary solvers for the independent domains. Further, several criteria defined to measure the search performances are then listed in Table 1. Among these criteria, Number of Fitness Evaluation is used to measure the efficiency of the algorithms, while Ave.Cost and B.Cost serve as the criteria for measuring the solution qualities of the algorithms.

Table 3 Properties of the “CE” and “Christofides” CVRP benchmark sets

6.1 Capacitated vehicle routing problem domain

6.1.1 Experimental configuration

All three commonly used CVRP benchmark sets with diversity properties (e.g., number of vertex, vehicle number, etc.) are investigated in the present empirical study, namely “AUGERAT”, “CE” and “CHRISTOFIDES”. The detailed properties (e.g., number of vertices, lower bound, etc.) of the CVRP instances considered are summarized in Tables 2 and 3, where V denotes the number of vertices that need to be serviced, \(C_v\) gives the capacity of the vehicles in each instance, and LB describes the lower bound of each problem instance. In CVRP, each task or vertex has a corresponding coordinates (i.e., 2-d space) and demand. Using the coordinates of the vertex, the tasks assignment of each vehicle are generated based on K-Means clustering. A CVRP instance is thus represented by input matrix \(\mathbf {X}\), see Fig. 4, which is composed of the coordinate features of all tasks in the problem instance. The desired vehicle assigned for each task (i.e., task assignment) is then given by the \(\textit{ES}\) optimized solution, \(\mathbf {Y}\), of the respective CVRP instances.

Besides the proposed knowledge meme biased approach, two other commonly used initialization procedures for generating the population of solution individuals in the state-of-the-art baseline CAMA are investigated here to verify the efficiency and effectiveness of the proposed evolutionary search across problems. The first is the simple random approach for generating the initial population, which is labeled here as CAMA-R. The second is the informed heuristic population initialization procedure proposed in the state-of-the-art baseline CAMA [67] method. In particular, the initial population is a fusion of solution generated by Backward Sweep, Saving, and Forward Sweep and random initialization approaches.

The CAMA that employs our proposed transfer learning culture-inspired operators is then notated as CAMA-M, where the initial population of individuals in CAMA are now generated based on the high quality knowledge memes that have been accumulated from past CVRP solving experiences via the cultural evolutionary mechanisms of the learning, selection, variation and imitation. Note that if no prior knowledge has been learned so far, CAMA-M shall behave exactly like the baseline CAMA.

Last but not the least, the operator and parameter settings of CAMA-R, CAMA, CAMA-M are kept the same as that of [67] for the purpose of fair comparison. For CAMA-M, the MMD of Eq. 12 is augmented with the demand of each task as one of the problem feature.

6.1.2 Results and discussions

To gain a better understanding on the performances of the proposed new memetic computation paradigm, we present, analyze, discuss and compare the results obtained against recently proposed methods based on the criteria of search efficiency and solution quality.

Search efficiency: convergence trends and speedup To assess the efficiency of the proposed approach, the representative search convergence traces of CAMA, CAMA-R and CAMA-M on the 3 different CVRP benchmark sets are presented in Fig. 6.Footnote 8 The Y-axis of the figures denote the Actual travel cost obtained, while the X-axis gives the respective computational effort incurred in terms of the Number of Fitness Evaluation Calls made so far. From these figures, it can be observed that CAMA-M converges rapidly to near the lower bound solution at very early stage of the search as compared to both CAMA-R and CAMA across all the CVRP instances. This is because of the high quality solutions introduced into the initial population of CAMA-M by the positive memes learned from past search experiences. For example, on instances “B-n41-k6” (Fig. 6b), CAMA-M takes only approximately 250 number of fitness evaluations to converge near to the lower bound solution while CAMA-R and CAMA incurred more than 1000 number of fitness evaluations to do so. On the larger problem instances, such as “c199” (Fig. 6f), the fitness evaluations savings is more significant, where CAMA-M is observed to bring about at least 1000 fitness evaluations cost savings to arrive at the solution qualities attained by CAMA and CAMA-R.

Fig. 6
figure 6

Averaged search convergence traces (across 30 independent runs) of CAMA, CAMA-R, and CAMA-M on representative CVRP “AUGERAT”, “CE”, and “CHRISTOFIDES” benchmark sets. Y-axis: Travel cost, X-axis: Number of Fitness Evaluation. Note that CAMA-M is observed to search significantly faster in converging to near the lower bound solution on the respective CVRPs than the other counterparts

To provide more in-depth insights on the enhanced efficiency of the CAMA-M, in Table 4, we also tabulated the amount of speedup by CAMA-M over the baseline CAMA in arriving at different stages of the search defined by the fitness levels, for the representative problem instances.Footnote 9 Here speedup is defined by \(\frac{CAMA^{i}_{Fitness~Evaluation Calls}}{CAMA-M^{i}_{Fitness~Evaluation Calls}}\), where \(i = {1...i...N}\) and N denoting the number of fitness bins considered. \(A^{i}_{Fitness~Evaluation Calls}\) denotes the number of fitness evaluation used by algorithm A to arrive at the fitness attained in bin i. In the results reported in Table 4, an equal-width fitness bin size of 8 is used. For example, the fitness bin 1 and bin 8 of problem instance c100b in Table 4 (i.e., see the last row) shows the speedup of CAMA-M over CAMA at the start of the search and upon search convergence are 2924.24 and 9.44 times, respectively. Note that speedups are observed throughout the entire search in the representative CVRP instances, as observed in the table.

Table 4 Speedup by CAMA-M over CAMA in the different stages of the search on representative CVRP instances has been observed

For the purpose of conciseness, we further summarize the \(log_{10}(Speedup)\) of CAMA-M over CAMA at the start of the search on the CVRP instances in the order that they are solved, in Fig. 7. It is worth noting that Fig. 7 resembles a learning curve where the increasing knowledge learned corresponds to an increasing \(log_{10}(Speedup)\) observed as more problems are solved in each benchmark set. For example, on the benchmark “AUGERAT” set, the \(log_{10}(Speedup)\) is observed to increase from under 2.6 to exceed 3.4 when instances “A2”, “A3” and “A4” are solved. Overall, a \(log_{10}(Speedup)\) of at least 2.5 times has been attained by CAMA-M on all the CVRP instances considered.

Fig. 7
figure 7

Speedup of CAMA-M over CAMA across all the problem instances for the three CVRP benchmark sets. Y-axis: Speedup, X-axis: Problem Instance index of each CVRP in the benchmark set. A learning curve with increasing knowledge learned in each benchmark set, as observed from the increasing \(log_{10}(Speedup)\) observed, as more problems are solved

Solution Quality: To evaluate the solution quality of the proposed approach, Table 5 tabulates all the results obtained by respective algorithms over 30 independent runs. The values in “B.Cost” and “Ave.Cost” denoting superior performance are highlighted using bold font. Further, in order to obtain the statistically comparison, Wilcoxon rank sum test with 95 % confidence level has been conducted on the experimental results. As discussed in Sect. 3, the “knowledge pool” of CAMA-M is empty when the first CVRP instance is encountered (e.g., “A-n32-k5” of “AUGERAT” benchmark set), and thus CAMA-M behaves like the baseline CAMA. As more CVRP instances are encountered, the learning, selection, variation and imitation mechanisms shall kick in to learn and generalize knowledge that would induce positive biases into the evolutionary search of new CVRP instances. It can be observed from Table 5 that the CAMA-M converges to competitive solution qualities attained by both CAMA-R and CAMA on the first problem instance of each CVRP benchmarks (since no knowledge meme is learned yet), while exhibiting superior performances over CAMA-R and CAMA on subsequent CVRP instances. Thus, beyond showing speedups in search performance at no loss in solution quality, CAMA-M has been observed to attain improved solution quality on the CVRPs. In particular, on “AUGERAT” and “CE” benchmark sets, CAMA-M exhibits superior performances in terms of Ave.Cost on 13 out of 19 CVRP instances. In addition, on “CHRISTOFIDES” benchmark set, CAMA-M also attained improved solution quality in terms of Ave.Cost on 4 out of 7 CVRP instances. Since CAMA, CAMA-R and CAMA-M shares a common baseline evolutionary solver, i.e., CAMA, and differing only in terms of the population initialization phase, the superior performance of CAMA-M can clearly be attributed to the effectiveness of the proposed transfer learning as culture-inspired operators where imitation of learned knowledge meme from past problem solving experiences are used to generate biased solutions that lead to enhanced future evolutionary searches.

Table 5 Solution quality of CAMA, CAMA-R, and CAMA-M on “AUGERAT”, “CE” and “CHRISTOFIDES” CVRP benchmark sets

Insights on a knowledge biased CVRP solution In this subsection, to provide a deep insight into the mechanisms of the proposed approach in attaining the high performance efficacy observed, we analyze samples of the solutions obtained by CAMA, CAMA-R and CAMA-M, as well as the converged optimized solution of CAMA on solving problem instance “B-n41-k6”. In Fig. 8, each node denotes a customer that needs to be serviced, and the nodes with the same color and shape shall be serviced by a common route or vehicle. Figure 8a, b denote the solution in the initial population of baseline CAMA and CAMA-R, respectively. Figure 8c is the solution in CAMA-M, which has been positively biased using the imitated knowledge learned from past experiences of problem solving on CVRP instances “A-n32-k5”, “A-n54-k7”, “A-n60-k9” and “A-n69-k9”. Further, Fig. 8d gives the converged optimized solution achieved by CAMA. As observed, the task distributions of the solution in CAMA-M search is noted to bear greatest similarities to that of the converged optimized solution by CAMA, as compared to that of CAMA and CAMA-R. Besides task distributions, portions of the figures are magnified in Fig. 8c, d, for the purpose of illustrating the service orders of a solution obtained by CAMA-M relative to the converged optimized solution of baseline CAMA, respectively. The magnified subfigures illustrate high similarities between their respective service orders.

Fig. 8
figure 8

An illustration of CVRP solutions in the respective EA populations for solving “B-n41-k6” CVRP instance. Each point plotted in the sub-figures denotes a CVRP customer node that needs service. Points or nodes with same symbol are serviced by a common vehicle

This suggests that the service orders information of the converged optimized solution for instances “A-n32-k5”, “An54-k7”, “A-n60-k9” and “A-n69-k9” has been successful learned and preserved by the learning operator of the CAMA-M, and subsequently through the cultural evolutionary mechanisms of selection, variation and imitation, the learned knowledge meme is imitated to generate positively biased solutions that is close to the optimal solution, thus bringing about significant speedups in the search on related problem instances.

6.2 Capacitated arc routing problem

To assess the generality of the proposed transfer learning culture-inspired operators for faster evolutionary optimization of problems by learning from past experiences, further experimental study on the domain of capacitated arc routing problems is conducted in what follows.

Table 6 Properties of the egl “E” Series CARP benchmarks
Table 7 Properties of the egl “S” Series CARP benchmarks

The well-established egl benchmark set is used in the present experimental study on CARP. It comprises of two series of CARP instances, namely “E” and “S” series with a total of 24 instances. The detailed properties of each egl instance are presented in Tables 6 and 7. “|V|”, “\(|E_R|\)”, “E” and “\(\textit{LB}\)” denote the number of vertices, number of tasks, total number of edges and lower bound, of each problem instance, respectively.

In CARP, each task (arc) is represented by a corresponding head vertex, tail vertex, travel cost and demand (service cost). Note the contrast to CVRP where a task is represented by a vertex. Thus the arc information is pre-processed to obtain the position vertices. The shortest distance matrix of the vertices is first derived by means of the Dijkstra’s algorithm [68], i.e., using the distances available between the vertices of a CARP. The coordinate features (i.e., locations) of each task are then approximated by means of multidimensional scaling [69]. In this manner, each task is represented as a node in the form of coordinates and the dimension of the nodes is governed based on multidimensional scaling. A CARP instance in the current setting is then represented by input vector \(\mathbf {X}\) composing of the coordinate features of all tasks in the problem. With such a representation, standard clustering approaches such as the K-Means algorithm is then used on the CARP for task assignments, followed by pairwise distances sorting of tasks to preserve the service orders. The label information of each task in \(\mathbf {Y}\) belonging to the CARP instance, is then defined by the optimized solution of CARP.

Table 8 Solution quality of ILMA, ILMA-R, and ILMA-M on egl “E” and “S” Series CARP instances

In the empirical study, two commonly used population initialization procedures based on ILMA are considered here for comparison. The first is a simple random approach, which is labeled here as ILMA-R. The second is the informed heuristic based population generation procedure used in the baseline ILMA [61] for CARP. There, the initial population is formed by a fusion of chromosomes generated from Augment_Merge, Path_Scanning, Ulusoy’s Heuristic and the simple random initialization procedures.

Our proposed transfer learning culture-inspired operators, which leverage past problem solving experiences to bias the initial or starting population of EA solutions in the conventional ILMA is labeled then here as ILMA-M. To facilitate a fair comparison and verify the benefits of the proposed transfer learning culture-inspired approach, the configurations of the evolutionary operators in baseline ILMA and the variants are keep in consistent to those reported in [61].

6.2.1 Results and discussions

On both the “E” and “S” series egl instances, ILMA-M has been consistently been able to converge more efficiently than ILMA and ILMA-R on the CARP instances considered.Footnote 10 Similar trends on speedup in search as observed in the CVRP has been observed in the CARP. Overall, ILMA-M is noted to bring up to 70 % savings in the number of fitness function evaluations to arrive at the solutions attained by both ILMA and ILMA-R on most of the CARP instances. For instance, it is worth noting that on instance “S1-B” (Fig. 9b), ILMA-M incurred a total of \(1.5\times 10^6\) number of fitness function evaluations to converge at the same solution found by ILMA and ILMA-R, which otherwise expended a large fitness evaluation costs of approximately \(6\times 10^6\). This equates to a significant cost savings of 4 times by ILMA-M over ILMA and ILMA-R.

Fig. 9
figure 9

Averaged search convergence traces (across 30 independent runs) of ILMA, ILMA-R, and ILMA-M on representative CARP “E” and “S” Series instances. Y-axis: Travel cost, X-axis: Number of Fitness Evaluation. Note that ILMA-M is observed to search significantly faster in converging to the near-optimal solution on the respective CARPs than the other counterparts

Further, Table 8 summarizes the results that measure the solution quality on the “E”-Series and “S”-Series egl instances as obtained by the respective algorithms, across 30 independent runs. To obtain statistical significance comparisons, the Wilcoxon rank sum test with 5 % confidence level has been conducted on the experimental results. As observed, ILMA-M performed competitively to ILMA on instances “E1-A” and “S1-A”, which is expected, since these are the first encountered problem instances of ILMA-M on the “E” and “S” egl CARP instances, respectively, where no useful prior knowledge is available yet. As more problem instances are solved, the ILMA-M is observed to not only search much more efficiently but also at no loss in solution quality. As a matter of fact, it arrives at superior solution quality over ILMA in terms of Ave.Cost on 18 out of 24 problem instances in the egl benchmark set. In terms of B.Cost, ILMA-M also achieved 2 and 8 improved solution qualities over ILMA and ILMA-R on the “E” and “S” series egl instances, respectively.

6.3 Real world application: the package collection/delivery problem

In the courier business, the package collection/delivery task is among one of the challenging tasks that courier companies confront with everyday. The package collection/delivery problem (PCP) can be defined as the task of servicing a set of customers with a fleet of capacity constrained vehicles located at a single or multiple depot(s). In particular, the PCP can be seen as a variant of the vehicle routing problem with stochastic demand (VRPSD), in which the true demand, i.e., the actual weight and dimensions of the package for each customer remain uncertain before it is serviced on site. In such situations, it may happen that the service vehicle capacity may be insufficient to accommodate the true demand faced (i.e., capacity of the package for example) upon arriving at the customer location, thereby necessitating a return trip to the central depot for capacity replenishment before proceeding to service the affected customers. This not only leads to a substantial increase in the operational costs, but more importantly, probable customer dissatisfactions and unhappiness due to failures in meeting the advertised delivery time guaranteed.

In the present context, the experiences of past optimized PCP routes and the unseen PCP requiring routes design are provided by the courier company. In the PCP of interest here, the archival of past experiences include ten previously solved PCPs of diverse customer distributions and customer size (including uniform, well-clustered, etc., distributed customers, ranging from about 100 to over 500). For the purpose of illustration, two solved PCPs with customer distributions are depicted in Fig. 10a, b. The customer distribution of the new unseen PCP to solve is then depicted in Fig. 10c. On this problem, the recently reported state-of-the-art evolutionary search with Monte Carlo simulation for designing reliable VRPSD solution [70] and labeled here as CES, is considered as the baseline evolutionary solver. To speed up the search for reliable PCP routes, the proposed transfer learning culture-inspired operators are incorporated into the CES, which is notated here as MCES. Thus MCES differs from CES in that the former is equipped with an initial population of biased routing solutions that are generated using knowledge memes that are learned from search experiences on the ten previously solved PCPs. To ensure a fair comparison, the parameter and operator settings of CES and MCES are kept consistent to that of [70].

Fig. 10
figure 10

Faster evolutionary optimization of real world PCP by transfer learning from past solving experiences. The head portrait in the subfigures represent the customers to be serviced

The optimized routes obtained by MCES on the new unseen PCP is illustrated in Fig. 10d. With the availability of the knowledge meme mined from past PCP solving experiences, MCES attained a population of higher quality solutions than CES, right at the start of the search (i.e., initialization). In addition, MCES is shown to search more efficiently than the CES throughout the entire search. Upon convergence, MCES showcased a computational cost savings of more than 58 % to arrive at the same best solution found by counterpart CES. It is worth highlighting that as MCES and CES share a common evolutionary solver, the significant improvements of the former with respect to efficiency can thus be clearly attributed to the proposed transfer learning as culture-inspired operators, which generate high-quality solutions in the initial population to speed up evolutionary search by transfer learning from past experiences.

7 Conclusion

In this paper, we have proposed a new Memetic Computation Paradigm: Evolutionary Optimization \(+\) Transfer Learning for search, which models how human solves problems and presented a novel study towards intelligent evolutionary optimization of problems through the transfers of structured knowledge in the form of memes as building blocks learned from previous problem-solving experiences, to enhance future evolutionary searches. In particular, the four culture-inspired operators, namely, Learning, Selection, Variation and Imitation have been proposed and presented. The mathematical formulations of the cultural evolutionary operators for solving well established NP-hard routing problems have been derived, where learning is realized by maximizing the statistical dependence between past problem instances solved and the respective optimized solution. In contrast to earlier works, the proposed approach facilitates a novel representation, learning and imitation of generalized knowledge that provide greater scope for faster evolutionary search on unseen related problems. Further, comprehensive studies on the widely studied NP-hard CVRP and CARP domain, has been made to demonstrate and validate the benefits of the proposed faster evolutionary search approach. Subsequently studies on a real world Package Collection/Delivery Problem further verified the effectiveness of the proposed fast evolutionary approach.

Future works will explore the generality of the proposed paradigm in the context of dynamic optimization problems, where the problems of two subsequent time instances may share high similarity. Last but not the least, although this paper focuses on evolutionary optimization approaches, the proposed cultural-inspired transfer learning operators can also apply to other population based methods, such as particle swarm optimization, differential evolution, etc.