1 Introduction

“It should be one’s sole endeavour to see everything afresh and create it anew.” - Gustav Mahler

Nobel Prize winner in Physics, Richard Feynman, was considered one of the quintessential problem solvers of the twentieth century. In a Chicago Tribune interviewFootnote 1 his colleague at Caltech, Murray Gell-Mann, another Nobel Prize awardee, who also was a world class problem solver in Physics and a champion for the study of complexity in science (Gell-Mann 1995), half-jokingly specified ‘Feynman’s Algorithm’ as a simple three-step procedure:

  1. 1.

    Write down the problem.

  2. 2.

    Think really hard.

  3. 3.

    Write down the solution.

Puckish as this may seem, this “algorithm” rightfully models at a very high level the conduct of much of the fundamental practice of Algorithmics and Theory of Computer Science. When faced with a computational problem, algorithmic designers and complexity theorists typically approach a white board (or a blank page), think really hard for a while, sketch particular “complex instances of small size” that may come to their mind, think really hard (again) to try to identify the worst-case scenario, and they finally write down the solution in the form of an algorithm (or a complexity proof that shows membership or some other type of tight result).

While this activity is accompanied by a toolkit of techniques and experience obtained during the past decades, it is most certainly ad hoc. There is no consensus of what “writing down the problem” is. Ideally, computational benchmarking on all instances of small size can help to uncover the reasons why a heuristic may fail to obtain the optimal solution. Systematic benchmarking on small instances is rarely seen in practice, with most benchmarking being done a posteriori, on larger instances. Hooker already pointed at these issues a quarter of a century ago both in Hooker (1995) and in when he appealed for the establishment of a new “empirical science of algorithms” (Hooker 1994).

Undoubtedly, it is also clear that the algorithmics approach of using human intuition alone to generate worst-case instances has had its successes. For certain problems, this standard approach is sufficient and relying on human intuition is all what is needed; a near trivial example is the sorting of a list of numbers where most of the “bad” algorithms are more complex than the “good” ones (cf. Bubblesort and Mergesort). However, a proposal for using evolutionary algorithms to augment human intuition via “evolutionary attacks” has already been proposed for those algorithms (Cotta and Moscato 2003).

In spite of the success of just using human intuition alone, the core difficulties with the approach run deeply and propagate into the future, for instance, by teaching our young peers. Indeed, we routinely train the next generation of computer scientists and applied discrete mathematicians by handpicking “the best examples” of this paradigm based on the successes of this approach, and we praise the beauty and simplicity of some proofs. When considering harder problems, however, this leads to the development of algorithms with obscure weaknesses and limitations and/or lack of insights into their empirical success or applicable range. For many problems and many researchers “algorithmic success” then becomes a one-off hit during the lifetime of their careers. We can read in a famous textbook: “The Algorithm Design Manual” by S. Skiena: “Algorithms are the ideas behind computer programs” (Skiena 2008), a partial definition that algorithms share with heuristics. Very little is said about where to find intuition for these ideas, or how we can systematize a process for algorithm design. The early-on take home for students from the book is “Searching for counterexamples is the best way to disprove the correctness of a heuristic” ... but he also recognizes that this search “relies more on inspiration than exhaustion” (Skiena 2008, Sec. 1.3.3).

What happens when we run out of inspiration? Would then be possible that we could globally and collectively be ‘running out of ideas’ for a given problem? Could we augment out intuition by employing exhaustive testing on instances of smaller sizes? Can we make a methodological bridge between algorithm and heuristic design by bringing benchmarking a priori instead of a posteriori? We aim at providing a case study in one of combinatorial optimization’s most studied problems as an illustrative example.

1.1 The vertex cover problem: much more than a case in point

The history of the well-known combinatorial problem k -Vertex Cover demonstrates the problem in an incisively way. Given an undirected graph G(VE), a vertex cover \(V'\) is a subset of V such that for all edges in E at least one of the endpoints is in \(V'\). The problem of deciding if a graph G can have a vertex cover with at most \(k>0\) vertices is one of the most well-recognized “classical” examples of an NP-complete problem. In the optimization version (Min Vertex Cover), we aim at finding the cover of the minimum cardinality.

The proof of the existence of a “straightforward algorithm” (quoting Garey and Johnson 1979) with a guaranteed performance ratio of 2 is both simple and elegant and, consequently, it has become standard practice to teach this proof in many undergraduate courses around the world. However, the weakness and limitations of this paradigm for algorithm design, which is based on a performance ratio and simplicity of proofs, and which certainly involves “think [ing] really hard”, are neither discussed nor questioned. In fact, it comes as a surprise to our students that very little other progress in the development of approximation algorithms for Min Vertex Cover has occurred since Gavril proved this result (the existence of a 2-approximation ratio) in 1974 [known as a private communication to D.S. Johnson according to a well-known textbook (Garey and Johnson 1979)]. The best known approximation ratio for this problem is \(2 - \Theta (\frac{1}{\sqrt{log |V|}})\), a result obtained approximately 30 years after Gavil’s proof Karakostas (2005). Progress exists on trying to demonstrate the hardness of approximability, and only three years before, Dinur and Safra (2002) proved that it is NP-hard to approximate the cardinality of the minimum vertex cover within any factor smaller than 1.36.

1.2 Fixed-parameterized tractability: a new hope and a new stagnation

A breath of fresh air also happened for this problem; a string of exact algorithms for the Min Vertex Cover has appeared over the past 30 years, each improving on the previous results. These are called fixed-parameter tractable algorithms. The problem then quickly became the “superstar” of this new wave of algorithmic design hope. Today it is perhaps the most well-known problem of a computational complexity class known as FPT (i.e., the class of problems for which a fixed-parameter tractable algorithm exists).

This type of approach brought an interesting design strategy; it contained systematic theoretical analyses of instances of small size for the purpose of kernelization. They bring, in some sense, an “implicit benchmarking” process via mathematically proven results on safe reduction rules. These ideas are explained in detail in Sect. 2. We give an example of a safe reduction rule in Sect. 2.1 and all the reduction rules used in this manuscript are given in Appendix A. They have proven useful to solve to optimality any instance of vertex cover with up to 7 vertices. However, after a very successful 15 years, also in this case we observe a stagnation in the development of fixed-parameter tractable algorithms since no significant further progress occurred since 2005, i.e. fifteen years ago.

1.3 Is lack of progress “natural”?

How to track progress? Unfortunately, there is no generic repository containing all theoretical results in computer science as a semantic database, but there is a very useful online compendium on approximation algorithms maintained by Viggo Kann since 1992.Footnote 2 From it we can see that many well-known problems in that compendium also have had very sporadic successes in terms of improving the approximability status over decades. It is also remarkable that vertex cover, the problem that in some sense initiated and help championed the theory of fixed-parameter tractability, also has the same stagnation for graphs of size equal or greater than 8 vertices, a situation that has changed little after nearly 15 years of research.

Should we calmly accept that “this is natural”? Would it be possible that some combinatorial problems are “just like that”? Or, we dare to ask, is it the case that the progress in theoretical computer science, and in this case graph theory, is indeed limited by the human cognitive capacity for identifying structure in complex objects?

We point that Thorup (1998) structured programs (i.e. goto-free programs), “including, for example, short circuit evaluations and multiple exits from loops, have tree width at most 6”, so perhaps that is linked to human cognitive capacity for control flow of algorithms. However, today the route to justify the lack of progress over decades is to blame the “inherent hardness” of these problems, their “inapproximability” in some cases, or, in the case of fixed-parameter algorithms, the increasingly more difficult mathematical proofs required to bring new advances. Of course, we also have another tempting escape route; we can also blame the problem. Usually, a conjecture is open, in this case it was that there does not exist an algorithm with a fixed approximation ratio better than 2 (Feige 2003; Hochbaum 1983).

Both routes are directions for an intellectual escapism and neither is addressing the basic difficulty. Few researchers dare to challenge the common core of these highly entrenched methodological practices in mathematics and computer science, in particular, its lack of comprehensive experimentation with instances of small size. After all, nothing seems wrong with “Fenyman’s Algorithm”. How can we blame ourselves for “Thinking really hard”? Well, in fact, perhaps we can blame the part that says that we should start with a blank page or a perfectly clean whiteboard. While we acknowledge that we do not yet have a final solution, we offer an alternative for consideration and this manuscript is, humbly speaking, perhaps the first step in that new direction.

1.4 Structure of this paper

This paper is structured as follows: Section 2 discusses kernelization, an important concept to understand the paper (while Appendix A presents the set of reduction rules that have been implemented and tested in this study). In Appendix B we point at some key surveys on current methods and practices of automated heuristics designs. In Sect. 2.2 we present the results of these reduction rules in large scale vertex cover benchmarking datasets. Section 3 lays out an augmented intelligence methodology for moving beyond the traditional approach for developing reduction rules and provides some partial conclusions. Section 4 details the design of three new heuristic algorithms built thanks to the insights obtained thanks to the approach of Sect. 3. Section 5 presents the results of these heuristics on the benchmarking algorithms and statistical performance tests are provided. In addition to those, we presented the results of other metaheursitics used for BHOSLIB and DIMACS datasets in Appendices C.1, C.2 and a comparison of our heuristic with the Isolation Algorithm (IA) (Ugurlu 2012) in Appendix C.4. Section 6 discuss possible new reduction rules for the problem; and Sects. 7 and 8 discuss the results and limitations of the study, and present the conclusions and future directions.

2 Fundamentals of kernelization

We will center the discussion on the k -Vertex Cover decision problem: “Given a simple undirected graph G(VE), is there a \(V'\subseteq V\) such that \(|V'| \le k\) ?” Then, from the point of view of parameterized complexity, the instance is the graph and a parameter (i.e., (Gk)).

2.1 Preprocessing and reduction rules

In the area of Operations Research some polynomial-time techniques have been used to deal with large instances of problems that have some “real-world origin” and have been used in practice. They are called “safe data reductions”. Based on them, some pre-processing techniques often make seemingly intractable, large instances, such as those arising in machine learning or bioinformatics, small and tractable. A great example of this is Karsten Weihe’s “covering trains by stations” story (see Moscato (2019) for a detailed account of it). An example of a reduction rule for the k -Vertex Cover is the following:

Lemma 1

Shared neighborhoods of two adjacent vertices - Let (Gk) be an instance of k -Vertex Cover with \(u,v \in V(G)\) where \(N(u) \subseteq N(v)-u\) and \(uv \in E(G)\). Let \(G'(V',E')\) be the graph obtained by deleting vertex v from G. Then (Gk) is a Yes instance of k -Vertex Cover iff \((G-v,k-1)\) is also a Yes instance.

Proof

 

\((\Rightarrow )\):

Let \(V'\) be a vertex cover of size k for G. We have two cases: \(v \in V'\) and \(v \notin V'\). If \(v \in V'\), then \(V' - v\) is a vertex cover for \(G-v\) of size \(k-1\). If \(v \notin V'\), we must have \(u \in V'\) to cover the edge uv. However as \(v \notin V'\), the edges wv for any \(w \in N(v)- u\) must be covered by w. In particular this implies that \(N(u) \subset V'\). Therefore we can take the alternate vertex cover \(V'' = (V'-u) \cup \{v\}\) and we again have the first case.

\((\Leftarrow )\):

Let \(V'\) be a vertex cover of size \(k-1\) for \(G'\). \(V'\) covers all edges of G except those edges wv for \(w \in N(v)\). Thus \(V' \cup \{v\}\) forms a vertex cover of size k for G.

\(\square \)

We note that this reduction rule is not saying that vertex v has to be in a vertex cover of minimum cardinality (but we are saying that it is in one of size k). If we keep on reducing the graph until a vertex cover is found, we can then revisit this vertex and if all vertices in N(v) are already in the vertex cover then v does not need to be. The reduction rule is still valid as a mechanism to answer the decision version for the value of k, and if v is actually no needed, then a vertex cover of cardinality \(k-1\) may actually exist.

2.1.1 Kernelization and fixed-parameter tractability

Formal proofs rules like the one above were previously known as “preprocessing” in the area of Operations Research. It was noted nearly three decades ago that they can become powerful tools to reduce the dimensionality of an instance. The design of “fixed-parameter algorithms” depends on identifying these rules and prove their correctness. In general, in parameterized computational complexity the input is a pair (xk) where x corresponds to an instance of a decision problem and \(k>0\) is a parameter which represents some characteristic of the instance. It is also assumed that k is independent of the size of the instance (i.e., it is assumed to be a fixed value or “fixed parameter”). The overall aim of the application of reduction rules is then generally to transform, in polynomial time, an instance (xk) of a decision problem P into an equivalent instance \((x',k')\) such that, \(|x'| \le |x'|\) and \(k < k'\) and \(|x'|+k' \le f(k)\) for some fixed computable function f(k).

These transformations are obtained via the application of these ‘reduction rules’ and they provide an effective tool for pre-processing instances of NP-hard problems.

This type of technique is known as kernelization. For some problems it is possible to find a function f(k), depending only on k, bounding the size of a maximal reduced instance \((x',k')\) (Abu-Khzam et al. 2004).

If such a function exists, then the problem is said to be ‘kernelizable’, and this proves that it is in the computational complexity class called ‘FPT’ (the class of fixed-parameter tractable problems).

2.1.2 Strict kernelization

When a parameterized problem is in class FPT, it is often said that the combinatorial running-time “explosion” (currently conjectured to be something inherently unavoidable to most exact algorithms for NP-hard problems) has now been “confined to the parameters”.

More recently, the concept of “diminishable problems” and “strict polynomial kernelization” has been introduced Fernau et al. (2018). In this case it is required that \(|x'| \le f(k)\), for some function f(k) in \(k^{O(1)}\) and \(k' \le k+c\) for some constant c. This basically means that a “strict kernelization is a kernelization that does not increase the output parameter \(k'\) by more than an additive constant” (Fernau et al. 2018).

2.2 Are reduction rules identified for kernelization purposes useful in practice?

This is the a fundamental question for practitioners who develop heuristics. Following Hooker’s classic “Testing Heuristics: We have it all wrong” (Hooker 1995), we stop to analyze the result of this computational experiment. We aim at getting some insights of whether the best set of techniques from fixed-parameter algorithmics fails at producing both feasible solutions and to find the optimum for a large number of instances used for benchmarking.

Are these safe rules coming from fixed-parameter tractability useful to reduce the size of the instances? To address the question, we decided to benchmark the algorithm that was is recognized as the best of its class. Since it is based on 10 reductions rules we call it ‘the The 10-R approach’ and we provide an explicit description in Appendix A.

To test the effectiveness of The 10-R, we employed the widely known benchmarking instances from the Second DIMACS Implementation Challenge. The DIMACS datasets consist of collections of graphs initially proposed to challenge algorithmic designers in the areas of maximum clique and graph coloring problems. For each type of problem, we have two sets of graphs: benchmark and challenge. We have applied our implemented reduction rules to reduce the number of vertices of these graphs. We used the following set of instances:

  • DIMACS Benchmark Clique Instances

  • DIMACS Benchmark Graph Coloring Instances

  • DIMACS Volume Graph Coloring Instances

  • DIMACS Volume Clique Instances.

These instances go from approximately 200 nodes to 4000 and they have roughly between 4000 and 4 million edges (e.g C4000.5).

Table 1 Number of successful applications of reduction rules on the DIMACS color graph’s benchmark and volume instances over all instances

To understand if some of the rules in The 10-R have more success than others we have counted the number of times each of the reduction rules was able to safely reduce the problem. We present the results in Table 1. We found that the Struction rule is the most dominating (the one that more frequently is able to reduce the size of the graph). The next successful reduction rule is Complete neighborhood. The Degree Zero, Degree k and Crown rules have never been able to successfully be applied to reduce any graph used in this experiment, and the Linear Programming one is only able to do it five times. None of the reduction rules was able to reduce the size of the instance of the Clique type. Overall, in the light of the amount of theoretical interest and time dedicated to them over several decades, it is a rather disappointing, although not entirely unexpected result. Current benchmarking challenge instances are still very difficult in practice to established kernelization approaches. This motivates the next step of our investigation.

3 Augmented intelligence discovery of new reduction rules

We now look at the following hypothesis: there exists another “yet to be discovered” reduction rule that could be better suited than the ones already known. We consider that it would be very unproductive to search for them by analyzing benchmarking data. The instances that we have used in the previous experiment come from a variety of sources, present different (yet unknown) structural properties, and we would not get inspiration and/or insight due to the large size and density. An alternative approach is subsequently offered.

3.1 First step: test the current tools

We have implemented in a single piece of software all the reduction rules in Abu-Khzam et al. (2004) and Chen et al. (2010) (i.e., The 10-R algorithm) and we extended it to be an exact Min Vertex Cover solver. We do this by the application in tandem of the reduction rules The 10-R followed by a reduction to an Integer Programming model for Min Vertex Cover which uses CPLEXFootnote 3 as a subroutine.

We then used this solver on the complete enumerated sets of non-isomorphic graphs ranging from 2 to 7 vertices respectively. The union of the reduction rules of the two papers cited above (Abu-Khzam et al. 2004; Chen et al. 2010) were able to solve all instances with 7 vertices, as expected. This confirms that our code reduces all of them in agreement with the existing theory.

3.2 Second step: build intuition by working at the interface

The next step is to increase the size of the instance slightly, to appreciate what may be escaping our intuition, yet providing a relatively small instance of the basic problem. Things became really interesting this way. After application of the The 10-R algorithm, only 5 irreducible graphs of size 8 (i.e., irreducible by the rules The 10-R) exist from the grand total of all non-isomorphic graphs of size 8 (which is 12,346). This means that only five required to be solved by the Integer Programming model using CPLEX that follows. All of these 5 irreducible graphs of size 8 have a vertex cover of size five. They are shown in Fig. 1.

Fig. 1
figure 1

Irreducible instances of size 8 graphs (\(n=8\) and \(k=5\)) after application of all the reductions rules of references Abu-Khzam et al. (2004) and Chen et al. (2010). The vertices in red indicate a minimum cardinality vertex cover. Several vertex covers of size five exist for each of these instances (Color figure online)

This has been an inspiring and informative experiment. We know that the The 10-R are able to reduce any instance of size 7, and “almost” does the same job for all instances of size 8. The existence of a relatively small subset of irreducible instances inspires the quest of identifying one or two other rules that may reduce these instances as well, something which at present is an open problem. This provides the first outcome for further theoretical development of new fixed-parameter algorithms.

3.3 Third step: look ahead

It is then perhaps logical to look one step ahead to see what are we facing. If we do not find a reduction (or reductions) for size 8, these graphs will remain as subgraphs of non-reducible graphs of size 9. How many of them are non-reducible by The 10-R? We did this analysis as well. Of the 274,668 non-isomorphic graphs of size 9, the reduction rules only failed to solve 118 instances, showing that the number of irreducible instances has only increased by 23 times.

Fig. 2
figure 2

A visual illustration of outcome from the Network Alignment of two graphs. Here, a G08-02 graph is aligned with b G9-003. c The matching graph of NetworkAlignment(G8-02, G9-003) is shown. In the matching graph, the vertex labels are formed with the labels from bigger graph to smaller graph, hence \(V(G9)\rightarrow V(G8)\). The vertex without any matching will only have the vertex label from the bigger graph, G9. Common or matched edges are shown in bold

We think that this is also important for theoretical development. Many inductive proofs would require to specify that a graph is free of some property, which may proved to be the case after a reduction rule has been executed. We thus are interested in collecting a database of graphs that are non-reducible by the existing reductions rules and then can be compared with others of smaller size.

3.4 Fourth step: identify common structure

We then investigated if there was some sort of common structure (i.e., subgraph present in some pairs of these sets). We employed a highly effective memetic algorithm for the network alignment problem (Mathieson et al. 2019). This software has been very useful since it is a fast heuristic and allows exploratory insights and scales well for the tasks at hand.

Also, in this case, the results were very interesting and they promise to be informative for the theoretical development of fixed-parameter tractable algorithms. For instance, the 5 non-reduced instances of size 8 do not seem to have a significant common isomorphic subgraph between them when taken in pairs, indicating that perhaps several new reduction rules would be needed to fully reduce these instances (Fig. 2).

The memetic algorithm for network alignment tries to maximize one objective function, the so-called Edge Correctness (EC) score. Given an alignment of two networks, the EC score is the ratio of the number of edges from the smaller network mapped to edges of the larger network and the number of edges in the smaller network in total. To understand if there is some common structure between irreducible instances of size 8 (G8) and those irreducible of size 9 (G9), we computed the matrix of the EC scores obtained by our memetic algorithm for all 5 (G8) \(\times \) 118 (G9) graphs. Figure 7 shows an example of Network alignment between a pair of G8 and G9 graphs.

We have executed the Network Alignment of G8 with G9 for 20 independent runs of the memetic algorithm we have used. We have used the best alignment result (i.e., the one that maximizes the Edge Correctness score in these 20 runs). To make sense of the results we have used the algorithm presented in Moscato et al. (2007) to produce a permutation of rows and columns that maximizes the correlations observed. See Fig. 3 for the result.

Fig. 3
figure 3

Heatmap of the Edge Correctness scores obtained thanks to the best alignments of G8 with G9. The heatmap has been chopped into two pieces to fit into the page width. The left part is placed at the bottom, while the right part is on the top. The scores are sorted using the memetic algorithm of reference Moscato et al. (2007) using the Euclidean distance as a measure of similarity between rows and columns (Color figure online)

3.5 Results of the structural analysis

In the heat map (shown in Fig. 3) the red color represents that the Edge Correctness between a pair of graphs is 1.0 (i.e., the graph of eight vertices is a subgraph of the graph of nine). The minimum value of observed Edge correctness is 0.75 and it is represented by a Black colour. The average Edge Correctness is 0.96 and it is shown using green.

3.5.1 Several non-reducible graphs of size nine have all the non-reducible graphs of size eight as subgraphs

After executing the Network Alignment program by their types of vertex-cover match, we found a relatively large number of (25) size nine non-reducible graphs G9 which that have all the non-reducible graphs of size eight as subgraphs. An example of them is G09-003 and it is shown in Fig. 7. We also show the network alignment output for the G09-003 which has perfect alignment with all of the G8’s in Fig. 1.

3.5.2 One graph in G8 is a subgraph of 114 out of 118 graphs of G9

Since G8-01 is a graph which aligned perfectly (EC = 1.0) with most of the non-reducible graphs of size 9 (114 out of 118) its study may be prioritized among the five non-reducible of the same size. A reduction rule that may safely reduce G8-01 could potentially lead to reducing almost all those of G9.

If this proves successful, we would also have to find potential reduction rules for these four graphs. In the search for common structure, we search for the best pair alignment of these four graphs. The resulting six network alignments are shown in Fig. 4. We have used the yED software editorFootnote 4 to visualize the graph, with the ‘Hierarchical Layout’ with the ‘BFS Layering’ (seeFootnote 5 for details of the method). It is clear that all these graphs have an independent set of size three and a vertex cover that includes all the nodes, not in the independent set.

3.5.3 Other alignments

Among 118 G9 graphs, G8-02 aligned perfectly with 28 of them, G8-03 with 64, G8-04 with 51 and G8-05 with 89 of the G9 graphs.

3.5.4 Other characteristics of the graphs of size 9

There are other characteristics of the graphs introduced in Sect. 3.5.2. From Fig. 4 we can observe that G9-015 has the best alignment with G9-090 and G9-092, only having two unmapped edges. On the other hand, G9-015 has the worst alignment with G9-110, which contains four unmatched edges. We can see G9-090 achieved the best alignment of having only two unmapped edges with each of the G9-092 and G9-110. Here, G9-092 have the worst alignment with G9-110 of having seven unmatched edges.

Fig. 4
figure 4

Network Alignment outcome of Four G9 graphs. Vertex label in the aligned graph formed as “vertex label of first graph \(\rightarrow \) vertex label of second graph”. Matched edges are marked with bold edges

3.6 Partial conclusions

At this stage, it is perhaps relevant to highlight how the computationally-supported process we introduced in this section could augment our intuition in searching for structures that can be exploited to develop reduction rules and fixed-parameter tractable algorithms. We have previously shown in the manuscript that computational results on a large dataset of well-known benchmark instances helped us to illustrate that a set of the 10 reductions rules that build a kernel with the current best fixed-parameter algorithm (in the sense that it has the best theoretical bounds) was actually not very useful in practice. We also questioned what other useful reductions rules have yet to be added to the existing repertoire. Utilising a computational method to augment our intuition, we have identified, using exhaustive enumeration of graphs of orders up to nine, and using powerful methods for network alignment, several small instances which contain structures that the employed reduction rules cannot deal with. Conserved structures can give us some clues of which type of reduction rules are still needed to solve all instances up to size 9.

It is now time to turn our attention to what these structures can also give us in terms of the design of heuristics for this problem. In the next section we will look at how other properties of these graphs can give us the necessary intuition to produce heuristics for the Min Vertex Cover problem that find the optimal solutions of all these instances on graphs up to size 9. Later we will show how these new heuristics perform on the same dataset of benchmarking instances used before.

4 Three new heuristics for finding minimum vertex covers based on k-class edge scores

In this section we present three new heuristics based on the intuition we have got from the computational experiments conducted so far. We start by introducing some basic concepts.

4.1 “Hierarchical optimal” layering and triangles

Another alternative offered by the software yED to layout graphs is called “Hierarchical Optimal Layering”. An illustrative example of it running for the same four graphs aligned and implicitly presented in Fig. 4 can be found in Fig. 5. Without having an explicit algorithmic definition of what the software is doing, we have noticed that, in general, the method allocates a “layer” to each node of the graph. Nodes in the same layer seem to constitute an independent set.

Fig. 5
figure 5

Four The 10-R-irreducible G9 graphs displayed using the in ‘Hierarchical Optimal Layering’ option of the yED software, revealing their edge density and an interesting structure of layers of independent sets and a relative high density of triangles

If such a layered arrangement can be found in polynomial time by a well-specified algorithm, it may be possible to come up with another polynomial-time algorithm that exploits this structure as a precondition and safely reduces the instance size.

We can also observe in Fig. 5 that the graphs have connections between layers and that they have a relatively large number of triangles, i.e., cliques of size three in the graph. These extra insights have motivated new heuristic approaches. Before we discuss them we need to introduce some definitions for clarity.

4.2 The truss decomposition of a graph and other necessary definitions

With the new intuition gained from this analysis, we are now proposing several heuristic approaches. We will also use the information of a triangle-based decomposition of the graph via the k-truss (Wang and Cheng 2012).

Definition 1

(k-truss of a graph) The k-truss of a graph \(G=(V,E)\) is the largest subgraph of G in which every edge is contained in at least \((k-2)\) triangles within the subgraph.

Consequently, the k-truss decomposition of a graph G is the set of all non-empty k-trusses of G for all \(k\ge 2\).

Definition 2

(Truss number): The truss number of an edge \(e \in E\) of a graph G(VE) is the number of triangles in G that contain e.

We can then define the k-class score of an edge as a function of the truss numbers as:

Definition 3

(k -class edge score): The k-class score of an edge \(e\in E\), denoted as kcEdgeScore(e), is the maximum truss number (k) of the edge obtained from the k-truss decomposition of the graph.

We also need to define another score that will be useful later on when proposing some heuristics in Sect. 4.6.

Definition 4

(k-class score sum of a vertex): The k-class score sum of a vertex in a k-truss decomposition of a graph G is defined as

$$\begin{aligned} kcVertexScore(u) = \sum _{v \in N(u)}{kcEdgeScore((uv))} \end{aligned}$$
(1)

for all \(v \in N(u)\), i.e., the sum of the k-class edge scores for all of the incident edges of node u.

We show the k-truss decomposition of the G8 graphs in Fig. 6. Here, the graph G8-01 does not have any triangles, so it has no k-truss decomposition available. Here, an edge with k-class score of 4 denotes that it is incident with a maximum of \((4-2)=2\) triangles in the graph. Now we describe the algorithm in detail.

Fig. 6
figure 6

The G8 graphs and their k-class edges (\(3 \le k \le 4\)). The dashed-edges in G8 graphs correspond to the non k-class edges (\(\le 2\)). In the G8 graphs, we have two values of k-classes (3 and 4). Here 3-class edges are coloured with green and 4-class edges are with red (Color figure online)

4.3 The Extended Berra “Lemma”

Although we are not aware of any contributions to the mathematics of professional baseball player and coach Mr. Lawrence Peter “Yogi” Berra, he is well known due to many of his quotes, many of them hilarious, which are nowadays famous in popular culture. One that in particular motivates the name to our branching strategy:

“When you come to a fork in the road, take it.”

so in the context of celebrating this quote we can write:

Lemma 2

(Berra “Lemma”) Let (Gk) be a reduced instance of k -Vertex Cover and \(uv \in E(G)\). If \((G-v-u,k-2)\) is completely reducible, take it.

In fact, Gavril’s 2-approximation ratio can be seen as a randomized heuristic that finds at random “a fork in the road” (an edge) and includes the nodes that it connects in the vertex cover. However, it does it without any guarantee that the remaining graph is reducible. There is also no guarantee that the endpoints should be both in an optimal vertex cover.

An extension into a three-way branching seems natural:

Lemma 3

(Extended Berra “Lemma”) Let (Gk) be a reduced instance of k -Vertex Cover and \(uv \in E(G)\). If \((G-u,k-1)\) or \((G-v,k-1)\), or \((G-v-u,k-2)\) is completely reducible, take it.

meaning that, when considering an edge (uv), we will search in parallel both parameterized problem alternatives (i.e., either u or v can be part of a vertex cover).

4.4 Final check for redundancies: the necessary final step

Take any one of the non-reducible graphs of size 8 (G8) and apply the Extended Berra “Lemma” to an arbitrarily chosen edge. We are in one of the three-way splits; assume that u is assigned to be in the vertex cover. The resulting graph is reducible by the The 10-R algorithm because it is of size 7, and we know that any graph of that size is reducible. Still, we need to check that the nodes of this vertex cover of this graph of size 7 do not contain the neighborhood of u in the original graph of size 8 as a subset. If that is the case, it is not necessary to include u. This means that the decisions after the application of the Berra “Lemma” necessarily need to be revisited. Our heuristic will then use a final pass to check that no vertex in the cover is “redundant” since all its neighbors are already in the cover.

4.5 Greediness guided by a function of the k-class edge scores

Note also that an arbitrary edge of a graph in G8 would lead to a reducible graph, so the Berra “Lemma” only needs to be applied once. However, for a larger graph, we may need to have a guiding function to decide on which edge to “fork”. Several heuristics for Min Vertex Cover have used a number of strategies, e.g. selecting a node of maximum degree and, when there are ties, the node u that minimizes the number of edges among N(u). While other more complicated branching rules exist (see for instance Akiba and Iwata 2016), we present one here based on triangles and the k-class edge scores which is novel in the area. More importantly, it is directly motivated by our computational experiment, which has clearly augmented our intuition about the problem, and this is a main research theme of our study.

4.6 Three proposed heuristics

To illustrate the results of the experimental process, we present three heuristics motivated specifically by empirical observation of the structure of these instances. Later, in Sect. 5 we demonstrate their effectiveness in comparison to the prior results of the exact reduction algorithms on the benchmarking datasets.

4.6.1 Heuristic 1

The first heuristic employs the k-class score sum of the vertex (defined in Eq. 4.2) to prioritise vertices for inclusion in the vertex cover. Given a graph \(G=(V,E)\) as input, the heuristic first sorts the vertices into a list L by k-class score sum of vertex (highest first), then by highest degree among vertices with the same k-class score sum and finally with ties broken uniformly at random.

The heuristic then proceeds to build a vertex cover by successively selecting the next available vertex v from the list and examining its neighborhood. If v has the highest k-class score sum among its neighbors, it is added to the cover and removed from the graph (along with its incident edges, of course). If there is at least one other vertex with the same k-class score sum in its neighborhood, all such neighbors are collected into a set \(N^{k}_{v}\) and the heuristic branches on either adding v to the cover and removing it and all its incident edges, or adding a randomly selected \(w \in N^{k}_{v}\) to the cover and removing it and all its incident edges. In all cases, the removed vertex is also deleted from L, the degrees and k-class score sum of the remaining vertices are updated and L is resorted.

A branch of the heuristic terminates successfully when either a cover is found in some sub-branch (i.e., when all edges in G have been removed), or unsuccessfully when the size of the cover exceeds the parameter k in all sub-branches. Finally, we do a final check to remove any vertices where all of its neighbors are already in cover and return the smallest cover C.

The pseudocode is given in Heuristic 1.

figure a

4.6.2 Heuristic 2

The second heuristic for Min Vertex Cover is based on k-class edge score (defined in Eq. 3), the maximum degree of the vertices and the number of edges among neighbors of a vertex. Heuristic 2 gives the pseudocode.

figure b

The heuristic receives the graph \(G=(V,E)\) as the input. It iteratively adds a vertex, v, into the cover C and removes the vertex and all its incident edges from G. This process repeats until the graph is completely disconnected. To chose a vertex to add to the cover C, the heuristic considers several scores. First, it selects the set of edges \(E'\) with the maximum k-class edge score and chooses the endpoint with the maximum degree from those edges. If there are multiple vertices with the maximum degree, then we select the vertex whose neighbors have the minimum number of common edges. Any further tie is broken by choosing a vertex uniformly at random. The chosen vertex v is added to the cover C. Then we remove v and its incident edges from the graph, and update associated scores (k-class edge score, degree etc.).

After the main loop is complete the C is checked for vertices whose entire neighborhood is also in C. Such redundant vertices are removed. The cover C is then removed.

4.6.3 Heuristic 3

The third heuristic is based on considering the degree of the vertices, the degree imbalance of edge end-points, the k-class edge score. The whole process is illustrated by the pseudocode of Heuristic 3.

figure c

This heuristic receives the graph \(G=(V,E)\) as the input. It iterates by adding a vertex, v, at a time into the cover C and by removing the vertex from G. This process repeats until all edges are being removed. To chose a vertex in the cover C, the heuristic considers several scores. Initially it finds the set of vertices \(V_m\) with minimum degree. Form \(V_m\), we compute the set of edges \(E' = \{(u,v) \in E\}\) where at least one vertex of the edge is at \(V_m\) and we compute the absolute degree imbalance (\(|d(u)-d(v)|\)) between the two endpoints of the edges in \(E'\). If \(|E'| = \{e\}\) (i.e., if there is a unique edge with the maximum degree imbalance), the endpoint of e with the highest degree is chosen as the vertex to be added in the cover. If \(|E'|>1\), we consider the k-class edge score. From \(E'\) we select the set of edges \(E''\) with the maximum k-class edge score and we add the end-point with the maximum degree into the cover C. Any remaining tie is broken uniformly at random. The chosen vertex, v, is then added into the cover C. We then remove v from G, and update associated scores (k-class edge score, degree imbalance, etc.). We continue this process until the graph is completely disconnected. Finally, if there exists any vertex \(v \in C\) where \(N(v) \subset C\) such that the deletion of it from cover does not leave an edge uncovered, then we remove v from C. We then return the cover C.

5 Results of the proposed heuristics on benchmark datasets

We have applied the three proposed heuristics for Min Vertex Cover problem on the Complement graphs of Clique Instances form DIMACS challenge datasets. We executed the algorithms on the Research Compute Grid (RCG)Footnote 6 allocated with 2 CPU cores, 8GB memory and 100 hours of CPU wall time per instances. We report the best performance of the proposed heuristics over the branch-and-reduce based exact algorithm (FPT) implemented in Akiba and Iwata (2016). For some instances, the FPT-based exact solution method or the heuristics exhausted the maximum allocated CPU-time before it reached any solution. In that case, we put a dash ‘–’ mark in a cell of the reported table.

5.1 Reduction outcome by the proposed heuristics on BHOSLIB Dataset

Ke Xu from the Beihang University, Beijing, China maintains a benchmark dataset for graph problems, named BHOSLIB,Footnote 7 which contains 40 instances for Min Vertex Cover problem. We have used those instances to compare the performance of our proposed Heuristics in Table 2. In the comparison of the best results obtained by the three proposed heuristics, the Heuristic 1 (h1) achieved the best performances in 8, Heuristic 2 (h2) in 21 and Heuristic 3 (h3) in 19 instances from the BHOLIB dataset. The runtime requirements for \(t_{h3}\) are significantly lower than that of \(t_{h1}\) and \(t_{h2}\).

Table 2 Execution outcome of proposed heuristics for Min Vertex Cover problem on the BHOSLIB instances

5.2 Reduction outcome by the proposed heuristics on the DIMACS graphs for clique (complement instances) problem

We have applied the heuristics on complement graph of the Maximum Clique problem of DIMACS Challenge. The results are shown in Table 3. In the comparison of the best results obtained by the three proposed heuristics, Heuristic 1 (h1) achieved the best performances for 27 instances, Heuristic 2 (h2) in 26 and Heuristic 3 (h3) in 59 instances from the DIMACS dataset. Both h1 and h2 were unable to complete the execution for 10 and 1 instances respectively, in the allocated time of 360,000 s. In contrast, h3 was able to produce the results within a fraction of the time.

Table 3 Execution outcome of proposed heuristics for Min Vertex Cover problem on the complement graphs of Clique instances from DIMACS challenge dataset

5.3 Statistical test of the performances of proposed three heuristics

We used the vertex cover size obtained for each of the heuristics (h1, h2, h3) presented in Tables 2 and 3. We consolidated only results of those instances where all three heuristics were able to complete the execution within the allocated CPU times for the statistical test (since it is already clear that h3 was the most successful in delivering a cover for all instances). Now we want to find if there are any significant differences in the performances of finding the size of a vertex cover by the heuristics on the consolidated 108 instances from BHOSLIB and DIMACS datasets. A Friedman rank-sum test on those data returned the Friedman chi-squared statistic \(=28.155556\) with a p-value \(=7.693054 \, 10^{-7}\). This p-value rejects the omnibus null hypothesis, hence, “Significant Differences exist in the ranking of the heuristics”.

The omnibus p-value is way below the respectable critical threshold of 0.05, so we conduct a post-hoc test to find which of the pairs have significant differences in their performances. The p-value obtained for the pairwise comparisons using Nemenyi post hoc test is presented in heatmap shown in Fig. 7a. We also computed a critical difference plot [as proposed in Demsar (2006)] to show the significant differences of the heuristics for their ranking in Fig. 7b.

Fig. 7
figure 7

Visual illustration of statistical significance test for the rankings of the proposed heuristics (h1, h2, h3) on BHOSLIB (in Table 2) and DIMACS (in Table 3) instances. Here, a The Heatmap showing the p-value obtained for Nemenyi post hoc test. b Critical Difference (CD) plot showing the heuristics ordered on a line for their median ranking and statistically non-significant heuristics are connected by horizontal line where Critical difference = 0.279282

Both the heatmap and critical difference plot revealed that there exist no significant differences between h1 and h2. However, the h3 shows significant differences in the performance than both h1 and h2. The critical plot illustrates the median ranking of both heuristics h1 and h2 are within the rank of 2 and 3. The median ranking of the h3 is between 1 and 2. Form these statistical tests, it is evident that both of the h1 and h2 are outperformed by h3. So, we will use the proposed heuristic h3 in the remaining sections of the paper.

5.4 Reduction outcome by the proposed h3 heuristic on a small subset of DIMACS Datasets used by Asgeirsson and Stein (2005)

A small subset form DIMACS instances and the Clique complements are used in Asgeirsson and Stein (2005). We have compared the outcome of our h3 with their results in Table 4. We refer to Asgeirsson and Stein’s method as hp. The name of clique, coloring and clique-complement instances are ended with .clq, .col and .\(\hbox {clq}^\prime \), respectively.

Table 4 Execution outcome of proposed heuristic 3 for Min Vertex Cover problem on a small subset of DIMACS datasets used by Asgeirsson and Stein (2005)

5.4.1 Statistical test of results obtained for datasets used by Asgeirsson and Stein (2005)

Wilcoxon signed-ranks test (Wilcoxon 1992), is a non-parametric statistical hypothesis test, is being suggested as a useful statistical test to compare two algorithms over multiple datasets (Demsar 2006). We used the Wilcoxon signed-ranks testFootnote 8 to find if there any difference exists between hp and h3 for the results presented in Table 4.

We have calculated both the W-value and z-value and we are reporting them here. However, since we have less than 20 pairs of samples to compare in Table 4, we will use the W-value to evaluate the hypothesis. We found three cases of the tie, which yields the number of samples \(N=12\). For those samples with a confidence level \(\alpha =0.05\), the Two-sided Wilcoxon Signed-Rank test revealed the W-value=9.5. According to the table of exact critical values for the Wilcoxon’s test, the critical value for W at \(N=12\) (for \(p < 0.05\)) is smaller or equal to 13. Hence, the result is significant at \(p < 0.05\) and “the difference between the rankings of the methods (hp and h3) is significant”.

Now we used a form of binomial test, known as the sign test (Salzberg 1997; Sheskin 2000), to find which method is significantly better. To do so, we counted the number of wins, losses and ties for h3 over hp (win \(=10\), loss \(=2\), tie \(=2\)). The 10 times win of h3 over hp (in the \(N=12\) datasets, not considering the ties) supports with 95% confidence that “h3 is significantly better than hp”.

6 New reduction rules for k -Vertex Cover?

We now return briefly to the augmented intuition approach for the development of reduction rules for vertex cover proposed in Sect. 4. The exploration of the graphs of size 9 remaining after reduction and their analysis using network alignment techniques to help identify common substructure now suggests the following two rules:

Lemma 4

Let (Gk) be an instance of k -Vertex Cover with \(\alpha (G) = 2\). (Gk) is a Yes instance iff \(k \ge |V(G)| - 2\).

Proof

This result is a simple corollary of the classical k -Vertex Cover/Independent Set NP-completeness reduction. \(\square \)

Lemma 5

Let (Gk) be a reducedFootnote 9 instance of k -Vertex Cover with a vertex v not in any independent set of size at least 3, then (Gk) is a Yes instance iff \((G-v,k-1)\) is a Yes instance.

Proof

As \(\alpha (G) \ge 3\), v cannot be in the complement of the minimum vertex cover, i.e., it must be in the minimum vertex cover. Therefore it is safe to include this vertex in any vertex cover. \(\square \)

While these rules are only a small step for the field of fixed-parameter tractability, they offer a proof-of-concept for the augmented intuition approach for algorithm design in this context. What is particularly interesting is to consider the more advanced version whereby we employ computational approaches to not only hint at common substructure, but to propose reduction rules directly, and perhaps even prove them. While there are reduction rules that would evade this approach (certainly there are reduction rules which require at least first order logic to be expressed), many rules can be expressed in logics that are amenable to computer aided proof techniques.

7 Discussion and limitations of the study

The first limitation we would like to discuss is an interesting one since it has been self-imposed. We got an intuition of what may be needed to design a heuristic that would reduce to optimality every single graph of the The 10-R-non-reducible ones of sizes 8 and 9. This means that we could have empowered our three heuristics in the following way. Whenever the iterative application of the heuristic disconnects the graph by creating a subgraph of size 7, we could have resorted to the The 10-R set to get the exact solution for this component (since we know that any graph of size 7). In terms of our Berra “Lemma”, we could have redesigned the three heuristics so that it will try to find a connected component of size 7 and safely reduce it via The 10-R. Therefore, the results of such a combined approach could have then been better than the one presented, but we abstained to include them in the heuristic because we wanted to evaluate the power of the new heuristics without resorting to the existing previous practice (the The 10-R set). Isolating our heuristic gave us some confidence on its merits by experimentally evaluating it in absence of the more sophisticated complex methods involved in the The 10-R.

Another reason for only presenting the results of the new heuristics by running them iteratively without the use of the wide set of reduction rules is due to the possible gains that can be obtained by an adequate permutation in the order they are applied. This situation is well illustrated in the study of Asgeirsson and Stein (2005) which also is based on the use of a triangle-approach (although highly different from us). Their claim is that: “If vertices vu and w form a triangle, then we can include all three vertices in a vertex cover for a 3/2-approximation”. One entire section of their paper is just dedicated to discussing the relationship between this procedure to eliminate triangles and the order in which they are executed interspaced with other reductions. They say: “After much experimentation, we settled on using the following order of reductions to automate the approximation process.” (Sec. 3 of Asgeirsson and Stein (2005), p 552), thus recognizing that their results may be been the result of manual tuning with either a smaller set of instances (or with the same they are reporting results with). Without knowing the details of this experimentation, and because the use of the other reductions rules may be a confounding factor, we opted not to follow that approach. In addition, for some instances, and for some of the reduction rules they used, the choice of reductions may lead to a situation in which the algorithm gets “stuck” and can not finish (see Table 2 of Asgeirsson and Stein 2005). We decided to avoid this experimental confounder and just present the results of our heuristics in isolation from safe reductions rules. It is, however, an open area for research how to best intersperse the heuristics presented in this paper with other reduction rules existing in the literature [such as those in Akiba and Iwata (2016), Asgeirsson and Stein (2005), Stege (2000) and/or the extended network flow approach Akiba and Iwata (2016), Iwata et al. (2014)].

We also note that our approach is not undermining any of the current practices, but instead calls for a confluence of techniques in a common purpose. Benchmarking instances, for example, have been able to reveal the practical limitations of the The 10-R set (Table 1) in spite of its theoretical interest as a worst-case analysis. However, the relatively large size of these instances does not allow us to benefit from them. We argue that perhaps evolutionary algorithms can be used to find small yet hard to solve instances for particular algorithms such as in Cotta and Moscato (2003), Ahammed and Moscato (2011), or by construction using specific graph grammars (such as done for the TSP Mariano et al. 1995; Moscato and Norman 1998). Automation may also play a role here by generating instances via folding and mirroring, an approach that goes all the way back to pioneering work by David Hilbert (Norman and Moscato 1995). The automated co-evolution of the generation of the worst-case scenarios of the small size of instances for the existing safe-reduction reduction rules (and approximation algorithms) seems to guarantee future research due to its core role in augmenting our intuition. Again, we envision a man-machine approach as “Thinking really hard” will always have a role to generate difficult instances for benchmarking Hougardy and Zhong (2020).

8 Conclusions

It is obvious that the quest of presenting a compelling case of why an augmented intuition computational-based approach to algorithm and heuristic design is far from being completed. However, we have some interesting conclusions and also some lessons from this study that are worth sharing. We will summarize here some of the most relevant ones at this stage of our research endeavour.

First, as shown in Sect. 3.2, we tested the reduction rules from one of the best fixed-parameter algorithms for the k -Vertex Cover as of 2006 [Abu-Khzam et al. (2004) and Chen et al. (2010)], with some of the tightest worst-case analysis. We thus scientifically tested Hooker (1995) both our implementations and the theory behind what we called ‘The 10-R’, the ten safe reductions rules in Abu-Khzam et al. (2004) and Chen et al. (2010) (we refer to Sect. 3.1). By a comprehensive approach that tested all non-isomorphic graphs of size 8, we created an intuition of what may be the core problem to develop further fixed-parameter algorithms based on kernelization. We observed that only five graphs of medium density can not be reduced by The 10-R. Without automation, we conjecture that the task of identifying such a small subset of non-reducible instances in the large number of non-isomorphic graphs of size 8 (a ratio of \(5/12346 \approx 0.000405\)) could have not happened by human intuition alone or, alternatively, it would have possibly taken several decades until all five are finally identified. Computer-based scientific enquire clearly plays a big role here. In addition, “looking ahead”, also supported by automation (Sect. 3.3), shows that the ratio of non-reducible graphs by the set The 10-R is just slightly higher (\(118/274668 \approx 0.00043\)). Here is then our first claim for an augmented intelligence approach to algorithm design.

Second, the identification of new properties of graphs may require, in turn, the identification of common structures in existing graphs which are conjectured to be of the same type. Again, this seems to be another call for innovative automation if we wish to augment the mathematical intuition of the human designer. In that sense our analysis presented in Sect. 3.4 is based on the evidence obtained by the inductive step of Sect. 3.3 (incrementing the size of the graphs by just one node). The use of algorithms for subgraph isomorphism and network alignment algorithms seems to be required; we were well positioned to do this by having a highly-effective memetic algorithm for this NP-complete and W[1]-complete problem (Mathieson et al. 2019). On the positive side for theory formation, only one of the five graphs of size eight (G8-01) is a subgraph of many The 10-R-non-reducible subgraphs of size 9 (114 out of 118), there are only four types of new “structures” of interest in The 10-R-non-reducible graphs of size 9. This may indicate that only one or two new reduction rules may be required to reduce all these 114 graphs.

Third, we obviously are well aware that the approach of systematically employing the generation of non-isomorphic graphs in the quest of obtaining non-reducible graphs can continue “ad infinitum” (or we should say, an arbitrarily large order). It also poses a known consideration for kernelization algorithms; a similar situation will likely occur in all FPT-algorithm design tasks for problems in class FPT. This is hardly being discussed but “hitting the wall” is likely to be the most common issue with the paradigm of kernelization for some instance size. We propose that an alternative scientific approach needs to be created. On the positive side, we showed that our study was feasible to do with our available university computing systems and we have obtained some new leads for further investigation. Notably, not only our work took us to the final line, we are leveraging on the expertise and work of others, for instance, Brendan McKay from ANU and his collection of non-isomorphic graphs available online. Then, our collection of non-reducible graphs gave us the intuition that the use of a truss decomposition could guide new types of heuristics, and we presented our tests in this manuscript. Following Hooker’s guidelines (Hooker 1995), three heuristics were designed that optimally solved to optimality all The 10-R-non-reducible graphs of size 8 and 9; to give context to our heuristics we presented the results of tests with challenging instances used in the literature. Once again, the process we proposed has been backed by the use of automation to gain new intuition and the results also look very promising.

There is another intuitive insight we have obtained, which we would like to share, and that we leave for further exploration. With the exception of graph G08-01, all the other four The 10-R-non-reducible graphs of size 8 have a connected vertex cover (Escoffier et al. 2010), so the “price of connectivity” (Camby et al. 2014) is 1 for these graphs. This leads us to conjecture that some kind of extremal graph theory argument can be applied for graphs having some property that these graphs share (aside of being The 10-R-non-reducible), leading to perhaps to a new heuristic via the computation of a spanning tree subgraph that maximizes the number of leaves while minimizing the maximum degree observed in the leaf in the original graphs [e.g. a parameterized generalization of the problem studied in Binkele-Raible and Fernau (2014)]. If this new problem is also in class FPT, perhaps it can provide interesting new reduction rules applicable to k -Vertex Cover (if we can characterize those for which the prize of connectivity is equal to 1). Both from exact and heuristic perspectives, this is an interesting area that guarantees further research and we have obtained a new intuition thanks to the proposed method. We thus claim that the approach could also trigger new theory formation.

Before we conclude our observations we may also say that the use of graph layout algorithms (Sect. 4.1) provides another type of insights. It may be the case that one of the results of applying the The 10-R set of reduction rules is the creation of graphs which can be arranged in “layers” of nodes such that each layer contains an independent set, and nodes in one layer are connected to other nodes “only a few layers apart” (perhaps an emergent new parameter for parameterized complexity analysis). This seems to be an indirect result of the relatively large number of triangles present in these graphs but also about its global structure, which in turn motivated the introduction of our heuristics. The outputs of the yEd software shown in Figs. 5 and 6 show this characteristic. In addition, from Fig. 1 the reader can observe that for all these graphs the vertices in the minimal vertex cover can be located in the same layer that is connected to another layer which is composed entirely of vertices in an independent set. From this figure, as well as Fig. 4 we got the intuition that perhaps another reduction rule, similar in spirit to the crown reduction (see Appendix A), which is part of the The 10-R set but which may not been “covered” by the network flow-based procedure of Iwata et al. (2014). This is another area that would benefit from future theoretical research, particularly since the recent results on “layered graphs” (Chitturi 2017; Chitturi et al. 2018). If we can test in polynomial time (or if an FPT-algorithm can be developed that can test if the precondition is true), perhaps a dynamic programming approach, such as the one proposed in Chitturi et al. (2018), can be used in tandem thus leading to a combined set of reduction rules of great power. A tight complexity result would then be required to prove that all non-reducible graphs (of some set of reduction rules such as The 10-R) is guaranteed to produce a graph having the required precondition. Once again, this makes a bridge to theory formation by collecting evidence of required structure in small, yet hard to identify, instances of the problem.

Finally, to develop a research enterprise such as that requested almost a quarter of a century ago by Hooker (1995), and in “Needed: An Empirical Science of Algorithms” (Hooker 1994), we are convinced that extramural and international cooperation is going to be necessary. While we think there is still a role for benchmarking, particularly as a component of the cooperative design of challenges to the field, this “empirical science” would need to collect evidence by generating large amounts of well annotated datasets. It is “as if” algorithm design may soon be entering a field such as high-energy particle physics, with international collaboration in generating data, interpretations and new theory. We base our belief in the relatively complex set of tools we have used and developed for this research project which involved using integer programming solvers, develop state-of-the-art network alignment algorithms, use of libraries of annotated non-isomorphic graphs, employ visualization and layout algorithms, coding efficient implementations of kernelization algorithms, etc. An international collaborative effort in developing the set of tools needed for automating parts of algorithmic design across the globe would be essential for this global virtual lab of collaboration. We thus hope that our work will stir the interest both of theoreticians and practitioners and perhaps be catalytic for the confluence of a new international research alliance and a novel scientific paradigm for algorithm and heuristic design based on augmented intuition and through international collaboration.