KP01 Solved by an n-Dimensional Sampling and Clustering Heuristic

Harita, Maria; Wong, Alvaro; Rexachs, Dolores; Luque, Emilio

doi:10.1007/978-3-031-08754-7_31

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13351))

Included in the following conference series:

International Conference on Computational Science

1058 Accesses
1 Citations

Abstract

In the field of optimization, NP-Hard problems play an important role concerning its real-world applications, such as resource allocation, scheduling, planning, logistics, etc. In this paper, we propose a heuristic search algorithm based on Montecarlo along with a clustering strategy that analyzes density and performs k-means partitions to solve the classic binary Knapsack Problem (KP01). Our heuristic method, which was designed to solve combinatorial optimization problems, has evolved and can adapt to other optimization problems, such as the KP01 that can be organized in an n-Dimensional search space. Regarding the methodology, we substantially reduced the search space while the areas of interest were located in the clustering stage, which brings us closer to the best solutions. After the experiments, we obtained a high-quality solution, which resulted in an average success rate of above 90%.

This research has been supported by the Agencia Estatal de Investigacion (AEI), Spain and the Fondo Europeo de Desarrollo Regional (FEDER) UE, under contract PID2020-112496GB-I00 and partially funded by the Fundacion Escuelas Universitarias Gimbernat (EUG).

Access provided by Autonomous University of Puebla. Download conference paper PDF

A k-means binarization framework applied to multidimensional knapsack problem

Article 11 July 2017

A K-means Grasshopper Algorithm Applied to the Knapsack Problem

A Stochastic Local Search Heuristic for the Multidimensional Multiple-choice Knapsack Problem

Keywords

1 Introduction

In optimization, we set the objectives when looking for the best possible configuration of a set of variables. We can deal with these problems by discretizing the variables or taking the real values. Thus, each optimization problem is specified by defining the possible solutions (or states) and a general objective. The classical optimization paradigm is understood as to how the solution identifies by enumeration or differential calculus, the existence of an assumed (unique) solution, or the convergence of classical optimization methods for the solution of the corresponding first-order conditions.

Probabilistic distribution methods such as Montecarlo offer flexible forms of approximation, with some advantages regarding cost. In this sense, metaheuristics are high-level algorithms that are capable of determining a sufficiently satisfactory (almost optimal) solution for an approximate optimization problem, while other approaches use similarity.

We propose a heuristic algorithm that works in conjunction with clustering techniques based on [1]. The Montecarlo K-Means (MCKM) method was improved and evaluated with the optimization benchmark functions [4], which are high-Dimensional problems in a large search space. In all the benchmark cases analyzed, we came up with the improved heuristic Montecarlo Density K-Means (MCDKM), obtaining promising results comparable with those in the literature.

For this paper, our proposal goes a step further in solving different types of combinatorial optimization problems where search processes are involved. We have selected the Knapsack Problem (KP), which seeks to maximize the profits provided by a selection of elements to enter a Knapsack. The objective function would be defined as the total profit of the selected objects. KP is a classic combinatorial optimization problem with various applications in different industries, such as resource allocation, tasks, resource management/scheduling, or energy allocation management. KP is also part of a historical list of NP-Complete problems elaborated by Richard Karp [5].

Martello [8], on the background of this classic NP-Hard problem, offers an extensive review of the many effective heuristics to solve the KP. For example, a population-based approach such as incremental learning algorithm based on a greedy strategy in [7]. In addition, other different heuristic techniques, such as [3, 6, 9] are proposed to provide an acceptable solution to this problem.

Regarding the contribution of our heuristic method MCDKM, it has been validated that it can find feasible solutions. In the methodology, we will see how the partitions in n-dimensions of the search space may reduce the ordering times of the axes. The results section compares the solution quality found when varying the Dimensional arrays that organize the data, ranging from bi-dimensional to more than 50-dimension arrangement.

The paper organizes as follows. Section 2 describes the MCDKM heuristic method. Section 3 introduces the Knapsack problem into the MCDKM method, Sect. 4 details the implementation and the results, and finally, Sect. 5 presents the conclusions and future work.

2 Description of the MCDKM Heuristic Method

MCKM heuristic method [1] was initially developed to solve a particular combinatorial optimization problem. Our proposal goes a step further by adding a previous step. This is, we prepare the problem by parameterizing it and organizing the search in an n-dimensional space. Furthermore, we also perform a density analysis of the generated map. Bringing the problem into our parameters will contribute to the solution’s efficiency since, first, the search space can be significantly reduced. Second, the ordering times of the axes are improved, even when sorting them in a more significant number of dimensions. All this finally translates into a general improvement that contributes to increasing the quality of the solution and a good ratio between the size of the problem and the analyzed sample.

The procedure is shown in detail in Fig. 1. First, with Montecarlo, we generated a uniform sample that formed a map that is distributed through the whole search space. Then, the sample is reduced through the density analysis (for which we use the DBScan [2] algorithm), sweeping data to discard the sample that is not worth analyzing while keeping the remaining areas to perform another clustering process. Such clustering is made with k-means, which is now able to correctly partition the sample to group the feasible solutions into clusters for its selection.

The now more robust MCDKM heuristic method was able to find the best solution values, which turned out to be very similar and comparable to the optimum solution of the benchmark functions. The success rate was between 97% and 99%, as seen in the results from [4].

3 The Knapsack Problem into the MCDKM Method

We are bringing the Knapsack Problem KP01 into our heuristic method to find a comparable efficient solution. This combinatorial optimization problem is sometimes exemplified as: the objective is for a supposed person to choose the elements that will allow him to maximize the benefit without exceeding the allowed capacity of a Knapsack. The problem can be stated as a problem of vectors $X = (x_{1}, x_{2}, x_{3}, ... x_{n})$, which have components of zeros and ones, shown in Eq. 1, and have at most the restriction of the objective function Z(x), as seen in Eq. 2. Mathematically, we have the following:

$$\begin{aligned} W\left( x\right) =\ \sum _{i=1}^{n}x_{i}w_{i}\le \ W \end{aligned}$$

(1)

$$\begin{aligned} Z\ \left( x\right) =\ \sum _{i=1}^{n}x_{i}p_{i} \end{aligned}$$

(2)

where W denotes the maximum capacity of the backpack, x would be the elements whose index numbering can vary from 1 to n. Concerning ${w_{i}}$ and ${p_{i}}$ they represent the weight and value of the i element, meaning that the sum of the weights must not exceed the knapsack capacity, which is W. Now, Z(x) is the objective function (maximize or minimize). In addition, a vector x in Eq. 2 that meets the constraint W is feasible if we have a maximum result in Z(x).

Table 1. Dataset EX01: Knapsack capacity (W) = 6,404,180 and profit (P) = 13,549,094 elements n(i)=24

Full size table

Analyzing these statements is one of the essential steps in the methodology to adapt the parameters. First, we find the relationship between the variables to sort the data into a Montecarlo map.

Since this KP modality is a decision problem, our strategy considers the elements as belonging to different dimensions. This is, we divide the search space into multiple dimensions distributing the elements among them, and subsequently ordering according to all the possible combinations between them and identifying them with an ID.

We selected two available open-source datasets, EX01 from Kreher and Stinson which was checked with a branch and bound (B&B) algorithm. EX02 is a dataset available in the ORLibrary. EX03 dataset was based on EX01.

Table 1 shows a Dimensional array we designed for EX01. As seen, for a 2-D search space, 12 elements will be placed in each dimension, and the ordering is made out of all the possible combinations between these 12 elements. In the case of a 24-D search space, each element belongs to a dimension. Ordering the elements should be more efficient, which could be taken as an advantage in terms of resource use.

Table 2. Detail of the possible combinations (C) in the 1st-dim. of an 8-dim array.

Full size table

Table 3. Solution found by the heuristic method for a 8-Dimensional search space.

Full size table

The Montecarlo map generates by crossing data between the combinations of the elements, which must not exceed the restriction (Eq. 1) or the value is rejected. The uniform map is formed with the samples that meet the weight constraint. The binary code serves as a coordinate representation of such combinations since there are only two possible states. Each digit position represents whether an element will be packed or not, so we know what elements are within each coordinate. Afterward, ordering the axes is carried out in ascending order of best profit. In Table 2 we show how ordering a dimension takes place.

In Fig. 2, we illustrate how we define the map by establishing a Montecarlo index and referring to the position of each combination of elements in each dimension, which returns an integer when generating random samples.

After creating the Montecarlo map, we seek to reduce the search space. Now, the density clustering algorithm can be applied directly to this map, but it has to be adjusted to work efficiently.

We make such adjustment by cutting the map along the objective function axis. Cutting the map helps improve the clustering, and we can avoid analyzing the complete map. Sweeping of the data must occur, and then the stopping criterion is activated when more than 1 cluster is found. We aim for the largest number of clusters since it interprets as many feasible areas. Now, the rest of the sample can be discarded, narrowing the search space.

The results are detailed in Table 3, where the list of combinations found by the heuristic results in a MaxProfit (P) = 13,405,311 fulfilling the restriction since the sum of the weight (W) = 6,383,556.

Table 4. Shows the results for the EX01, EX02 and EX03.

Full size table

4 Implementation and Empirical Results

The headers of Table 4 detail the best solutions found for each data-set of the Knapsack Problem, such as the capacity (W), the number of elements (n), the fitness used, and the Optimal Profit (P), obtained with a branch and bound (B&B) algorithm. The rest of the Table represents the best results obtained by the MCDKM heuristic out of 30 executions for each space partition.

As expected, as the number of dimensions increased, the Axis sort time decreased when it required fewer elements, thus decreasing the total time until finding a solution. Therefore, it can be considered an advantage against other types of search algorithms capable of finding the optimal but entailing a high computational cost (like B&B). Also, as seen in Table 4, we emphasize the ratio between the number of samples and the size problem. As a result, the total samples needed to find an efficient solution was less than 1% in all cases.

Regarding EX03, we increased the complexity to compare us with other algorithms. In this case, we used branch and bound. We verified that, as the complexity of the problem increases, the execution time might also increase, influencing the prediction quality. Nevertheless, the solution maintained a prediction quality above 95% in less than 1 s (when partitioned into 80 dimensions). In contrast, the execution time was considerably longer even though the branch and bound algorithm obtained the optimal solution.

Therefore, heuristic search methods, especially ours that perform stochastic optimization, provide a good way to solve large combinatorial problems and offer a high-quality solution.

5 Conclusions and Future Work

The Knapsack Problem was entirely adapted and brought into our parameters. As a result, we found an efficient solution and achieved better results in the execution time when we increased the no. of partitions of the search space. Furthermore, we significantly reduced the number of samples needed to reach the best solutions compared to the full sample (which is the problem size) in all cases. Space ordering and sorting is an essential step in the methodology.

As we present in this paper, the MCDKM heuristic search algorithm proved some of its advantages, such as its applicability in combinatorial optimization problems. The MCDKM method has the potential to solve any KP01 using an n-Dimensional search space approach obtaining an efficient, high-quality solution with a success rate above 90% average. Solving the KP01, grants MCDKM the capability to deal with a range of other combinatorial optimization problems, such as resource allocation, production scheduling, distribution processes, etc.

Our heuristic model is constantly improving and considering more factors, for example, the relationship among multiple items. In addition, our model is expected to solve multi-objective and multi-Dimensional optimization problems. We believe that parallelization of the method would provide a fast convergence with the improvement in the execution times.

References

Cabrera, E., Taboada, M., Iglesias, M.L., Epelde, F., Luque, E.: Optimization of healthcare emergency departments by agent-based simulation. Procedia Comput. Sci. 4, 1880–1889 (2011)
Article Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)
Google Scholar
Gao, Y., Zhang, F., Zhao, Y., Li, C.: Quantum-inspired wolf pack algorithm to solve the 0–1 knapsack problem. Math. Probl. Eng. 2018, 1–10 (2018)
MathSciNet MATH Google Scholar
Harita, M., Wong, A., Rexachs, D., Luque, E.: Evaluation of the quality of the “Montecarlo plus k-means” heuristics using benchmark functions. In: Short Papers of the 8th Conference on Cloud Computing Conference, Big Data & Emerging Topics (JCC-BD&ET), pp. 36–39 (2020)
Google Scholar
Karp, R.M.: Reducibility among combinatorial problems. In: Complexity of Computer Computations, pp. 85–103. Springer, Cham (1972). https://doi.org/10.1007/978-1-4684-2001-2_9
Li, Z., Li, N.: A novel multi-mutation binary particle swarm optimization for 0/1 knapsack problem. In: 2009 Chinese Control and Decision Conference, pp. 3042–3047 (2009)
Google Scholar
Liu, L.: Solving 0–1 knapsack problems by greedy method and dynamic programming method. Adv. Mater. Res. 282–283, 570–573 (2011)
Google Scholar
Martello, S., Pisinger, D., Toth, P.: New trends in exact algorithms for the 0–1 knapsack problem. Eur. J. Oper. Res. 123, 325–332 (2000)
Article MathSciNet Google Scholar
Zhao, J., Huang, T., Pang, F., Liu, Y.: Genetic algorithm based on greedy strategy in the 0–1 knapsack problem. In: 2009 Third International Conference on Genetic and Evolutionary Computing, pp. 105–107 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Autonoma de Barcelona, 08193, Barcelona, Spain
Maria Harita, Alvaro Wong, Dolores Rexachs & Emilio Luque

Authors

Maria Harita
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro Wong
View author publications
You can also search for this author in PubMed Google Scholar
Dolores Rexachs
View author publications
You can also search for this author in PubMed Google Scholar
Emilio Luque
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maria Harita .

Editor information

Editors and Affiliations

Brunel University London, London, UK
Derek Groen
University of Amsterdam, Amsterdam, The Netherlands
Clélia de Mulatier
AGH University of Science and Technology, Krakow, Poland
Maciej Paszynski
University of Amsterdam, Amsterdam, The Netherlands
Valeria V. Krzhizhanovskaya
University of Tennessee at Knoxville, Knoxville, TN, USA
Jack J. Dongarra
University of Amsterdam, Amsterdam, The Netherlands
Peter M. A. Sloot

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Harita, M., Wong, A., Rexachs, D., Luque, E. (2022). KP01 Solved by an n-Dimensional Sampling and Clustering Heuristic. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13351. Springer, Cham. https://doi.org/10.1007/978-3-031-08754-7_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-08754-7_31
Published: 15 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08753-0
Online ISBN: 978-3-031-08754-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

KP01 Solved by an n-Dimensional Sampling and Clustering Heuristic

Abstract

Similar content being viewed by others

A k-means binarization framework applied to multidimensional knapsack problem

A K-means Grasshopper Algorithm Applied to the Knapsack Problem

A Stochastic Local Search Heuristic for the Multidimensional Multiple-choice Knapsack Problem

Keywords

1 Introduction

2 Description of the MCDKM Heuristic Method

3 The Knapsack Problem into the MCDKM Method

4 Implementation and Empirical Results

5 Conclusions and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

KP01 Solved by an n-Dimensional Sampling and Clustering Heuristic

Abstract

Similar content being viewed by others

A k-means binarization framework applied to multidimensional knapsack problem

A K-means Grasshopper Algorithm Applied to the Knapsack Problem

A Stochastic Local Search Heuristic for the Multidimensional Multiple-choice Knapsack Problem

Keywords

1 Introduction

2 Description of the MCDKM Heuristic Method

3 The Knapsack Problem into the MCDKM Method

4 Implementation and Empirical Results

5 Conclusions and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation