Abstract
Algorithm selection is a prominent approach to improve a system’s performance by selecting a well-performing algorithm from a portfolio for an instance at hand. One extension of the traditional algorithm selection problem is to not only select one single algorithm but a schedule of algorithms to increase robustness. Some approaches exist for solving this problem of selecting schedules on a per-instance basis (e.g., the Sunny and 3S systems), but to date, a fair and thorough comparison of these is missing. In this work, we implement Sunny’s approach and dynamic schedules inspired by 3S in the flexible algorithm selection framework flexfolio to use the same code base for a fair comparison. Based on the algorithm selection library (ASlib), we perform the first thorough empirical study on the strengths and weaknesses of per-instance algorithm schedules. We observe that on some domains it is crucial to use a training phase to limit the maximal size of schedules and to select the optimal neighborhood size of k-nearest-neighbor. By modifying our implemented variants of the Sunny and 3S approaches in this way, we achieve strong performance on many ASlib benchmarks and establish new state-of-the-art performance on 3 scenarios.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
A common observation in many areas of AI (e.g., SAT or CSP solving) and machine learning is that no single algorithm dominates the performance of all others. To exploit this complementarity of algorithms, algorithm selection systems [6, 8, 11] are used to select a well-performing algorithm for a new given instance. Algorithm selectors, such as SATzilla [12] and 3S [7], demonstrated in several SAT competitions that they can outperform pure SAT solvers by a large margin (see, e.g., the results of the SAT Challenge 2012Footnote 1).
An open problem in algorithm selection is that the machine learning model sometimes fails to select a well-performing algorithm, e.g., because of uninformative instance features. An extension of algorithm selection is to select a schedule of multiple algorithms at least one of which performs well.
To date, a fair comparison of such algorithm schedule selectors is missing, since every publication used another benchmark set and some implementations (e.g., 3S) are not publicly available (because of license reasons). To study the strengths and weaknesses of such schedulers in a fair manner, we implemented well known algorithm schedule approaches (i.e., Sunny [1] and dynamic schedules inspired by 3S [7]) in the flexible framework of flexfolio (the successor of claspfolio 2 [5]) and studied them on the algorithm selection library (ASlib [3]).
2 Per-instance Algorithm Scheduling
Similar to the per-instance algorithm selection problem [11], the per-instance algorithm scheduling problem is defined as follows:
Definition 1
(Per-instance Algorithm Scheduling Problem). Given a set of algorithms \(\mathcal {P}\), a set of instances \(\mathcal {I}\), a runtime cutoff \(\kappa \), and a performance metric \(m: \varSigma \times \mathcal {I}\rightarrow \mathbb {R}\), the per-instance algorithm scheduling problem is to find a mapping \(s: \mathcal {I}\rightarrow \varSigma \) from an instance \(\pi \in \mathcal {I}\) to a (potentially unordered) algorithm schedule \(\sigma _\pi \in \varSigma \) where each algorithm \(\mathcal {A}\in \mathcal {P}\) gets a runtime budget \(\sigma _\pi {}(\mathcal {A})\) between 0 and \(\kappa \) such that \(\sum _{\mathcal {A}\in \mathcal {P}} \sigma _\pi {}(\mathcal {A}) \le \kappa \) and \(\sum _{ \pi \in \mathcal {I}} m(s(\pi ),\pi )\) will be minimized.
The algorithm scheduler aspeed [4] addresses this problem by using a static algorithm schedule; i.e., aspeed applies the same schedule to all instances. The schedule is optimized with an answer set programming [2] solver to obtain a timeout-minimal schedule on the training instances. The scheduler aspeed either uses a second optimization step to determine a well-performing ordering of the algorithms or sorts the algorithms by their assigned times, in ascending order (such that a wrongly selected solver does not waste too much time).
Systems such as 3S [7], SATzilla [12] and claspfolio 2 [5] combine static algorithm schedules (also called pre-solving schedules) and classical algorithm selection. All these systems run the schedule for a small fraction of the runtime budget \(\kappa \) (e.g., 3S uses \(10\%\) of \(\kappa \)), and if this pre-solving schedule fails to solve the given instance, they apply per-instance algorithm selection to run an algorithm predicted to perform well. 3S and claspfolio 2 use mixed integer programming and answer set programming solvers, respectively, to obtain a timeout-minimal pre-solving schedule. SATzilla uses a grid search to obtain a pre-solving schedule that optimizes the performance of the entire system.
The algorithm scheduler Sunny [1] determines the schedule for a new instance \(\pi \) by first determining the set of k training instances \(\mathcal {I}_{k}\) closest to \(\pi \) in instance feature space, and then assigns each algorithm a runtime proportional to the number of instances in \(\mathcal {I}_{k}\) it solved. The algorithms are sorted by their average PAR10 scores on \(\mathcal {I}_{k}\), in ascending order (which corresponds to running the algorithm with the best expected performance first).
3 Instance-Specific Aspeed (ISA)
Kadioglu et al. [7] proposed a variant of 3S that uses per-instance algorithm schedules instead of a fixed split between static pre-solving schedule and algorithm selection. In order to evaluate the potential of per-instance timeout-optimized scheduling, we developed the scheduler ISA, short for instance-specific aspeed. Inspired by Kadioglu et al. [7], our implementation uses k-nearest neighbor (k-NN) to identify the set \(\mathcal {I}_{k}\) of training instances closest to a given instance \(\pi \) and then applies aspeed to obtain a timeout-minimal schedule for them.
During offline training, we have to determine a promising value for the neighborhood size k. In our experiments, we evaluated different k values between 1 and 40 by running cross-validation on the training data and stored the best performing value to use online. We chose this small upper bound for k to ensure a feasible runtime of the schedulerFootnote 2 (in our experiments less than 1 second). Furthermore, to optimize the runtime of the scheduler, we reduced the set of training instances, omitting all instances that were either solved by every algorithm or solved by none within the cutoff time.
For each new instance, ISA first computes the k nearest neighbor instances from the reduced training set. This instance set is passed to aspeed [4], which returns a timeout-minimal unordered schedule for the neighbor set. The schedule is finally aligned by sorting the time slots in ascending order.
4 Trained Sunny (TSunny)
To offer a form of scheduling with less overhead in the online stage than ISA, we implemented a modified version of Sunny [1] by adding a training phase. For a new problem instance Sunny first selects a subset of k training instances \(\mathcal {I}_{k}\) using k-NN. Then time slots are assigned to each candidate algorithm: Each solver gets one slot for each instance of \(\mathcal {I}_{k}\) it can solve within the given time. Additionally, a designated backup solver gets one slot for each instance of \(\mathcal {I}_{k}\) that cannot be solved by any of the algorithms. Having this slot assignment, the actual size of a single time slot is computed by dividing the available time by the total number of slots. Finally, the schedule is aligned by sorting the algorithms by their average PAR10 score on \(\mathcal {I}_{k}\), thereby running the most promising solver first.
Preliminary experiments for our implementation of this algorithm produced relatively poor results. Examining the schedules, we found that Sunny tends to employ many algorithms per schedule, which we suspected to be a weakness. Thus, we enhanced the algorithm by limiting the number of algorithms used in a single schedule to a specified number \(\lambda \).
Originally, Sunny is defined as lazy, i.e. not applying any training procedures after the benchmark data is gathered. However, to obtain better values for our new parameter \(\lambda \), and also to improve the choice of the neighborhood size k, we implemented a training process for Sunny. Similar to ISA, different configurations for \(\lambda \) (range 1 to the total number of solvers) and k (range 1 to 100) are evaluated by cross-validation on the training data. To distinguish this enhanced algorithm from the original Sunny, we dubbed this trained version TSunny.
5 Empirical Study
To compare the different algorithm scheduling approaches of ISA and Sunny, we implemented them in the flexible algorithm selection framework flexfolio Footnote 3 and compared them to various other systems: The static algorithm scheduling system aspeed [4], the default configuration of flexfolio (which is similar to SATzilla [12] and claspfolio 2 [5] and includes a static-presolving schedule), as well as the per-instance algorithm selector AutoFolio [9] (an automatically-configured version of flexfolio without consideration of per-instance algorithm schedules). If not mentioned otherwise, we used the default parameter values of flexfolio. The comparison is based on the algorithm selection library (ASlib [3]), which is specifically designed to fairly measure the performance of algorithm selection systems. Version 1.0 of ASlib consists of 13 scenarios from a wide range of different domains (SAT, MAXSAT, CSP, QBF, ASP and operations research).
Table 1 shows the performance of the systems as the fraction of the gap closed between the static single best algorithm and the oracle (i.e., the performance of an optimal algorithm selector), using performance metric PAR10Footnote 4. As expected, the per-instance schedules (i.e., Sunny and ISA) performed better on average than aspeed’s static schedules. However, aspeed still establishes the best performance on SAT11-HAND. By comparing Sunny and TSunny, we see that parameter tuning substantially improved performance. Comparing TSunny and ISA, we note that their overall performance is similar but that either has advantages on different scenarios; thus, there is still room for improvement by selecting the better of the two on a per-scenario basis. Surprisingly, the per-instance schedules had a similar performance (ISA with 0.71) to the state-of-the-art procedure AutoFolio (0.70); however, AutoFolio performed slightly more robustly, being amongst the best systems on 10/13 scenarios. Nevertheless, ISA establishes new state-of-the-art performance on PREMAR-2013 (short for PREMARSHALLING-ASTAR-2013) and TSunny on PROTEUS-2014 and QBF-2011 according to the on-going evaluation on ASlib Footnote 5.
Table 2 gives more insights into our systems’ behavior. It also includes our implemented version of Sunny without training, dubbed Sunny’. Sunny (and also Sunny’) sets the neighborhood size k as the square root of the number of instances, whereas TSunny optimizes k on the training instances. The reason for TSunny’s better performance in comparison to Sunny is probably its much smaller values for k on all scenarios except on SAT12-RAND. Also TSunny’s average schedule size was smaller on nearly all scenarios (except CSP-2010).
Comparing the static aspeed and the instance-specific aspeed (ISA), the average schedule size of aspeed is rather large since aspeed has to compute a single static schedule that is robust across all training instances and not only on a small subset. Surprisingly, the values of k for ISA and TSunny differ a lot, indicating that the best value of k depends on the scheduling strategy.
6 Conclusion and Discussion
We showed that per-instance algorithm scheduling systems can perform as well as algorithm selectors and even establish new state-of-the-art performance on 3 scenarios of the algorithm selection library [3]. Additionally, we found that the performance of the algorithm schedules strongly depends on the adjustment of their parameters for each scenario, here the neighborhood size of the k-nearest neighbor and the maximal size of the schedules.
In our experiments we did not tune all possible parameters of Sunny and ISA in the flexible flexfolio framework; e.g., we fixed the pre-processing strategy of the instance features. Therefore, a future extension of this line of work would be to extend the search space of the automatically-configured algorithm selector AutoFolio [9] to also cover per-instance algorithm schedules. Another extension could be to allow communication between the algorithms in the schedule [10].
Notes
- 1.
- 2.
Optimizing a schedule is NP-hard; thus the size of the input set, defined by k, must be kept small to make the process applicable during runtime.
- 3.
The source code and all benchmark data are available at http://www.ml4aad.org/algorithm-selection/flexfolio/.
- 4.
PAR10 is the penalized average running time where timeouts are counted as 10 times the running time cutoff.
- 5.
References
Amadini, R., Gabbrielli, M., Mauro, J.: SUNNY: a lazy portfolio approach for constraint solving. TPLP 14(4–5), 509–524 (2014)
Baral, C.: Knowledge Representation, Reasoning and Declarative Problem Solving. Cambridge University Press, Cambridge (2003)
Bischl, B., Kerschke, P., Kotthoff, L., Lindauer, M., Malitsky, Y., Frechétte, A., Hoos, H., Hutter, F., Leyton-Brown, K., Tierney, K., Vanschoren, J.: ASlib: a benchmark library for algorithm selection. AIJ 237, 41–58 (2016)
Hoos, H., Kaminski, R., Lindauer, M., Schaub, T.: aspeed: Solver scheduling via answer set programming. TPLP 15, 117–142 (2015)
Hoos, H., Lindauer, M., Schaub, T.: claspfolio 2: Advances in algorithm selection for answer set programming. TPLP 14, 569–585 (2014)
Huberman, B., Lukose, R., Hogg, T.: An economic approach to hard computational problems. Science 275, 51–54 (1997)
Kadioglu, S., Malitsky, Y., Sabharwal, A., Samulowitz, H., Sellmann, M.: Algorithm selection and scheduling. In: Lee, J. (ed.) CP 2011. LNCS, vol. 6876, pp. 454–469. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23786-7_35
Kotthoff, L.: Algorithm selection for combinatorial search problems: a survey. AI Mag. 35, 48–60 (2014)
Lindauer, M., Hoos, H., Hutter, F., Schaub, T.: Autofolio: an automatically configured algorithm selector. JAIR 53, 745–778 (2015)
Malitsky, Y., Sabharwal, A., Samulowitz, H., Sellmann, M.: Boosting sequential solver portfolios: knowledge sharing and accuracy prediction. In: Nicosia, G., Pardalos, P. (eds.) LION 2013. LNCS, vol. 7997, pp. 153–167. Springer, Heidelberg (2013). doi:10.1007/978-3-642-44973-4_17
Rice, J.: The algorithm selection problem. Adv. Comput. 15, 65–118 (1976)
Xu, L., Hutter, F., Hoos, H., Leyton-Brown, K.: SATzilla: portfolio-based algorithm selection for SAT. JAIR 32, 565–606 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Lindauer, M., Bergdoll, RD., Hutter, F. (2016). An Empirical Study of Per-instance Algorithm Scheduling. In: Festa, P., Sellmann, M., Vanschoren, J. (eds) Learning and Intelligent Optimization. LION 2016. Lecture Notes in Computer Science(), vol 10079. Springer, Cham. https://doi.org/10.1007/978-3-319-50349-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-50349-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50348-6
Online ISBN: 978-3-319-50349-3
eBook Packages: Computer ScienceComputer Science (R0)