Abstract
Search decisions are often made using heuristic methods because real-world applications can rarely be tackled without any heuristics. In many cases, multiple heuristics can potentially be chosen, and it is not clear a priori which would perform best. In this article, we propose a procedure that learns, during the search process, how to select promising heuristics. The learning is based on weight adaptation and can even switch between different heuristics during search. Different variants of the approach are evaluated within a constraint-programming environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
Bibliography
J. A. Boyan and A. W. Moore. Using prediction to improve combinatorial optimization search. In Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics (AISTATS-97), 1997.
P. Cowling, G. Kendall, and E. Soubeiga. A parameter-free hyperheuristic for scheduling a sales summit. In Proceedings of the Fourth Metaheuristics International Conference (MIC’2001), pages 127–131, 2001.
M. Dorigo, G. Di Caro, and L. M. Gambardella. Ant algorithms for discrete optimization. Artificial Life, 5 (3): 137–172, 1999.
J. Frank. Learning short-term weights for GSAT. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (IJCAI-97) pages 384391,1997.
H. H. Hoos and T. Stützle. Evaluating las vegas algorithms — pitfalls and remedies. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-98), pages 238–245, 1998.
D. E. Joslin and D. P. Clements. Squeaky wheel optimization. Journal of Artificial Intelligence Research, 10: 353–373, 1999.
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated annealing. Science, 220 (4598): 671–680, 1983.
M. L. Littman and D. H. Ackley. Adaptation in constant utility non-stationary environments. In Proceedings of the Fourth International Conference on Genetic Algorithms, pages 136–142, 1991.
F. Michaud and M. J. Matark. Learning from history for behavior-based mobile robots in non-stationary environments. Machine Learning (Joint Special Issue on Learning in Autonomous Robots), 31: 141–167, 1998.
A. Nareyek. Constraint-Based Agents–An Architecture for Constraint-Based Modeling and Local-Search-Based Reasoning for Planning and Scheduling in Open and Dynamic Worlds, volume 2062 of Lecture Notes in Artificial Intelligence. Springer, 2001a.
A. Nareyek. Using global constraints for local search. In E. C. Freuder and R. J. Wallace, editors, Constraint Programming and Large Scale Discrete Optimization, volume 57 of DIMACS Series on Discrete Mathematics and Theoretical Computer Science, pages 9–28. American Mathematical Society Publications, 2001b.
G. Rabideau, R. Knight, S. Chien, A. Fukunaga, and A. Govindjee. Iterative repair planning for spacecraft operations in the aspen system. In Proceedings of the International Symposium on Artificial Intelligence Robotics and Automation in Space (iSAIRAS 99 ), 1999.
W. Ruml. Incomplete tree search using adaptive probing. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01), pages 235–241, 2001.
J. Schmidhuber. Making the world differentiable: On using self-supervised fully recurrent neural networks for dynamic reinforcement learning and planning in non-stationary environments. Technical Report TR FKI-12690, Department of Computer Science, Technical University of Munich, Germany, 1990.
D. Schuurmans and F. Southey. Local search characteristics of incomplete SAT procedures. In Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-2000), pages 297–302, 2000.
S. F. Smith. OPIS: A methodology and architecture for reactive scheduling. In M. Zweben and M. S. Fox, editors, Intelligent Scheduling, pages 29–66. Morgan Kaufmann, 1994.
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
C. Voudouris and E. Tsang. Guided local search. Technical Report CSM-247, University of Essex, Department of Computer Science, Colchester, United Kingdom, 1995.
M. Zweben, B. Daun, E. Davis, and M. Deale. Scheduling and rescheduling with iterative repair. In M. Zweben and M. S. Fox, editors, Intelligent Scheduling, pages 241–255. Morgan Kaufmann, 1994.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media New York
About this chapter
Cite this chapter
Nareyek, A. (2003). Choosing Search Heuristics by Non-Stationary Reinforcement Learning. In: Metaheuristics: Computer Decision-Making. Applied Optimization, vol 86. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-4137-7_25
Download citation
DOI: https://doi.org/10.1007/978-1-4757-4137-7_25
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5403-9
Online ISBN: 978-1-4757-4137-7
eBook Packages: Springer Book Archive