Abstract
The multiple traveling salesman problem (mTSP) is a well-known NP-hard problem with numerous real-world applications. In particular, this work addresses MinMax mTSP, where the objective is to minimize the max tour length among all agents. Many robotic deployments require recomputing potentially large mTSP instances frequently, making the natural trade-off between computing time and solution quality of great importance. However, exact and heuristic algorithms become inefficient as the number of cities increases, due to their computational complexity. Encouraged by the recent developments in deep reinforcement learning (dRL), this work approaches the mTSP as a cooperative task and introduces DAN, a decentralized attention-based neural method that aims at tackling this key trade-off. In DAN, agents learn fully decentralized policies to collaboratively construct a tour, by predicting each other’s future decisions. Our model relies on attention mechanisms and is trained using multi-agent RL with parameter sharing, providing natural scalability to the numbers of agents and cities. Our experimental results on small- to large-scale mTSP instances (50 to 1000 cities and 5 to 20 agents) show that DAN is able to match or outperform state-of-the-art solvers while keeping planning times low. In particular, given the same computation time budget, DAN outperforms all conventional and dRL-based baselines on larger-scale instances (more than 100 cities, more than 5 agents), and exhibits enhanced agent collaboration. A video explaining our approach and presenting our results is available at https://youtu.be/xi3cLsDsLvs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Kaempfer, Y., Wolf, L.: Learning the multiple traveling salesmen problem with permutation invariant pooling networks. arXiv preprint arXiv:1803.09621 (2018)
Hu, Y., Yao, Y., Lee, W.S.: A reinforcement learning approach for optimizing multiple traveling salesman problems over graphs. Knowl.-Based Syst. 204, 106244 (2020)
Park, J., Bakhtiyar, S., Park, J.: ScheduleNet: learn to solve multi-agent scheduling problems with reinforcement learning. arXiv preprint arXiv:2106.03051 (2021)
Faigl, J., Kulich, M., Přeučil, L.: Goal assignment using distance cost in multi-robot exploration. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3741–3746. IEEE (2012)
Oßwald, S., Bennewitz, M., Burgard, W., Stachniss, C.: Speeding-up robot exploration by exploiting background information. IEEE Robot. Autom. Lett. 1(2), 716–723 (2016)
Chao, C., Hongbiao, Z., Howie, C., Ji, Z.: TARE: a hierarchical framework for efficiently exploring complex 3D environments. In: Robotics: Science and Systems Conference (RSS). Virtual (2021)
IBM: CPLEX Optimizer (2018). https://www.ibm.com/analytics/cplex-optimizer
Helsgaun, K.: An extension of the Lin-Kernighan-Helsgaun TSP solver for constrained traveling salesman and vehicle routing problems. Roskilde University, Roskilde (2017)
Gurobi Optimizer (2020). https://www.gurobi.com
Google: OR Tools (2012). https://developers.google.com/optimization/routing/vrp
Vinyals, O., Fortunato, M., Jaitly, N.: Pointer networks. arXiv preprint arXiv:1506.03134 (2015)
Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940 (2016)
Kool, W., Van Hoof, H., Welling, M.: Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475 (2018)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of NeurIPS, pp. 5998–6008 (2017)
Bektas, T.: The multiple traveling salesman problem: an overview of formulations and solution procedures. Omega 34(3), 209–219 (2006). https://doi.org/10.1016/j.omega.2004.10.004
Zhang, K., Yang, Z., Başar, T.: Multi-agent reinforcement learning: a selective overview of theories and algorithms. arXiv:1911.10635 (2021)
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Proceedings of AAMAS, pp. 66–83 (2017)
Moritz, P., et al.: Ray: a distributed framework for emerging AI applications. In: Proceedings of OSDI, pp. 561–577 (2018)
OpenAI: OpenAI Baselines: ACKTR & A2C (2017). https://openai.com/blog/baselines-acktr-a2c/
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2017)
Lupoaie, V.I., Chili, I.A., Breaban, M.E., Raschip, M.: SOM-guided evolutionary search for solving MinMax multiple-TSP. arXiv:1907.11910 (2019)
Voudouris, C., Tsang, E.P., Alsheddy, A.: Guided local search. In: M. Gendreau, J.Y. Potvin (eds.) Handbook of Metaheuristics, vol. 146, pp. 321–361. Springer, US, Boston, MA (2010). https://doi.org/10.1007/978-1-4419-1665-5_11. Series Title: International Series in Operations Research & Management Science
Reinelt, G.: TSPLIB-A traveling salesman problem library. INFORMS J. Comput. 3(4), 376–384 (1991)
Acknowledgements
This work was supported by Temasek Laboratories (TL@NUS) under grants TL/SRP/20/03 and TL/SRP/21/19. We thank colleagues at TL@NUS and DSO for useful discussions, and Mehul Damani for his help with the initial manuscript. Detailed comments from anonymous referees contributed to the quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cao, Y., Sun, Z., Sartoretti, G. (2024). DAN: Decentralized Attention-Based Neural Network for the MinMax Multiple Traveling Salesman Problem. In: Bourgeois, J., et al. Distributed Autonomous Robotic Systems. DARS 2022. Springer Proceedings in Advanced Robotics, vol 28. Springer, Cham. https://doi.org/10.1007/978-3-031-51497-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-51497-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51496-8
Online ISBN: 978-3-031-51497-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)