Abstract
In this paper, we present a simulation-based dynamic programming method that learns the ‘cost-to-go’ function in an iterative manner. The method is intended to combat two important drawbacks of the conventional Model Predictive Control (MPC) formulation, which are the potentially exorbitant online computational requirement and the inability to consider the future interplay between uncertainty and estimation in the optimal control calculation. We use a nonlinear Van de Vusse reactor to investigate the efficacy of the proposed approach and identify further research issues.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
åström, K. J. and Helmersson, A., “Dual Control of an Integrator with Unknown Gain,”Comp. and Maths. with Appls.,12A, 653 (1986).
åström, K. J. and Wittenmark, B., “Adaptive Control,” Addison-Wesley (1989).
Bellman, R. E., “Dynamic Programming,” Princeton University Press, New Jersey (1957).
Bemporad, A. and Morari, M., “Control of Systems Integrating Logic, Dynamics and Constraints,”Automatica,35, 407 (1999).
Bertsekas, D. P., “Dynamic Programming and Optimal Control,” 2nd ed., Athena Scientific, Belmont, MA (2000).
Bertsekas, D. P. and Tsitsiklis, J. N., “Neuro-Dynamic Programming,” Athena Scientific, Belmont, MA (1996).
Chikkula, Y. and Lee, J.H., “Robust Adaptive Predictive Control of Nonlinear Processes Using Nonlinear Moving Average System Models,”Ind. Eng. Chem. Res.,39, 2010 (2000).
Crites, R. H. and Barto, A. G., “Improving Elevator Performance Using Reinforcement Learning,” Advances in Neural Information Processing Systems 8, Touretzky, D. S., Mozer, M. C. and Haselmo, M. E., eds., MIT Press, Cambridge, MA, 1017 (1996).
Henson, M.A., “Nonlinear Model Predictive Control: Current Status and Future Directions,”Computers and Chemical Engineering,23, 187 (1998).
Howard, R.A., “Dynamic Programming and Markov Processes,” MIT Press, Cambridge, MA (1960).
Kaisare, N. S., Lee, J. M. and Lee, J.H., “Simulation Based Strategy for Nonlinear Optimal Control: Application to a Microbial Cell Reactor,”International Journal of Robust and Nonlinear Control,13, 347 (2002).
Lee, J.H. and Cooley, B., “Recent Advances in Model Predictive Control,”Chemical Process Control-V, 201 (1997).
Lee, J. H. and Ricker, N. L., “Extended Kalman Filter Based Nonlinear Model Predictive Control,”Ind. Eng. Chem. Res.,33, 1530 (1994).
Lee, J.H. and Yu, Z., “Worst-Case Formulations of Model Predictive Control for Systems with Bounded Parameters,”Automatica,33, 1530 (1997).
Lee, J. M. and Lee, J.H., “Simulation-Based Dual Mode Controller for Nonlinear Processes,” Proceedings of IFAC ADCHEM 2003, accepted (2004).
Lee, J.M. and Lee, J. H., “Neuro-Dynamic Programming Approach to Dual Control Problems,” AIChE Annual Meeting, Reno, NV (2001).
Marback, P. and Tsitsiklis, J.N., “Simulation-Based Optimization of Markov Reward Processes,”IEEE Transactions on Automatic Control,46, 191 (2001).
Mayne, D.Q., Rawlings, J. B., Rao, C.V. and Scokaert, P.O. M., “Constrained Model Predictive Control: Stability and Optimality,”Automatica,36, 789 (2000).
Meadows, E. S. and Rawlings, J. B., “Model Predictive Control,” Nonlinear Process Control, Henson, M. A. and Seborg, D. E., eds., Prentice Hall, New Jersey, 233 (1997).
Morari, M. and Lee, J.H., “Model Predictive Control: Past, Present and Future,”Computers and Chemical Engineering,23, 667 (1999).
Puterman, M. L., “Markov Decision Processes,” Wiley, New York, NY (1994).
Sistu, P.B. and Bequette, B.W., “Model Predictive Control of Processes with Input Multiplicities,”Chemical Engineering Science,50, 921 (1995).
Sutton, R. S. and Barto, A.G., “Reinforcement Learning: An Introduction,” MIT Press, Cambridge, MA (1998).
Tesauro, G. J., “Practical Issues in Temporal Difference Learning,”Machine Learning,8, 257 (1992).
Van de Vusse, J. G., “Plug-Flow Type Reactor versus Tank Reactor,”Chemical Engineering Science,19, 964 (1964).
Zhang, W. and Dietterich, T. G., “A Reinforcement Learning Approach to Job Shop Scheduling,” Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1114 (1995).
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is dedicated to Professor Hyun-Ku Rhee on the occasion of his retirement from Seoul National University.
Rights and permissions
About this article
Cite this article
Lee, J.M., Lee, J.H. Simulation-based learning of cost-to-go for control of nonlinear processes. Korean J. Chem. Eng. 21, 338–344 (2004). https://doi.org/10.1007/BF02705417
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02705417