تلفیق مبانی فازی و یادگیری تقویتی در کنترل سیستم‌های دینامیکی

نوع مقاله : مقاله پژوهشی

نویسندگان

فردوسی مشهد

چکیده

یادگیری تقویتی، روشی است که در آن عامل یا عاملان باتوجه به یک‌سری پاداش‌های مثبت و یا منفی، یک عمل بهینه را انجام می‌دهند. این روش، زمانی کارایی بسیار بالایی خواهد داشت که مدل سیستم به‌صورت طبیعی موجود نباشد و یا به‌دست آوردن آن موجب زحمت فراوان گردد. در این صورت می‌توان، آن را جایگزین مناسبی برای منطق‌های کنترلی دیگر دانست. یکی از معایب اساسی این روش، استفاده از عمل‌های گسسته در حین انجام آن می‌باشد. این در حالی است که خیلی از سیستم‌های دینامیکی با چنین رویکردی، عملکرد بهینه‌ای نخواهند داشت. برای جبران این نقیصه، رویکردهای متفاوتی از جمله تقریب مقادیر ظهور پیدا می کنند. در این مقاله از منطق فازی برای پیوسته کردن عمل‌های بهینه استفاده شده است. در این حالت، سیستم یادگیری تقویتی، قوانین بین کنترل‌کنندۀ فازی را در جهت نیل به بهینه‌ترین عمل تنظیم می‌نماید و به این ترتیب می تواند عمل‌های پیوسته‌ای را تولید نماید. به این منظور مدل یک آونگ معکوس در سیم مکانیکس در نظر گرفته شده است که توسط کنترل‌کننده طراحی شده است و حرکت آن در دو حالت کنترل زاویۀ آونگ و کنترل کامل آونگ و ارابه مورد بررسی قرار می گیرد. نتایج به‌دست آمده نشان می دهند، هوش مصنوعی به‌کار گرفته شده به‌جای انتخاب قوانین موجود، می تواند کارایی بالاتری در کنترل سیستم های دینامیکی داشته باشد.

کلیدواژه‌ها


عنوان مقاله [English]

Combining the principles of fuzzy logic and reinforcement learning for control of dynamic systems

نویسندگان [English]

  • Masoud Goharimanesh
  • Ali Akbar Akbari
  • Mohammad-Bagher Naghibi
Ferdowsi University of Mashhad
چکیده [English]

Reinforcement learning is a method in which agent/agents obtain a positive or negative reward to do an efficient operation. In this way, the performance will be very suitable for the systems which are naturally complicated for deriving the differential equations. This can be a good alternative to other control areas. One of the main disadvantages of this method is considering the discrete actions during it. However, many of dynamical systems couldn't be optimized by this approach. To remedy this deficiency, different approaches have emerged, including approximate methods. In this paper, fuzzy logic is used to continually optimize the operations. In this case, the reinforcement learning method sets the fuzzy control rules which are the principles of optimal control. Two approaches, stabilizing the pendulum and both of pendulum and cart are considered to control the pole- cart problem in this paper. The results show that the applied artificial intelligence can be used as a proper solution for the taken policy.

کلیدواژه‌ها [English]

  • Reinforcement learning
  • fuzzy logics
  • optimal control
  • inverse pendulum
1. Sutton, R.S. and Barto, A.G., "Reinforcement learning: An introduction", Vol. 1, Cambridge Univ Press, (1998).
2. Kaelbling, L.P., Littman, M.L. and Moore, A.W. "Reinforcement learning: A survey", Journal of Artificial Intelligence Research, Vol. 4, pp. 237-285, (1996).
3. Watkins, C.J. and Dayan, P., "Q-learning", Machine learning, Vol. 8, pp. 279-292, (1992).
4. Berenji, H. Lea, R. Jani, Y., Khedkar, P., Malkani, A. and Hoblit, J., "Space shuttle attitude control by reinforcement learning and fuzzy logic", in Fuzzy Systems, Second IEEE International Conference on 1993, pp. 1396-1401, (1993).
5. Graepel, T. Herbrich, R. and Gold, J., "Learning to fight", in Proceedings of the International Conference on Computer Games: Artificial Intelligence, Design and Education, pp. 193-200, (2004).
6. Ng, A.Y., Coates, A. Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E. and Liang, E., "Autonomous inverted helicopter flight via reinforcement learning", Experimental Robotics IX, ed: Springer, pp. 363-372, (2006).
7. Lin, C.-K. "A reinforcement learning adaptive fuzzy controller for robots," Fuzzy Sets and Systems, Vol. 137, pp. 339-352, (2003).
8. Yung, N.H. and Ye, C. "An intelligent mobile vehicle navigator based on fuzzy logic and reinforcement learning", Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, Vol. 29, pp. 314-321, (1999).
9. Barto, A. and Crites, R.,"Improving elevator performance using reinforcement learning", Advances in neural information processing systems, Vol. 8, pp. 1017-1023, (1996).
10. Howell, M. and Best, M.C., "On-line PID tuning for engine idle-speed control using continuous action reinforcement learning automata", Control Engineering Practice, Vol. 8, pp. 147-154, (2000).
11. Frost, G., Howell, M., Gordon, T. and Wu, Q., "Dynamic vehicle roll control using reinforcement learning", (1996).
12. Howell, M.N. Frost, G.P. Gordon, T.J. and Wu, Q.H., "Continuous action reinforcement learning applied to vehicle suspension control", Mechatronics, Vol. 7, pp. 263-276, (1997).
13. Bucak, İ. and Öz, H., "Vibration control of a nonlinear quarter-car active suspension system by reinforcement learning", International Journal of Systems Science, Vol. 43, pp. 1177-1190, (2012).
14. Lauer, M., "A case study on learning a steering controller from scratch with reinforcement learning," in Intelligent Vehicles Symposium (IV), 2011 IEEE, pp. 260-265, (2011).
15. Akbari, A.A. and Goharimanesh, M., "Yaw Moment Control Using Fuzzy Reinforcemnt Learning", Advanced Vehicle Control (AVEC14), (2014).
16. Vogel, A. , Ramachandran, D., Gupta, R. and Raux, A., "Improving Hybrid Vehicle Fuel Efficiency Using Inverse Reinforcement Learning", AAAI, (2012).
17. Woodbury, T., Dunn, C. and Valasek, J., "Autonomous Soaring Using Reinforcement Learning for Trajectory Generation", (2014).
18. Ng, A.Y., Kim, H.J. Jordan, M.I. Sastry, S. and Ballianda, S., "Autonomous Helicopter Flight via Reinforcement Learning", NIPS, (2003).
19. Cam, B. Dembia, C. and Israeli, J., "Reinforcement learning for bicycle control", (2013).
20. Yamashita, S., Horiuchi, T. and Kato, S., "A study on skill acquisition in trailer-truck steering problem by reinforcement learning", SICE 2002. Proceedings of the 41st SICE Annual Conference, pp. 810-812, (2002).
21. Kirkpatrick, K. and Valasek, J., "Reinforcement learning for characterizing hysteresis behavior of shape memory alloys", Journal of Aerospace Computing, Information, and Communication, Vol. 6, pp. 227-238, (2009).
22. Kirkpatrick, K. and Valasek, J., "Active length control of shape memory alloy wires using reinforcement learning", Journal of Intelligent Material Systems and Structures, Vol. 22, pp.
1595-1604, (2011).
23. Zhou, M., Hu, B., Gao, W. and Wang, J., "Reinforcement Learning Fuzzy Neural Network Control for Magnetic Shape Memory Alloy Actuator," International Journal of Control & Automation, Vol. 7, No. 6, pp. 109-122, (2014).
24. Uragami, D., Takahashi, T. and Matsuo, Y., "Cognitively inspired reinforcement learning architecture and its application to giant-swing motion control", Biosystems, Vol. 116, pp. 1-9, (2014).
25. Shahriari, M. and Khayyat, A.A., "Gait analysis of a six-legged walking robot using fuzzy reward reinforcement learning", Fuzzy Systems (IFSC), 13th Iranian Conference on, pp. 1-4, (2013)
26. Navarro-Guerrero, N., Weber, C., Schroeter, P. and Wermter, S., "Real-world reinforcement learning for autonomous humanoid robot docking", Robotics and Autonomous Systems, Vol. 60, pp. 1400-1407, (2012).
27. Miljković, Z. Mitić, M. Lazarević, M. and Babić, B., "Neural network reinforcement learning for visual control of robot manipulators", Expert Systems with Applications, Vol. 40, pp. 1721-1736, (2013).
28. Kober, J., Bagnell, J.A. and Peters, J., "Reinforcement learning in robotics: A survey," The International Journal of Robotics Research, Vol. 32, pp. 1238-1274, (2013).
29. Fernandez-Gauna, B., Lopez-Guede, J.M. and Graña, M., "Transfer learning with partially constrained models: application to reinforcement learning of linked multicomponent robot system control", Robotics and Autonomous Systems, Vol. 61, pp. 694-703, (2013).
30. Fernandez-Gauna, B., Ansoategui, I. Etxeberria-Agiriano, I. and Graña, M., "Reinforcement learning of ball screw feed drive controllers", Engineering Applications of Artificial Intelligence, (2014).
31. Zarandi, M.H.F. Moosavi, S.V. and Zarinbal, M., "A fuzzy reinforcement learning algorithm for inventory control in supply chains", The International Journal of Advanced Manufacturing Technology, Vol. 65, pp. 557-569, (2013).
32. Parbhoo, S., "A reinforcement learning design for HIV clinical trials", (2014).
33. Syafiie, S. Tadeo, F. and Martinez, E., "Model-free learning control of neutralization processes using reinforcement learning", Engineering Applications of Artificial Intelligence, Vol. 20, pp. 767-782, (2007).
34. Syafiie, S. Tadeo, F., Martinez, E. and Alvarez, T., "Model-free control based on reinforcement learning for a wastewater treatment problem", Applied Soft Computing, Vol. 11, pp. 73-82, (2011).
35. Zhao, Y., Zeng, D., Socinski, M.A. and Kosorok, M.R. "Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer", Biometrics, Vol. 67, pp. 1422-1433, (2011).
36. Farjadian, A.B. Yazdanpanah, M.J. and Shafai, B., "Application of Reinforcement Learning in Sliding Mode Control for Chattering Reduction", Proceedings of the World Congress on Engineering, (2013).
37. Khan, S.G., Herrmann, G., Lewis, F.L., Pipe, T. and Melhuish, C., "Reinforcement learning and optimal adaptive control: An overview and implementation examples", Annual reviews in control, Vol. 36, pp. 42-59, (2012).
38. Berenji, H., and Jamshidi, M., "Fuzzy reinforcement learning for System of Systems (SOS)," IEEE International Conference on Fuzzy Systems, FUZZ 2011, June 27, 2011 - June 30, 2011, Taipei, Taiwan, pp. 1689-1694, (2011).
39. Berenji, H.R., "A reinforcement learning—based architecture for fuzzy logic control", International Journal of Approximate Reasoning, Vol. 6, pp. 267-292, (1992).
40. Lee, C.-C. and Berenji, H., "An intelligent controller based on approximate reasoning and reinforcement learning", Intelligent Control, 1989. Proceedings., IEEE International Symposium,
pp. 200-205, (1989).
41. Zadeh, L.A., "Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic", Fuzzy Sets and Systems, Vol. 90, pp. 111-127, (1997).
CAPTCHA Image