کاربرد یادگیری تقویتی در جهت یابی حرکت سرپنتین ربات مارمانند

نوع مقاله : مقاله پژوهشی

نویسندگان

دانشگاه فردوسی

چکیده

هدف این مقاله هدایت و کنترل ربات مارمانند با استفاده از یادگیری تقویتی (Reinforcement learning) می باشد. این مقاله به مدل‌سازی و شبیه سازی ربات مارمانند می‌‌پردازد و سپس نتایج آن را روی ربات واقعی پیاده‌سازی می‌نماید. در ابتدا معادلات دینامیک یک ربات مارمانند با n عضو (Link) در حرکت سرپنتین (Serpentine Locomotion) با استفاده از روش گیبس-اپل (Gibbs-Appell) به‌صورت ساده، جامع و کارآمد به‌دست آمده است. روش ارائه شده در این مطالعه حجم محاسبات برای دینامیک ربات مارمانند را تا حد قابل توجهی نسبت به‌کارهای پیشین کاهش می دهد. سپس مدل فیزیکی ربات در نرم‌افزار سیم مکانیک (SimMechanics) مدل‌سازی گشته و از آن برای تأیید معادلات دینامیک استفاده شده است. در این مقاله از یادگیری Q (Q learning) برای آموزش ربات مارمانند و جهت یابی آن استفاده شده است. هم‌چنین تأثیر پارامترهای منحنی سرپنوید و منحنی بدن مار روی سرعت یادگیری بررسی شده است. نتایج نشان می‌دهند پارامترهای فیزیکی که شکل ربات مار را تغییر نمی دهند، بر روی یادگیری ربات تأثیر محسوسی نمی گذارند. در انتها، نرم‌افزار شبیه‌سازی وباتس و ربات مارمانند FUM-Snake II برای تصدیق نتایج حاصل از یادگیری به‌کار گرفته شده است. نتایج به‌دست آمده از ربات آزمایشگاهی نشان می دهد که مسیر حرکت این ربات پس از یادگیری به روش Q، با مسیر پیش‌بینی شده توسط حل‌ دینامیکی پیشنهادی و نتایج شبیه‌سازی در نرم‌افزار وباتس همگی دارای مطابقت خوبی می باشند.

کلیدواژه‌ها


عنوان مقاله [English]

Application of Reinforcement Learning for Navigation of a Planar Snake Robot in Serpentine Locomotion

نویسندگان [English]

  • hadi kalani
  • alireza Akbarzadeh
Ferdowsi University of Mashhad
چکیده [English]

This article presents an implementation of a reinforcement learning (RL) method for a snake like robot navigation. The paper starts with developing kinematics and dynamics model of a snake robot in serpentine locomotion followed by performing simulation and finishes with actual experimentation. First, Gibbs-Appell's method is used to obtain the robot dynamics. The robot is also modeled in SimMechanics toolbox of MATLAB software which is then used to verify the derived dynamics equations. In this study, for the first time, Q-learning method is employed to obtain the optimal state and actions. Effects of serpenoid curve and body curve parameters on the snake robot learning ability are also investigated. Results indicate that parameters which do not affect body shape of the snake robot, also do not affect the learning ability. Finally, the experimental FUM-Snake II as well as webots software are both employed to validate theoretical results. Results show that the Q-learning method is an effective method for navigation of snake like robot.

کلیدواژه‌ها [English]

  • Snake Robot
  • Gibbs Appell's method
  • Reinforcement learning
  • Q-learning
1. Hirose, S., "Biologically Inspired Robots (Snake-like Locomotor and Manipulator)", Oxford University Press, Oxford, (1993).
2. Saito M., Fukaya M. and Iwasaki T., "Serpentine locomotion with robotic snakes", IEEE Control Systems Magazine, Vol. 22, pp. 64–81, (2002).
3. Transeth, A.A., "Modelling and Control of Snake Robots", Ph.D. Dissertation, Trondheim, September, (2007).
4. Transeth, A.A., Pettersen K.Y. and Liljebäck, P., "A survey on snake robot modeling and locomotion", Robotica, Vol. 27, pp. 999–1015, (2009).
5. Vossoughi Gh., Pendar Ho., Heidari Z. and Mohammadi S., "Assisted passive snake-like robots: conception and dynamic modeling using Gibbs–Appell method", Robotica, Vol. 26, pp. 267–276, (2008).
6. Spranklin, B.W., "Design, Analysis and fabrication of a snake- inspired robot with a rectilinear gait", Master of Science Thesis, University of Maryland, (2006).
7. Liljebäck, P., Stavdahl, A. and Pettersen K.Y., "Modular pneumatic snake robot: 3D modeling, implementation and control", in: Proc. 16th IFAC World Congress, Prague, Czech Republic, 4-8 July (2005).
8. Ye, Ch., Ma, Sh., and Li, B., "Development of a 3D Snake-like Robot Perambulator-II: Design and Basic Experiments", in Proc. 2007 IEEE Int. Conf. on Intelligent Mechatronics and Automation (ICMA2007), pp. 117-122, 5-8 Aug, (2007).
9. Crespi, A. and Ijspeert, A.J., "Online Optimization of Swimming and Crawling in an Amphibious Snake Robot", IEEE Transactions on Robotics, Vol. 24, (2008).
10. Hasanzadeh, Sh. and Akbarzadeh, Alireza., "Ground adaptive and optimized locomotion of snake robot moving with a novel gait", Auton Robot, Vol. 28, pp. 457–470, (2010).
11. Hasanzadeh, Sh. and Akbarzadeh, A., "Adaptive Optimal Locomotion of Snake Robot Based on CPG-Network Using Fuzzy Logic Tuner", IEEE - CIS RAM, (2008).
12. Kalani, H. and Akbarzadeh, A., "Design and Modeling of a Snake Robot Based on Worm-Like Locomotion", Accepted, Advance Robotics.
13. Kalani, H., Akbarzadeh, A. and Safehian, J., "Traveling Wave Locomotion of Snake Robot along Symmetrical and Unsymmetrical body shapes", ISR-Robotik Conf., Munich, Germany, 2-4 June (2010).
14. Safehian, J., kalani, H. and Akbarzadeh, A., "A Novel Kinematics Modeling Method for Snake Robot in Traveling wave Locomotion ASME 2010 10th Biennial Conference on Engineering Systems Design and Analysis ", Turkish, 12-14 July (2010).
15. Kalani, H., Akbarzadeh, A. and Bahrami, Ho., "Application of Statistical Techniques in Modeling and Optimization of a Snake Robot", Robotica, Vol. 31, pp. 623-641, (2013).
16. Sutton, R.S., and Barto, A.G., "Reinforcement Learning: An Introduction". MIT Press, Cambridge, MA, (1998).
17. Chen, C., Li, H. and Dong, D., "Hybrid control for robot navigation: A hierarchical Q-learning algorithm", IEEE Robot. Autom. Mag., Vol. 15, pp. 37–47, (2008).
18. Franchi, A., Freda, L. and Oriolo, G., "The sensor-based random graph method for cooperative robot exploration", IEEE/ASME Trans. Mechatronics, Vol. 14, pp. 163–175, (2009).
19. Chen, C. and Dong, D., "Grey system based reactive navigation of mobile robots using reinforcement learning", Int. J. Innov. Comput., Inf. Control, Vol. 6, pp. 789–800, (2010).
20. Jan, G.E., Chang, K.Y. and Parberry, I., "Optimal path planning for mobile robot navigation", IEEE/ASME Trans. Mechatronics, Vol. 13, pp. 451–460, (2008).
21. Rodriguez, M., Iglesias, R., Regueiro, C.V., Correa, J. and Barro, S., "Autonomous and fast robot learning through motivation", Robot. Auton.Syst., Vol. 55, pp. 735–740, (2007).
22. Ratliff, N.D., Silver, D. and Bagnell, J.A., "Learning to search: Functional gradient techniques for imitation learning", Auton. Robots, Vol. 27, pp. 25– 53, (2009).
23. Watkins, J.C.H. and Dayan, P., "Q-learning. Machine Learning", Vol. 8, pp. 279-292, (1992).
24. Ginsberg, J.H., "Advanced Engineering Dynamics", New York, Cambridge University Press, (1998).
25. Greenwood, D.T., "Advanced Dynamics", New York, Cambridge University Press, (2003).
26. Matari´c, M.J., "Reinforcement learning in the multi-robot domain", Autonomous Robots, Vol. 4,
pp. 73–83, (1997).
27. Peters, J., Vijayakumar, S. and Schaal, S.," Reinforcement learning for humanoid robotics", In Third IEEE-RAS International Conference on Humanoid Robots, Karlsruhe, 29-30 Sept (2003).
28. Asada M., Uchibe E. and Hosoda, K., "Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development", Artificial Intelligence, Vol. 110, pp. 275–292, (1999).
29. Park K.H., Kim Y.J. and Kim J.H., "Modular Q-learning based multi-agent cooperation for robot soccer", Robotics and Autonomous Systems, Vol. 35, pp. 109–122, (2001).
30. Tan, M., "Multi-agent reinforcement learning: independent vs. cooperative agents", in Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, pp. 330–337, (1993).
31. Dong, D., Chen, Ch., Chu, J. and Tarn, T.J., "Robust Quantum-Inspired Reinforcement Learning for Robot Navigation", IEEE/ASME Transactions on Mechatronics, Vol. 17, (2012).
32. Ito, K., Fukumori, Y. and Takayama, A., "Autonomous control of real snake-like robot using reinforcement learning; abstraction of state-action space using properties of real world", Intelligent Sensors, Sensor Networks and Information, ISSNIP, 3rd International Conference, Melbourne, Qld., 3-6 Dec., (2007).
33. Ito, K., Kamegawa, T. and Matsuno, F., "Extended QDSEGA for controlling real robots-acquisition of locomotion patterns for snake-like robot", Robotics and Automation, Vol. 1, (2003).
34. Tanev, I., Ray, T. and Buller, A., "Automated evolutionary design, robustness, and adaptation of sidewinding locomotion of a simulated snake-like robot", Robotics, IEEE Transactions, pp. 632-645, (2005).
CAPTCHA Image