Treffer: 基于深度强化学习的机械臂动态 避障算法设计与实验验证.

Title:
基于深度强化学习的机械臂动态 避障算法设计与实验验证. (Chinese)
Alternate Title:
Design and experimental verification of a dynamic obstacle avoidance algorithm for robot manipulators based on deep reinforcement learning. (English)
Source:
Experimental Technology & Management; Apr2025, Vol. 42 Issue 4, p78-85, 8p
Database:
Complementary Index

Weitere Informationen

[Objective] The study addresses the challenge of dynamic obstacle avoidance for robot manipulators operating in unstructured environments. Traditional motion planning algorithms often struggle with real-time adaptability and responsiveness to dynamic changes, especially in scenarios involving nonstatic obstacles and targets where the ability to adapt quickly and accurately is crucial for safe and efficient operation. Therefore, this research aims to develop an advanced algorithm based on deep reinforcement learning (DRL) that effectively balances dynamic obstacle avoidance and target tracking, ensuring the safe and efficient operation of robot manipulators in complex, unpredictable scenarios. [Methods] To achieve this goal, a DRL framework using the soft actor-critic (SAC) algorithm was designed. The SAC algorithm, known for its suitability in continuous control tasks, uses neural networks to handle high-dimensional tasks without requiring precise environment modeling. The robot manipulator learns optimal control strategies through trial-and-error interactions with the environment. The proposed method incorporates a comprehensive reward function that balances critical factors, including end-effector and body obstacle avoidance, self-collision prevention, precise target reaching, and motion smoothness. This comprehensive reward function guides the learning process by providing clear feedback signals that encourage the agent to develop efficient and safe behaviors. The state space provides a comprehensive representation of the environment, incorporating crucial details about the robot manipulator, obstacles, and target. It includes joint angles, joint velocities, end-effector positions and orientations, as well as key points on the manipulator’s body. This holistic representation of the environment ensures that the agent has all the necessary information for making accurate and efficient decisions. The action space is defined by joint accelerations, which are transformed into planned joint velocities and communicated to the manipulator for control. This control strategy effectively eliminates motion singularities, enabling smooth and continuous operation. [Results] The algorithm is trained in a simulation environment that leverages Python and the PyBullet simulator, providing a realistic and efficient platform for agent training. This environment is encapsulated using the Gym framework and integrated with the Stable-Baselines3 library to facilitate smooth agent–environment interactions. Extensive simulations demonstrate the algorithm’s ability to learn effective dynamic obstacle avoidance strategies, with average reward and success rate curves showing noticeable improvement and eventual stabilization. These results indicate that the model achieves a relatively stable state, capable of navigating complex and dynamic environments. The trained model is subsequently deployed on a real robot manipulator equipped with a visual servoing system. This setup includes a Realsense D435 camera and an Onrobot gripper attached to a UR5 manipulator. The visual servoing system employs ArUco markers for detecting obstacles and targets, while OpenCV handles image processing and pose estimation, enabling real-time environmental perception and precise manipulator control. Experimental results validate the algorithm’s practical effectiveness, as the robot successfully avoids dynamic obstacles and reliably reaches target positions regardless of the direction of obstacle motion. Quantitative analysis reveals that the end-effector’s position error with respect to the target converges to zero, and joint velocities remain smooth throughout the operation. These results validate the algorithm’s precision and reliability. [Conclusions] This study successfully develops and validates a DRL-based dynamic algorithm for obstacle avoidance in robot manipulators. By utilizing the soft actor-critic algorithm and a well-structured reward function, the proposed method demonstrates superior performance in navigating complex, dynamic environments. Deployment of the trained model on a real robot manipulator, integrated with a visual servoing system, further validates the algorithm’s practical applicability. These results highlight the potential of DRL in enhancing the autonomy and adaptability of robot manipulators, paving the way for future research in intelligent robotic systems. [ABSTRACT FROM AUTHOR]

针对非结构化环境中的机械臂动态避障任务, 提出基于深度强化学习 (deep reinforcement learning, DRL) 的算法设计与实验验证流程。通过设计全面的奖励函数来平衡动态避障与目标跟踪, 包括末端避障、本体避障、 自碰撞避免、精确到达目标和运动平滑性。利用 Python 编程环境搭建仿真平台训练智能体, 实现高效的状态识别 与动作执行。将训练好的模型应用于实际机械臂, 并结合视觉伺服系统, 完成实时环境感知与精准避障测试。 实验结果验证了 DRL 算法的性能, 为实验室科研智能化和自主化提供了技术支撑, 同时有助于培养学生理论 联系实际的能力。 [ABSTRACT FROM AUTHOR]

Copyright of Experimental Technology & Management is the property of Experimental Technology & Management Editorial Office and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)