Treffer: 结合先验知识与深度强化学习的机械臂抓取研究.

Title:
结合先验知识与深度强化学习的机械臂抓取研究. (Chinese)
Alternate Title:
Robotic arm grasping study combining prior knowledge and deep reinforcement learning. (English)
Source:
Journal of Xi'an Polytechnic University; 2023, Vol. 37 Issue 4, p92-101, 10p
Database:
Complementary Index

Weitere Informationen

In the process of applying deep reinforcement learning(DRL)to realize autonomous behavioral decision-making of robotic arms, the high-dimensional continuous state-action space is prone to low data sampling efficiency and low quality of empirical samples, which ultimately leads to slow convergence of the reward function and long learning time. To address this problem, a DRL model that introduces prior knowledge was proposed. The model was combined with the inverse kinematics of the robotic arm, and prior knowledge was introduced to guide the agent during the sampling phase of DRL, addressing the issues of low data sampling efficiency and poor quality of experience samples during the learning process. Furthermore, the introduced prior knowledge DRL model’s strong generalization capabilities were verified when facing new tasks through network parameter transfer. Lastly, joint simulation experiments were conducted using Python and the CoppeliaSim platform. The results show that the DRL model with the introduction of prior knowledge improves the learning efficiency by 13.89% and 12.82%,and the success rate of completing the task increases by 16.92% and 13.25% than the original model; in the new task, the learning rate improves by 23.08% and 23.33%,and the success rate improves by 10.7% and 11.57%. [ABSTRACT FROM AUTHOR]

在应用深度强化学习(deep reinforcement learning, DRL)实现机械臂自主行为决策过程中, 高维连续的状态-动作空间易引起数据采样效率低及经验样本质量低,最终导致奖赏函数收敛速度 慢、学习时间长. 针对此问题,提出一种引入先验知识的 DRL模型. 该模型与机械臂逆运动学相 结合,在 DRL采样阶段引入先验知识指导智能体(Agent)采样,解决学习过程中的数据采样效率 低、经验样本质量低的问题;同时通过网络参数迁移的方式验证引入先验知识的 DRL模型在面对 新任务时仍具有较强的泛化能力;最后,利用 Python 和 CoppeliaSim 仿真平台进行联合仿真实验. 结果表明:引入先验知识的 DRL模型比原始模型的学习效率提升了13.89%、12.82%,完成任务 的成功率提高了16.92%、13.25%;在新任务中,学习率提升了23.08%、23.33%,成功率提高了 10.7%、11.57%. [ABSTRACT FROM AUTHOR]

Copyright of Journal of Xi'an Polytechnic University is the property of Editorial Department of Journal of Xi'an Polytechnic University and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)