Treffer: Simultaneous AUV navigation and tracking with satellite ASVs

Title:
Simultaneous AUV navigation and tracking with satellite ASVs
Contributors:
Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Cruz, Nuno Alexandre, Costa Castelló, Ramon
Publisher Information:
Universitat Politècnica de Catalunya
Publication Year:
2022
Collection:
Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
Document Type:
Dissertation master thesis
File Description:
application/pdf
Language:
English
Accession Number:
edsbas.97B4186F
Database:
BASE

Weitere Informationen

The navigation of Autonomous Underwater Vehicles (AUV) presents a series of challenges, being the location due to the inaccessibility of GPS signal one of them. A solution is the use of several surface vehicles equipped with hydrophones to locate and track an AUV that emits an acoustic signal. This Master’s Thesis addresses this solution with the objective to optimize the energy consumption of 2 Autonomous Surface Vehicles (ASV) while maintaining a preferred geometry of the formation to reduce the uncertainty of the position estimation of the AUV. To control these surface vehicles, 2 approaches of Reinforcement Learning (RL) have been implemented, Deep Deterministic Policy Gradient (DDPG) and Multi-Agent Deep Deter- ministic Policy Gradient (MADDPG). The aim of using these 2 algorithms is to test if the inclusion of the multi-agent element is necessary in this case. The RL algorithms control the linear and angular velocities of each robot. The environment used for the training of the robots is created specifically for this project using Python. A weighted sum of Gaussian functions is designed for the reward function, which contains all the elements related to the optimization of energy consumption and the formation of the AUV. To analyse several aspects of the final models of each robot a total of 4 different tests were done. These tests focus on the analysis of the distribution of the weights in the reward func- tion and the ability to adapt to difficult scenarios. The tests are done as simulations. The re- sults show that the ASVs trained with these implementations of RL modify their behaviour depending on the weight configuration and can adapt to more difficult scenarios depending on aspects of the training such as the noise applied to the actions. Moreover, the performance between the DDPG and MADDPG is compared. A discussion with similar works that have treated this problem is done using the final results.