Abstract:
Objectives To enhance ship path planning and obstacle avoidance in complex marine environments while improving the efficiency and safety of ship navigation, this study proposes a novel method based on an improved DDPG algorithm.
Methods A priority experience replay mechanism, guided by a path importance score, is introduced to enhance the utilization efficiency of important experience in the learning process. A self-attention mechanism is integrated into the actor-critic network to enhance its ability to capture environmental features. In addition, the network architecture is optimized by using the dueling deep Q-network to improve the accuracy of value function estimation.
Results Simulation results in the East China Sea and the Indian Ocean show that, compared with the DDPG and A* algorithms, the improved algorithm achieves significant improvements in path length, inflection points and collision avoidance. For example, in the East China Sea, the improved algorithm reduces path length by 0.75%, inflection points by 26.92%, and collisions by 15.80% compared with the DDPG algorithm; and reduces path length by 4.59% and inflection points by 42.42% compared with the A* algorithm.
Conclusions The improved algorithm is superior to DDPG and traditional A* algorithms in marine environments of varying complexity, demonstrating its significant advantages and strong generalizability. It provides a reference for intelligent decision-making in ship navigation.