Abstract:
Objective Maritime rescue missions require efficient and reliable path planning for unmanned surface vehicles (USVs). However, these missions are challenged by the limited sensing capabilities of USVs operating in vast and uncertain environments with randomly distributed obstacles. This study addresses the issues of low path planning efficiency and poor robustness resulting from restricted perception range. To tackle these challenges, a novel local observation-based path planning approach is proposed for USVs in maritime rescue missions.
Method The proposed approach integrates three key methodological innovations. First, the soft actor-critic (SAC) algorithm is employed with a reward function tailored to local observation, which rewards efficient goal-reaching and penalizes obstacle collisions. This design helps balance exploration and exploitation in uncertain environments. Second, a feature-enhanced soft actor-critic (FESAC) algorithm is introduced to improve training efficiency and model robustness. It extracts key environmental features and employs a randomized training environment with strategically placed obstacles to enhance sampling efficiency. During training, obstacle positions, USV starting points, and goals are randomly reset across episodes, encouraging the model to learn generalizable navigation strategies instead of memorizing specific scenarios. Third, an adaptive waypoint planning algorithm is developed based on local perception domains to effectively coordinate local obstacle avoidance with global goal-reaching behavior. Waypoints are dynamically selected within the USV's perception radius using a weighted objective function that balances proximity to the goal and distance from obstacles. This decomposes the complex global path planning task into a series of manageable local planning problems.
Results Comprehensive simulation experiments validate the effectiveness of the proposed approach. In feature-rich environments with randomly distributed obstacles, the method achieves a success rate exceeding 98%, significantly outperforming traditional methods. In simulated maritime rescue missions over 1,000 m×1,000 m areas with 20-50 randomly placed obstacles, the method maintains a task completion rate exceeding 93% under appropriate parameter configurations. The simulation results also reveal a notable trade-off between path safety and efficiency: increasing the obstacle avoidance weight w_2 yields safer but longer paths, whereas increasing the goal-reaching weight w_1 results in shorter paths at the cost of higher collision risk. Depending on different task requirements, optimal performance metrics can be obtained through proper parameter tuning. Comparative analysis shows that the FESAC algorithm converges significantly faster than standard SAC in complex environments, demonstrating enhanced learning efficiency.
Conclusion The proposed local observation-based path planning method effectively addresses the challenges posed by limited perception in maritime rescue scenarios, exhibiting strong robustness and adaptability to uncertain environments. By decomposing complex global planning tasks into manageable local subtasks and enhancing feature extraction capabilities, the method provides a practical solution for real-world USV operations where complete environmental information is unavailable. This work provides valuable technical insights for the practical application of reinforcement learning algorithms in actual engineering scenarios.