Abstract:
Objectives Unmanned surface vehicle (USV) swarms have become essential in critical maritime applications, including search and rescue, environmental monitoring, and military operations, due to their superior robustness and operational efficiency compared with single-USV systems. However, traditional path planning methods for USV swarms exhibit significant limitations in complex and dynamic marine environments. Conventional algorithms, such as A*, RRT*, APF, and DWA, operate in a passive manner, relying on static or quasi-static environmental parameters (e.g., obstacle positions and velocities) to construct models. These algorithms lack predictive capabilities for dynamic targets and fail to support proactive decision-making, resulting in limited adaptability and insufficient robustness. Although deep reinforcement learning (DRL) approaches enable end-to-end policy learning, they suffer from high sample complexity, high training costs, limited generalization, and difficulty in integrating high-level task constraints (e.g., task priorities and safety thresholds). To address these challenges, this study proposes a novel adaptive path planning framework that leverages the advanced reasoning and decisionmaking capabilities of large language models (LLMs) to improve the performance of USV swarms in complex scenarios.
Methods The proposed method, termed adaptive path planning with tool-function chains (APPT), adopted a multi-component architecture to enable intelligent and adaptive path planning. First, a dedicated planning encoder was developed to process environmental data and extract critical features, including obstacle density, dynamic obstacle motion patterns, and task constraints (including path length limits and safety distance requirements). The encoder transformed unstructured environmental and task information into structured prompt vectors, which were subsequently fed into the LLM. Second, through prompt engineering, an LLM-driven USV swarm path planning agent was constructed. This agent integrated a library of classical path planning algorithms (A*, RRT*, APF, and DWA) as plug-and-play "tool functions". The LLM dynamically assembled optimal tool-function chains by computing the cosine similarity between prompt vectors (representing environmental and task requirements) and the capability feature vectors of the tool functions. Third, an adaptive iterative optimization mechanism guided by user input was introduced. Based on three key evaluation metrics—planning time (the total duration from prompt input to plan generation), total path length (the sum of individual USV paths in the swarm), and safety (distance between USVs and obstacles relative to obstacle expansion radii)—the LLM iteratively adjusted tool function parameters (e.g., heuristic weights in A*, attraction/repulsion gains in APF). This iterative adjustment was driven by structured prompt templates that incorporated scenario details, current performance metrics, and optimization goals, ensuring the framework could flexibly adapt to evolving task requirements.
Results Extensive experiments were conducted in dynamically generated obstacle environments (100 m × 100 m maps) with 3 fixed obstacles (radii: 8 m, 10 m, 15 m) and 3 dynamic obstacles (radius: 5 m, updated at discrete steps). The results demonstrated the effectiveness of the APPT method across multiple dimensions. In tool selection accuracy, the APPT method achieved an average accuracy of 89.7% across various scenarios, with performance adapting to environmental and task conditions. For example, when safety was prioritized (weight: 0.6−0.8), the accuracy of safety-oriented tool selection reached 95.6%, whereas when planning time was emphasized (weight: 0.6−0.8), time-related selection accuracy peaked at 96.2%. Regarding path optimization, the APPT method significantly reduced total path length by 14.55% after iterative parameter adjustment (e.g., optimizing APF parameters to mitigate local minimum oscillations). Compared to conventional parameter optimization algorithms, the APPT method maintained comparable path quality (with only a 0.7% increase in path length) while reducing optimization time by 61% (7.52 s vs. 19.48 s), demonstrating superior efficiency.
Conclusions The APPT method represents a paradigm shift in USV swarm path planning by fully leveraging the reasoning and analytical capabilities of LLMs. By integrating prompt engineering and dynamic tool-chain composition, the proposed method overcomes the adaptability and robustness limitations of traditional approaches, as well as the scalability challenges of DRL methods. The APPT method achieves high tool selection accuracy (89.7% on average) in complex environments and supports efficient, user-guided iterative optimization, leading to notable improvements in both path quality and planning efficiency. In practice, the APPT framework offers a flexible solution for a wide range of maritime tasks—from civilian environmental monitoring to military operations—by adjusting evaluation metrics and tool parameters. Theoretically, it bridges the gap between LLMs and engineering applications in maritime robotics, offering a foundation for future research on intelligent and adaptive multi-agent systems in complex and dynamic environments.