基于提示工程的无人艇集群大语言模型决策架构

A decision architecture for large language model of unmanned surface vessel clusters based on prompt engineering

  • 摘要: 【目的】针对无人艇集群任务中目标识别、态势理解与自主决策控制在小样本、低资源、复杂动态场景下易出现泛化不足、响应迟缓等问题,本文旨在构建一种融合大语言模型推理优势与实时控制能力的端到端多智能体架构,以提升无人艇集群在复杂环境下的协同作战与自主控制水平。【方法】提出一种“感知—理解—决策”三级协同的Maritime Commander Agent架构,将Qwen2.5-72B大语言模型引入无人艇集群决策全流程,并结合提示词工程和PID控制,实现高层策略规划与底层精确控制的深度融合。系统包括:目标感知智能体,基于YOLOv8l实现多类海上目标高精度检测与定位;态势理解智能体,利用结构化提示模板将感知结果转化为高层次自然语言态势描述;决策智能体,结合大语言模型推理与外部计算工具生成控制指令,并通过PID调节实现响应速度优化。该架构无需额外微调模型,具备低延迟和良好适应性。【结果】经实验验证,在公开数据集ABOships及退化场景(大雾、大雨)下目标检测精度较高;态势理解语义转化正确率达93.5%;仿真实验结果表明在4v1围捕任务的成功率由传统规则法的20%提升至80%,10v10对抗任务的成功率由25%提升至75%,验证了系统在复杂海上环境下的鲁棒性与跨域泛化能力。【结论】所提Maritime Commander Agent架构在保持大语言模型高层认知推理能力的同时,通过PID控制增强了实时响应与执行精度,显著提升了无人艇集群在动态任务中的协同决策水平。该研究为智能海上集群系统提供了可行的技术路径和工程化实现参考。

     

    Abstract: Objectives In unmanned boat cluster missions, target recognition, situational awareness, and autonomous decision-making control often suffer from issues such as insufficient generalization and slow response in scenarios with small samples, low resources, and complex dynamics. This paper aims to construct an end-to-end multi-agent architecture that integrates the reasoning advantages of large language models with real-time control capabilities to improve the collaborative combat and autonomous control capabilities of unmanned boat clusters in complex environments. Methods Proposing a three-tier collaborative “perception-understanding-decision-making” architecture for the Maritime Commander Agent, this study integrates the Qwen2.5-72B large language model into the entire decision-making process of unmanned vessel clusters. By combining prompt engineering and PID control, it achieves a deep integration of high-level strategic planning and low-level precise control. The system comprises: a target perception agent, which uses YOLOv8l to achieve high-precision detection and localization of multiple types of maritime targets; a situational understanding agent, which utilizes structured prompt templates to convert perception results into high-level natural language situational descriptions; and a decision-making agent, which combines large language model inference with external computational tools to generate control commands and optimizes response speed through PID regulation. This architecture does not require additional model fine-tuning and exhibits low latency and excellent adaptability. Results Experimental results demonstrate that the system achieves high accuracy in object detection on the public dataset ABOships and in degraded scenarios (heavy fog, heavy rain); the correct rate for semantic conversion in situational understanding reaches 93.5%; Simulation experiment results show that the success rate of the 4v1 encirclement task has increased from 20% using traditional rule-based methods to 80%, and the success rate of the 10v10 adversarial task has increased from 25% to 75%, verifying the system's robustness and cross-domain generalization capabilities in complex maritime environments. Conclusions The proposed Maritime Commander Agent architecture maintains the high-level cognitive reasoning capabilities of large language models while enhancing real-time response and execution accuracy through PID control, significantly improving the collaborative decision-making capabilities of unmanned vessel clusters in dynamic tasks. This research provides a feasible technical path and engineering implementation reference for intelligent maritime cluster systems. This paper focuses on constructing an end-to-end multi-agent system based on large language models to address the challenges of target identification, situational understanding, and autonomous decision-making control in unmanned vessel cluster tasks. Existing unmanned vessel decision-making systems primarily rely on rule-based methods, which often prove inadequate in complex scenarios with limited data and resources. While large language models possess robust reasoning and generalization capabilities, they struggle to meet the demands of high-frequency real-time control. To address this, this paper proposes a three-tier collaborative ‘perception-understanding-decision’ architecture called the Maritime Commander Agent, which fully leverages the advantages of large models in intelligent decision-making and combines PID control technology to optimize system response speed. The system comprises three types of agents: target perception agents, responsible for multi-target detection and spatial localization; a situational understanding agent, which converts perception results into high-level natural language situational descriptions; and a decision-making agent, which generates control commands in real time, achieving deep integration between high-level planning and low-level execution. Through simulation task validation, the results demonstrate that this method outperforms traditional rule-based decision-making methods in key metrics such as task completion rate and adversarial success rate, showcasing its immense potential as a new paradigm for intelligent maritime swarm collaboration systems.

     

/

返回文章
返回