面向细粒度舰船识别的视觉属性提示词学习方法研究

张骁扬; 杨予光; 张必浪; 张娟; 张宝昌; 罗晓燕; 樊洁茹

doi:10.19693/j.issn.1673-3185.04407

面向细粒度舰船识别的视觉属性提示词学习方法研究

Visual attribute prompt learning method for fine-grained ship recognition

摘要

摘要:
目的针对舰船图像识别任务中存在的强干扰、数据稀缺及深度学习方法语义特征建模不足等问题，提出一种基于视觉属性的提示词学习机制（Visual Attribute Prompt Tuning, VAPT）。
方法通过构建大规模预训练视觉属性词表，引入多分支交叉注意力机制（Multi-head Cross-Attention , MCA）实现属性匹配和选择过程，用于与深度视觉模型对齐，提升模型对舰船关键特征的识别能力。
结果在经人工精细标注与清洗的自建舰船图像数据集上的验证结果表明，与基线模型（Vision Transformer，ViT）相比，所提方法的Top-1准确率提升了3.79%；在公开舰船数据集FGSCR-42上，相较于当前最优的B-CNN和RA-CNN方法，所提方法的Top-1准确率分别提升2.20%和0.10%。
结论研究结果为复杂海洋环境下的目标识别任务提供了新的特征解耦与知识迁移技术路径，对智能海上监测系统具有重要意义。

Abstract:
Objective This study proposes a Visual Attribute Prompt Learning (VAPT) mechanism to address challenges such as strong interference, limited data, and inadequate modeling of semantic features in deep learning-based ship image recognition tasks.
Method The framework constructs a large-scale pre-trained visual attribute codebook and incorporates a Multi-head Cross-Attention Mechanism (MCA) for attribute matching and selection, enabling effective alignment with deep visual models to enhance their ability to recognize critical ship features.
Results The proposed method is validated on a meticulously annotated custom ship image dataset, achieving a notable 3.79% improvement in Top-1 accuracy over the baseline Vision Transformer (ViT). On the public FGSCR-42 dataset, the proposed method achieves Top-1 accuracy gains of 2.20% and 0.10% over the state-of-the-art B-CNN and RA-CNN methods, respectively.
Conclusion The research provides a novel technical solution for feature decoupling and knowledge transfer in target recognition tasks under complex marine conditions, offering significant implications for intelligent maritime monitoring systems.

HTML全文

参考文献(33)

施引文献

资源附件(0)