改进YOLOv3的多模态融合行人检测算法中国测试科技资讯平台

作者：邓佳桐, 程志江, 叶浩劼

作者单位：新疆大学电气工程学院，新疆乌鲁木齐 830049

关键词：行人检测;多模态融合;内卷算子;注意力机制

摘要：

针对可见光单模态行人检测在夜间光线不足、目标密集、多尺度目标及目标部分遮挡场景中检测效果较低的问题，提出一种基于改进YOLOv3的多模态融合行人检测算法YOLOv3-Invo。该算法采用改进的Darknet-VI作为多模态特征提取网络模块，通过级联操作将两个不同特征图拼接输出，脖颈检测层分支引入空间金字塔池化模块并结合高效的内卷算子网络，以降低模型参数量；在检测网络层的深度卷积堆叠模块中设计新的ResFuse模型替换第一个卷积，并结合注意力机制CBAM模型，以加强融合特征图提取。对比实验表明，该算法在KAIST数据集上的行人检测准确率和召回率分别提升8.24%和2.82%，验证该算法的有效性，具有一定的研究价值。

Multimodal fusion pedestrian detection algorithm based on improved YOLOv3
DENG Jiatong, CHENG Zhijiang, YE Haojie
School of Electrical Engineering, Xinjiang University, Urumqi 830049,China
Abstract: Aiming at the problem that visible light single-modality pedestrian detection has low detection effect in scenes with insufficient light at night, dense targets, multi-scale targets and partial occlusion of targets, a multi-modal fusion pedestrian detection algorithm based on improved YOLOv3, YOLOv3-Invo is proposed. The algorithm uses the improved Darknet-VI as the multi-modal feature extraction network module, and splices two different feature maps through the cascade operation. The neck detection layer branch is introduced into the spatial pyramid pooling module and combined with an efficient involution operator network to reduce the amount of model parameters; a new ResFuse model is designed in the deep convolution stacking module of the detection network layer to replace the first convolution, and combined with the attention mechanism CBAM model to enhance the fusion feature map extraction. Comparative experiments show that the pedestrian detection Precision and Recall rate of the algorithm on the KAIST data set are increased by 8.24% and 2.82% respectively, which verifies the robustness of the improved algorithm and has certain research value.
Keywords: pedestrian detection;multimodal fusion;involution operator;attention mechanism
2022, 48(5):108-115 收稿日期: 2021-11-25;收到修改稿日期: 2022-02-11
基金项目: 自治区自然科学基金(202102401)；自治区重点实验室开放课题(2021D04011)
作者简介: 邓佳桐（1996-），女，四川南充市人，硕士研究生，专业方向为模式识别与人工智能
参考文献
[1] 刘桂雄, 刘思洋, 吴俊芳, 等. 基于深度学习的机器视觉目标检测算法及在票据检测中应用[J]. 中国测试, 2019, 45(5): 1-9
[2] JAN P, SIMON L, MARGARITA C, et al. People detection and tracking from aerial thermal views[C]//IEEE International Conference on Robotics and Automation (ICRA). 2014: 1794-1800.
[3] DOLLAR P, APPEL R, BELONGIE S, et al. Fast feature pyramids for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8): 1532-1545
[4] BENENSON R, OMRAN M, HOSANG J, et al. Ten years of pedestrian detection, What have we learned? [C]//European Conference on Computer Vision (ECCV), 2014: 613-627.
[5] PIOTR D, CHRISTIAN W, BERNT S, et al. Pedestrian detection: an evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4): 743-761
[6] ZHANG S, BENENSON R, OMRAN M, et al. How far are we from solving pedestrian detection? [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 1259-1267.
[7] WAGNER J, FISCHER V, HERMAN M, et al. Multispectral pedestrian detection using deep fusion convolutional neural networks[C]//Proceedings of European Symposium on Artificial Neural Networks (ESANN), 2016: 509-514.
[8] 邱根, 王锂, 陈凯. 基于红外图像融合算法的高压容器检测技术研究[J]. 中国测试, 2021, 47(5): 97-103
[9] 金志刚, 赵明昕, 张瑞, 等. 基于双目视觉的聚合积分通道行人检测优化算法[J]. 天津大学学报(自然科学与工程技术版), 2016, 49(12): 1225-1230
[10] 王玉萍, 曾毅. 人类视觉机制与ROI融合的红外行人检测[J]. 中国测试, 2021, 47(9): 87-93
[11] KONIG D, ADAM M, JARVERS C, et al. Fully convolutional region proposal networks for multispectral person detection[C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017: 243-250.
[12] JING T, WANG H, ZHANG X, et al. An object detection system based on YOLO in traffic scene. [C]//International Conference on Computer Science and Network Technology (ICCSNT), 2017: 315–319.
[13] 鞠默然, 罗海波, 王仲博, 等. 改进的YOLO V3算法及其在小目标检测中的应用[J]. 光学学报, 2019, 39(7): 253-260
[14] LI D, HU J, WANG C, et al. Involution: inverting the inherence of convolution for visual recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2021: 12316-12325.
[15] HWANG S, PARK J, KIM N, et al. Multispectral pedestrian detection: benchmark dataset and baseline. [C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2015: 1037-1045.

数字期刊群

6大专业栏目，满足更精准的内容需求

《中国测试》期刊

了解我们，实时跟进期刊出版

行业期刊

测试计量专业期刊论文数据库

资料下载

学习工具和知识点的集合库

视听课堂

中国测试独家视频

言论图书馆

行业参考书籍

学术会议

追踪学术研究热点

行业新闻

获取业界最新要闻

专家风采

关注学术大牛动向

杂志社动态

聚焦中国测试杂志社

科普课堂

用科学知识，引领智慧生活

专题页

用科学知识，引领智慧生活

科技情报

获取业界最新要闻

产业情报

关注学术大牛动向

改进YOLOv3的多模态融合行人检测算法

量值传递中绝对测量与相对测量转化实例的数理分析(一)

三维结构形变的单目像机测量方法

光纤陀螺标度因数与零偏测试及评价方法研究

声源定位系统校准研究与不确定度分析

低空慢速小目标探测与定位技术研究

一种相似性框架下基于非线性扩散过程的剩余寿命估计模型

免费

免费

免费

免费