An object detection algorithm based on feature enhancement and adaptive threshold non-maximum suppression

MENG Weijun; AN Wen; MA Sugang; YANG Xiaobao

doi:10.13700/j.bh.1001-5965.2023.0534

Volume 51 Issue 7

Jul. 2025

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2025 > 51(7): 2349-2359.

MENG W J，AN W，MA S G，et al. An object detection algorithm based on feature enhancement and adaptive threshold non-maximum suppression[J]. Journal of Beijing University of Aeronautics and Astronautics，2025，51（7）：2349-2359 （in Chinese） doi: 10.13700/j.bh.1001-5965.2023.0534

Citation:

PDF( 2590 KB)

An object detection algorithm based on feature enhancement and adaptive threshold non-maximum suppression

doi: 10.13700/j.bh.1001-5965.2023.0534

MENG Weijun^{1
,
,},
AN Wen¹,
MA Sugang^{1, 2},
YANG Xiaobao^{1, 3}

1.
School of Computer Science and Technology，Xi’an University of Posts and Telecommunications，Xi’an 710121，China
2.
Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing，Xi’an 710121，China
3.
Xi’an Key Laboratory of Big Data and Intelligent Computing，Xi’an 710121，China

Funds:

National Natural Science Foundation of China (62072370)；Natural Science Foundation of Shaanxi Province (2023-JC-YB-598)

More Information

Corresponding author: E-mail：mengweijun@xupt.edu.cn
Received Date: 24 Aug 2023
Accepted Date: 13 Oct 2023

Available Online: 17 Nov 2023

Publish Date: 14 Nov 2023

Abstract

Abstract

To further solve the problems of object omission and repeated detection and improve the accuracy of object detection, this paper proposes an object detection algorithm based on feature enhancement and adaptive threshold non-maximum suppression(NMS). The attention-guided multi-scale context module(AMCM) is applied to the neck of the detector. Based on improving the semantic information of features by dilated convolution, the cross-channel location information is captured by the attention mechanism, so as to enhance the feature expression ability of the network. The dynamic suppression threshold is adaptively applied to instances of the scenes through the adaptive density threshold of NMS（ADT-NMS）, which lowers the false detection rate for objects. In comparison to the baseline algorithm YOLOv4, the suggested approach’s false detection rate on the PASCAL VOC dataset is 13.7%, a 1% decrease. The recall rate and detection accuracy increase by 0.9% and 1.7%, respectively, to 96.6% and 83.7%. The false detection rate of the proposed algorithm on the KITTI dataset achieves 22.1%, reduced by 1.3%. The detection accuracy and recall rate achieved 83.6% and 91.8%, improved by 1.8% and 2.3%, respectively. The experimental results show that the algorithm can better solve the problems of object omission and repeated detection.
- adaptive threshold,
- non-maximum suppression,
- object detection,
- dilated convolution,
- attention mechanism

FullText(HTML)

References(34)

References

[1]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2014: 580-587.
[2]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 779-788.
[3]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. (2017-06-12)[2023-02-01]. http://arxiv.org/abs/1706.03762.
[4]	ZHU X Z, SU W J, LU L W, et al. Deformable DETR: deformable transformers for end-to-end object detection[EB/OL]. (2021-03-18)[2023-02-01]. http://arxiv.org/abs/2010.04159.
[5]	NEUBECK A, VAN GOOL L. Efficient non-maximum suppression[C]//Proceedings of the 18th International Conference on Pattern Recognition. Piscataway: IEEE Press, 2006: 850-855.
[6]	GONG Y Q, YU X H, DING Y, et al. Effective fusion factor in FPN for tiny object detection[C]//Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2021: 1159-1167.
[7]	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8759-8768.
[8]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 936-944.
[9]	SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 2818-2826.
[10]	许腾, 唐贵进, 刘清萍, 等. 基于空洞卷积和Focal Loss的改进YOLOv3算法[J]. 南京邮电大学学报(自然科学版), 2020, 40(6): 100-108. XU T, TANG G J, LIU Q P, et al. Improved YOLOv3 based on dilated convolution and focal loss[J]. Journal of Nanjing University of Posts and Telecommunications (Natural Science Edition), 2020, 40(6): 100-108(in Chinese).
[11]	王囡, 侯志强, 蒲磊, 等. 空洞可分离卷积和注意力机制的实时语义分割[J]. 中国图象图形学报, 2022, 27(4): 1216-1225. doi: 10.11834/jig.200729 WANG N, HOU Z Q, PU L, et al. Real-time semantic segmentation analysis based on cavity separable convolution and attention mechanism[J]. Journal of Image and Graphics, 2022, 27(4): 1216-1225(in Chinese). doi: 10.11834/jig.200729
[12]	肖进胜, 张舒豪, 陈云华, 等. 双向特征融合与特征选择的遥感影像目标检测[J]. 电子学报, 2022, 50(2): 267-272. doi: 10.12263/DZXB.20210354 XIAO J S, ZHANG S H, CHEN Y H, et al. Remote sensing image object detection based on bidirectional feature fusion and feature selection[J]. Acta Electronica Sinica, 2022, 50(2): 267-272(in Chinese). doi: 10.12263/DZXB.20210354
[13]	谢学立, 李传祥, 杨小冈, 等. 基于动态感受野的航拍图像目标检测算法[J]. 光学学报, 2020, 40(4): 0415001. doi: 10.3788/AOS202040.0415001 XIE X L, LI C X, YANG X G, et al. Dynamic receptive field-based object detection in aerial imaging[J]. Acta Optica Sinica, 2020, 40(4): 0415001(in Chinese). doi: 10.3788/AOS202040.0415001
[14]	BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: improving object detection with one line of code[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 5562-5570.
[15]	侯志强, 刘晓义, 余旺盛, 等. 基于双阈值-非极大值抑制的Faster R-CNN改进算法[J]. 光电工程, 2019, 46(12): 190159. HOU Z Q, LIU X Y, YU W S, et al. Improved algorithm of Faster R-CNN based on double threshold-non-maximum suppression[J]. Opto-Electronic Engineering, 2019, 46(12): 190159(in Chinese).
[16]	JIANG B R, LUO R X, MAO J Y, et al. Acquisition of localization confidence for accurate object detection[M]// Computer Vision——ECCV 2018. Berlin: Springer, 2018: 816-832.
[17]	HENDERSON P, FERRARI V. End-to-end training of object class detectors for mean average precision[M]//Computer Vision——ACCV 2016. Berlin: Springer, 2017: 198-213.
[18]	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection[EB/OL]. (2020-04-23)[ 2023-02-01]. http://arxiv.org/abs/2004.10934.
[19]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
[20]	HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1904-1916. doi: 10.1109/TPAMI.2015.2389824
[21]	ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(7): 12993-13000.
[22]	LIU W, LIAO S C, REN W Q, et al. High-level semantic feature detection: a new perspective for pedestrian detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5182-5191.
[23]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848. doi: 10.1109/TPAMI.2017.2699184
[24]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
[25]	REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031
[26]	DAI J F, LI Y, HE K M, et al. R-FCN: object detection via region-based fully convolutional networks[EB/OL]. (2016-05-20)[2023-02-01]. http://arxiv.org/abs/1605.06409.
[27]	ZHU Y S, ZHAO C Y, WANG J Q, et al. CoupleNet: coupling global structure with local parts for object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 4146-4154.
[28]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[M]//Computer Vision——ECCV 2016. Berlin: Springer, 2016: 21-37.
[29]	REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2023-02-01]. http://arxiv.org/abs/1804.02767.
[30]	ZHANG S F, WEN L Y, BIAN X, et al. Single-shot refinement neural network for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 4203-4212.
[31]	TIAN Z, SHEN C H, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9626-9635.
[32]	DUAN K W, BAI S, XIE L X, et al. CenterNet: keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6568-6577.
[33]	ZHOU X Y, WANG D Q, KRÄHENBÜHL P. Objects as points[EB/OL]. (2019-04-25)[2023-02-01]. http://arxiv.org/abs/1904.07850.
[34]	高扬, 安雯. 基于可变空间感知的目标检测算法[J]. 现代电子技术, 2023, 46(12): 91-95. GAO Y, AN W. Object detection algorithm based on variable spatial perception[J]. Modern Electronics Technique, 2023, 46(12): 91-95(in Chinese).

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(8) / Tables(7)

Get Citation

PDF

XML

Article Metrics

Article views(678) PDF downloads(10)

An object detection algorithm based on feature enhancement and adaptive threshold non-maximum suppression

doi: 10.13700/j.bh.1001-5965.2023.0534

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

An object detection algorithm based on feature enhancement and adaptive threshold non-maximum suppression

doi: 10.13700/j.bh.1001-5965.2023.0534

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content