Pedestrian detection algorithm based on improved YOLOv3

Authors

  • Meiqing Wang Graduate School of Advanced Technology and Science Tokushima University, Japan.
  • Stephen Karungaru Graduate School of Advanced Technology and Science Tokushima University, Japan.
  • Terada Kenji Graduate School of Advanced Technology and Science Tokushima University, Japan.

Keywords:

Pedestrian detection ,YOLOv3,Deep learning,Attention Mechanisms,Target detection

Abstract

The ability to accurately detect pedestrians in the area of interest in real time is crucial in the field of autonomous driving. An improved YOLOv3 model is proposed for pedestrian detection. Firstly, a lightweight model that incorporates a residual network module approach and a CBAM attention mechanism is added to the structure to enhance the feature representation capability of the network. Experimental results show that the improved YOLOv3 target detection model raises the detection accuracy by 4% compared to the original algorithm, and the accuracy precision is improved to a large extent, which verifies the feasibility and effectiveness of the improved YOLOv3 model for pedestrian detection.

References

LeCun Y, Bengio Y, Hinton G. Deep learning[J]. nature, 2015, 521(7553): 436-444.

Ng P C, Henikoff S. SIFT: Predicting amino acid changes that affect protein function[J]. Nucleic acids research, 2003, 31(13): 3812-3814.

Dalal N, Triggs B. Histograms of oriented gradients for human detection[C]//2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05). Ieee, 2005.

Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model[C]//2008 IEEE conference on computer vision and pattern recognition. Ieee, 2008

LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324..

Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580-587.

He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(9): 1904-1916.

Girshick R. Fast r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2015: 1440-1448.

Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[J]. Advances in neural information processing systems, 2015, 28.

Uijlings J R R, Van De Sande K E A, Gevers T, et al. Selective search for object recognition[J]. International journal of computer vision, 2013, 104: 154-171.

Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 779-788.

Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, 2016: 21-37.

Jeong J, Park H, Kwak N. Enhancement of SSD by concatenating feature maps for object detection[J]. arXiv preprint arXiv:1705.09587, 2017.

Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 7263-7271.

Redmon J, Farhadi A. Yolov3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.

Han J, Liao Y, Zhang J, et al. Target fusion detection of LiDAR and camera based on the improved YOLO algorithm[J]. Mathematics, 2018, 6(10): 213.

Kuang P, Ma T, Li F, et al. Real-time pedestrian detection using convolutional neural networks[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2018, 32(11): 1856014.

Sermanet P, Kavukcuoglu K, Chintala S, et al. Pedestrian detection with unsupervised multi-stage feature learning[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2013: 3626-3633.

Angelova A, Krizhevsky A, Vanhoucke V, et al. Real-time pedestrian detection with deep network cascades[J]. 2015.

Li J, Liang X, Shen S M, et al. Scale-aware fast R-CNN for pedestrian detection[J]. IEEE transactions on Multimedia, 2017, 20(4): 985-996.

Cai Z, Saberian M, Vasconcelos N. Learning complexity-aware cascades for deep pedestrian detection[C]//Proceedings of the IEEE international conference on computer vision. 2015: 3361-3369.

Wang X, Xiao T, Jiang Y, et al. Repulsion loss: Detecting pedestrians in a crowd[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7774-7783.

He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

Woo S, Park J, Lee J Y, et al. Cbam: Convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 3-19.

Chen Gao, Wang Weihua, Lin Dandan. Infrared vehicle based on untrained convolution neural network Target detection [ J ]. Infrared technology, 2021, 43 (04): 342-348

Published

2024-05-30

How to Cite

Meiqing Wang, Stephen Karungaru, & Terada Kenji. (2024). Pedestrian detection algorithm based on improved YOLOv3. JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH, 6(3). Retrieved from http://mail.joaasr.com/index.php/joaasr/article/view/946