Research Article | | Peer-Reviewed

Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm

Received: 13 November 2023    Accepted: 1 December 2023    Published: 28 December 2023
Views:       Downloads:
Abstract

A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.

Published in Mathematics and Computer Science (Volume 8, Issue 5)
DOI 10.11648/j.mcs.20230805.11
Page(s) 104-111
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Object Detection, Loss Function, Real-Time, YOLOv4

References
[1] A. Kumar and S. Srivastava, “Object Detection System Based on Convolution Neural Networks Using Single Shot Multi-Box Detector,” Procedia Comput. Sci., vol. 171, no. 2019, pp. 2610–2617, 2020, doi: 10.1016/j.procs.2020.04.283.
[2] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-Decem, pp. 779–788, 2016, doi: 10.1109/CVPR.2016.91.
[3] J. Tao, H. Wang, X. Zhang, X. Li, and H. Yang, “An object detection system based on YOLO in traffic scene,” Proc. 2017 6th Int. Conf. Comput. Sci. Netw. Technol. ICCSNT 2017, vol. 2018-Janua, pp. 315–319, 2018, doi: 10.1109/ICCSNT.2017.8343709.
[4] T. Ahmad et al., “Object Detection through Modified YOLO Neural Network,” Sci. Program., vol. 2020, 2020, doi: 10.1155/2020/8403262.
[5] Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, “Object Detection with Deep Learning: A Review,” IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 11, pp. 3212–3232, 2019, doi: 10.1109/TNNLS.2018.2876865.
[6] S. Lu, B. Wang, H. Wang, L. Chen, M. Linjian, and X. Zhang, “A real-time object detection algorithm for video,” Comput. Electr. Eng., vol. 77, pp. 398–408, 2019, doi: 10.1016/j.compeleceng.2019.05.009.
[7] M. Algabri, H. Mathkour, M. A. Bencherif, M. Alsulaiman, and M. A. Mekhtiche, “Towards Deep Object Detection Techniques for Phoneme Recognition,” IEEE Access, vol. 8, pp. 54663–54680, 2020, doi: 10.1109/ACCESS.2020.2980452.
[8] P. Sermanet and D. Eigen, “OverFeat : Integrated Recognition, Localization and Detection using Convolutional Networks arXiv : 1312. 6229v4 [cs. CV] 24 Feb 2014”.
[9] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580–587, 2014, doi: 10.1109/CVPR.2014.81.
[10] R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440–1448. doi: 10.1109/ICCV.2015.169.
[11] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–1149, 2017, doi: 10.1109/TPAMI.2016.2577031.
[12] W. Liu et al., “SSD: Single shot multibox detector,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9905 LNCS, pp. 21–37, 2016, doi: 10.1007/978-3-319-46448-0_2.
[13] J. Redmon and A. Farhadi, “Yolo V2.0,” Cvpr2017, no. April, pp. 187–213, 2017, [Online]. Available: http://www.worldscientific.com/doi/abs/10.1142/9789812771728_0012
[14] J. Redmon and A. Farhadi, “YOLO v.3, An incremental improvement” Tech Rep., pp. 1–6, 2018, [Online]. Available: https://pjreddie.com/media/files/papers/YOLOv3.pdf
[15] A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” arXiv. 2020.
[16] X. Wang and J. Song, “ICIoU: Improved Loss Based on Complete Intersection over Union for Bounding Box Regression,” IEEE Access, vol. 9, pp. 105686–105695, 2021, doi: 10.1109/ACCESS.2021.3100414.
[17] J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, “UnitBox,” pp. 516–520, 2016, doi: 10.1145/2964284.2967274.
[18] H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, “Generalized intersection over union: A metric and a loss for bounding box regression,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2019-June, pp. 658–666, 2019, doi: 10.1109/CVPR.2019.00075.
[19] X. Qian, S. Lin, G. Cheng, X. Yao, H. Ren, and W. Wang, “Object detection in remote sensing images based on improved bounding box regression and multi-level features fusion,” Remote Sens., vol. 12, no. 1, 2020, doi: 10.3390/RS12010143.
[20] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-IoU loss: Faster and better learning for bounding box regression,” AAAI 2020 - 34th AAAI Conf. Artif. Intell., no. 2, pp. 12993–13000, 2020, doi: 10.1609/aaai.v34i07.6999.
Cite This Article
  • APA Style

    Saleem, M., Sheikh, N., Rehman, A., Rafiq, M., Jahan, S. (2023). Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Mathematics and Computer Science, 8(5), 104-111. https://doi.org/10.11648/j.mcs.20230805.11

    Copy | Download

    ACS Style

    Saleem, M.; Sheikh, N.; Rehman, A.; Rafiq, M.; Jahan, S. Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Math. Comput. Sci. 2023, 8(5), 104-111. doi: 10.11648/j.mcs.20230805.11

    Copy | Download

    AMA Style

    Saleem M, Sheikh N, Rehman A, Rafiq M, Jahan S. Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm. Math Comput Sci. 2023;8(5):104-111. doi: 10.11648/j.mcs.20230805.11

    Copy | Download

  • @article{10.11648/j.mcs.20230805.11,
      author = {Muhammad Saleem and Naveed Sheikh and Abdul Rehman and Muhammad Rafiq and Shah Jahan},
      title = {Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm},
      journal = {Mathematics and Computer Science},
      volume = {8},
      number = {5},
      pages = {104-111},
      doi = {10.11648/j.mcs.20230805.11},
      url = {https://doi.org/10.11648/j.mcs.20230805.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.mcs.20230805.11},
      abstract = {A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.
    },
     year = {2023}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Real-Time Object Identification Through Convolution Neural Network Based on YOLO Algorithm
    AU  - Muhammad Saleem
    AU  - Naveed Sheikh
    AU  - Abdul Rehman
    AU  - Muhammad Rafiq
    AU  - Shah Jahan
    Y1  - 2023/12/28
    PY  - 2023
    N1  - https://doi.org/10.11648/j.mcs.20230805.11
    DO  - 10.11648/j.mcs.20230805.11
    T2  - Mathematics and Computer Science
    JF  - Mathematics and Computer Science
    JO  - Mathematics and Computer Science
    SP  - 104
    EP  - 111
    PB  - Science Publishing Group
    SN  - 2575-6028
    UR  - https://doi.org/10.11648/j.mcs.20230805.11
    AB  - A widely utilized object detection technique in computer vision involves Convolutional Neural Networks (CNN) due to their simplicity and efficiency. The effectiveness of CNN-based object detection relies significantly on the choice of loss function, with localization precision being a critical determinant. In order to improve localization accuracy, we have made changes inside CIoU loss function resulting in the development of a new loss function known as Area-CIoU (ACIoU). This new loss function specifically adopts a comprehensive approach by taking into account the alignment of bounding boxes between predictions and ground truth, combining the relationship between aspect ratio and area for both bounding boxes. When both bounding boxes have the same aspect ratio, we take into account how the prediction box may affect localization accuracy. As a result, the penalty function is strengthened, which improves the network model's localization precision. Experimental results on a custom dataset of vehicles including car, person, motorcycle, truck and bus, affirm the efficacy of ACIoU in enhancing the localization accuracy of network models, as demonstrated through its application in the one-stage object detector YOLOv4. Experiments also show that the network’s accuracy was enhanced but its FPS dropped due to the new penalty term composition in the loss function. We achieved AP of 88.48% and average recall rate of 86.37% with 41 frames per second.
    
    VL  - 8
    IS  - 5
    ER  - 

    Copy | Download

Author Information
  • Department of Mathematics, University of Balochistan, Quetta, Pakistan

  • Department of Mathematics, University of Balochistan, Quetta, Pakistan

  • Department of Mathematics, University of Balochistan, Quetta, Pakistan

  • Department of Mathematics, University of Balochistan, Quetta, Pakistan

  • Department of Mathematics, University of Balochistan, Quetta, Pakistan

  • Sections