×

Research on improved image classification algorithm based on Darknet53 model. (English) Zbl 07929842

Liang, Qilian (ed.) et al., Communications, signal processing, and systems. Proceedings of the 11th international conference, virtual, July 23–24, 2022. Vol. 3. Cham: Springer. Lect. Notes Electr. Eng. 874, 73-80 (2023).
Summary: An improved Image classification algorithm based on Darknet53 model is proposed in this paper to solve the problem that a large amount of the same gradient information is repeatedly used to update the weights of different dense layers in the Darknet53 network in the process of back propagation. Firstly, the residual block in the original network is replaced by THE CSP module, referring to the idea of cutting off the gradient flow in CSPnet to prevent too much repeated gradient information. Second, the Mish activation function is used to replace the Leaky ReLU function, which can transmit information more smoothly and achieve better accuracy and generalization. Finally, a SPP module is added to the end of the original network structure to solve the multi-scale problem of the main part of image classification. The experimental results show that the improved Darknet has achieved better performance, and the accuracy is improved by 1.3% compared with the original network; At the same time, compared with resnet50 and resnet101, the improved Darknet has better effect on image classification.
For the entire collection see [Zbl 1537.94008].

MSC:

94A08 Image processing (compression, reconstruction, etc.) in information and communication theory
68U10 Computing methodologies for image processing
Full Text: DOI

References:

[1] Ouyang, W.; Zeng, X.; Wang, X., DeepID-Net: object detection with deformable part based convolutional neural networks, IEEE Trans. Pattern Anal. Mach. Intell., 39, 7, 1320-1334 (2017) · doi:10.1109/TPAMI.2016.2587642
[2] Hu, G., Yang, Y.X., Yi, D., et al.: When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition. In: International Conference on Computer Vision, Santiago, Chile, 11-18 December 2015, pp. 142-150. IEEE Press, Piscataway (2015)
[3] Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: HigherHRNet: scale-aware representation learning for bottom-up human pose estimation. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5385-5394 (2020). doi:10.1109/CVPR42600.2020.00543
[4] Lecun, Y.; Bottou, L.; Bengio, Y., Gradient-based learning applied to document recognition, Proc. IEEE, 86, 11, 2278-2324 (1998) · doi:10.1109/5.726791
[5] Krizhevsky, A., Sutskever, I.E., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, 3-6 December 2012, pp. 1097-1105. Curran Associates Inc., Red Hook (2012)
[6] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference of Learning Representation, San Diego, CA, 7-9 May 2015 (2015). arXiv:1409.1556v6 [cs.CV]
[7] He, K.M., Zhang, X.Y., Ren, S.Q., et al.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, 27-30 June 2016. IEEE Computer Society, Los Alamitos (2016)
[8] Gedy, C., Liu, W., Jia, Y.Q., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7-12 July 2015, pp. 1-9. IEEE Press, Piscataway (2015)
[9] Huang, G., Liu, Z., Maaten, L.V.D., et al.: Densely connected convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, 21-26 July 2017, pp. 2261-2269. IEEE Press, Piscataway (2017)
[10] Redmon, J., Farhadi, A.: Yolov3: an incremental improvement.arXiv Preprint arXiv: 1804.02767 (2018)
[11] Misra, D.: Mish: a self regularized non-monotonic neural activation function (2019)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.