Treffer: 基于改进 YOLO11 的瓷砖表面检测轻量级算法.
Weitere Informationen
[Background and purposes] With the process of the fourth industrial revolution, China's manufacturing industry will gradually realize intelligent production, intelligent detection and intelligent logistics. China is one of the world's largest producers, consumers and exporters of ceramic tiles. However, the ceramic tile surface defect detection still relies on manual visual inspection, resulting in low detection efficiency and high costs, making it difficult to achieve intelligent inspection. With the development of deep learning, the object detection technology based on deep learning is expected to realize the intelligent detection of tiles and further improve the production efficiency of tiles. In this context, this paper is aimed to explore an improved YOLO11 algorithm for tile surface defect detection, thus achieving model lightweighting while enhancing detection. [Methods] The original dataset used in this study was collected from the internet, which consists of 3,593 images with a resolution of 8192×6000 pixels and 1,975 images with a resolution of 4096×3500 pixels. Considering the high resolution of the original images and the large proportion of small defect targets in the dataset, a sliding window slicing approach was adopted, which effectively alleviated the training difficulties caused by high-resolution data, where the original images were cropped into smaller images of 640×640 pixels with a 25% overlap rate. After preprocessing, a ceramic tile surface defect dataset, containing a total of 20,736 images, was constructed and split into training and validation sets at a ratio of 9:1. Four improvement strategies were adopted, based on the YOLO11n model. Considering the high proportion of small defects in the dataset and the importance of low-level detail information for feature extraction of small object, the fusion module was proposed and integrated into the backbone network to enhance the interaction between abstract high-level semantic features and fine-grained low-level detail features by fusing low-level feature maps during high-level feature extraction, thus improving the feature extraction ability of the model. (2) In the downsampling process of the network, the MDown module partially replaced standard 3×3 convolutions, in which the feature map was split into dual pathways, processing them through convolutional operations and max pooling respectively, thereby achieving a lightweight downsampling architecture. (3) ITo address the phenomenon of similar features in feature maps generated by standard convolutions, the Ghost module is introduced to optimize the network architecture, where the intrinsic features was first generated via standard convolution, then more features were produced through simple linear operations. (4) To address the challenges posed by complex texture patterns on tile surfaces, Efficient Multi-scale Attention (EMA) mechanism, which is a hybrid attention integrating both channel and spatial dimensions, was introduced to enhance the model's adaptability to complex environment, thereby improving its generalization capability. [Results] The experiments were conducted in a virtual environment deployed on the Hengyuan Cloud Platform, utilizing computational resources including a 16-core AMD EPYC 7J13 CPU and an NVIDIA RTX 4090D GPU. The system was operated on Ubuntu 22.04 LTS, with Python 3.11 as the programming language, PyTorch 2.4.0 as the deep learning framework, and CUDA 12.1.1 toolkit for GPU acceleration. The training configuration was set with the following parameters: 500 epochs, batch size of 64, initial learning rate 0.01, final learning rate 0.01, SGD optimizer, and momentum 0.937. The optimized model exhibited significant improvements over the baseline YOLO11n. Specifically, 31% parameter reduction and 26% lower FLOPs were achieved, together with performance improvements of +5.2% in precision, +5.7% in recall, +3.9% in mAP@0.5, and +5.6% in mAP@0.5:0.95. Furthermore, the improved model achieved 82.8% mAP@0.5, outperforming YOLO11s by 0.5% in mAP@0.5. [Conclusions] The improved YOLO11 model was able to optimize object detection performance, while maintaining low computational complexity, thus providing a lightweight detection algorithm for reference in tile surface defect detection tasks. [ABSTRACT FROM AUTHOR]
针对瓷砖表面缺陷检测依赖人工, 存在检测效率低下、检测结果波动大, 成本较高等问题, 本研究提出了一 种基于改进 YOLO11 的瓷砖表面检测轻量级算法。首先, 针对小目标缺陷占比过高的特点, 提出了一种多特征融合骨 干网络, 即在 B3、B4、B5 特征提取过程中使用 Fusion 模块融合低层特征图, 实现语义信息和细节信息的交互, 从而 加强网络特征提取能力, 也在一定程度上缓解了梯度消失问题; 然后, 在网络下采样过程中使用 MDown 轻量模块部 分替代普通卷积, 引入 Ghost 模块对整个网络进一步的轻量化; 最后, 在骨干网络和颈部网络的连接处引入 EMA 注意 力机制, 将空间特征和通道特征进行融合, 形成多尺度特征表示, 以增强模型对复杂场景的适应能力。在切片后的瓷 砖表面缺陷数据集上进行实验, 与 YOLO11n 相比, 改进模型在 mAP@0.5、mAP@0.5:0.95 分别提升了 3.9%、5.6%, 模型的参数量和计算量降低约 31%、26%; 与 YOLO11s 相比, 基于 YOLO11n 的改进模型在 mAP@0.5 提升了 0.5%。 [ABSTRACT FROM AUTHOR]
Copyright of Journal of Ceramics / Taoci Xuebao is the property of Journal of Ceramics Editorial Office and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)