Treffer: 基于改进 YOLOv12 的遮挡环境下肉牛目标检测方法.

Title:
基于改进 YOLOv12 的遮挡环境下肉牛目标检测方法. (Chinese)
Alternate Title:
Beef Cattle Object Detection Method Under Occlusion Environment Based on Improved YOLOv12. (English)
Authors:
Source:
Smart Agriculture; Sep2025, Vol. 7 Issue 5, p182-192, 11p
Database:
Complementary Index

Weitere Informationen

[Objective] With the rapid development of intelligent agriculture, computer vision-based livestock detection technology has become increasingly important in modern farming management. Among various livestocks, beef cattle play a crucial role in animal husbandry industry all over the world. Accurate detection and counting of beef cattle are essential for improving breeding efficiency, monitoring animal health, and supporting government subsidy distribution. However, in real-world farming environments, cattle often gather and move closely together, leading to frequent occlusions. These occlusions significantly degrade the performance of traditional object detection algorithms, resulting in missed detections, false positives, and poor robustness. Manual counting methods are laborintensive, error-prone, and inefficient, while existing deep learning-based detection models still struggle with occlusion scenarios due to limited feature extraction capabilities and insufficient use of global contextual information. To address these challenges, an improved object detection algorithm named YOLOv12s-ASR, based on the YOLOv12s framework, was proposed in this research. The goal is to enhance detection accuracy and real-time performance in complex occlusion conditions, providing a reliable technical solution for intelligent beef cattle monitoring. [Methods] The proposed YOLOv12s-ASR algorithm introduced three key improvements to the baseline YOLOv12s model. First, part of the standard convolution layers with a modifiable kernel convolution module (AKConv) was replaced. Unlike traditional convolutions with fixed kernel shapes, AKConv could dynamically adjust the shape and size of the convolution kernel according to the input image content. This flexibility allowed the model to better capture local features of occluded cattle, especially in cases where only partial body parts were visible. Second, a self-ensembling attention mechanism (SEAM) was integrated into the Neck structure. SEAM combined spatial and channel attention through depthwise separable convolutions and consistency regularization, enabling the model to learn more robust and discriminative features. It enhanced the model's ability to perceive global contextual information, which was crucial for inferring the presence and location of occluded targets. Third, a repulsion loss function was introduced to supplement the original loss. This loss function included two components: RepGT, which pushed the predicted box away from nearby ground truth boxes, and RepBox, which encouraged separation between different predicted boxes. By reducing the overlap between adjacent predictions, the repulsion loss helped mitigate the negative effects of non-maximum suppression (NMS) in crowded scenes, thereby improving localization accuracy and reducing missed detections. The overall architecture maintained the lightweight design of YOLOv12s, ensuring that the model remained suitable for deployment on edge devices with limited computational resources. Extensive experiments were conducted on a self-constructed beef cattle dataset containing 2 458 images collected from 13 individual farms in Ningxia, China. The images were captured using surveillance cameras during daytime hours and included various occlusion scenarios. The dataset was divided into training, validation, and test sets in a 7:2:1 ratio, with annotations carefully reviewed by multiple experts to ensure accuracy. [Results and Discussions] The proposed YOLOv12s-ASR algorithm achieved a mean average precision (mAP) of 89.3% on the test set, outperforming the baseline YOLOv12s by 1.3 percent points. The model size was only 8.5 MB, and the detection speed reached 136.7 frames per second, demonstrating a good balance between accuracy and efficiency. Ablation studies confirmed the effectiveness of each component: AKConv improved mAP by 0.6 percent point, SEAM by 1.0 percent point and repulsion loss by 0.6 percent point. When all three modules were combined, the mAP increased by 1.3 percent points, validating their complementary roles. Furthermore, the algorithm was evaluated under different occlusion levels— slight, moderate, and severe. Compared to YOLOv12s, YOLOv12s-ASR improved mAP by 4.4, 2.9, and 4.4 percent points, respectively, showing strong robustness across varying occlusion conditions. Comparative experiments with nine mainstream detection algorithms, including Faster R-CNN, SSD, Mask R-CNN, and various YOLO versions, further demonstrated the superiority of YOLOv12s-ASR. It achieved the highest mAP while maintaining a compact model size and fast inference speed, making it particularly suitable for real-time applications in resource-constrained environments. Visualization results also showed that YOLOv12s-ASR could more accurately detect and localize cattle targets in crowded and occluded scenes, with fewer false positives and missed detections. [Conclusions] Experimental results show that YOLOv12s-ASR achieves state-of-the-art performance on a self-built beef cattle dataset, with high detection accuracy, fast processing speed, and a lightweight model size. These advantages make it well-suited for practical applications such as automated cattle counting, behavior monitoring, and intelligent farm management. Future work will focus on further enhancing the model's generalization ability in more complex environments and extending its application to multi-object tracking and behavior analysis tasks. [ABSTRACT FROM AUTHOR]

[目的/意义] 针对肉牛互相遮挡导致难以有效获取关键特征信息, 造成检测精度受限的问题, 提出了一 种肉牛目标检测算法 YOLOv12s-ASR (YOLOv12s-AKConv SEAM Repulsion)。[方法] 首先, 利用可改变核卷积替 代部分标准卷积, 充分捕获被遮挡部分的局部特征; 然后, 融合自集成注意力机制, 通过结合空间注意力和特征 增强机制, 充分捕获全局上下文信息; 最后, 引入排斥损失函数对原损失函数进行补充, 减少因非极大值抑制阈 值选取不当造成的漏检或误检, 提高模型的检测精度。[结果和讨论] 在自建肉牛数据集上, YOLOv12s-ASR 算法 的平均精度均值达到 89.3%, 相比于 YOLOv12s 算法提高了 1.3 个百分点, 并优于其他主流目标检测方法; 同时模 型参数量仅有 8.5 MB, 算法检测速度达到 136.7 FPS。[结论] 本研究提出的改进算法 YOLOv12s-ASR 能够实时准 确地检测肉牛目标 [ABSTRACT FROM AUTHOR]

Copyright of Smart Agriculture is the property of Smart Agriculture Editorial Office and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)