Treffer: 融合多层特征的多尺度行人检测.
Weitere Informationen
Objective Humans fully understand a picture, ollen classify different images, and understand all the information in each image, including the location and concept of the object This task is called object detection and is one ol the basic research areas in computer vision. Object detection consists of different subtasks, such as pedestrian detection and skeleton detection. Pedestrian detection is a key link in object detection and one of the difficult tasks. This study mainly investigates pedestrian detection in traffic scenes, which is one of the most valuable topics in the lield ol pedestrian detection. Pedestrian detection in traffic scenes has always been a key technology for intelligent video surveillance technology, unmanned technology, intelligent transportation, and other issues. In recent years, this topic lias been the research locus in academic and industrial circles. With the upsurge of artificial intelligence technology development, a large number ol computer vision technologies are widely used. Multi-scale pedestrian detection has great research value because the development and application of pedestrian detection has complex real scenes and diHerent pedestrian scales. Pedestrian detection is widely used in based on deep learning have false detection and miss detection problems in the case ol low resolution and small pedestrian scale. A multi-scale pedestrian detection algorithm based on multi-layer features is proposed. The proposed convolutional neural network exhibits improved accuracy ol pedestrian detection by a level, and there has been no small progress in practical applications. The academic enthusiasm brought abaout by deep learning has enabled scholars to make great progress and breakthroughs in pedestrian detection in complex scenes. Deep learning in the future will be a major boost lor pedestrian detection. Method The deep residual network is maiiilv used in the multi-objective classification field. After analyzing the network, only the feature maps of the three stages are used and the residual unit and the full connection layer of the last stage are deleted. The deep residual network is mainly used to extract the feature maps of the three stages. The feature map extracted by the last layer is doubled using the characteristics of the three feature maps and then added bv the nearest neigh-poor sampping method. The features with rich high-pevel semantic information and the features with rich low-level detail information are combined to improve the detection effect. The merged three-laver features are encoded into the region proposal network, ami the proposal frames with pedestrians are obtained through Soft max classification for pedestrian detection. In this work, four experiments are designed, three of which are used to verify the validity of the proposed method. Results are compared with the mainstream algorithm results. Comparative experiments indicate that simple stratification does not improve the effect and the effect of multi-layer fusion is unsatisfactory. Therefore, the method of adjacent layer fusion is selected, and the result oi miilli-scale pedestrian detection is directly compared with that ot the deepest network. I he ellect of adjacent layer fusion is letter than the result. All experimental results are compared, and the fusion results ol the adjacent layers are the hesL The rale of missed detection is lower than that of the mainstream algorithm. The network is hilly convolved and consists end-to-end training through random downsampling and hackpropagation. Each image contains a numlier of candidate boxes lor positive and negative samples. However, directly taking the optimized sample will easily lead to loss bias to the negative sample because the number ol negative samples is larger than that ol positive samples. This study takes an image to select 256 anchors and calculates its loss. The ratio of the positive and negative samples is 1:1. This article will randomly initialize all new layers in the network, and the standard of initialization is from the zero mean standard deviation. 'Hie value set is 0. 01, and the weight is taken from Gaussian distribution. The other layers are initialized by classifying the pre-trained model, and the entire training process iterates through two epochs. Result On the Caltech pedestrian detection dataset and under the condition that each image false alarm rate (EPPI) is 10%, the loss rate of the proposed algorithm is only 57. 88%, which is decreased by 3. 07% compared with the loss of one ol the best iruxlels, namely, MS-CNN (multi-scale convolutional neural network) (60.95%) This work also adopts comparative experiment The overall loss rate of Ped-RPN is 64. 55%, which is worse than that of the proposed algorithm. The loss rate of the layered and then detected method (Ped-muti-RPN) is 77. 15%, which is better than that of Ped-RPN method. Ped-fused-RPN is a detection algorithm that combines multiple layers. The result is 61. 32%, and the effect is better than the proposed algorithm. Conclusion Small-scale pedestrians have the disadvantage ol blurred images, which make the detection effect extremely poor and affect the overall multi-scale detection. In order to solve the problem of the sharp decline of small-scale pedestrian detection, this paper proposes a metluxl of integrating deep semantic information and shallow detail features so the features of all scales have rich semantic information. The deep features have high semantic information, and the receptive field is small. The shallow features have positional information, and the receptive field is more fused. The two features can enhance the deep features, which have rich target position information. The merged feature map has different levels of detail and semantic information and has a good effect on detecting pedestrians of different scales. [ABSTRACT FROM AUTHOR]
目的(行人检测在自动驾驶(视频监控领域中有着广泛应用%是一个热门的研究话题) 针对当前基于深度 学习的行人检测算法在分辨率较低(行人尺度较小的情况下存在误检和漏检问题%提出一种融合多层特征的多尺 度的行人检测算法) 方法(首先%针对行人检测问题%删除了深度残差网络的一部分%仅采用深度残差网络的# 个 区域提取特征图%然后采用最邻近上采样法将最后一层提取的特征图放大两倍后再用相加法%将高层语义信息丰 富的特征和低层细节信息丰富的特征进行融合'最后将融合后的3 层特征分别输入区域候选网络中%经过soitinax 分类%得到带有行人的候选框%从而实现行人检测的目的) 结果(实验结果表明%在Cailech行人检测数据集上%在 每幅图像虚警率FPPI为10%的条件下%本文算法丢失率仅为57.88%比最好的模型之一-- 多尺度卷积神经 网络模型MS-CNN丢失率(60.95%)降低3.07% 结论(深层的特征具有高语义信息且感受野较大的特点%而 浅层的特征具有位置信息且感受野较小的特点%融合两者特征可以达到增强深层特征的效果%让深层的特征具有 较为丰富的目标位置信息) 融合后的多层特征图具有不同程度的细节和语义信息%对检测不同尺度的行人有较好 的效果) 所以利用融合后的特征进行行人检测%能够提高行人检测性能. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Image & Graphics is the property of Editorial Office of Journal of Image & Graphics and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)