Serviceeinschränkungen vom 12.-22.02.2026 - weitere Infos auf der UB-Homepage

Treffer: Reducing the Feature Distance between Real and Synthetic Data for Training Safeguards-Relevant Computer Vision Models.

Title:
Reducing the Feature Distance between Real and Synthetic Data for Training Safeguards-Relevant Computer Vision Models.
Source:
Journal of the Institute of Nuclear Materials Management. 2025 3rd Quarter, Vol. 52 Issue 3, p30-42. 13p.
Database:
Supplemental Index

Weitere Informationen

Computer vision models hold significant promise as tools for supporting international nuclear safeguards verification activities. These models can assist safeguards inspectors and analysts in quickly interpreting visual scenes, identifying objects of interest, and detecting anomalies. However, example images of relevant safeguards objects are relatively scarce compared to more common items (e.g., cars, dogs, people), due to their commercial proprietary nature, proliferation concerns, and lack of prominence in the world. For example, there are only three nuclear fuel reprocessing facilities for light water reactor fuel in the world. These facilities represent technology that is highly safeguards relevant, but inherently more scarce than more common classes of images, thus making training computer vision models on these classes difficult. While synthetic images offer a potential solution to this issue, previous research has indicated that models trained exclusively on synthetic images and tested on real images often exhibit poor performance. In this paper, we discuss novel methods to identify and minimize the feature differences between real and synthetic images to enhance model performance. We extract activations from the second-to-last fully connected layer of a pre-trained VGG-16 model classifying both real and synthetic images and use a clustering algorithm to identify similarities in the feature space between the image sets. By using synthetic images that cluster with real images as training data to fine-tune a VGG-16 model and introducing sensor noise to the synthetic images, we improve the model's ability to identify a 48-type uranium hexafluoride container in real images when exclusively training the model on synthetic data and testing on real data. Additionally, filtering out synthetic images that do not occupy a minimum area of an image from the training set further improves performance. Finally, we employ intra-class variation (ICV) and inter-domain dissimilarity (IDD) as quantitative metrics for enhanced data augmentation, which flattens the loss curve for the real data, indicating improved alignment between the synthetic training data distribution and that of the real data. [ABSTRACT FROM AUTHOR]