Treffer: Data Quality, Semantics, and Classification Features: Assessment and Optimization of Supervised ML-AI Classification Approaches for Historical Heritage.

Title:
Data Quality, Semantics, and Classification Features: Assessment and Optimization of Supervised ML-AI Classification Approaches for Historical Heritage.
Authors:
Cera, Valeria1 (AUTHOR) valeria.cera@unina.it, Antuono, Giuseppe2 (AUTHOR), Campi, Massimiliano1 (AUTHOR), D'Agostino, Pierpaolo2 (AUTHOR)
Source:
Heritage (2571-9408). Jul2025, Vol. 8 Issue 7, p265. 33p.
Database:
Academic Search Index

Weitere Informationen

In recent years, automatic segmentation and classification of data from digital surveys have taken a central role in built heritage studies. However, the application of Machine and Deep Learning (ML and DL) techniques for semantic segmentation of point clouds is complex in the context of historic architecture because it is characterized by high geometric and semantic variability. Data quality, subjectivity in manual labeling, and difficulty in defining consistent categories may compromise the effectiveness and reproducibility of the results. This study analyzes the influence of three key factors—annotator specialization, point cloud density, and sensor type—in the supervised classification of architectural elements by applying the Random Forest (RF) algorithm to datasets related to the architectural typology of the Franciscan cloister. The main innovation of the study lies in the development of an advanced feature selection technique, based on multibeam statistical analysis and evaluation of the p-value of each feature with respect to the target classes. The procedure makes it possible to identify the optimal radius for each feature, maximizing separability between classes and reducing semantic ambiguities. The approach, entirely in Python, automates the process of feature extraction, selection, and application, improving semantic consistency and classification accuracy. [ABSTRACT FROM AUTHOR]