Treffer: THE IMPACT OF FEATURE SELECTION AND DATA PRE-PROCESSING ON ML MODELS.

Title:
THE IMPACT OF FEATURE SELECTION AND DATA PRE-PROCESSING ON ML MODELS.
Source:
Romanian Journal of Petroleum & Gas Technology; 2025, Vol. 6(77) Issue 1, p163-178, 16p
Database:
Complementary Index

Weitere Informationen

This research aims to identify issues with datasets that affect the training of machine learning models. The paper demonstrates the impact of feature processing on the model evaluation metrics. Initially, the paper evaluates unprocessed categorical features based on customized rules. The results of the seven evaluated algorithms show very low metric values. The demonstration is conducted on a dataset with 16 features for predicting depression among students. The dataset consists of 7,022 records. The issue of depression was chosen because it involves many features, which facilitate the analysis of the correlations between them. Moreover, the large number of features and records allowed for the analysis of generalization capacity by training on different dataset scenarios for all seven algorithms. The demonstration shows that data pre-processing generates better results when the inputs are exclusively numerical. Subsequently, the research demonstrates the importance of individually analyzing the contribution of each parameter within the model. The study encompasses three categories of tests, which are implemented using the ML.NET framework in the C# programming language [ABSTRACT FROM AUTHOR]

Copyright of Romanian Journal of Petroleum & Gas Technology is the property of Petroleum - Gas University of Ploiesti and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)