Treffer: Machine learning and deep learning models applied to identification and classification of mango

Title:
Machine learning and deep learning models applied to identification and classification of mango
Source:
Heavy metals and arsenic concentrations in water, agricultural soil, and rice in Ngan Son district, Bac Kan province, Vietnam. 7:429-437
Publisher Information:
Vietnamese Journal of Food Control, National Institute for Food Control, 2024.
Publication Year:
2024
Document Type:
Fachzeitschrift Article
ISSN:
2615-9252
DOI:
10.47866/2615-9252/vjfc.4370
Accession Number:
edsair.doi...........2eed24a6dbedda5e7b4995d9ab27782c
Database:
OpenAIRE

Weitere Informationen

This study utilizes the data published on the website https://data.mendeley.com/ datasets/46htwnp833/2, which includes visible-near-infrared (Vis-NIR) spectral data at wavelengths ranging from 309 nm to 1149 nm for 11691 mangoes in Australia, collected from 10 mango varieties across 2 different growing regions. The research developed machine learning models with open-source programming language Python such as: principal component analysis (PCA) combined with support vector machines (SVM), decision trees (DT), random forests (RF), and artificial neural networks (ANN); partial least squares model combined with discriminant analysis (PLS-DA); and a deep learning model 1-dimensional convolutional neural network (1D-CNN). The preprocessing steps were caried out based on the full spectral data with second derivative, smoothing using the Savitzky-Golay algorithm, and data balancing via a new Synthetic Minority Oversampling Technique (SMOTE). The results demonstrated that applying the SMOTE data preprocessing technique before running the machine learning models significantly enhanced classification accuracy. Furthermore, using a 1D-CNN model with a complex structure provided higher classification efficiency than conventional machine learning models. The accuracy of the 1D-CNN model in classifying mango ripeness, mango variety, and growing location was 99.40%, 94.35%, and 96.92%, respectively. The 1D-CNN deep learning model is well-suited for sample classification when dealing with large datasets containing tens of thousands of samples based on spectral data.