Serviceeinschränkungen vom 12.-22.02.2026 - weitere Infos auf der UB-Homepage

Treffer: Smartic: A smart tool for Big Data analytics and IoT.

Title:
Smartic: A smart tool for Big Data analytics and IoT.
Authors:
Sayeed S; Faculty of Information Science and Technology, Multimedia University, Melaka, Melaka, 75450, Malaysia., Ahmad AF; Faculty of Information Science and Technology, Multimedia University, Melaka, Melaka, 75450, Malaysia., Peng TC; Faculty of Information Science and Technology, Multimedia University, Melaka, Melaka, 75450, Malaysia.
Source:
F1000Research [F1000Res] 2024 Feb 06; Vol. 11, pp. 17. Date of Electronic Publication: 2024 Feb 06 (Print Publication: 2022).
Publication Type:
Journal Article
Language:
English
Journal Info:
Publisher: F1000 Research Ltd Country of Publication: England NLM ID: 101594320 Publication Model: eCollection Cited Medium: Internet ISSN: 2046-1402 (Electronic) Linking ISSN: 20461402 NLM ISO Abbreviation: F1000Res Subsets: MEDLINE
Imprint Name(s):
Original Publication: London : F1000 Research Ltd
Contributed Indexing:
Keywords: Big Data Analytics; Data Cleaning; Data Imputation; Feature Engineering; IoT
Entry Date(s):
Date Created: 20240125 Date Completed: 20250714 Latest Revision: 20250714
Update Code:
20250715
PubMed Central ID:
PMC10806368
DOI:
10.12688/f1000research.73613.2
PMID:
38269303
Database:
MEDLINE

Weitere Informationen

The Internet of Things (IoT) is leading the physical and digital world of technology to converge. Real-time and massive scale connections produce a large amount of versatile data, where Big Data comes into the picture. Big Data refers to large, diverse sets of information with dimensions that go beyond the capabilities of widely used database management systems, or standard data processing software tools to manage within a given limit. Almost every big dataset is dirty and may contain missing data, mistyping, inaccuracies, and many more issues that impact Big Data analytics performances. One of the biggest challenges in Big Data analytics is to discover and repair dirty data; failure to do this can lead to inaccurate analytics results and unpredictable conclusions. Different imputation methods were employed in the experimentation with various missing value imputation techniques, and the performances of machine learning (ML) models were compared. A hybrid model that integrates ML and sample-based statistical techniques for missing value imputation is being proposed. Furthermore, the continuation involved the dataset with the best missing value imputation, chosen based on ML model performance for subsequent feature engineering and hyperparameter tuning. K-means clustering and principal component analysis were applied in our study. Accuracy, the evaluated outcome, improved dramatically and proved that the XGBoost model gives very high accuracy at around 0.125 root mean squared logarithmic error (RMSLE). To overcome overfitting, K-fold cross-validation was implemented.
(Copyright: © 2024 Sayeed S et al.)

No competing interests were disclosed.