Treffer: Adaptive Dragonfly Optimization (Ado) Feature Selection Model and Distributed Bayesian Matrix Decomposition for Big Data Analytics
Weitere Informationen
Matrix decompositions are fundamental methods for extracting knowledge from large data sets produced by contemporary applications. Processing extremely large amounts of data using single machines are still inefficient or impractical. Distributed matrix decompositions are necessary and practical tools for big data analytics where high dimensionalities and complexities of large datasets hinder the data mining processes. Current approaches consume more execution time making it imperative to reduce dataset feature counts in processing. This work presents a novel wrapper feature selection method utilising Adaptive Dragonfly Optimisation (ADO) algorithm for making the search space more appropriate for feature selections. ADO was used to transform continuous vector search spaces into their binary representations. Distributed Bayesian Matrix Decomposition (DBMD) model is presented for clustering and mining voluminous data. This work specifically uses, 1) accelerated gradient descent, 2) alternate direction method of multipliers (ADMM), and 3) statistical inferences to model distributed computing. These algorithms' theoretical convergence behaviours are examined where tests reveal that the suggested algorithms perform better or on par with two common distributed approaches. The methods also scale up effectively to large data sets. Clustering performances are assessed using the metrics of precision, recall, F-measure, and Rand Index (RI), which are better suited for imbalanced classes.