Result: Evaluation of Hyperparameter Optimization Techniques in Deep Learning Considering Accuracy, Runtime, and Computational Efficiency Metrics.
Further Information
Hyperparameter optimization is considered one of the most crucial steps in training deep learning models, as the performance metrics of these models, such as accuracy, generalizability, and computational efficiency, are closely tied to it. The following five hyperparameter optimization techniques have been explored in this work: Grid Search, Random Search, Genetic Algorithm, Particle Swarm Optimization, and Simulated Annealing, on a feedforward neural network (FFNN) trained with the MNIST dataset. It considers two main configurations of 20 epochs and 50 epochs, focusing on three key metrics: accuracy, runtime, and computational efficiency. Results show that approximation algorithms, such as Genetic Algorithm and Simulated Annealing, can achieve a remarkable trade-off between accuracy and runtime, allowing them to perform significantly better in terms of computational practicality than classical methods like Grid Search and Random Search. As a simple example, the highest Genetic Algorithm accuracy is 98.60% within 50 epochs, whereas Simulated Annealing performed better, with the fastest run taking 357.52 seconds. These results are bound to show how much flexibility and efficiency there is the approximation algorithms when searching high-dimensional hyperparameter spaces under scarce resources. This work also presents a trade-off analysis between exhaustive classic techniques and adaptive approximation techniques. The Python implementation, which is modular in architecture, provides a basic structure that can be extended to accommodate complex datasets and architectures. By bridging computational efficiency with practical efficacy, this work provides actionable guidance to both practitioners and researchers on use of deep learning, offering a possible direction for selecting hyperparameter optimization methodologies that are most suitable for specific constraints versus objectives. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Soft Computing & Data Mining (JSCDM) is the property of Universiti Tun Hussein Onn Malaysia and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)