Abstract
Deep neural networks (DNNs) have successfully been applied across various data-intensive applications ranging from computer vision, language modelling, bioinformatics and search engines. Regardless of application, the performance of a DNN typically is highly reliant on a situationally good choice of hyperparameters, defined as parameters not learned during model-training, why the design-phase of constructing a DNN-model becomes critical. Framing the selection and tuning of hyperparameters as an expensive black-box optimization problem, obstacles encountered in manual by-hand tuning could be overcome by taking instead an automated algorithmic approach.
In this work, we compare different black-box optimization algorithms with reported state-of-the-art performance across two specialized DNN-models, focused on identifying aspects of practical relevance.
Examined algorithms includes the Nelder-Mead-method, Particle Swarm Optimization, Bayesian Optimization with Gaussian Processes and the Tree-structured Parzen Estimator. Our results indicate the Tree-structured Parzen Estimator achieves the highest performance with respect to solution quality, convergence speed, variability and generalizability across the selected problem instances.