A Comparative Study of Black-box Optimization Algorithms for Tuning of Hyper-Parameters in Deep Neural Networks

O. Skogby Steinholtz and J. Söderström. Master thesis, Chalmers University of Technology, May 2018.

Abstract

Deep neural networks (DNNs) have successfully been applied across various data-intensive applications ranging from computer vision, language modelling, bioinformatics and search engines. Regardless of application, the performance of a DNN typically is highly reliant on a situationally good choice of hyperparameters, defined as parameters not learned during model-training, why the design-phase of constructing a DNN-model becomes critical. Framing the selection and tuning of hyperparameters as an expensive black-box optimization problem, obstacles encountered in manual by-hand tuning could be overcome by taking instead an automated algorithmic approach.

In this work, we compare different black-box optimization algorithms with reported state-of-the-art performance across two specialized DNN-models, focused on identifying aspects of practical relevance.

Examined algorithms includes the Nelder-Mead-method, Particle Swarm Optimization, Bayesian Optimization with Gaussian Processes and the Tree-structured Parzen Estimator. Our results indicate the Tree-structured Parzen Estimator achieves the highest performance with respect to solution quality, convergence speed, variability and generalizability across the selected problem instances.

 




Photo credits: Nic McPhee