TY - GEN
T1 - S*ReLU: Learning Piecewise Linear Activation Functions via Particle Swarm Optimization
AU - Basirat, Mina
AU - Roth, Peter M.
PY - 2021
Y1 - 2021
N2 - Recently, it was shown that using a properly parametrized Leaky ReLU (LReLU) as activation function yields significantly better results for a variety of image classification tasks. However, such methods are not feasible in practice. Either the only parameter (i.e., the slope of the negative part) needs to be set manually (L*ReLU), or the approach is vulnerable due to the gradient-based optimization and, thus, highly dependent on a proper initialization (PReLU). In this paper, we exploit the benefits of piecewise linear functions, avoiding these problems. To this end, we propose a fully automatic approach to estimate the slope parameter for LReLU from the data. We realize this via Stochastic Optimization, namely Particle Swarm Optimization (PSO): S*ReLU. In this way, we can show that, compared to widely-used activation functions (including PReLU), we can obtain better results on seven different benchmark datasets, however, also drastically reducing the computational effort.
AB - Recently, it was shown that using a properly parametrized Leaky ReLU (LReLU) as activation function yields significantly better results for a variety of image classification tasks. However, such methods are not feasible in practice. Either the only parameter (i.e., the slope of the negative part) needs to be set manually (L*ReLU), or the approach is vulnerable due to the gradient-based optimization and, thus, highly dependent on a proper initialization (PReLU). In this paper, we exploit the benefits of piecewise linear functions, avoiding these problems. To this end, we propose a fully automatic approach to estimate the slope parameter for LReLU from the data. We realize this via Stochastic Optimization, namely Particle Swarm Optimization (PSO): S*ReLU. In this way, we can show that, compared to widely-used activation functions (including PReLU), we can obtain better results on seven different benchmark datasets, however, also drastically reducing the computational effort.
KW - Activation function
KW - Deep Learning
KW - Visual categorization
KW - Activation functions
KW - Particle swarm optimization
UR - http://www.scopus.com/inward/record.url?scp=85102969187&partnerID=8YFLogxK
M3 - Conference paper
T3 - VISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
SP - 645
EP - 652
BT - VISAPP
A2 - Farinella, Giovanni Maria
A2 - Radeva, Petia
A2 - Braz, Jose
A2 - Bouatouch, Kadi
T2 - 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
Y2 - 8 February 2021 through 10 February 2021
ER -