Module Kaun.Activation
Activation functions for neural networks.
All functions are differentiable through Rune's autodiff. The standard activations relu, sigmoid and tanh are re-exported from Nx for convenience.
Standard activations
leaky_relu x is max(x, negative_slope * x).
negative_slope defaults to 0.01.
hard_sigmoid x is min(1, max(0, alpha * x + beta)).
alpha defaults to 1/6. beta defaults to 0.5.
prelu ~alpha x is max(0, x) + alpha * min(0, x).
alpha is a learnable tensor, broadcast against x.
Exponential family
elu x is x when x >= 0 and alpha * (exp(x) - 1) otherwise.
alpha defaults to 1.0.
selu x is lambda * elu(x, alpha) with self-normalizing constants.
celu x is max(0, x) + min(0, alpha * (exp(x/alpha) - 1)).
alpha defaults to 1.0.
Smooth activations
gelu_approx x is the tanh-based approximation of gelu.
squareplus x is 0.5 * (x + sqrt(x^2 + b)).
b defaults to 4.0.
log_sigmoid x is log(sigmoid(x)), computed in a numerically stable way by branching on the sign of x.
Gating
glu x splits x in half along axis and returns left * sigmoid(right).
axis defaults to -1.
Raises Invalid_argument if the split does not produce two partitions.
Sparse activations
sparse_plus x is x when x >= 1, 0 when x <= -1, and 0.25 * (x + 1)^2 otherwise.