Softplus¶
Formula¶
\[
\mathrm{Softplus}(x)=\log(1+e^x)
\]
Plot¶
fn: log(1+exp(x))
xmin: -6
xmax: 6
ymin: -0.1
ymax: 6.5
height: 280
title: Softplus(x)
Parameters¶
- \(x\): scalar input (applied elementwise)
What it means¶
Softplus is a smooth approximation to ReLU.
What it's used for¶
- When a smooth nonlinearity is preferred over ReLU.
- Enforcing positive outputs (e.g., scale/rate parameters).
Key properties¶
- Smooth and strictly positive.
- Derivative is sigmoid: \(\frac{d}{dx}\mathrm{Softplus}(x)=\sigma(x)\).
Common gotchas¶
- Can be slower than ReLU.
- For very negative inputs, outputs are near zero but not exactly zero.
Example¶
\(\mathrm{Softplus}(0)=\log 2 \approx 0.693\).
How to Compute (Pseudocode)¶
Input: tensor/vector x
Output: y = Softplus(x) applied elementwise
for each element x_i in x:
y_i <- log(1 + exp(x_i))
return y
Complexity¶
- Time: \(O(m)\) elementwise operations for \(m\) inputs
- Space: \(O(m)\) for the output tensor/vector (or \(O(1)\) extra if done in place)
- Assumptions: Elementwise application over \(m\) scalars; exact constant factors depend on operations like \(\exp\), \(\tanh\), or \(\mathrm{erf}/\Phi\) approximations