Relationship between sigmoid and tanh activation function
Understanding commonly used activation functions in ML
Sigmoid and tanh
The sigmoid function and the hyperbolic tangent (tanh) function are both activation functions that are commonly used in neural networks.
The sigmoid function takes any real-valued input and maps it to the range 0 to 1, outputting a probability-like value. It is defined as:
sigmoid(x) = 1 / (1 + e^-x)
The tanh function, on the other hand, maps real-valued input to the range -1 to 1. It is defined as:
tanh(x) = 2 * sigmoid(2x) — 1
Similarities
Both the sigmoid and tanh functions are widely used in neural networks, and they have some similar properties:
- Both functions are non-linear, which makes them useful for capturing complex patterns in data.
- Both functions are smooth, which makes them easier to optimize than other non-linear functions such as the ReLU function.
- Both functions are monotonic, which means that they increase or decrease monotonically with respect to the input.
Differences
However, there are also some differences between the sigmoid and tanh functions:
- The range of the sigmoid function is fixed at 0 to 1, whereas the range of the tanh function is fixed at -1 to 1.
- The sigmoid function has a slower rate of change than the tanh function, which means that the derivative of the sigmoid function will be smaller than the derivative of the tanh function for a given input. This can affect the speed at which a neural network trained with the sigmoid function will converge compared to a network trained with the tanh function.
- The sigmoid function is often used in the output layer of a binary classification network, while the tanh function is often used in the hidden layers of a network.
Thanks for reading!