## Learning Rate

It is a hyperparameter that can control the amount by which coefficients are adjusted i.e. shifted to either direction left or right. Since it is a hyperparameter, we can change its value.

m[i]= m[i-1] -α* (slope_m)

where

**m**= coefficient value**α=**learning rate

### Case 1: If Learning Rate is very High

If the value of the learning rate is kept very high, there will be large updates in the weight i.e the weights will explode (**Exploding** **Gradient**) from one value to another and may never reach global minima.

### Case 2: If Learning Rate is too Low

Keeping a low value for the learning rate makes the weights be updated by a very small amount. It can cause the **Vanishing Gradient** problem i.e. the value will take a huge amount of time to reach global minima.

#### How to Find Optimal Value for Learning Rate?

Initially keep a high value for the learning rate and if you find it is overshooting, decrease its value until you reach an optimal value.