Frequently Asked Questions

What will Happen if the Learning Rate is Set Too Low or Too High?

Learning Rate

It is a hyperparameter that can control the amount by which coefficients are adjusted i.e. shifted to either direction left or right. Since it is a hyperparameter, we can change its value.

m[i]= m[i-1] - α * (slope_m)

where

  • m= coefficient value
  • α= learning rate

Case 1: If Learning Rate is very High

If the value of the learning rate is kept very high, there will be large updates in the weight i.e the weights will explode (Exploding Gradient) from one value to another and may never reach global minima.

Case 2: If Learning Rate is too Low

Keeping a low value for the learning rate makes the weights be updated by a very small amount. It can cause the Vanishing Gradient problem i.e. the value will take a huge amount of time to reach global minima.

How to Find Optimal Value for Learning Rate?

Initially keep a high value for the learning rate and if you find it is overshooting, decrease its value until you reach an optimal value.

Other Popular Questions