Ridge Regression

A Simple Explanation - By Varsha Saini

If your machine learning model learns the pattern of data very well while training, it can lead to the problem of Overfitting. Overfitting is a situation in which models perform very well on the training data and poor on testing data. To resolve this issue, a penalty term can be added to the equation of the model, this process is called Regularization.

There are three types of regularization models: Lasso Regression (L1 regularization) , Ridge Regression (L2 regularization) and Elastic net Regression.

Ridge Regression

  • Ridge Regression is a regularization method in which a small bias is added to the linear equation such that the overall complexity of the model is reduced.
  • It is also called L2 Regularization since the penalty added is squared (power 2).
  • The amount of bias added to the model is called the Ridge Regression penalty.
  • It can handle data that suffers from Multicollinearity problem.

Cost Function of Linear Regression

Cost =

Cost Function of Ridge Regression

The cost function of Linear Regression is modified by adding a Regularization term to it.

Cost =
where

  • slope m is the coefficients (weight) of the independent features.
  • lambda  λ is the penalty term.

Idea behind Ridge Regression

  • Regularization parameter is added to the cost function such that when an optimization algorithm like gradient descent is used to minimize the cost function, the values of weights are reduced.
  • Reducing weights decreases the complexity of the model and hence resolves the Overfitting problem.

Controlling the Regularization Parameter

  • Lambda λ which denotes the penalty term in the equation can be used to control the regularization parameter.
  • A high value of lambda increases the regularization parameter and decreases the value of coefficient or weight. It can lead to the problem of Underfitting.
  • A low value of lambda reduces the regularization parameter and does not have much effect on the value of coefficient or weight. Hence the problem of Overfitting may not get resolved.
  • Finding an optimal value of λ is a must to achieve a perfect Ridge Regression model.

Advantages and Disadvantages of Ridge Regression

  • Advantage of ridge regression is that it prevents the model from overfitting by reducing the model complexity.
  • Disadvantage is that it is not capable of performing feature selection.