Logistic Regression is another popular machine learning algorithm used for classification problems. It has various interesting concepts and tricky concepts.
In this article, we will learn how logistic regression works, its hypothesis function, cost function, why linear regression cannot be used for classification problems, and other important concepts in detail. We have already covered Linear Regression Algorithm.
Table of Contents
ToggleMachine Learning
It is an application of artificial intelligence that provides machines the ability to learn and improve from their experiences without being explicitly programmed.
Types of Machine Learning Algorithms
- Supervised Machine Learning Algorithms
Supervised ML Algorithms are those which has target variables.Â
- Unsupervised Machine Learning Algorithms
Unsupervised ML Algorithms are those which doesn’t have target variables.
Supervised Machine Learning Algorithms are of two Types
- Regression Models
- Classification Models
a. Regression Models
It is a type of supervised machine learning algorithm in which target variables are continuous values.
A few Regression Algorithms are Linear Regression, Decision Tree Regressor, etc.
Example: House Price Prediction: in this, the target variable is house price which is a continuous variable.
b. Classification Models
It is a type of supervised machine learning algorithm in which target variables are discrete values called classes.
A few classification algorithms are Logistic Regression, SVM, Decision Tree, Naive Bayes, and K-NN.
Classification can be further divided into two Parts:-
- Binary Classification: Classification algorithms that have two classes.
Example: Email is spam or not spam, it has two classes.
- Multiclass Classification: Classification algorithms that have more than two classes.
Example: In Iris Dataset, predicting if the flower category is Virginica, Setosa, or Versicolor. It has three classes.
Logistic Regression
- Logistic Regression although sounds like a regression but is a classification-supervised machine learning algorithm.
- It is a probabilistic model, its output value always lies between 0 and 1.
- It is a statistical analysis method used for binary classification.
- A few Examples where Logistic Regression can be applied are
- Prediction if an email is spam or not spam.
- Prediction whether a person in the picture is male or female.
Why Logistic Regression is called Regression?
- Logistic: This term is taken from the Logit Function that is used in this method of classification.
- Regression: Underlying calculation for Logistic Regression is quite similar to Linear Regression.
If Logistic Regression is so similar to Linear Regression. Can we use Linear Regression for Classification?
Linear Regression for Classification
Linear Regression is used for predicting continuous values. What if we use a threshold (t) and whenever output>=t, we consider it belongs to class 1 and whenever output<t, we consider it belongs to class 0?
Can we use Linear Regression for Classification?
No Linear Regression cant be used for Classification due to the below reasons:
- Linear Regression returns continuous output that can be greater than 1 or less than 0, whereas we require a binary output.
- The best Fit Line in Regression can change by adding more data, it is highly affected by outliers.
The blue line in the above image represents Best Fit Line for the given data. Let’s add more data and check if the best fit line shifts.
We can see that the Best Fit Line (green line) shifts from its previous value after adding more data. The output class (Class 1 or Class 0) for data changes.
Hypothesis Function for Logistic Regression
We need a function that can change the outputs returned by regression between 0 and 1. Sigmoid Function also called Logistic Function can be used for this task.
Sigmoid Function = where z = mx+c
z=mx+c is an equation of a line which is also a hypothesis function in Linear Regression. The sigmoid Function which has an exponential part in it will convert the output of z between 0 and 1.
ypredicted =
- 0.5 is considered to be the threshold value.
- if >=0.5, y=1.
- if < 0.5,  y=0.
- if x=0 then =0.5 and hence y=1.
- if x>0 then >0.5 and hence y=1.
- if x<0 then <0.5 and hence y=0.
Interpretation of Hypothesis Output
The hypothesis function returns the probability value. Let’s consider P(0) as the probability that data belongs to class 0 and P(1) as the probability that data belongs to class 1.
P(0) + p(1) = 1Â P(0) = 1 - P(1)
The probability that the data belongs to class 0 can be found using the above formula if the probability for class 1 is already known.
Cost Function of Logistic Regression
The cost function used in Linear Regression is Root Mean Squared Error. Can we use the same function for Logistic Regression? Before answering this question let’s understand the purpose of a Cost Function.
- The cost function is used to evaluate the performance of the model by comparing the predicted value to the actual value.
- A good model has predicted values close to the actual values, and hence we need to minimize the cost function.
- To minimize the cost function, we need to find such values of coefficients for which cost is minimum using an optimization algorithm like Gradient Descent.
Can MSE be used as a Cost Function for Logistic Regression?
- No, we cannot use the MSE cost function for Logistic Regression.
- MSE function is non-convex function for binary classification.
- In the non-convex function, weights can get stuck in local minima and consider those values of coefficients having the least cost.
- If a binary classification model is trained with the MSE cost function, it is not guaranteed to minimize the cost function.
Cost Function for Logistic Regression
We know Logistic Regression is a Binary Classification Algorithm. Let’s try to find an optimal cost function for it. Below are a few cases and their required error.
Look at the below table, it tells you what should be the error in different cases.
Actual Class | Predicted Class | Error |
0 | 0 | 0 |
0 | 1 | inf |
1 | 1 | 0 |
1 | 0 | inf |
When the actual class and predicted class are the same, the error is 0 and if they are not the same error is infinite. We need to use such a function in the cost function which can return 0 and infinite. Hence Log Cost Function is used in Logistic Regression.
Cost Function Formula
if y=1 , cost = if y=0 , cost =
- if y = 1 and = 1 , cost = -log(1) = 0
- if y = 1 and = 0 , cost = -log(0) = inf
- if y = 0 and = 0 , cost = -log(1-0) = -log(1) = 0
- if y = 0 and = 1 , cost = -log(1-1) = -log(0) =inf
The above cost function can be compressed into one
Logistic Regression in Multiclass Classification
We have learned that Logistic Regression is a Binary Classification Algorithm. Does that mean it can’t be used for Multiclass Classification?
Logically Logistic Regression cannot be used for multiclass classification. We can use some methods to use logistic regression for multiclass classification.
- Multiclass Classification problems can be divided into multiple binary classification problems.
- The logistic regression model can be applied to each subproblem.
- This approach is called one-vs-rest.
- Hence, Logistic Regression can be applied to multiclass classification using the one-vs-rest approach.
Performance Metrics
We need Evaluation Metrics to check how well the model trained is performing on unseen data.
Performance Metrics for Classification Models
- Accuracy
- Precision
- Recall
- F1 Score
End Notes
I hope this article was able to make your understanding of logistic regression better. You can learn about other machine learning algorithms from here.
Feel free to ask any query or give your feedback in the comment box below.
Happy Learning!