Significance Level

A Simple Explanation - By Varsha Saini

The significance level is a common term in statistics, especially in hypothesis testing. It is the probability of rejecting the null hypothesis when it is true. Since we are rejecting a correct null hypothesis means we are making some mistakes. Therefore it can be regarded as an error. It is a Type 1 or False Positive Error.

The Null Hypothesis (H0) is the Hypothesis to be tested.

The Alternate Hypothesis (H1) is a contradictory statement to the Null Hypothesis.

How to select the level of significance?

The level of significance or significance level is denoted by α. The most common values of α are 0.01, 0.05, and 0.10 which corresponds to 99%, 95%, and 90% confidence in a test. The value of α is selected based on the certainty you need 0.01 > 0.05 > 0.10. If you want to be 95% confident then there is a 5% (100-95) risk of rejecting a null hypothesis that was true. The same applies to 90% where the significance level is 10% and 99% where the significance level is 1%.

The value of the significance level depends on the problem to which hypothesis testing is applied.  A low significance value is used in cases for which high certainty is required and vice versa.

For Example, if you want to check if a machine is working properly or not, then you may go with a 0.01 significance level since you expect little or no mistakes.
For cases like analyzing human behavior, you may go with a 0.10 significance level. since human behavior can be very uncertain.

False Positive Errors in Hypothesis Testing

In a binary classification problem where the target class is Positive and Negative. Let there be a case where the class is Negative but the model predicts it as a Positive, it is a False Positive or a Type 1 Error.

You can look at the below confusion matrix to understand the error better.

The probability of making a False Positive Error is α. Since the value of the significance level (α) is selected by you, the responsibility for making this error is on you.