To understand the distribution of data, it is important to find the variability or spread in the data which can be found using two statistical measures Variance and Standard Deviation.
What is Standard Deviation?
Standard Deviation (σ) is the measure of dispersion or the scatter of the data when compared to its mean. Dispersion is the extent to which values in a distribution differ from the mean of the distribution.
- Low standard deviation denotes that the values are clustered around the mean.
- Hight standard deviation denotes the more spread of the data, that values are far away from the mean.
- Zero standard deviation denotes that all the values lie at the mean.
How is Standard Deviation Calculated?
Standard Deviation is calculated as the square root of variance.
Variance
Variance is the measure of variability in data i.e the spread of data from the mean. Higher the Variance, the more the spread in data.
Variance =
where
xi= value of current data
n= no of data points
i= iterator which moves from 1 to n
= mean
In the above formula, the distance from the mean is squared to get the positive value of output.
Standard Deviation Formula
Standard Deviation is calculated by taking the square root of Variance.
Standard Deviation =
1. Population Standard Deviation
Population standard deviation is calculated using each individual in the population, hence it is a fixed value.
σ = √Σ(xi – μ)2 / N
where:
- Σ: A symbol that means “sum”
- xi: The ith value in a dataset
- μ: The population mean
- N: The population size
2. Sample Standard Deviation
Sample standard deviation is calculated using the samples drawn from the population, hence it is not a fixed value but rather a statistic.
s = √Σ(xi – x̄)2 / (n – 1)
where:
- Σ: A symbol that means “sum”
- xi: The ith value in a dataset
- x̄: The sample mean
- n: The sample size
Difference between Standard Deviation and Variance
Standard Deviation | Variance |
The square root of variance. | The average squared difference from the mean. |
It is expressed in the same unit of measurement as of dataset. | It is expressed in the squared unit of measurement as of the dataset. |