Probability Distribution is a statistical function using which the probability of occurrence of different values within a given range can be calculated. It is a function that gives the relative likelihood of occurrence of all possible outcomes of an experiment.
Let’s consider a random event of throwing dice, it can return 6 possible values (1,2,3,4,5,6). The probability distribution that 1 will be returned is ~17%.
Table of Contents
ToggleProbability Density Function (PDF)
The Probability Density Function represents the density of a continuous random variable lying between a specific range of values.
Cumulative Density Function (CDF)
The cumulative Density Function calculated for a random variable R at a point x represents the probability distribution that R will have a value less than or equal to x.
Probability Mass Function (PMF)
The Probability Mass Function represents the probability distribution of a discrete random variable.
Types of Probability Distributions
- Discrete Distribution
- Continuous Distribution
Discrete variables represent data that is countable (eg. number of apples). Continuous data is data that falls in a constant sequence (eg. temperature, age).
Discrete Distribution
Discrete Distribution represents the probability distribution of discrete data. A few discrete distributions are:
- Binomial Distribution
- Bernoulli Distribution
- Uniform Distribution
- Poisson Distribution
1. Binomial Distribution
The Binomial Distribution is used when there are only two possible outcomes, success (1) or failure (0) for n number of trials. The probability for both outcomes is the same in all the trials.
Properties
- Each trial is independent.
- There are only two possible outcomes 1 and 0.
- The probability of success and failure is the same for all the trials.
Probability Mass Function
where
- P(x) = Binomial Probability
- n = number of trials
- p = probability of success
- q = probability of failure
- x = number of times for a specific outcome within n trial.
Example
2. Bernoulli Distribution
Similar to the Binomial Distribution, Bernoulli Distribution has only two possible outcomes, success (1) or failure (0) but only one trial.
Properties
- The number of trails is 1.
- There are only two possible outcomes 1 and 0.
- The probability of success and failure may not be the same.
Bernoulli Distribution is similar to the Binomial Distribution. The only difference is that in Bernoulli, n=1 always, and x will take a value of 0 or 1.
Probability Mass Function
where x ∈ (0,1)
You can derive the probability mass function of the Bernoulli distribution from the pmf of the Binomial distribution by keeping n=1 and x =0 or 1 in the equation.
3. Uniform Distribution / Rectangular Distribution
When you roll a die, the probability of getting each value (from 1 to 6) is equal (1/6). This is a perfect example of Uniform Distribution.
Properties
- Uniform distribution is a probability distribution where all outcomes are equally likely.
- This is a distribution that has constant probability.
Density Function
for -∞ < a ≤ x ≤ b < ∞
- Area of Rectangle = length * width.
- Area of Rectangle = (b-a)*(1/(b-a)) =1.
- the area under the curve is always 1.
4. Poisson Distribution
Poisson Distribution can be used to find the probability of several events in a time period. For example, imagine you have a clinic and want to find out approximately how many patients visit the clinic in a day. It can be any number. Now the total number of patients visited in a day is calculated using Poisson Distribution.
Properties
- All events occur independently.
- An event can occur any number of times.
Probability Mass Function
where u is the mean
Example
Continuous Distribution
Continuous Distribution represents the probability distribution of continuous data. A few continuous distributions are:
- Normal Distribution
- Standard Normal Distribution
- Student’s T Distribution
- Chi-Squared Distribution
1. Normal Distribution / Gaussian Distribution
A Random Variable (X) having mean () and standard deviation () is said to be Normally Distributed if it has the below properties:
Properties
- Mean = Median = Mode
- No Skewness
- It follows Bell Curve
- It is symmetrical on both sides of the mean
Empirical Rule
- If you go one standard deviation to the left and one standard deviation to the right, it covers 68% of the total data.
- If you go two standard deviations to the left and two standard deviations to the right, it covers 95% of the total data.
- If you go three standard deviations to the left and three standard deviations to the right, it covers 99.7% of the total data.
Probability Density Function
2. Standard Normal Distribution
Standard Normal Distribution is a special type of Normal Distribution where the mean is 0 and the standard deviation is 1.
How to convert Normal Distribution into Standard Normal Distribution?
Normal Distribution can be converted to Standard Normal Distribution by using a z-score. This process is also called Standardization.
z-score =
where
- Â = mean
- = standard deviation
- = current value
3. Student’s T Distribution
When the sample size of the data is small, instead of following Normal Distribution it follows Student’s T Distribution.
Student’s T-Distribution is used when the sample size is low and the population standard deviation is unknown.
Let n be the sample size, then the Degree of Freedom is n-1. As the degree of freedom increases, the student’s t distribution becomes a normal distribution.
4. Chi-Squared Distribution
The Chi-Squared distribution is used to describe the distribution of a sum of squared random variables.
The mean of the chi-square distribution is equal to the degrees of freedom (k) while the variance is twice the degrees of freedom (2k).
The shape of the chi-square distribution depends on the number of degrees of freedom k. When k is small, the shape of the curve tends to be skewed to the right, and as the k gets larger, the shape becomes more symmetrical and approaches a normal distribution.
Properties
- It is used to test the goodness of fit of a distribution of data.
- The value of the chi-squared distribution ranges from 0 to ∞.
End Notes
Thank you for reading this article. By the end of this article, we are familiar with different Probability distributions that are frequently used in Statistics.
I hope this article was informative. Feel free to ask any query or give your feedback in the comment box below.