Autocorrelation

A Simple Explanation - By Varsha Saini

Autocorrelation is the relationship between a data point and a past (lagged) version of itself. An autocorrelation of +1 between the two versions shows a perfect positive correlation and -1 represents a perfect negative correlation.

Autocorrelation can be used to find the influence of past values on current values. For example, in stock prices, autocorrelation can be used to find the effect of yesterday’s price.

ACF and PACF are two functions to evaluate autocorrelation i.e. the correlation between a value and a lagged version of itself. In the graph, every black line represents the autocorrelation with the lagged version. If the blue dot of the black line is above the oval-shaped blue region, the value is autocorrelated.

Autocorrelation Function ACF

ACF, Autocorrelation Function shows the direct as well as indirect relation between a value with a lagged version of itself.


import statsmodels.graphics.tsaplots as sgt

import matplotlib.pyplot as plt

sgt.plot_acf(df.market_value, zero = False, lags = 40)

plt.title("ACF for Prices", size = 20)

plt.show()

 

 

 

 

 

In the above graph, all the past 40 lags values are correlated with the current value, though the effect decreases as we go towards the lagged version.

Partial Autocorrelation Function PACF

PACF, Partial Autocorrelation Function only shows the direct relationship between a value and a lagged version of itself.


sgt.plot_pacf(df.market_value, lags = 40, alpha = 0.05, zero = False, method = ('ols'))

plt.title("PACF for Prices", size = 20)

plt.show()

 

 

 

 

 

PACF graph is different as compared to ACF for the same series of data. The ACF graph shows many past lagged values are correlated with current value as there could be an indirect relation between a lagged and current value whereas the PACF graph has only few past lagged values as correlated with current as the direct correlation reduces as we go more towards past values.

Indirect Relation

By indirect relation, we refer to all indirect paths through which past data can affect the current day data. For example, how does the value three days ago t-3 affect the value two and one days ago which in turn affects the current value.

Direct Relation

Direct relation refers to the direct relationship between the past data and the current value. For example, how the value three days ago is affecting the current value directly removing all intermediate paths.