Moving Average

A Simple Explanation - By Varsha Saini

Moving Average takes the average of a range of values and the range is updated continuously. It is calledĀ Moving as the range of values is relocated continuously andĀ Average as it takes the average (mean) of a range of values.

The core assumption of the moving average is that data is stationary and has a slowly varying mean. If the data is not stationary, we first need to convert it into stationary before applying the moving average.

Moving Average Smoothening

Smoothening in time series is a technique of removing all the variations between time steps. The moving average is one of the most commonly used smoothening processes in time series. It creates a new series from the raw series in which every new value is the average calculated from the last n values. The value n is called the window size or window width.

Types of Moving Average

There are various types of moving averages:

1. Simple Moving Average

It is simply the average of n numbers that is moving over a specified period.

 

where t = value at current time and n= number of data points (window size)

2. Weighted Moving Average

It is the weighted average of n numbers in which the recent values are given more weightage.

where w= weights

3. Exponential Moving Average

It is another type of moving average in which recent values are given more weightage and significance.

  • There is no need to decide on weights manually.
  • It adopts more quickly to the data points changes.

4. Exponential Smoothening

This type has an extra smoothening factor Ī± which can be used to control the weights of values.

  • The value of Ī± near 1 means it gives more weightage to closer data points.
  • The value of Ī± near 0 means it gives more weightage to extreme data points.

where Ī± = smoothening parameter

Applications of Moving Average

  • Moving Average can be used to create a smoothed version of the original dataset.
  • It is used across various analytics domains.
  • In streaming analytics, a lot of window functions come from the moving averages.

Python Code

Let’s code and see how these moving averages are applied to time series data. We will also compare the results.

1. Load Dataset

The data is taken from Kaggle Electric_Production.

import pandas as pd

data=pd.read_csv("Electric_Production.csv")
data.head()

 

2. Make Date as Index

For working with any time series data, the first step is to make the date column as index.

data["DATE"]=pd.to_datetime(data["DATE"])
data.set_index('DATE',inplace=True)
data.head()

3. Simple Moving Average


data["ma_rolling_3"]=data["IPG2211A2N"].rolling(window=3).mean().shift(1)

data.head()

The moving average is calculated using the past 3 values. Therefore, the ist 3 values are NaN since they don’t have past 3 values.

data[data.index.year>2007].plot()
The original value and moving average value are not that close but still, it is able to capture the fluctuations but with a time lag.

4. Weighted Moving Average

Let us calculate the weighted moving average to find its ability to capture the fluctuations of the original series. Similar to the moving average, we will use the past 3 data to calculate the weighted moving average.


def wma(weights):
 def calc(x):
   return (weights*x).mean()
 return calc
import numpy as np

data["wma_rolling_3"]=data['IPG2211A2N'].rolling(window=3).apply(wma(np.array([0.5,1,1.5]))).shift(1)

data[data.index.year>2010].plot()

For better clarity, we have plotted data after 2010. the blue line represents the original data, the orange represents the moving average and the green represents the weighted moving average.

It is interpreted from the graph, that a weighted moving average is finding trends sooner than a simple moving average. But the drawback is it is more complex.

5. Exponential Moving Average


data["ema_rolling_3"]=data['IPG2211A2N'].ewm(span=3,adjust=False,min_periods=0).mean().shift(1)

data.head(10)

data[data.index.year>2011].plot()

Similar to the weighted moving average, data is plotted after 2011 to better capture the fluctuations. The graph shows that sometimes the exponential moving average outperforms the weighted average and the other way round.

6. Exponential Smoothening Average

data["esa_rolling_3"]=data['IPG2211A2N'].ewm(alpha=0.7,adjust=False,min_periods=0).mean().shift(1)

data[data.index.year>2015].plot()

7. Root Mean Squared Error

We have compared different moving averages using the graph. Now, we will use Root Mean Squared Error to compare the values.

((data['IPG2211A2N']-data['ma_rolling_3'])**2).mean()**0.5

10.777516363403585

((data['IPG2211A2N']-data['wma_rolling_3'])**2).mean()**0.5

9.723924117312949

((data['IPG2211A2N']-data['ema_rolling_3'])**2).mean()**0.5

8.754703078794426

((data['IPG2211A2N']-data['esa_rolling_3'])**2).mean()**0.5

8.528422169053954

From the above calculation, we can see that error is least in exponentially smoothening moving average and highest in the simple moving average. This is the same as what we saw in the graphs as well.