Below is a few methods that can be used to treat missing values in time-series data.
1. Last Observation Carried Forward (LOCF)
In this method, the missing value is filled by the previous value in the sequence.
2. Next Observation Carried Backward (NOCB)
In this method, the missing value is filled by the next value in the sequence.
3. Rolling Statistics
In this method, the missing value is imputed by aggregating the previous non-missing values.
- Simple Moving Average
In this method, the missing values are imputed by taking the average of the last n values. All the values are given equal weightage.
- Weighted Moving Average
In this method, the missing values are imputed by taking the average of the last n values. The more recent values are given more weightage.
- Exponential Weighted Moving Average
Similar to the weighted moving average, it gives more weightage and significance to the recent values.
In this method, missing values are estimated by assuming a relationship within a range of data points. These data points can be past or future known values.
The relationship between data points can be linear. Values can be estimated by focusing more on nearby points than far-away points.
Points to Note
- In time series data, the current value is affected by the last value. Therefore, regular methods for missing value imputation cannot be used in time series data.
- There may be a seasonality or trend component present in the data. If we impute the data with the global mean or median, we may lose the correct patterns in the data.
- In time series data, we care about the order of the events hence we cannot drop the missing values data.