Let the Machines Learn

Understanding Time Series Modelling and Forecasting, Part 2

As promised, this is the second post on my two part blog series on time series modelling and forecasting. In my first blog post I discussed the basics of time series analysis and gave a theoretical overview. In case you missed it you can find it here – Understanding Time Series Modelling and Forecasting, Part 1 

 

Table of Contents

  1. Identifying a Possible Model
  2. Diagnosing a Selected Model
  3. Forecasting With ARIMA Models
  4. Seasonal ARIMA Models

Identifying a Possible Model

I have talked about this in detail in my previous blog post. Lets go over this briefly once again. There are three things that need to be considered to make a first guess.

I have discussed this in my previous blog post. But I would like to mention a very important table again that I often refer.

AR(p) MA(q) ARMA(p, q)
ACF Tails Off Cuts off after lag q Tails Off
PACF Cuts off after lag p Tails Off Tails Off

Diagnosing a selected model

Now that we have decided a particular model to use, we need to estimate the coefficients of our model. You usually do not have to worry about it as R or any other statistical software would do this for you. Most of the software packages use maximum likelihood  estimation method to make the estimates.

Once you get the coefficients you need to consider a few things –

If any of the checks bothers you, revise your guess of the model selected. You might have to change the order of the ARIMA model.

What if more than one model looks okay?

This happens very often. You will be in a position where once you perform the above steps, more than one model would seem to work. Here are the few steps you can take –

Forecasting with ARIMA models

When we forecast a value past the end of the series, on the right side of the equation we might need values from the observed series that aren’t yet observed. Again, statistical software like R would do this for you but let us discuss the basic steps involved.

Let us consider the AR(2) model,

xt  =  φ1xt-1  + φ2xt-2  +  wt

Suppose that we have observed n data values and wish to forecast the value of xn+1 and xn+2 , this can be done using the above equation. 

xn+1  =  φ1xn  + φ2xn-1  +  wn+1

xn+2  =  φ1xn+1  + φ2xn  +  wn+2

We replace the wn+1 and wn+2 by the expected value of 0 (the assumed mean for the errors).  We use the forecasted value of xn+1 to get the values of xn+2 .

In general, the forecasting procedure is as follows –

Seasonal ARIMA Models

Seasonality in time series is a regular pattern of changes occurring at fixed time periods. Lets denote this fixed time period with S. For example, if the sales of a particular product increases every July then S = 12 (months per year).

In a seasonal ARIMA models, seasonal AR and MA terms predict xusing data values and errors at times with lags that are multiples of S (the span of the seasonality).

xt  =  φ1xt-7  + φ2xt-14  +  wt

xt  =  θ12wt-12  + wt

Seasonal Differencing

Seasonality usually causes the series to be non-stationary because the average values at some particular times within the seasonal span (months, for example) may be different than the average values at other times. For instance, the sale of blankets will always be higher in the winter months.

Seasonal differencing is defined as a difference between a value and a value with lag that is a multiple of S. With S = 24, a seasonal difference is  xt – xt-24 .

Seasonal differencing can occur with non-seasonal differencing too.

The above equations assumed that the non-seasonal orders are zero. A model with non-seasonal as well as seasonal orders is represented as –

ARIMA (p, d, q) x (P, D, Q)S

where,

p is the non-seasonal AR order, d is the non-seasonal differencing order, q is the non-seasonal MA order, P is the seasonal AR order, D is the non-seasonal differencing order, Q is the non-seasonal MA order and S is the span of the seasonality.

Therefore, an ARIMA(1, 0, 1) x (1, 0, 2)12 model would have the following equation –

xt  =  φ1xt-1  + φ12xt-12  + θ1wt-1  + θ12wt-12  + θ24wt-24  + w

Identifying a seasonal model

  1. Examine the time series plot of the data for trend and seasonality. We usually know beforehand whether we have gathered seasonal (months, weeks, years etc.) or not.
  2. We need to do any necessary differencing –
    • If there is seasonality and no trend, then differencing of order S is required. Seasonality in ACF will appear as a slowly tapering pattern at multiples of S.
    • If there is linear trend but no seasonality then apply a first difference. If there is quadratic trend then apply a second order difference.
    • If there is both trend and seasonality, first apply a seasonal difference. If the trend remains then apply the requisite non-seasonal difference. (first order, second order etc.)
    • If there is no trend and no seasonality then no differencing is needed.
  3. Examine the ACF and PACF  plot of the differenced data (if differencing is necessary).
    • Non-seasonal terms –  Examine the early lags to guess the non-seasonal terms. Spikes in the ACF (at low lags) indicate non-seasonal MA terms. Spike in the PACF (at low lags) indicate possible non-seasonal AR terms.
    • Seasonal terms – Examine the patterns across lags that are multiples of S. For example, for weekly data, look at lags 7, 14, 21 and so on. The seasonal lags are judged in the same way as we judge the earlier lags.
  4. Use a statistical software like R, to estimate the coefficients of the decided model.
  5. Examine the coefficients following the same diagnosis steps that we do for the non-seasonal models. If the diagnosis results are not good, we need to redo step 3 above.

Further Reading

Thank You. Hope you found this useful. 🙂