Reading: Ch 3 - Shumway and Stoffer
ARIMA models¶
Let’s end our discussion of ARIMA models by finally putting everything together and adding the “I” term - this is for Autoregressive Integrated Moving Average.
When do you use ARIMA models? Up until now, we have discussed fitting models only to stationarity data, with many various ways of checking whether our data meet these assumptions. However, many real datasets are nonstationary, and thus we need to do some transformations to make our models work.
In earlier lectures where we discussed the random walk, , we also discussed that we can difference the signal, and that is stationary. We could also have a process consisting of a trend (nonstationary) and a zero-mean stationary component, for example:
where and is stationary. If we difference this process, we get
which is indeed stationary.
We can also apply this differencing more than once! If we have instead a kth order polynomial, , then the differenced series . For example, if we have
where is stationary. is a random walk, but so is ! Differencing once,
since from the first recursion.
Because is stationary, the (non)stationarity of is governed entirely by . Unrolling the recursion from an initial value ,
.
This still depends on , so it is nonstationary (it’s still a random walk). So we can difference twice:
and both terms are stationary, so is stationary. This reflects the fact that here is effectively : a stochastic trend built from two nested random walks requires two differences to reduce to stationarity.
Definition: A process is said to be ARIMA(p,d,q) if:
is ARMA(p,q). In general we can also write:
If , we can write the model as:
,
where .
So how much differencing should we do? We should be careful! It’s very rare that differencing order is necessary. Over-differencing can also add correlation where there was none. For example, if is a random walk, differencing twice leads to a non-invertible moving average .
Remember also that when we use differencing before forecasting, we are forecasting the difference series, and not our original series, so we will have to integrate our differenced data to get the original values back. We actually did this last time ourselves, but here we’ll show a few more examples.