Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
import pandas as pd
from statsmodels.tsa.ar_model import AutoReg

ACF and PACF of ARMA processes

Today we shall first look at the theoretical ACF and PACF functions of ARMA processes. The key observation is this: (a) For MA(qq), the ACF becomes zero for h>qh > q. (b) For AR(pp), the PACF becomes zero for h>ph > p. (c) For ARMA(p, q) with both p,q1p, q \geq 1, neither the ACF nor the PACF becomes zero after a finite lag. In this case, determining pp and qq by looking at the ACF and PACF alone is difficult.

In the code below, we are using the statsmodels functions arma_acf and arma_pacf below to compute the ACF and PACF. Different ARMA processes are obtained by changing the ma_coeffs and ar_coeffs in the code below.

from statsmodels.tsa.arima_process import ArmaProcess, arma_acf, arma_pacf
#MA(1)
#th = 0.8
#ma_coeffs = [1]
#ma_coeffs = [1, 0, 0, 0, th]
ma_coeffs = [1, th]
#ma_coeffs = [1, 0.8, 0.6]
ph = 0.8
#ph = -0.8
ar_coeffs = [1]
ar_coeffs = [1, -ph] #these are the coefficients of the AR polynomial (note the sign changes)
#ar_coeffs = [1, -0.5, 0.25]
#ar_coeffs = [1, 0, 0, 0, -ph]
#th2 = 0.8
#ma_coeffs = [1, th, th2]
L = 20
corrs = arma_acf(ar = ar_coeffs, ma = ma_coeffs, lags = L)
par_corrs = arma_pacf(ar = ar_coeffs, ma = ma_coeffs, lags = L)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize = (12, 5))
ax1.stem(range(L), corrs)
ax1.set(title = 'Theoretical ACF', xlabel = 'Lag h', ylabel = 'Autocorrelation')
ax1.axhline(0, lw = 0.5)
ax1.set_xticks([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]) 

ax2.stem(range(L), par_corrs)
ax2.set(title = 'Theoretical PACF', xlabel = 'Lag h', ylabel = 'Partial Autocorrelation')
ax2.axhline(0, lw = 0.5)
ax2.set_xticks([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]) 
plt.tight_layout()
plt.show()
<Figure size 1200x500 with 2 Axes>

ARMA(pp, qq) models for the GNP dataset

Let us consider the GNP dataset from FRED.

gnp = pd.read_csv("GNP_17Nov2025.csv")
#this is quarterly data
print(gnp.head())

y = gnp['GNP'].to_numpy()
plt.plot(y, color = 'black')
plt.xlabel('Time (in quarters)')
plt.ylabel('GNP in Billions of Dollars')
plt.show()
  observation_date      GNP
0       1947-01-01  244.142
1       1947-04-01  247.063
2       1947-07-01  250.716
3       1947-10-01  260.981
4       1948-01-01  267.133
<Figure size 640x480 with 1 Axes>

Instead of working with raw data, we work with logarithms.

ylog = np.log(y)
plt.plot(ylog, color = 'black')
plt.title('Log of GNP over time')
plt.xlabel('Time (in quarters)')
plt.ylabel('Log of GNP')
plt.show()
<Figure size 640x480 with 1 Axes>

It does not make sense to fit stationary ARMA(pp, qq) models directly to the above dataset. So we take differences.

#We difference the log-data:
ylogdiff = np.diff(ylog)
plt.plot(ylogdiff, color = 'black')
plt.title('Percent change in GNP over time')
plt.xlabel('Time (in quarters)')
plt.ylabel('Percent change in GNP')
plt.show()
<Figure size 640x480 with 1 Axes>

ARMA(pp, qq) model fitting, AIC and BIC

We now fit an ARMA(pp, qq) model to this differenced dataset. Let us see how to fit an ARMA(1, 1) model.

arma11 = ARIMA(ylogdiff, order = (1, 0, 1)).fit()
print(arma11.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  313
Model:                 ARIMA(1, 0, 1)   Log Likelihood                 932.499
Date:                Tue, 18 Nov 2025   AIC                          -1856.999
Time:                        16:22:26   BIC                          -1842.014
Sample:                             0   HQIC                         -1851.010
                                - 313                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0154      0.001     10.429      0.000       0.013       0.018
ar.L1          0.6911      0.139      4.955      0.000       0.418       0.964
ma.L1         -0.4642      0.152     -3.054      0.002      -0.762      -0.166
sigma2         0.0002   3.97e-06     38.092      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   0.27   Jarque-Bera (JB):              7596.39
Prob(Q):                              0.60   Prob(JB):                         0.00
Heteroskedasticity (H):               1.55   Skew:                            -0.01
Prob(H) (two-sided):                  0.03   Kurtosis:                        27.13
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

The summary above gives maximized log-likelihood (called Log Likelihood) and also AIC and BIC. The maximized log-likelihood is the value of the log-likelihood at the MLEs. The AIC are BIC are given by:

AIC for a model = (2)×(-2) \times (maximized log-likelihood) + 2 (number of parameters)

BIC for a model = (2)×(-2) \times (maximized log-likelihood) + (logn\log n) \times (number of parameters) where nn denotes the sample size.

print(arma11.llf) #maximized log-likelihood
print(arma11.aic, (-2)*arma11.llf + 2*4) #AIC (note the number of parameters here equals 4)
print(arma11.bic, (-2)*arma11.llf + np.log(len(ylogdiff))*4) #BIC
932.4992964756076
-1856.9985929512152 -1856.9985929512152
-1842.0137801890546 -1842.0137801890546

Selecting the best ARMA(pp, qq) model by automatic model selection

We will go over all ARMA(pp, qq) models with p5p \leq 5 and q5q \leq 5 and find the best fitting ARMA(pp, qq) model using AIC and BIC. In the code below, if parameter estimation does not work for some reason for some pp and qq, then the value NaN is given for AIC and BIC for that model.

import warnings
warnings.filterwarnings("ignore") #this suppresses convergence warnings. 

dt = ylogdiff
pmax = 5
qmax = 5

aicmat = np.full((pmax + 1, qmax + 1), np.nan)
bicmat = np.full((pmax + 1, qmax + 1), np.nan)

for i in range(pmax + 1):
    for j in range(qmax + 1):
        try:
            model = ARIMA(dt, order=(i, 0, j)).fit()
            aicmat[i, j] = model.aic
            bicmat[i, j] = model.bic
        except Exception as e:
            # Some models may not converge; skip them
            print(f"ARIMA({i},0,{j}) failed: {e}")
            continue

aic_df = pd.DataFrame(aicmat, index=[f'AR({i})' for i in range(pmax+1)],
                               columns=[f'MA({j})' for j in range(qmax+1)])
bic_df = pd.DataFrame(bicmat, index=[f'AR({i})' for i in range(pmax+1)],
                               columns=[f'MA({j})' for j in range(qmax+1)])

# Best AR model (MA = 0)
best_ar_aic = np.nanargmin(aicmat[:, 0])
best_ar_bic = np.nanargmin(bicmat[:, 0])

# Best MA model (AR = 0)
best_ma_aic = np.nanargmin(aicmat[0, :])
best_ma_bic = np.nanargmin(bicmat[0, :])

# Best ARMA model overall
best_arma_aic = np.unravel_index(np.nanargmin(aicmat), aicmat.shape)
best_arma_bic = np.unravel_index(np.nanargmin(bicmat), bicmat.shape)

# Print results
print(f"Best AR model by AIC: AR({best_ar_aic})")
print(f"Best AR model by BIC: AR({best_ar_bic})")

print(f"Best MA model by AIC: MA({best_ma_aic})")
print(f"Best MA model by BIC: MA({best_ma_bic})")

print(f"Best ARMA model by AIC: ARMA{best_arma_aic}")
print(f"Best ARMA model by BIC: ARMA{best_arma_bic}")
Best AR model by AIC: AR(2)
Best AR model by BIC: AR(2)
Best MA model by AIC: MA(3)
Best MA model by BIC: MA(2)
Best ARMA model by AIC: ARMA(2, 0)
Best ARMA model by BIC: ARMA(2, 0)

It seems that no errors were thrown for any of the ARMA models. There are a lot of warnings though. Go back and remove the warnings suppression code.

The best overall model here is AR(2).

h_max = 50
fig, axes = plt.subplots(nrows = 2, ncols = 1, figsize = (12, 6))
plot_acf(ylogdiff, lags = h_max, ax = axes[0])
axes[0].set_title("Sample ACF")
plot_pacf(ylogdiff, lags = h_max, ax = axes[1])
axes[1].set_title("Sample PACF")
plt.tight_layout()
plt.show()
<Figure size 1200x600 with 2 Axes>

AR(2) seems like a good model overall.

ar2 = ARIMA(ylogdiff, order = (2, 0, 0)).fit()
print(ar2.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  313
Model:                 ARIMA(2, 0, 0)   Log Likelihood                 935.005
Date:                Tue, 18 Nov 2025   AIC                          -1862.010
Time:                        16:27:21   BIC                          -1847.025
Sample:                             0   HQIC                         -1856.022
                                - 313                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0154      0.001     10.725      0.000       0.013       0.018
ar.L1          0.1971      0.025      7.856      0.000       0.148       0.246
ar.L2          0.2077      0.066      3.140      0.002       0.078       0.337
sigma2         0.0001   3.86e-06     38.565      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):              7833.57
Prob(Q):                              0.97   Prob(JB):                         0.00
Heteroskedasticity (H):               1.61   Skew:                            -0.09
Prob(H) (two-sided):                  0.02   Kurtosis:                        27.51
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Below we obtain predictions for this AR(2) model. These would be predictions for log(yt)log(yt1)\log(y_t) - \log(y_{t-1}). We can convert them to predictions for log(yt)\log(y_t) as in the code below.

n = len(y)
k = 100
fcast_diff_1 = ar2.get_prediction(start = n-1, end = n+k-2).predicted_mean #these are the forecasts for the differenced data
last_observed_ylog = ylog[-1]
fcast_1 = np.zeros(k)
fcast_1[0] = last_observed_ylog + fcast_diff_1[0]
for i in range(1, k):
    fcast_1[i] = fcast_1[i-1] + fcast_diff_1[i]
tme = range(1, n+1)
tme_future = range(n+1, n+k+1)
plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_1, label = 'Forecast (AR(2) on ylogdiff)', color = 'red')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

ARIMA Models

Instead of fitting AR(2) to the differences of the logs, we can fit ARIMA with p=2,d=1,q=0p = 2, d = 1, q = 0 on logarithms.

ar2_nodiff = ARIMA(ylog, order = (2, 1, 0)).fit()
print(ar2_nodiff.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  314
Model:                 ARIMA(2, 1, 0)   Log Likelihood                 909.630
Date:                Tue, 18 Nov 2025   AIC                          -1813.260
Time:                        16:38:29   BIC                          -1802.022
Sample:                             0   HQIC                         -1808.769
                                - 314                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1          0.4048      0.024     16.847      0.000       0.358       0.452
ar.L2          0.4130      0.045      9.193      0.000       0.325       0.501
sigma2         0.0002   5.51e-06     31.677      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   3.52   Jarque-Bera (JB):             13524.06
Prob(Q):                              0.06   Prob(JB):                         0.00
Heteroskedasticity (H):               1.71   Skew:                             1.58
Prob(H) (two-sided):                  0.01   Kurtosis:                        35.05
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
fcast_2 = ar2_nodiff.get_prediction(start = n, end = n+k-1).predicted_mean

plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_1, label = 'Forecast (AR(2) on ylogdiff)', color = 'red')
plt.plot(tme_future, fcast_2, label = 'Forecast (ARIMA(2, 1, 0) on ylog)', color = 'darkgreen')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

The reason the predictions are different is because the ARIMA model does not fit a constant term when d>1d > 1. To get the constant term with d=1d = 1, one needs to use the trend = ‘t’ option.

ar2_nodiff_withintercept = ARIMA(ylog, order = (2, 1, 0), trend = 't').fit() #note the trend = 't'
print(ar2_nodiff_withintercept.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  314
Model:                 ARIMA(2, 1, 0)   Log Likelihood                 935.005
Date:                Tue, 18 Nov 2025   AIC                          -1862.010
Time:                        16:42:17   BIC                          -1847.025
Sample:                             0   HQIC                         -1856.022
                                - 314                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0154      0.001     10.725      0.000       0.013       0.018
ar.L1          0.1971      0.025      7.856      0.000       0.148       0.246
ar.L2          0.2077      0.066      3.140      0.002       0.078       0.337
sigma2         0.0001   3.86e-06     38.565      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):              7833.57
Prob(Q):                              0.97   Prob(JB):                         0.00
Heteroskedasticity (H):               1.61   Skew:                            -0.09
Prob(H) (two-sided):                  0.02   Kurtosis:                        27.51
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
print(ar2.summary())
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  313
Model:                 ARIMA(2, 0, 0)   Log Likelihood                 935.005
Date:                Tue, 18 Nov 2025   AIC                          -1862.010
Time:                        16:41:07   BIC                          -1847.025
Sample:                             0   HQIC                         -1856.022
                                - 313                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0154      0.001     10.725      0.000       0.013       0.018
ar.L1          0.1971      0.025      7.856      0.000       0.148       0.246
ar.L2          0.2077      0.066      3.140      0.002       0.078       0.337
sigma2         0.0001   3.86e-06     38.565      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):              7833.57
Prob(Q):                              0.97   Prob(JB):                         0.00
Heteroskedasticity (H):               1.61   Skew:                            -0.09
Prob(H) (two-sided):                  0.02   Kurtosis:                        27.51
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).

Note that the summaries for AR(2) fitted to ylogdiff and ARIMA((2, 1, 0), trend = ‘n’) fitted to ylog are almost fully identical (one difference is in the number of observations; it is one smaller for AR(2)).

fcast_3 = ar2_nodiff_withintercept.get_prediction(start = n, end = n+k-1).predicted_mean

plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_1, label = 'Forecast (AR(2) on ylogdiff)', color = 'red')
plt.plot(tme_future, fcast_2, label = 'Forecast (ARIMA(2, 1, 0) on ylog)', color = 'darkgreen')
plt.plot(tme_future, fcast_3, label = 'Forecast (ARIMA(2, 1, 0) with trend = t on ylog)', color = 'black')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

The predictions on ARIMA(2, 1, 0) with trend = t coincide with the predictions for logyt\log y_t obtained via AR(2) on log(yt/yt1)\log (y_t/y_{t-1}).

Selecting the best ARIMA(pp, dd, qq) model by AIC or BIC

Below we go over all possible values of p5p \leq 5, d2d \leq 2 and q5q \leq 5 and calculate the AIC, BIC values of each ARIMA model. We use the default ARIMA (no modification for inclusion of intercepts when d1d \geq 1).

import warnings
warnings.filterwarnings("ignore") #this suppresses convergence warnings. 

dt = ylog
pmax, dmax, qmax = 5, 2, 5

results = []

for p in range(pmax + 1):
    for d in range(dmax + 1):
        for q in range(qmax + 1):

            # Use linear trend only when d = 1
            #trend = 't' if d == 1 else None
            #trend = 'n'if d >= 1 else 'c'

            try:
                model = ARIMA(dt, order=(p, d, q)).fit()

                results.append({
                    'p': p, 'd': d, 'q': q,
                    'trend': trend,
                    'AIC': model.aic,
                    'BIC': model.bic
                })
            except Exception as e:
                print(f"ARIMA({p},{d},{q}, trend={trend}) failed: {e}")
                continue

# Convert to DataFrame
results_df = pd.DataFrame(results)

# Find best models
best_aic_model = results_df.loc[results_df['AIC'].idxmin()]
best_bic_model = results_df.loc[results_df['BIC'].idxmin()]

print("Best model by AIC:")
print(best_aic_model)

print("\nBest model by BIC:")
print(best_bic_model)
Best model by AIC:
p                  1
d                  1
q                  3
trend              n
AIC     -1860.201397
BIC     -1841.470381
Name: 27, dtype: object

Best model by BIC:
p                  0
d                  2
q                  3
trend              n
AIC     -1856.667535
BIC     -1841.695523
Name: 15, dtype: object

The best ARIMA(p,d,qp, d, q) model (searched over p5p \leq 5, d2d \leq 2, q5q \leq 5) is ARIMA(1, 1, 3) (by AIC) and ARIMA(0, 2, 3) (by BIC). Note I used the default ARIMA functions with no modification for including intercepts in the models with d1d \geq 1.

arima113 = ARIMA(ylog, order = (1, 1, 3)).fit()
fcast_4 = arima113.get_prediction(start = n, end = n+k-1).predicted_mean

arima023 = ARIMA(ylog, order = (0, 2, 3)).fit()
fcast_5 = arima023.get_prediction(start = n, end = n+k-1).predicted_mean

plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_2, label = 'Forecast (ARIMA(2, 1, 0) on ylog)', color = 'darkgreen')
plt.plot(tme_future, fcast_3, label = 'Forecast (ARIMA(2, 1, 0) with trend = t on ylog)', color = 'black')
plt.plot(tme_future, fcast_4, label = 'Forecast (ARIMA(1, 1, 3) on ylog)', color = 'red')
plt.plot(tme_future, fcast_5, label = 'Forecast (ARIMA(0, 2, 3) on ylog)', color = 'darkblue')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

Here are the predictions on the original data.

plt.plot(tme, y, label = 'Data')
plt.plot(tme_future, np.exp(fcast_3), label = 'Forecast (ARIMA(2, 1, 0) with trend = t on ylog)', color = 'black')
plt.plot(tme_future, np.exp(fcast_4), label = 'Forecast (ARIMA(1, 1, 3) on ylog)', color = 'red')
plt.plot(tme_future, np.exp(fcast_5), label = 'Forecast (ARIMA(0, 2, 3) on ylog)', color = 'darkblue')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

We can compare the AIC and BIC of the best models above (ARIMA(1, 1, 3) and ARIMA(0, 2, 3)) with ARIMA(2, 1, 0) with trend = t

print(ar2_nodiff_withintercept.aic, ar2_nodiff_withintercept.bic)
print(arima113.aic, arima113.bic)
print(arima023.aic, arima023.bic)
-1862.0098207563674 -1847.0250079942068
-1860.2013971103497 -1841.470381157649
-1856.6675352600519 -1841.6955225088138

This suggests that the AR(2) with intercept is better than these models ARIMA(1, 1, 3) and ARIMA(0, 2, 3) without intercepts.

Note that the predictions look different but this is because we are looking at long range forecasts (k=100k = 100 corresponds to 25 future years!) If we look at short range forecasts (such as k=12k = 12), then the predictions will be quite close to each other.

plt.plot(tme, y, label = 'Data')
#below we only plot the first 8 predictions
k_short = 12
plt.plot(tme_future[0:k_short], np.exp(fcast_3)[0:k_short], label = 'Forecast (ARIMA(2, 1, 0) with trend = t on ylog)', color = 'black')
plt.plot(tme_future[0:k_short], np.exp(fcast_4)[0:k_short], label = 'Forecast (ARIMA(1, 1, 3) on ylog)', color = 'red')
plt.plot(tme_future[0:k_short], np.exp(fcast_5)[0:k_short], label = 'Forecast (ARIMA(0, 2, 3) on ylog)', color = 'darkblue')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

TTLCONS Dataset

ttlcons = pd.read_csv('TTLCONS_13Nov2025.csv')
print(ttlcons.tail())
y = ttlcons['TTLCONS'].to_numpy()
print(len(y))
plt.plot(y)
plt.title('Total Construction Spending')
plt.xlabel('Time (months)')
plt.show()
    observation_date  TTLCONS
386       2025-03-01  2150847
387       2025-04-01  2153440
388       2025-05-01  2149124
389       2025-06-01  2140546
390       2025-07-01  2139110
391
<Figure size 640x480 with 1 Axes>

We fit models to the logarithm of the data.

ylog = np.log(y)
plt.plot(ylog, color = 'black')
plt.title('Log of Total Construction Spending')
plt.xlabel('Time (months)')
plt.show()  
<Figure size 640x480 with 1 Axes>

We can obtain candidate models by differencing the data, and looking at acf and pacf.

ylogdiff = np.diff(ylog)
plt.plot(ylogdiff, color = 'black')
plt.title('Percent change in Total Construction Spending')
plt.xlabel('Time (months)')
plt.show()
<Figure size 640x480 with 1 Axes>
h_max = 50
fig, axes = plt.subplots(nrows = 2, ncols = 1, figsize = (12, 6))
plot_acf(ylogdiff, lags = h_max, ax = axes[0])
axes[0].set_title("Sample ACF")
plot_pacf(ylogdiff, lags = h_max, ax = axes[1])
axes[1].set_title("Sample PACF")
plt.tight_layout()
plt.show()
<Figure size 1200x600 with 2 Axes>

From the above ACF and PACF plots, probably the best model for ylogdiff is AR(3).

Let us go over all possible pp and qq and find the best model for ylogdiff.

import warnings
warnings.filterwarnings("ignore") #this suppresses convergence warnings. 

dt = ylogdiff #arma fitting can be unstable. Change dt = ylogdiff to dt = 100*ylogdiff and check if results change
pmax = 5
qmax = 5

aicmat = np.full((pmax + 1, qmax + 1), np.nan)
bicmat = np.full((pmax + 1, qmax + 1), np.nan)

for i in range(pmax + 1):
    for j in range(qmax + 1):
        try:
            model = ARIMA(dt, order=(i, 0, j)).fit()
            aicmat[i, j] = model.aic
            bicmat[i, j] = model.bic
        except Exception as e:
            # Some models may not converge; skip them
            print(f"ARIMA({i},0,{j}) failed: {e}")
            continue

aic_df = pd.DataFrame(aicmat, index=[f'AR({i})' for i in range(pmax+1)],
                               columns=[f'MA({j})' for j in range(qmax+1)])
bic_df = pd.DataFrame(bicmat, index=[f'AR({i})' for i in range(pmax+1)],
                               columns=[f'MA({j})' for j in range(qmax+1)])

# Best AR model (MA = 0)
best_ar_aic = np.nanargmin(aicmat[:, 0])
best_ar_bic = np.nanargmin(bicmat[:, 0])

# Best MA model (AR = 0)
best_ma_aic = np.nanargmin(aicmat[0, :])
best_ma_bic = np.nanargmin(bicmat[0, :])

# Best ARMA model overall
best_arma_aic = np.unravel_index(np.nanargmin(aicmat), aicmat.shape)
best_arma_bic = np.unravel_index(np.nanargmin(bicmat), bicmat.shape)

# Print results
print(f"Best AR model by AIC: AR({best_ar_aic})")
print(f"Best AR model by BIC: AR({best_ar_bic})")

print(f"Best MA model by AIC: MA({best_ma_aic})")
print(f"Best MA model by BIC: MA({best_ma_bic})")

print(f"Best ARMA model by AIC: ARMA{best_arma_aic}")
print(f"Best ARMA model by BIC: ARMA{best_arma_bic}")
Best AR model by AIC: AR(5)
Best AR model by BIC: AR(3)
Best MA model by AIC: MA(5)
Best MA model by BIC: MA(3)
Best ARMA model by AIC: ARMA(1, 1)
Best ARMA model by BIC: ARMA(1, 1)

Best AR model: AR(3) (using BIC) and AR(5) (using AIC)

Best MA model: MA(3) (using BIC) and MA(5) (using AIC)

Best overall ARMA model: ARMA(1, 1)

Instead of fitting the best model to ylogdiff, we can find best models for ylog (in this case, we shall use ARIMA(p, d, q) and vary p, d, q).

import warnings
warnings.filterwarnings("ignore") #this suppresses convergence warnings. 


dt = ylog
pmax, dmax, qmax = 5, 2, 5

results = []

for p in range(pmax + 1):
    for d in range(dmax + 1):
        for q in range(qmax + 1):

            # Use linear trend only when d = 1
            #trend = 't' if d == 1 else None
            #trend = None

            try:
                model = ARIMA(dt, order=(p, d, q)).fit()

                results.append({
                    'p': p, 'd': d, 'q': q,
                    'AIC': model.aic,
                    'BIC': model.bic
                })
            except Exception as e:
                print(f"ARIMA({p},{d},{q}) failed: {e}")
                continue

# Convert to DataFrame
results_df = pd.DataFrame(results)

# Find best models
best_aic_model = results_df.loc[results_df['AIC'].idxmin()]
best_bic_model = results_df.loc[results_df['BIC'].idxmin()]

print("Best model by AIC:")
print(best_aic_model)

print("\nBest model by BIC:")
print(best_bic_model)
Best model by AIC:
p         1.000000
d         1.000000
q         1.000000
AIC   -2405.521236
BIC   -2393.622796
Name: 25, dtype: float64

Best model by BIC:
p         1.000000
d         1.000000
q         1.000000
AIC   -2405.521236
BIC   -2393.622796
Name: 25, dtype: float64

Best model is ARIMA(1, 1, 1). Let us use it for prediction.

n = len(ylog)
k = 100
tme = range(1, n+1)
tme_future = range(n+1, n+k+1)
#ARIMA(1, 1, 1) model for ylog:
arima111 = ARIMA(ylog, order = (1, 1, 1)).fit()
print(arima111.summary())
fcast_arima111 = arima111.get_prediction(start = n, end = n+k-1).predicted_mean
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  391
Model:                 ARIMA(1, 1, 1)   Log Likelihood                1205.761
Date:                Tue, 18 Nov 2025   AIC                          -2405.521
Time:                        20:58:32   BIC                          -2393.623
Sample:                             0   HQIC                         -2400.805
                                - 391                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ar.L1          0.9682      0.018     54.374      0.000       0.933       1.003
ma.L1         -0.8394      0.040    -21.130      0.000      -0.917      -0.762
sigma2         0.0001   6.69e-06     17.962      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   0.92   Jarque-Bera (JB):                49.59
Prob(Q):                              0.34   Prob(JB):                         0.00
Heteroskedasticity (H):               0.54   Skew:                            -0.21
Prob(H) (two-sided):                  0.00   Kurtosis:                         4.69
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_arima111, label = 'Forecast (ARIMA(1, 1, 1) on ylog)', color = 'darkgreen')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

If we include the intercept term in ARIMA(1, 1, 1), then the predictions will be different.

arima111_t = ARIMA(ylog, order = (1, 1, 1), trend = 't').fit() #trend = 't' includes the intercept term
fcast_arima111_t = arima111_t.get_prediction(start = n, end = n+k-1).predicted_mean
plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_arima111, label = 'Forecast (ARIMA(1, 1, 1) on ylog)', color = 'darkgreen')
plt.plot(tme_future, fcast_arima111_t, label = 'Forecast (ARIMA(1, 1, 1) on ylog with intercept)', color = 'red')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

Which of these predictions would we prefer? We can compare the AIC and BIC of the model with and without the intercept.

print(arima111.aic, arima111.bic)
print(arima111_t.aic, arima111_t.bic)
-2405.524371156208 -2393.625930938837
-2406.8821088764494 -2391.0175219199546

It is interesting that in terms of the AIC, the model with intercept is better, while in terms of BIC, the model without intercept is better. The predictions for the model with no intercept look more natural.

Other Models for TTLCONS

Below we consider the double differenced dataset.

ylogdiff2 = np.diff(np.diff(ylog))
plt.plot(ylogdiff2)
plt.title("Double Differenced Data")
plt.show()
<Figure size 640x480 with 1 Axes>

Let us attempt to find best ARMA(pp, qq) models for this data.

dt = 100*ylogdiff2 #notice the multiplication by 100 (without it the code does not seem to work well)
pmax = 5
qmax = 5

aicmat = np.full((pmax + 1, qmax + 1), np.nan)
bicmat = np.full((pmax + 1, qmax + 1), np.nan)

for i in range(pmax + 1):
    for j in range(qmax + 1):
        try:
            model = ARIMA(dt, order=(i, 0, j)).fit()
            aicmat[i, j] = model.aic
            bicmat[i, j] = model.bic
        except Exception as e:
            # Some models may not converge; skip them
            print(f"ARIMA({i},0,{j}) failed: {e}")
            continue

aic_df = pd.DataFrame(aicmat, index=[f'AR({i})' for i in range(pmax+1)],
                               columns=[f'MA({j})' for j in range(qmax+1)])
bic_df = pd.DataFrame(bicmat, index=[f'AR({i})' for i in range(pmax+1)],
                               columns=[f'MA({j})' for j in range(qmax+1)])

# Best AR model (MA = 0)
best_ar_aic = np.nanargmin(aicmat[:, 0])
best_ar_bic = np.nanargmin(bicmat[:, 0])

# Best MA model (AR = 0)
best_ma_aic = np.nanargmin(aicmat[0, :])
best_ma_bic = np.nanargmin(bicmat[0, :])

# Best ARMA model overall
best_arma_aic = np.unravel_index(np.nanargmin(aicmat), aicmat.shape)
best_arma_bic = np.unravel_index(np.nanargmin(bicmat), bicmat.shape)

# Print results
print(f"Best AR model by AIC: AR({best_ar_aic})")
print(f"Best AR model by BIC: AR({best_ar_bic})")

print(f"Best MA model by AIC: MA({best_ma_aic})")
print(f"Best MA model by BIC: MA({best_ma_bic})")

print(f"Best ARMA model by AIC: ARMA{best_arma_aic}")
print(f"Best ARMA model by BIC: ARMA{best_arma_bic}")
Best AR model by AIC: AR(5)
Best AR model by BIC: AR(5)
Best MA model by AIC: MA(4)
Best MA model by BIC: MA(1)
Best ARMA model by AIC: ARMA(2, 4)
Best ARMA model by BIC: ARMA(0, 1)

According to BIC, the best model is MA(1). This is the same model that is suggested by ACF and PACF.

h_max = 50
fig, axes = plt.subplots(nrows = 2, ncols = 1)
plot_acf(ylogdiff2, lags = h_max, ax = axes[0])
axes[0].set_title("Sample ACF")
plot_pacf(ylogdiff2, lags = h_max, ax = axes[1])
axes[1].set_title("Sample PACF")
plt.tight_layout()
plt.show()
<Figure size 640x480 with 2 Axes>

This suggests the ARIMA(0, 2, 1) model for ylog.

arima021 = ARIMA(ylog, order = (0, 2, 1)).fit()
print(arima021.summary())
fcast_arima021 = arima021.get_prediction(start = n, end = n+k-1).predicted_mean
                               SARIMAX Results                                
==============================================================================
Dep. Variable:                      y   No. Observations:                  391
Model:                 ARIMA(0, 2, 1)   Log Likelihood                1199.732
Date:                Tue, 18 Nov 2025   AIC                          -2395.464
Time:                        21:03:58   BIC                          -2387.537
Sample:                             0   HQIC                         -2392.321
                                - 391                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ma.L1         -0.8732      0.026    -33.741      0.000      -0.924      -0.823
sigma2         0.0001   6.74e-06     18.122      0.000       0.000       0.000
===================================================================================
Ljung-Box (L1) (Q):                   1.69   Jarque-Bera (JB):                47.79
Prob(Q):                              0.19   Prob(JB):                         0.00
Heteroskedasticity (H):               0.55   Skew:                            -0.20
Prob(H) (two-sided):                  0.00   Kurtosis:                         4.67
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_arima111, label = 'Forecast (ARIMA(1, 1, 1) on ylog)', color = 'darkgreen')
plt.plot(tme_future, fcast_arima111_t, label = 'Forecast (ARIMA(1, 1, 1) on ylog with intercept)', color = 'red')
plt.plot(tme_future, fcast_arima021, label = 'Forecast (ARIMA(0, 2, 1) on ylog)', color = 'black')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

The model ARIMA(0, 2, 1) does not include any intercept term. We can also fit MA(1) to ylogdiff2, and then convert the predictions for the double differenced data to ylog.

#predictions for ylogdiff2: 
ma1_diff2 = ARIMA(ylogdiff2, order = (0, 0, 1)).fit()
fcast_diff2 = (ma1_diff2.get_prediction(start = n-2, end = n+k-3).predicted_mean)/100 #the division by 100 is because ylogdiff2 has a multiplication by 100 in its definition

The code below converts predictions for diff(diff(ylog)) into predictions for ylog.

tmp1 = ylog[-1] - ylog[-2]
fcast_d1 = np.zeros(k)
fcast_d1[0] = tmp1 + fcast_diff2[0]
for i in range(1, k):
    fcast_d1[i] = fcast_d1[i-1] + fcast_diff2[i]
tmp2 = ylog[-1]
fcast_ma1_diff2 = np.zeros(k)
fcast_ma1_diff2[0] = tmp2 + fcast_d1[0]
for i in range(1, k):
    fcast_ma1_diff2[i] = fcast_ma1_diff2[i-1] + fcast_d1[i]
plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future, fcast_arima111, label = 'Forecast (ARIMA(1, 1, 1) on ylog)', color = 'darkgreen')
plt.plot(tme_future, fcast_arima111_t, label = 'Forecast (ARIMA(1, 1, 1) on ylog with intercept)', color = 'red')
plt.plot(tme_future, fcast_arima021, label = 'Forecast (ARIMA(0, 2, 1) on ylog)', color = 'black')
plt.plot(tme_future, fcast_ma1_diff2, label = 'Forecast (ARIMA(0, 2, 1) on ylog with intercept)', color = 'darkblue')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

These predictions (except for the red predictions) seem divergent because the horizon of prediction is very long. If we do short range predictions, they are much closer.

k_short = 20
plt.plot(tme, ylog, label = 'Data')
plt.plot(tme_future[0:k_short], fcast_arima111[0:k_short], label = 'Forecast (ARIMA(1, 1, 1) on ylog)', color = 'darkgreen')
plt.plot(tme_future[0:k_short], fcast_arima111_t[0:k_short], label = 'Forecast (ARIMA(1, 1, 1) on ylog with intercept)', color = 'red')
plt.plot(tme_future[0:k_short], fcast_arima021[0:k_short], label = 'Forecast (ARIMA(0, 2, 1) on ylog)', color = 'black')
plt.plot(tme_future[0:k_short], fcast_ma1_diff2[0:k_short], label = 'Forecast (ARIMA(0, 2, 1) on ylog with intercept)', color = 'darkblue')
plt.axvline(x=n, color='gray', linestyle='--')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>