Volatility Forecasting using GARCH

The ARCH model (Autoregressive Conditional Heteroskedasticity) was created by Robert Engle (1982).

A time series is a temporally ordered sequence of events or data points whose characteristics are (necessarily) determined by the passage of time.

Starting from a real-time series \({ { y_t }_{t=1}^{T} }\), each \( y_t \) is a random variable. The discrete index \( t = 1,2,3, … T \) describes the time points. A random variable is a mathematical function that assigns a number to each possible outcome of a random experiment. For example, a random variable \(X \in {0,1,2}\) could represent the number of heads in a coin toss.

In time series analysis, the conditional mean is considered:

\begin{equation} E[y_t \mid \mathcal{F}_{t-1}] = m_t \end{equation}

Here, the explicit expectation at time \( y_t \) is not the main focus, but rather the distribution of an expected average value. Moreover, instead of the general expectation, the conditional expectation is considered, conditioned on \({F}_{t-1}\) (Sigma-algebra), which incorporates all knowledge available up to time \( t – 1 \). Thus, the model describes the generation of an expectation depending on what has previously occurred.

We introduce the concept of residuals \(\varepsilon_t \), which describe the deviation of the observed time series value \( y_t \) from the estimated conditional mean \( m_t \).

Residuals play a crucial role in understanding the structure of deviations \( \varepsilon_t \) rather than merely the accuracy of the mean value.

\begin{equation} \varepsilon_t = y_t – m_t \end{equation}

This key equation provides the rationale and motivation for ARCH models. In previous models, the variance, i.e., the dispersion of the residuals, was assumed to be constant. However, in many financial market analyses and corresponding time series studies, it has been observed that this assumption does not hold. The constant variance of the residuals is referred to as homoskedasticity.

\begin{equation} \text{Var}(\varepsilon_t \mid \mathcal{F}_{t-1}) = \sigma^2 \text{ (constant)} \end{equation}

If, in calm or highly dynamic market situations, volatility clustering occurs—where periods of low or high variance follow each other—homoskedastic models typically fail. As mentioned earlier, the presence of such real-world scenarios motivates the consideration of a temporal dependency in the residuals.

\begin{equation} \mathrm{Var}(\varepsilon_t \mid \mathcal{F}_{t-1}) = \sigma_t^2 \end{equation}

The ARCH model describes this mathematical relationship by incorporating the temporal dependency of residual variance through conditional variance. This is referred to as conditional heteroskedasticity. The following error term can thus be formulated as follows:

\begin{equation} \varepsilon_t = \sigma_t z_t \end{equation}

The time-dependent error \( \varepsilon_t \) is defined as the product of the time-dependent standard deviation \( \sigma_t \) and white noise \( z_t \). The white noise is assumed to be independent of previous values and normally distributed with a mean of zero and a variance of one. The core idea is that unpredictable events, which occur randomly and without patterns, can be modeled as random shocks.

The ARCH equation is given by:

\begin{equation} \sigma_t^2 = \alpha_0 + \alpha_1 \varepsilon_{t-1}^2 + \alpha_2 \varepsilon_{t-2}^2 + \dots + \alpha_p \varepsilon_{t-p}^2 \end{equation}

The conditional variance at time \( t \) is determined as the sum of the squared errors from previous time points \( t – i \).

The coefficients \( \alpha_0, \alpha_1, \dots, \alpha_p \) must always be positive, as variance can never be negative. These coefficients are typically estimated using the maximum likelihood method.

To determine these coefficients, we reconsider the error function:

\begin{equation} \varepsilon_t = y_t – m_t \end{equation}

Since \( z_t \) is standard normally distributed and the conditional standard deviation is defined based on past errors, we can assume a conditional normal distribution, given that the variance \( \sigma_t^2 \) varies over time.

Thus, the probability density function for \( \varepsilon_1 \) can be described as follows. \( \frac{\varepsilon_t^2}{\sigma_t^2} \) measures how strongly the error deviates from its expected variance, and \( \frac{1}{\sqrt{2\pi\sigma_t^2}} \) normalizes the function, ensuring the total probability is 1.

\begin{equation} f(\varepsilon_t \mid \sigma_t^2) = \frac{1}{\sqrt{2\pi\sigma_t^2}} \exp \left( -\frac{\varepsilon_t^2}{2\sigma_t^2} \right) \end{equation}

Since \( \varepsilon_t \) is a time-dependent observation, the likelihood function for the corresponding number of values is given by:

\begin{equation} L(\alpha_0, \alpha_1, \dots, \alpha_p) = \prod_{t=1}^{T} \frac{1}{\sqrt{2\pi\sigma_t^2}} \exp \left( -\frac{\varepsilon_t^2}{2\sigma_t^2} \right) \end{equation}

To simplify calculations, the logarithm function is used:

\begin{equation} \log L = -\frac{1}{2} \sum_{t=1}^{T} \left( \log (2\pi\sigma_t^2) + \frac{\varepsilon_t^2}{\sigma_t^2} \right) \end{equation}

Extension of the ARCH Model to GARCH

Since the ARCH model only considers past squared errors \( \varepsilon_{t-i}^2 \) and not past variance, it led to an extension of the existing model. ARCH models often require a large number of lagged terms (p), leading to high-dimensional parameter estimation and potential overfitting. The GARCH model addresses this issue by incorporating past variances \( \sigma_{t-j}^2 \) in addition to past squared errors.

The GARCH model (Generalized Autoregressive Conditional Heteroskedasticity) extends the formula by including the following term:

\begin{equation} \sum_{j=1}^{q} \beta_j \sigma_{t-j}^2 \end{equation}

The estimation of parameters \( \beta_j \) also follows maximum likelihood estimation (MLE). If violated, the variance grows infinitely, rendering the model unusable.

\begin{equation} \sigma_t^2 = \omega + \sum_{i=1}^{p} \alpha_i \varepsilon_{t-i}^2 + \sum_{j=1}^{q} \beta_j \sigma_{t-j}^2 \end{equation}

When to Use Which Model?

The choice of the appropriate model depends on a qualitative assessment of the time series and visualization of volatility patterns.

Implementation of various ARCH concepts in python

Imports

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from arch import arch_model
from statsmodels.stats.diagnostic import het_arch

Load Financial data via yahoo finance

ticker = "^GSPC"
df = yf.download(ticker, start="2010-01-01", end="2024-01-01")
df["log_return"] = np.log(df["Close"] / df["Close"].shift(1))
returns = df["log_return"].dropna() * 100

Fit different models

models = {
    "ARCH(1)": arch_model(returns, vol="ARCH", p=1),
    "GARCH(1,1)": arch_model(returns, vol="GARCH", p=1, q=1),
    "EGARCH(1,1)": arch_model(returns, vol="EGARCH", p=1, q=1),
}

Train models and store results

results = {}
for name, model in models.items():
    results[name] = model.fit(disp="off")

Compare models via AIC/BIC

model_comparison = pd.DataFrame({
    "Model": results.keys(),
    "AIC": [res.aic for res in results.values()],
    "BIC": [res.bic for res in results.values()]
}).sort_values(by="AIC")

Plot Volatility Predictions and save file

plt.figure(figsize=(12, 6))
for name, res in results.items():
    plt.plot(res.conditional_volatility, label=name)

plt.title("Volatility Estimates from Different GARCH Models")
plt.xlabel("Time")
plt.ylabel("Conditional Volatility")
plt.legend()
plt.savefig("volatility_plot.png", dpi=300, bbox_inches="tight")

forecast_days = 10
forecast_results = {
    name: res.forecast(horizon=1).variance.iloc[-1].values[0] for name, res in results.items()
}