Add voldep slides

This commit is contained in:
2025-05-18 20:30:26 +02:00
parent 6fed9201f5
commit 00ce3d9dd9
11 changed files with 791 additions and 0 deletions

3
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,3 @@
{
"files.autoSave": "off"
}

View File

@@ -53,3 +53,38 @@ cols %>%
scale_x_discrete(expand = c(0, 0)) +
scale_y_discrete(expand = c(0, 0)) +
theme_minimal() -> plot_cols
col_gas <- "blue"
col_eua <- "green"
col_oil <- "amber"
col_coal <- "brown"
col_scale2 <- function(x, rng_t) {
ret <- x
for (i in seq_along(x)) {
if (x[i] < rng_t[1]) {
ret[i] <- col_scale(rng_t[1])
} else if (x[i] > rng_t[2]) {
ret[i] <- col_scale(rng_t[2])
} else {
ret[i] <- col_scale(x[i])
}
}
return(ret)
}
rng_t <- c(-5, 5)
h_sub <- c(1, 5, 30)
col_scale <- scales::gradient_n_pal(
c(
cols[5, "green"],
cols[5, "light-green"],
cols[5, "yellow"],
# cols[5, "amber"],
cols[5, "orange"],
# cols[5, "deep-orange"],
cols[5, "red"]
),
values = seq(rng_t[1], rng_t[2], length.out = 5)
)

View File

@@ -8,6 +8,15 @@
year = {2014},
publisher = {Taylor \& Francis}
}
@article{berrisch2023modeling,
title = {Modeling volatility and dependence of European carbon and energy prices},
author = {Berrisch, Jonathan and Pappert, Sven and Ziel, Florian and Arsova, Antonia},
journal = {Finance Research Letters},
volume = {52},
pages = {103503},
year = {2023},
publisher = {Elsevier}
}
@incollection{aastveit2019evolution,
title = {The Evolution of Forecast Density Combinations in Economics},
author = {Aastveit, Knut Are and Mitchell, James and Ravazzolo, Francesco and van Dijk, Herman K},

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -2094,6 +2094,750 @@ Get these slides:
::::
# Modeling Volatility and Dependence of European Carbon and Energy Prices
---
## Motivation
:::: {.columns}
::: {.column width="48%"}
- Understanding European Allowances (EUA) dynamics is important
for several fields:
- - Portfolio & Risk Management,
- - Sustainability Planing
- - Political decisions
- - ...
EUA prices are obviously connected to the energy market
How can the dynamics be characterized?
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
</br>
- Several Questions arise:
- - Data (Pre)processing
- - Modeling Approach
- - Evaluation
:::
::::
## Data
- EUA, natural gas, Brent crude oil, coal
- March 15, 2010, until October 14, 2022
- Data was normalized w.r.t. $\text{CO}_2$ emissions
- Emission-adjusted prices reflects one tonne of $\text{CO}_2$
- We adjusted for inflation by Eurostat's HICP, excluding energy
- Log transformation of the data to stabilize the variance
- ADF Test: All series are stationary in first differences
- Johansens likelihood ratio trace test suggests two cointegrating relationships (levels)
- Johansens likelihood ratio trace test suggests no cointegrating relationships (logs)
## Data
```{r, echo=FALSE, fig.width = 12, fig.height = 6, fig.align="center"}
readr::read_csv("assets/voldep/2022_10_14_eur_ref_co2_adj_hvpi_ex_nrg.csv") %>%
select(-EUR_USD, -hvpi_x_nrg) %>%
pivot_longer(-Date) %>%
mutate(name = factor(name,
levels = c("EUA", "Oil", "NGas", "Coal")
)) %>%
ggplot(aes(x = Date, y = value, color = name)) +
geom_line(
linewidth = linesize
) + # The ggplot2 call got a little simpler now :)
geom_vline(
# Beginning of the war
xintercept = as.Date("2022-02-24"),
col = as.character(cols[8, "grey"]),
linetype = "dashed"
) +
ylab("Prices EUR") +
xlab("Date") +
scale_color_manual(
values = as.character(c(
cols[8, col_eua],
cols[8, col_oil],
cols[8, col_gas],
cols[8, col_coal]
)),
labels = c(
"EUA",
"Oil",
"NGas",
"Coal"
)
) +
guides(color = guide_legend(override.aes = list(size = 2))) +
theme_minimal() +
theme(
legend.position = "bottom",
# legend.key.size = unit(1.5, "cm"),
legend.title = element_blank(),
text = element_text(
size = text_size
)
) +
scale_x_date(
breaks = as.Date(
paste0(
as.character(seq(2010, 2022, 2)),
"-01-01"
)
),
date_labels = "%Y"
) +
xlab(NULL) +
scale_y_continuous(trans = "log2")
```
## Modeling Approach: Overview
</br>
### VECM: Vector Error Correction Model
- Modeling the expectaion
- Captures the long-run cointegrating relationship
- Different cointegrating ranks, including rank zero (no cointegration)
### GARCH: Generalized Autoregressive Conditional Heteroscedasticity
- Captures the variance dynamics
### Copula: Captures the dependence structure
- Captures: conditional cross-sectional dependence structure
- Dependence allowed to vary over time
## Modeling Approach: Overview
:::: {.columns}
::: {.column width="48%"}
- Let $\boldsymbol{X}_t$ be a $K$-dimensional vector at time $t$
- The forecasting target:
- Conditional joint distribution
- $F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}$
- $\mathcal{F}_{t}$ is the sigma field generated by all information available up to and including time $t$
Sklars theorem: decompose target into
- marginal distributions: $F_{X_{k,t}|\mathcal{F}_{t-1}}$ for $k=1,\ldots, K$, and
- copula function: $C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}$
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
Let $\boldsymbol{x}_t= (x_{1,t},\ldots, x_{K,t})^\intercal$ be the realized values
It holds that:
\begin{align}
F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}(\boldsymbol{x}_t) = C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}(\boldsymbol{u}_t) \nonumber
\end{align}
with: $\boldsymbol{u}_t =(u_{1,t},\ldots, u_{K,t})^\intercal$, $u_{k,t} = F_{X_{k,t}|\mathcal{F}_{t-1}}(x_{k,t})$
For brewity we drop the conditioning on $\mathcal{F}_{t-1}$.
The model can be specified as follows
\begin{align}
F(\boldsymbol{x}_t) = C \left[\mathbf{F}(\boldsymbol{x}_t; \boldsymbol{\mu}_t, \boldsymbol{ \sigma }_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda}); \Xi_t, \Theta\right] \nonumber
\end{align}
$\Xi_{t}$ denotes time-varying dependence parameters
$\Theta$ denotes time-invariant dependence parameters
We take $C$ as the $t$-copula
:::
::::
## Modeling Approach: Mean and Variance
:::: {.columns}
::: {.column width="48%"}
#### Individual marginal distributions:
$$\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}$$
#### Generalized non-central t-distributions
- To account for heavy tails
- Time varying
- expectation: $\boldsymbol{\mu}_t = (\mu_{1,t}, \ldots, \mu_{K,t})^{\intercal}$
- variance: $\boldsymbol{\sigma}_{t}^2 = (\sigma_{1,t}^2, \ldots, \sigma_{K,t}^2)^{\intercal}$
- Time invariant
- degrees of freedom: $\boldsymbol{\nu} = (\nu_1, \ldots, \nu_K)^{\intercal}$
- noncentrality: $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_K)^{\intercal}$
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
#### VECM Model
\begin{align}
\Delta \boldsymbol{\mu}_t = \Pi \boldsymbol{x}_{t-1} + \Gamma \Delta \boldsymbol{x}_{t-1} \nonumber
\end{align}
where $\Pi = \alpha \beta^{\intercal}$ is the cointegrating matrix of rank $r$, $0 \leq r\leq K$.
#### GARCH model
\begin{align}
\sigma_{i,t}^2 = & \omega_i + \alpha^+_{i} (\epsilon_{i,t-1}^+)^2 + \alpha^-_{i} (\epsilon_{i,t-1}^-)^2 + \beta_i \sigma_{i,t-1}^2 \nonumber
\end{align}
where $\epsilon_{i,t-1}^+ = \max\{\epsilon_{i,t-1}, 0\}$ ...
Separate coefficients for positive and negative innovations to capture leverage effects.
:::
::::
## Modeling Approach: Dependence
:::: {.columns}
::: {.column width="48%"}
#### Time-varying dependence parameters
\begin{align*}
\Xi_{t} = & \Lambda\left(\boldsymbol{\xi}_{t}\right)
\\
\xi_{ij,t} = & \eta_{0,ij} + \eta_{1,ij} \xi_{ij,t-1} + \eta_{2,ij} z_{i,t-1} z_{j,t-1},
\end{align*}
$\xi_{ij,t}$ is a latent process
$z_{i,t}$ denotes the $i$-th standardized residual from time series $i$ at time point $t$
$\Lambda(\cdot)$ is a link function
- ensures that $\Xi_{t}$ is a valid variance covariance matrix
- ensures that $\Xi_{t}$ does not exceed its support space and remains semi-positive definite
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
#### Maximum Likelihood Estimation
All parameters can be estimated jointly. Using conditional independence:
\begin{align*}
L = f_{X_1} \prod_{i=2}^T f_{X_i|\mathcal{F}_{i-1}},
\end{align*}
with multivariate conditional density:
\begin{align*}
f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu},
\boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda})
\end{align*}
The copula density $c$ can be derived analytically.
:::
::::
## Study Design and Evaluation
:::: {.columns}
::: {.column width="48%"}
#### Rolling-window forecasting study
- 3257 observations total
- Window size: 1000 days (~ four years)
- Forecasting 30-steps-ahead
=> 2227 potential starting points
We sample 250 to reduce computational cost
We draw $2^{12}= 2048$ trajectories from the joint predictive distribution
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
#### Evaluation
Forecasts are evaluated by the energy score (ES)
\begin{align*}
\text{ES}_t(F, \mathbf{x}_t) = \mathbb{E}_{F} \left(||\tilde{\mathbf{X}}_t - \mathbf{x}_t||_2\right) - \frac{1}{2} \mathbb{E}_F \left(||\tilde{\mathbf{X}}_t - \tilde{\mathbf{X}}_t'||_2 \right)
\end{align*}
where $\mathbf{x}_t$ is the observed $K$-dimensional realization and $\tilde{\mathbf{X}}_t$, respectively $\tilde{\mathbf{X}}_t'$ are independent random vectors distributed according to $F$
For univariate cases the Energy Score becomes the Continuous Ranked Probability Score (CRPS)
:::
::::
## Energy Scores
:::: {.columns}
::: {.column width="48%"}
Relative improvement in ES compared to $\text{RW}^{\sigma, \rho}$
Cellcolor: w.r.t test statistic of Diebold-Mariano test (testing wether the model outperformes the benchmark, greener = better).
```{r, echo=FALSE, results='asis'}
load("assets/voldep/energy_df.Rdata")
table_energy <- energy %>%
kbl(
digits = 2,
col.names = c(
"Model",
"\\(\\text{ES}^{\\text{All}}_{1-30}\\)",
"\\(\\text{ES}^{\\text{EUA}}_{1-30}\\)",
"\\(\\text{ES}^{\\text{Oil}}_{1-30}\\)",
"\\(\\text{ES}^{\\text{NGas}}_{1-30}\\)",
"\\(\\text{ES}^{\\text{Coal}}_{1-30}\\)",
"\\(\\text{ES}^{\\text{All}}_{1}\\)",
"\\(\\text{ES}^{\\text{All}}_{5}\\)",
"\\(\\text{ES}^{\\text{All}}_{30}\\)"
),
# linesep = "",
bootstrap_options = "condensed",
# Dont replace any string, dataframe has to be valid latex code ...
escape = TRUE,
format = "html",
align = c("l", rep("r", ncol(energy) - 1))
) %>%
kable_paper(full_width = FALSE)
for (i in 2:ncol(energy)) {
bold_cells <- rep(FALSE, times = nrow(energy))
if (all(energy[-1, i, drop = TRUE] < 0)) {
bold_cells[1] <- TRUE
} else {
bold_cells[which.max(energy[-1, i, drop = TRUE]) + 1] <- TRUE
}
table_energy <- table_energy %>%
column_spec(i,
background = c(
cols[5, "grey"],
col_scale2(
energy_dm[, i, drop = TRUE][-1],
rng_t
)
),
bold = bold_cells
)
}
table_energy %>%
kable_styling(
bootstrap_options = c("condensed"),
full_width = FALSE,
font_size = 16
)
```
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
- Benchmarks:
- $\text{RW}^{\sigma, \rho}$: Random walk with constant volatility and correlation
- Univariate $\text{ETS}^{\sigma}$ with constant volatility
- Vector ETS $VES^{\sigma}$ with constant volatility
- Heteroscedasticity is a main driver of ES
- The VECM model without cointegration (essentially a VAR) is the best performing model in terms of ES overall
- For EUA, the ETS Benchmark is the best performing model in terms of ES
:::
::::
## CRPS Scores
:::: {.columns}
::: {.column width="28%"}
- CRPS solely evaluates the marginal distributions
- The cross-sectional dependence is ignored
- VES models deliver poor performance in short horizons
- For Oil prices the RW Benchmark can't be oupterformed 30 steps ahead
- Both VECM models generally deliver good performance
:::
::: {.column width="2%"}
:::
::: {.column width="68%"}
Improvement in CRPS of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$ (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
```{r, echo=FALSE, results = 'asis'}
load("assets/voldep/crps_df.Rdata")
table_crps <- crps %>%
kbl(
col.names = c("Model", rep(paste0("H", h_sub), 4)),
digits = 1,
linesep = "",
# Dont replace any string, dataframe has to be valid latex code ...
escape = FALSE,
format = "html",
booktabs = TRUE,
align = c("l", rep("r", ncol(crps) - 1))
) %>%
kable_paper(full_width = FALSE)
for (i in 2:ncol(crps)) {
bold_cells <- rep(FALSE, times = nrow(crps))
if (all(crps[-1, i, drop = TRUE] < 0)) {
bold_cells[1] <- TRUE
} else {
bold_cells[which.max(crps[-1, i, drop = TRUE]) + 1] <- TRUE
}
table_crps <- table_crps %>%
column_spec(i,
background = c(cols[5, "grey"], col_scale2(
crps_dm[, i, drop = TRUE][-1], rng_t
)),
bold = bold_cells
)
}
table_crps <- table_crps %>%
add_header_above(c(" ", "EUA" = 3, "Oil" = 3, "NGas" = 3, "Coal" = 3))
table_crps %>%
kable_styling(
bootstrap_options = c("condensed"),
full_width = FALSE,
font_size = 16
)
```
:::
::::
## RMSE
:::: {.columns}
::: {.column width="28%"}
RMSE measures the performance of the forecasts at their mean
</br>
- Some models beat the benchmarks at short horizons
</br>
Conclusion: the Improvements seen before must be attributed to other parts of the multivariate probabilistic predictive distribution
:::
::: {.column width="2%"}
:::
::: {.column width="68%"}
Improvement in RMSE score of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$ (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
```{r, echo=FALSE, results = 'asis'}
load("assets/voldep/rmsq_df.Rdata")
table_rmsq <- rmsq %>%
kbl(
col.names = c("Model", rep(paste0("H", h_sub), 4)),
digits = 1,
# Dont replace any string, dataframe has to be valid latex code ...
escape = FALSE,
linesep = "",
format = "html",
align = c("l", rep("r", ncol(rmsq) - 1))
) %>%
kable_paper(full_width = FALSE)
for (i in 2:ncol(rmsq)) {
bold_cells <- rep(FALSE, times = nrow(rmsq))
if (all(rmsq[-1, i, drop = TRUE] < 0)) {
bold_cells[1] <- TRUE
} else {
bold_cells[which.max(rmsq[-1, i, drop = TRUE]) + 1] <- TRUE
}
table_rmsq <- table_rmsq %>%
column_spec(i,
background = c(cols[5, "grey"], col_scale2(
rmsq_dm[, i, drop = TRUE][-1], rng_t
)),
bold = bold_cells
)
}
table_rmsq <- table_rmsq %>%
add_header_above(c(" ", "EUA" = 3, "Oil" = 3, "NGas" = 3, "Coal" = 3))
table_rmsq %>%
kable_styling(
bootstrap_options = c("condensed"),
full_width = FALSE,
font_size = 14
)
```
:::
::::
## Evolution of Linear Dependence $\Xi$
```{r, echo=FALSE, fig.width = 12, fig.height = 6, fig.align="center"}
load("assets/voldep/plot_rho_df.Rdata")
ggplot() +
geom_line(
size = linesize,
data = plot_data[plot_data$name == "5Oil-Coal", ],
aes(x = idx, y = value, col = name)
) +
geom_line(
size = linesize,
data = plot_data[plot_data$name == "2EUA-NGas", ],
aes(x = idx, y = value, col = name)
) +
geom_line(
size = linesize,
data = plot_data[plot_data$name == "1EUA-Oil", ],
aes(x = idx, y = value, col = name)
) +
geom_line(
size = linesize,
data = plot_data[plot_data$name == "3EUA-Coal", ],
aes(x = idx, y = value, col = name)
) +
geom_line(
size = linesize,
data = plot_data[plot_data$name == "6NGas-Coal", ],
aes(x = idx, y = value, col = name)
) +
geom_line(
size = linesize,
data = plot_data[plot_data$name == "4Oil-NGas", ],
aes(x = idx, y = value, col = name)
) +
geom_vline(
xintercept = as.Date("2022-02-24"), # Beginning of the war
col = as.character(cols[8, "grey"]),
linetype = "dashed"
) +
geom_vline(
xintercept = as.Date("2022-02-28"), # Central bank of russia was blocked from access to foreign reserves
col = as.character(cols[8, "grey"]),
linetype = "dotted"
) +
scale_colour_manual(
values = as.character(
cols[
6,
c("green", "purple", "blue", "red", "orange", "brown")
]
),
labels = c(
paste0(varnames[comb[, 1] + 1], "-", varnames[comb[, 2] + 1])
)
) +
theme_minimal() +
theme(
zoom.x = element_rect(fill = cols[4, "grey"], colour = NA),
legend.position = "bottom",
text = element_text(size = text_size)
) +
ylab("Correlation") +
scale_x_date(
breaks = date_breaks_fct,
labels = date_labels_fct
) +
xlab(NULL) +
guides(col = guide_legend(
title = NULL,
nrow = 1,
byrow = TRUE,
override.aes = list(size = 2)
)) +
# scale_y_continuous(breaks = seq(-1.5, 1.0, 0.5)) +
ggforce::facet_zoom(
xlim = c(
as.Date("2022-02-02"),
as.Date("2022-04-30")
),
zoom.size = 1.5,
show.area = TRUE
)
```
## Predictive Quantiles (Russian Invasion)
```{r, echo=FALSE, fig.width = 12, fig.height = 6, fig.align="center"}
load("assets/voldep/plot_quant_df.Rdata")
plot_quant_data %>% ggplot(aes(x = date, y = value)) +
geom_line(size = 1, aes(col = quant)) +
geom_vline(
xintercept = as.Date("2022-02-24"), # Beginning of the war
col = as.character(cols[8, "grey"]),
linetype = "dashed"
) +
geom_vline(
xintercept = as.Date("2022-02-28"), # Central bank of russia was blocked from access to foreign reserves
col = as.character(cols[8, "grey"]),
linetype = "dotted"
) +
scale_colour_manual(
values = as.character(
c(
cols[
5,
c(1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15)
],
cols[9, 18]
)
)
) +
facet_grid(var ~ ., scales = "free_y") +
theme_minimal() +
theme(
legend.position = "bottom",
text = element_text(size = text_size),
strip.background = element_rect(
fill = cols[1, 18],
colour = cols[1, 18]
)
) +
ylab("EUR") +
scale_x_date(
breaks = (seq(
as.Date("2022-02-15"),
as.Date("2022-03-31"),
by = "4 day"
) + 1)[-12],
date_labels = "%b %d",
limits = c(as.Date("2022-02-15"), as.Date("2022-03-31"))
) +
xlab(NULL) +
guides(col = guide_legend(
title = "Quantiles [\\%]",
nrow = 2,
byrow = TRUE,
override.aes = list(size = 2)
)) +
scale_y_continuous(
trans = "log2",
breaks = y_breaks
)
```
## Conclusion
:::: {.columns}
::: {.column width="48%"}
Accounting for heteroscedasticity or stabilizing the variance via log transformation is crucial for good performance in terms of ES
- Price dynamics emerged way before the russian invaion into ukraine
- Linear dependence between the series reacted only right after the invasion
- Improvements in forecasting performance is mainly attributed to:
- the tails multivariate probabilistic predictive distribution
- the dependence structure between the marginals
:::
::: {.column width="2%"}
:::
::: {.column width="48%"}
</br>
<center>
<img src="assets/voldep/frame.png">
</center>
`r fontawesome::fa("newspaper")` `r Citet(my_bib, "berrisch2023modeling")`
:::
::::
## Columns Template