3672 lines
100 KiB
Plaintext
3672 lines
100 KiB
Plaintext
---
|
||
title: "Data Science Methods for Forecasting in Energy and Economics"
|
||
date: 30 June 2025
|
||
author:
|
||
- name: Jonathan Berrisch
|
||
affiliations:
|
||
- ref: hemf
|
||
affiliations:
|
||
- id: hemf
|
||
name: University of Duisburg-Essen, House of Energy, Climate and Finance
|
||
format:
|
||
revealjs:
|
||
embed-resources: false
|
||
self-contained-math: true
|
||
footer: ""
|
||
width: 1280
|
||
height: 720
|
||
logo: assets/logos_combined.png
|
||
theme: [default, sydney.scss, custom.scss]
|
||
smaller: true
|
||
fig-format: svg
|
||
slide-number: true
|
||
crossrefs-hover: true
|
||
pagetitle: "De-Fence"
|
||
html-math-method: mathjax
|
||
pointer:
|
||
color: "#1B5E20"
|
||
include-in-header:
|
||
- file: custom.html
|
||
execute:
|
||
daemon: false
|
||
highlight-style: github
|
||
bibliography: assets/library.bib
|
||
csl: apa-old-doi-prefix.csl
|
||
revealjs-plugins:
|
||
- pointer
|
||
---
|
||
|
||
## Outline {.center}
|
||
|
||
<!--
|
||
Render with: quarto preview /home/jonathan/git/PHD-Presentation/25_07_phd_defense/index.qmd --no-browser --port 6074
|
||
-->
|
||
|
||
::: {.hidden}
|
||
$$
|
||
\newcommand{\A}{{\mathbb A}}
|
||
$$
|
||
:::
|
||
|
||
:::: {style="font-size: 150%;"}
|
||
|
||
<i class="fa fa-fw fa-rocket" style="color:var(--col_grey_9);"></i>   [Research Motivation](#motivation)
|
||
|
||
<i class="fa fa-fw fa-book" style="color:var(--col_grey_9);"></i>   [Overview of the Thesis](#sec-overview)
|
||
|
||
<i class="fa fa-fw fa-layer-group" style="color:var(--col_grey_9);"></i>   [Online Aggregation](#sec-crps-learning)
|
||
|
||
<i class="fa fa-fw fa-chart-line" style="color:var(--col_grey_9);"></i>   [Probabilistic Forecasting of European Carbon and Energy Prices](#sec-voldep)
|
||
|
||
<i class="fa fa-fw fa-newspaper" style="color:var(--col_grey_9);"></i>   [Contributions & Outlook](#sec-conclusion)
|
||
|
||
:::
|
||
|
||
```{r, setup, include=FALSE}
|
||
# Compile with: rmarkdown::render("crps_learning.Rmd")
|
||
library(latex2exp)
|
||
library(ggplot2)
|
||
library(dplyr)
|
||
library(tidyr)
|
||
library(purrr)
|
||
library(kableExtra)
|
||
library(RefManageR)
|
||
knitr::opts_chunk$set(
|
||
dev = "svglite" # Use svg figures
|
||
)
|
||
BibOptions(
|
||
check.entries = TRUE,
|
||
bib.style = "authoryear",
|
||
cite.style = "authoryear",
|
||
style = "html",
|
||
hyperlink = TRUE,
|
||
dashed = FALSE
|
||
)
|
||
source("assets/01_common.R")
|
||
col_lightgray <- "#e7e7e7"
|
||
col_blue <- "#000088"
|
||
col_smooth_expost <- "#a7008b"
|
||
col_constant <- "#dd9002"
|
||
col_optimum <- "#666666"
|
||
col_smooth <- "#187a00"
|
||
col_pointwise <- "#008790"
|
||
col_green <- "#61B94C"
|
||
col_orange <- "#ffa600"
|
||
col_yellow <- "#FCE135"
|
||
```
|
||
|
||
## The beginning: June 2020
|
||
|
||
](assets/allisonhorst/the_beginning_cropped.png)
|
||
|
||
## Motivation
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="53%"}
|
||
|
||
<table style="width: 100%; border-collapse: separate; border-spacing: 0 0.8em; border: none;">
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; width: 1.5em; border: none;">
|
||
<i class="fa fa-fw fa-network-wired" style="color:var(--col_grey_9);"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Energy market liberalization created complex, interconnected trading systems
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-wind" style="color:var(--col_grey_9);"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Renewable energy transition introduces uncertainty and volatility from weather-dependent generation
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-chart-line" style="color:var(--col_grey_9);"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Traditional point forecasts are inadequate for modern energy markets with increasing uncertainty
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-percent" style="color:var(--col_grey_9);"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Risk inherently *is* a probabilistic concept
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-dice" style="color:var(--col_grey_9);"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
**Probabilistic forecasting** essential for risk management, planning and decision making in volatile energy environments
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-sync-alt" style="color:var(--col_grey_9);"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
**Online learning** methods needed for fast-updating models with streaming energy data
|
||
</td>
|
||
</tr>
|
||
</table>
|
||
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="43%"}
|
||
|
||

|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Overview of the Thesis {#sec-overview}
|
||
|
||
::: {.r-stack}
|
||
|
||
::: {.fragment .fade-in-then-out}
|
||
|
||
<table style="width: 100%; border-collapse: separate; border-spacing: 0 1em; border: none;">
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; width: 2em; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. CRPS learning. <em>Journal of Econometrics</em>, 237(2), 105221.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH20241568]. Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. <em>International Journal of Forecasting</em>, 40(4), 1568–1586.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Hirsch, S., Berrisch, J., & Ziel, F. [-@hirsch2024online]. Online Distributional Regression. <em>arXiv preprint</em> arXiv:2407.08750.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@berrisch2022distributional]. Distributional modeling and forecasting of natural gas prices. <em>Journal of Forecasting</em>, 41(6), 1065–1086.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. [-@berrisch2023modeling]. Modeling volatility and dependence of European carbon and energy prices. <em>Finance Research Letters</em>, 52, 103503.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., Narajewski, M., & Ziel, F. [-@BERRISCH2023100236]. High-resolution peak demand estimation using generalized additive models and deep neural networks. <em>Energy and AI</em>, 13, 100236.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J. [-@berrisch2025rcpptimer]. rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support. <em>arXiv preprint</em> arXiv:2501.15856.
|
||
</td>
|
||
</tr>
|
||
</table>
|
||
|
||
:::
|
||
|
||
::: {.fragment .fade-in-then-out}
|
||
|
||
<table style="width: 100%; border-collapse: separate; border-spacing: 0 1em; border: none;">
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; width: 2em; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. CRPS learning. <em>Journal of Econometrics</em>, 237(2), 105221.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH20241568]. Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. <em>International Journal of Forecasting</em>, 40(4), 1568–1586.
|
||
</td>
|
||
</tr>
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Hirsch, S., Berrisch, J., & Ziel, F. [-@hirsch2024online]. Online Distributional Regression. <em>arXiv preprint</em> arXiv:2407.08750.
|
||
</td>
|
||
</tr>
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@berrisch2022distributional]. Distributional modeling and forecasting of natural gas prices. <em>Journal of Forecasting</em>, 41(6), 1065–1086.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. [-@berrisch2023modeling]. Modeling volatility and dependence of European carbon and energy prices. <em>Finance Research Letters</em>, 52, 103503.
|
||
</td>
|
||
</tr>
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., Narajewski, M., & Ziel, F. [-@BERRISCH2023100236]. High-resolution peak demand estimation using generalized additive models and deep neural networks. <em>Energy and AI</em>, 13, 100236.
|
||
</td>
|
||
</tr>
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J. [-@berrisch2025rcpptimer]. rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support. <em>arXiv preprint</em> arXiv:2501.15856.
|
||
</td>
|
||
</tr>
|
||
</table>
|
||
|
||
:::
|
||
|
||
::: {.fragment .fade-in-then-out}
|
||
|
||
<table style="width: 100%; border-collapse: separate; border-spacing: 0 1em; border: none;">
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; width: 2em; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. CRPS learning. <em>Journal of Econometrics</em>, 237(2), 105221.
|
||
</td>
|
||
</tr>
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH20241568]. Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. <em>International Journal of Forecasting</em>, 40(4), 1568–1586.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Hirsch, S., Berrisch, J., & Ziel, F. [-@hirsch2024online]. Online Distributional Regression. <em>arXiv preprint</em> arXiv:2407.08750.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., & Ziel, F. [-@berrisch2022distributional]. Distributional modeling and forecasting of natural gas prices. <em>Journal of Forecasting</em>, 41(6), 1065–1086.
|
||
</td>
|
||
</tr>
|
||
<tr class = "greyed-out" style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. [-@berrisch2023modeling]. Modeling volatility and dependence of European carbon and energy prices. <em>Finance Research Letters</em>, 52, 103503.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J., Narajewski, M., & Ziel, F. [-@BERRISCH2023100236]. High-resolution peak demand estimation using generalized additive models and deep neural networks. <em>Energy and AI</em>, 13, 100236.
|
||
</td>
|
||
</tr>
|
||
<tr style="border: none;">
|
||
<td style="vertical-align: top; border: none;">
|
||
<i class="fa fa-fw fa-newspaper"></i>
|
||
</td>
|
||
<td style="border: none;">
|
||
Berrisch, J. [-@berrisch2025rcpptimer]. rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support. <em>arXiv preprint</em> arXiv:2501.15856.
|
||
</td>
|
||
</tr>
|
||
</table>
|
||
|
||
:::
|
||
|
||
:::
|
||
|
||
## Overview
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
#### Online Distributional Regression
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
#### Distributional Modeling and Forecasting of Natural Gas Prices
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
<center>
|
||
<img src="assets/ondir/ondir_flow.svg">
|
||
</center>
|
||
|
||
Reduces estimation time by 2-3 orders of magnitude
|
||
|
||
Maintainins competitive forecasting accuracy
|
||
|
||
Real-World Validation in Energy Markets
|
||
|
||
<i class="fa-brands fa-fw fa-python" style="color: #FFD43B;"></i> Python Package *ondil* on <i class="fa-brands fa-fw fa-github" style="color:var(--col_grey_10);"></i> [Github](https://github.com/simon-hirsch/ondil) and [PyPi](https://pypi.org/project/ondil/)
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Forecasting of Day-Ahead and Month-Ahead prices
|
||
|
||
To capture the full distribution of future prices
|
||
|
||
Extensive data analysis from 2011-2020
|
||
|
||
State-Space models, skewed Student's *t* distribution
|
||
|
||
<i class="fa-brands fa-fw fa-python" style="color: #306998;"></i> Python Package *sstudentt* on <i class="fa-brands fa-fw fa-github" style="color:var(--col_grey_10);"></i> [Github](https://github.com/BerriJ/sstudentt) and [PyPi](https://pypi.org/project/sstudentt/)
|
||
|
||
```{r, echo=FALSE, fig.width = 12, fig.height = 6, fig.align="left"}
|
||
load("assets/ngas/residuals.RData")
|
||
|
||
clr_day_ahead <- cols[5, "green"]
|
||
clr_month_ahead <- cols[5, "blue"]
|
||
|
||
line_da <- da_data %>%
|
||
ggplot(aes(y = sresids, x = obs)) +
|
||
geom_point(col = clr_day_ahead) +
|
||
geom_line(col = clr_day_ahead) +
|
||
theme_minimal() +
|
||
xlab(NULL) +
|
||
ylab(expression(z[t])) +
|
||
theme(text = element_text(size = text_size))
|
||
|
||
acf_da <- forecast::ggAcf(da_data$sresids, size = 1, col = clr_day_ahead) +
|
||
theme_minimal() +
|
||
ggtitle("") +
|
||
theme(text = element_text(size = text_size))
|
||
|
||
# Density Plot DA
|
||
den_da <- da_data %>%
|
||
ggplot(aes(x = sresids)) +
|
||
geom_histogram(aes(y = ..density..), fill = clr_day_ahead) +
|
||
geom_density(aes(y = ..density..), size = 1) +
|
||
theme_minimal() +
|
||
ylab("Density") +
|
||
xlab(NULL) +
|
||
theme(text = element_text(size = text_size))
|
||
|
||
line_ma <- ma_data %>%
|
||
ggplot(aes(y = sresids, x = obs)) +
|
||
geom_point(col = clr_month_ahead) +
|
||
geom_line(col = clr_month_ahead) +
|
||
theme_minimal() +
|
||
xlab(NULL) +
|
||
ylab(expression(z[t])) +
|
||
theme(text = element_text(size = text_size))
|
||
|
||
acf_ma <- forecast::ggAcf(ma_data$sresids, size = 1, col = clr_month_ahead) +
|
||
theme_minimal() +
|
||
ggtitle("") +
|
||
theme(text = element_text(size = text_size))
|
||
|
||
den_ma <- ma_data %>%
|
||
ggplot(aes(x = sresids)) +
|
||
geom_histogram(aes(y = after_stat(density)), fill = clr_month_ahead) +
|
||
geom_density(aes(y = after_stat(density)), size = 1) +
|
||
theme_minimal() +
|
||
ylab("Density") +
|
||
xlab(NULL) +
|
||
theme(text = element_text(size = text_size))
|
||
|
||
plots_da <- cowplot::align_plots(line_da, acf_da, line_ma, acf_ma, align = "v", axis = "l")
|
||
|
||
bottom_row_da <- cowplot::plot_grid(plots_da[[2]], den_da, align = "hv")
|
||
|
||
plots_ma <- cowplot::align_plots(line_ma, acf_ma, line_ma, acf_ma, align = "v", axis = "l")
|
||
|
||
bottom_row_ma <- cowplot::plot_grid(plots_ma[[2]], den_ma, align = "hv")
|
||
|
||
cols_ngas <- c(
|
||
clr_day_ahead, clr_day_ahead, clr_day_ahead, clr_day_ahead,
|
||
clr_month_ahead, clr_month_ahead, clr_month_ahead, clr_month_ahead
|
||
)
|
||
|
||
|
||
pacfs <- map2(pacfs, cols_ngas, .f = ~ .x %>%
|
||
mutate(Col = .y)) %>%
|
||
purrr::reduce(.f = rbind)
|
||
|
||
var_labs <- c(
|
||
"Residuals", "Absolute", "Positive", "Negative",
|
||
"Residuals", "Absolute", "Positive", "Negative"
|
||
)
|
||
|
||
lvls <- unique(pacfs$Var)
|
||
|
||
names(var_labs) <- lvls
|
||
|
||
pacfs <- transform(pacfs,
|
||
Var = factor(Var,
|
||
levels = lvls,
|
||
labels = c(
|
||
"paste(Residuals, ':',~ z[t])",
|
||
"paste(Absolute, ':',~ '|', z[t] ,'|')",
|
||
"paste(Positive, ':',~ '(', z[t],')'^'+')",
|
||
"paste(Negative, ':',~ '(', z[t],')'^'-')"
|
||
)
|
||
)
|
||
)
|
||
|
||
acfs <- pacfs %>%
|
||
dplyr::filter(Lag != 0) %>%
|
||
ggplot(aes(x = Lag)) +
|
||
geom_linerange(aes(ymin = ymin, ymax = ymax, color = Col), size = 1) +
|
||
geom_line(aes(y = upper), linetype = "longdash", alpha = 0.5) +
|
||
geom_line(aes(y = lower), linetype = "longdash", alpha = 0.5) +
|
||
scale_color_identity() +
|
||
ylab("ACF") +
|
||
theme_minimal() +
|
||
facet_grid(Product ~ Var, labeller = label_parsed) +
|
||
theme(
|
||
plot.margin = unit(c(0, 0, 0, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
strip.background = element_rect(fill = "grey95", colour = "grey95")
|
||
)
|
||
|
||
acfs_da <- pacfs %>%
|
||
dplyr::filter(Lag != 0 & Product == "Day-Ahead") %>%
|
||
ggplot(aes(x = Lag)) +
|
||
geom_linerange(aes(ymin = ymin, ymax = ymax, color = Col), size = 1) +
|
||
geom_line(aes(y = upper), linetype = "longdash", alpha = 0.5) +
|
||
geom_line(aes(y = lower), linetype = "longdash", alpha = 0.5) +
|
||
scale_color_identity() +
|
||
ylab("PACF") +
|
||
theme_minimal() +
|
||
facet_grid(. ~ Var, labeller = label_parsed) +
|
||
theme(
|
||
plot.margin = unit(c(0, 0, 0, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
strip.background = element_rect(
|
||
fill = "grey95", colour = "grey95"
|
||
),
|
||
axis.title.x = element_blank(),
|
||
axis.text.x = element_blank(),
|
||
axis.ticks.x = element_blank()
|
||
)
|
||
|
||
acfs_ma <- pacfs %>%
|
||
dplyr::filter(Lag != 0 & Product == "Month-Ahead") %>%
|
||
ggplot(aes(x = Lag)) +
|
||
geom_linerange(aes(ymin = ymin, ymax = ymax, color = Col), size = 1) +
|
||
geom_line(aes(y = upper), linetype = "longdash", alpha = 0.5) +
|
||
geom_line(aes(y = lower), linetype = "longdash", alpha = 0.5) +
|
||
scale_color_identity() +
|
||
ylab("PACF") +
|
||
theme_minimal() +
|
||
scale_x_continuous(breaks = c(1, 10, 20, 30, 40)) +
|
||
facet_grid(. ~ Var, labeller = label_parsed) +
|
||
theme(
|
||
plot.margin = unit(c(0, 0, 0, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
strip.background = element_blank(), strip.text = element_blank()
|
||
)
|
||
|
||
plots_da <- cowplot::align_plots(acfs_da, line_da, line_ma, acf_ma, align = "hv", axis = "l")
|
||
|
||
bottom_row_da <- cowplot::plot_grid(
|
||
plots_da[[2]], den_da,
|
||
align = "hv",
|
||
rel_widths = c(0.66, 0.33)
|
||
)
|
||
|
||
da_plots <- cowplot::plot_grid(plots_da[[1]], bottom_row_da, nrow = 2)
|
||
|
||
plots_ma <- cowplot::align_plots(acfs_ma, line_ma, line_ma, acf_ma, align = "hv", axis = "l")
|
||
|
||
bottom_row_ma <- cowplot::plot_grid(
|
||
plots_ma[[2]],
|
||
den_ma,
|
||
align = "hv",
|
||
rel_widths = c(0.66, 0.33)
|
||
)
|
||
|
||
cowplot::plot_grid(
|
||
bottom_row_da,
|
||
bottom_row_ma,
|
||
acfs_da,
|
||
acfs_ma,
|
||
ncol = 1,
|
||
rel_heights = c(0.2, 0.2, 0.3, 0.3)
|
||
)
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Overview
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
#### High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
#### rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Predict high-resolution electricity peaks using only low-resolution data
|
||
|
||
Combine GAMs and DNN's for superior accuracy
|
||
|
||
<i class="fa fa-fw fa-award" style="color:var(--col_red_9);"></i> Won Western Power Distribution Competition
|
||
|
||
<i class="fa fa-fw fa-award" style="color:var(--col_amber_9);"></i> Won Best-Student-Presentation Award
|
||
|
||
```{r, echo=FALSE, fig.width = 12, fig.height = 6, fig.align="center"}
|
||
load("assets/minmaxload/plot_overview.rds")
|
||
# linesize <- 1.2
|
||
ggplot() +
|
||
geom_line(
|
||
data = plot_df[plot_df$var == "1high_res_load", ],
|
||
aes(
|
||
x = time,
|
||
y = value,
|
||
colour = var
|
||
),
|
||
linewidth = linesize
|
||
) +
|
||
geom_line(
|
||
data = plot_df[plot_df$var == "2low_res", ],
|
||
aes(
|
||
x = time,
|
||
y = value,
|
||
colour = var
|
||
),
|
||
linewidth = linesize
|
||
) +
|
||
geom_line(
|
||
data = plot_df[plot_df$var == "4max", ],
|
||
aes(
|
||
x = time,
|
||
y = value,
|
||
colour = var
|
||
),
|
||
linewidth = linesize
|
||
) +
|
||
geom_line(
|
||
data = plot_df[plot_df$var == "3min", ],
|
||
aes(
|
||
x = time,
|
||
y = value,
|
||
colour = var
|
||
),
|
||
linewidth = linesize
|
||
) +
|
||
scale_color_manual(
|
||
values = as.character(c(
|
||
cols[4, col_load],
|
||
cols[9, col_load],
|
||
cols[8, col_min],
|
||
cols[8, col_max]
|
||
)),
|
||
labels = c(
|
||
"High-Resolution Load",
|
||
"Low-Resolution Load",
|
||
"Minimum",
|
||
"Maximum"
|
||
)
|
||
) +
|
||
guides(color = guide_legend(
|
||
override.aes = list(size = 2)
|
||
)) +
|
||
theme_minimal() +
|
||
ylab("Load [MW]") +
|
||
xlab("Time") +
|
||
theme(
|
||
zoom.x = element_rect(fill = cols[4, "grey"], colour = NA),
|
||
legend.position = "bottom",
|
||
legend.title = element_blank(),
|
||
text = element_text(
|
||
size = text_size + 4,
|
||
),
|
||
validate = FALSE
|
||
) +
|
||
scale_y_continuous(breaks = seq(-1.5, 1.0, 0.5)) +
|
||
ggforce::facet_zoom(
|
||
xlim = c(
|
||
as.POSIXct("2021-07-24 12:00:00", tz = "UTC"),
|
||
as.POSIXct("2021-07-24 19:00:00", tz = "UTC")
|
||
),
|
||
zoom.size = 2,
|
||
ylim = c(-.9, 1),
|
||
split = FALSE,
|
||
horizontal = FALSE,
|
||
show.area = TRUE
|
||
)
|
||
```
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
<i class="fa-brands fa-fw fa-r-project" style="color: #276DC3;"></i> R Package *rcpptimer* on <i class="fa-brands fa-fw fa-github" style="color:var(--col_grey_10);"></i> [Github](https://github.com/BerriJ/rcpptimer) and [CRAN](https://cran.r-project.org/web/packages/rcpptimer/)
|
||
|
||
Provides Rcpp bindings for <i class="fa-brands fa-fw fa-github" style="color:var(--col_grey_10);"></i> [cpptimer](https://github.com/BerriJ/rcpptimer)
|
||
|
||
tic-toc class for timing C++ code
|
||
|
||
Supports nested, overlapping and scoped timers
|
||
|
||
Supports OpenMP parallelism
|
||
|
||
```c++
|
||
//[[Rcpp::depends(rcpptimer)]]
|
||
#include <rcpptimer.h>
|
||
|
||
void main(){
|
||
Rcpp::Timer timer;
|
||
Rcpp::Timer::ScopedTimer _(timer, "ST");
|
||
|
||
timer.tic();
|
||
// Some more code
|
||
timer.toc();
|
||
} // ScopedTimer will stop automatically
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
|
||
# CRPS Learning {#sec-crps-learning}
|
||
|
||
Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. *Journal of Econometrics*, 237(2), 105221.
|
||
|
||
## Introduction
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
The Idea:
|
||
|
||
- Combine multiple forecasts instead of choosing one
|
||
|
||
- Combination weights may vary over **time**, over the **distribution** or **both**
|
||
|
||
2 Popular options for combining distributions:
|
||
|
||
- Combining across quantiles (this paper)
|
||
- Horizontal aggregation, vincentization
|
||
- Combining across probabilities
|
||
- Vertical aggregation
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Time
|
||
|
||
```{r, echo = FALSE, fig.height=6, cache = TRUE}
|
||
par(mfrow = c(3, 3), mar = c(2, 2, 2, 2))
|
||
set.seed(1)
|
||
# Data
|
||
X <- matrix(ncol = 3, nrow = 15)
|
||
X[, 1] <- seq(from = 8, to = 12, length.out = 15) + 0.25 * rnorm(15)
|
||
X[, 2] <- 10 + 0.25 * rnorm(15)
|
||
X[, 3] <- seq(from = 12, to = 8, length.out = 15) + 0.25 * rnorm(15)
|
||
# Weights
|
||
w <- matrix(ncol = 3, nrow = 15)
|
||
w[, 1] <- sin(0.1 * 1:15)
|
||
w[, 2] <- cos(0.1 * 1:15)
|
||
w[, 3] <- seq(from = -2, 0.25, length.out = 15)^2
|
||
w <- (w / rowSums(w))
|
||
# Vis
|
||
plot(X[, 1],
|
||
lwd = 4,
|
||
type = "l",
|
||
ylim = c(8, 12),
|
||
xlab = "",
|
||
ylab = "",
|
||
xaxt = "n",
|
||
yaxt = "n",
|
||
bty = "n",
|
||
col = "#2050f0"
|
||
)
|
||
plot(w[, 1],
|
||
lwd = 4, type = "l",
|
||
ylim = c(0, 1),
|
||
xlab = "",
|
||
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#2050f0"
|
||
)
|
||
text(6, 0.5, TeX("$w_1(t)$"), cex = 2, col = "#2050f0")
|
||
arrows(13, 0.25, 15, 0.0, , lwd = 4, bty = "n")
|
||
plot.new()
|
||
plot(X[, 2],
|
||
lwd = 4,
|
||
type = "l", ylim = c(8, 12),
|
||
xlab = "", ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "purple"
|
||
)
|
||
plot(w[, 2],
|
||
lwd = 4, type = "l",
|
||
ylim = c(0, 1),
|
||
xlab = "",
|
||
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "purple"
|
||
)
|
||
text(6, 0.6, TeX("$w_2(t)$"), cex = 2, col = "purple")
|
||
arrows(13, 0.5, 15, 0.5, , lwd = 4, bty = "n")
|
||
plot(rowSums(X * w), lwd = 4, type = "l", xlab = "", ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#298829")
|
||
plot(X[, 3],
|
||
lwd = 4,
|
||
type = "l", ylim = c(8, 12),
|
||
xlab = "", ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#e423b4"
|
||
)
|
||
plot(w[, 3],
|
||
lwd = 4, type = "l",
|
||
ylim = c(0, 1),
|
||
xlab = "",
|
||
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#e423b4"
|
||
)
|
||
text(6, 0.25, TeX("$w_3(t)$"), cex = 2, col = "#e423b4")
|
||
arrows(13, 0.75, 15, 1, , lwd = 4, bty = "n")
|
||
```
|
||
|
||
## Distribution
|
||
|
||
```{r, echo = FALSE, fig.height=6, cache = TRUE}
|
||
par(mfrow = c(3, 3), mar = c(2, 2, 2, 2))
|
||
set.seed(1)
|
||
# Data
|
||
X <- matrix(ncol = 3, nrow = 31)
|
||
|
||
X[, 1] <- dchisq(0:30, df = 10)
|
||
X[, 2] <- dnorm(0:30, mean = 15, sd = 5)
|
||
X[, 3] <- dexp(0:30, 0.2)
|
||
# Weights
|
||
w <- matrix(ncol = 3, nrow = 31)
|
||
w[, 1] <- sin(0.05 * 0:30)
|
||
w[, 2] <- cos(0.05 * 0:30)
|
||
w[, 3] <- seq(from = -2, 0.25, length.out = 31)^2
|
||
w <- (w / rowSums(w))
|
||
# Vis
|
||
plot(X[, 1],
|
||
lwd = 4,
|
||
type = "l",
|
||
xlab = "",
|
||
ylab = "",
|
||
xaxt = "n",
|
||
yaxt = "n",
|
||
bty = "n",
|
||
col = "#2050f0"
|
||
)
|
||
plot(X[, 2],
|
||
lwd = 4,
|
||
type = "l",
|
||
xlab = "", ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "purple"
|
||
)
|
||
plot(X[, 3],
|
||
lwd = 4,
|
||
type = "l",
|
||
xlab = "", ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#e423b4"
|
||
)
|
||
plot(w[, 1],
|
||
lwd = 4, type = "l",
|
||
ylim = c(0, 1),
|
||
xlab = "",
|
||
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#2050f0"
|
||
)
|
||
text(12, 0.5, TeX("$w_1(x)$"), cex = 2, col = "#2050f0")
|
||
arrows(26, 0.25, 31, 0.0, , lwd = 4, bty = "n")
|
||
plot(w[, 2],
|
||
lwd = 4, type = "l",
|
||
ylim = c(0, 1),
|
||
xlab = "",
|
||
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "purple"
|
||
)
|
||
text(15, 0.5, TeX("$w_2(x)$"), cex = 2, col = "purple")
|
||
arrows(15, 0.25, 15, 0, , lwd = 4, bty = "n")
|
||
plot(w[, 3],
|
||
lwd = 4, type = "l",
|
||
ylim = c(0, 1),
|
||
xlab = "",
|
||
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#e423b4"
|
||
)
|
||
text(20, 0.5, TeX("$w_3(x)$"), cex = 2, col = "#e423b4")
|
||
arrows(5, 0.25, 0, 0, , lwd = 4, bty = "n")
|
||
plot.new()
|
||
plot(rowSums(X * w),
|
||
lwd = 4, type = "l", xlab = "", ylab = "", xaxt = "n",
|
||
yaxt = "n", bty = "n", col = "#298829"
|
||
)
|
||
```
|
||
|
||
:::
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## The Framework of Prediction under Expert Advice
|
||
|
||
### The sequential framework
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Each day, $t = 1, 2, ... T$
|
||
- The **forecaster** receives predictions $\widehat{X}_{t,k}$ from $K$ **experts**
|
||
- The **forecaster** assings weights $w_{t,k}$ to each **expert**
|
||
- The **forecaster** calculates her prediction:
|
||
\begin{equation}
|
||
\widetilde{X}_{t} = \sum_{k=1}^K w_{t,k} \widehat{X}_{t,k}.
|
||
\label{eq_forecast_def}
|
||
\end{equation}
|
||
- The realization for $t$ is observedilities
|
||
- Vertical aggregation
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
- The experts can be institutions, persons, or models
|
||
- The forecasts can be point-forecasts (i.e., mean or median) or full predictive distributions
|
||
- We do not need any assumptions concerning the underlying data
|
||
- @cesa2006prediction
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
---
|
||
|
||
## The Regret
|
||
|
||
Weights are updated sequentially according to the past performance of the $K$ experts.
|
||
|
||
That is, a loss function $\ell$ is needed. This is used to compute the **cumulative regret** $R_{t,k}$
|
||
|
||
\begin{equation}
|
||
R_{t,k} = \widetilde{L}_{t} - \widehat{L}_{t,k} = \sum_{i = 1}^t \ell(\widetilde{X}_{i},Y_i) - \ell(\widehat{X}_{i,k},Y_i)\label{eq:regret}
|
||
\end{equation}
|
||
|
||
The cumulative regret:
|
||
|
||
- Indicates the predictive accuracy of the expert $k$ until time $t$.
|
||
- Measures how much the forecaster *regrets* not having followed the expert's advice
|
||
|
||
Popular loss functions for point forecasting @gneiting2011making:
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
$\ell_2$ loss:
|
||
|
||
\begin{equation}
|
||
\ell_2(x, y) = | x -y|^2 \label{eq:elltwo}
|
||
\end{equation}
|
||
|
||
Strictly proper for *mean* prediction
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
$\ell_1$ loss:
|
||
|
||
\begin{equation}
|
||
\ell_1(x, y) = | x -y| \label{eq:ellone}
|
||
\end{equation}
|
||
|
||
Strictly proper for *median* predictions
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
|
||
## Popular Algorithms and the Risk
|
||
|
||
<br/>
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="58%"}
|
||
|
||
### Popular Aggregation Algorithms
|
||
|
||
<br/>
|
||
|
||
#### The naive combination
|
||
|
||
|
||
\begin{equation}
|
||
w_{t,k}^{\text{Naive}} = \frac{1}{K}\label{eq:naive_combination}
|
||
\end{equation}
|
||
|
||
#### The exponentially weighted average forecaster (EWA)
|
||
|
||
\begin{equation}
|
||
\begin{aligned}
|
||
w_{t,k}^{\text{EWA}} & = \frac{e^{\eta R_{t,k}} }{\sum_{k = 1}^K e^{\eta R_{t,k}}}\\
|
||
& =
|
||
\frac{e^{-\eta \ell(\widehat{X}_{t,k},Y_t)} w^{\text{EWA}}_{t-1,k} }{\sum_{k = 1}^K e^{-\eta \ell(\widehat{X}_{t,k},Y_t)} w^{\text{EWA}}_{t-1,k}}
|
||
\end{aligned}\label{eq:exp_combination}
|
||
\end{equation}
|
||
|
||
Du kannst dann auch easy darauf verweisen: \ref{eq:exp_combination}.
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="38%"}
|
||
|
||
### Optimality
|
||
|
||
In stochastic settings, the cumulative Risk should be analyzed @wintenberger2017optimal:
|
||
\begin{align}
|
||
&\underbrace{\widetilde{\mathcal{R}}_t = \sum_{i=1}^t \mathbb{E}[\ell(\widetilde{X}_{i},Y_i)|\mathcal{F}_{i-1}]}_{\text{Cumulative Risk of Forecaster}} \\
|
||
&\underbrace{\widehat{\mathcal{R}}_{t,k} = \sum_{i=1}^t \mathbb{E}[\ell(\widehat{X}_{i,k},Y_i)|\mathcal{F}_{i-1}]}_{\text{Cumulative Risk of Experts}}
|
||
\label{eq_def_cumrisk}
|
||
\end{align}
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Optimal Convergence
|
||
|
||
<br/>
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### The selection problem
|
||
\begin{equation}
|
||
\frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) \stackrel{t\to \infty}{\rightarrow} a \quad \text{with} \quad a \leq 0.
|
||
\label{eq_opt_select}
|
||
\end{equation}
|
||
The forecaster is asymptotically not worse than the best expert $\widehat{\mathcal{R}}_{t,\min}$.
|
||
|
||
### The convex aggregation problem
|
||
|
||
\begin{equation}
|
||
\frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\pi} \right) \stackrel{t\to \infty}{\rightarrow} b \quad \text{with} \quad b \leq 0 .
|
||
\label{eq_opt_conv}
|
||
\end{equation}
|
||
The forecaster is asymptotically not worse than the best convex combination $\widehat{X}_{t,\pi}$ in hindsight (**oracle**).
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Optimal rates with respect to selection \eqref{eq_opt_select} and convex aggregation \eqref{eq_opt_conv} @wintenberger2017optimal:
|
||
|
||
|
||
\begin{align}
|
||
\frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) & =
|
||
\mathcal{O}\left(\frac{\log(K)}{t}\right)\label{eq_optp_select}
|
||
\end{align}
|
||
|
||
\begin{align}
|
||
\frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\pi} \right) & =
|
||
\mathcal{O}\left(\sqrt{\frac{\log(K)}{t}}\right)
|
||
\label{eq_optp_conv}
|
||
\end{align}
|
||
|
||
Algorithms can statisfy both \eqref{eq_optp_select} and \eqref{eq_optp_conv} depending on:
|
||
|
||
- The loss function
|
||
- Regularity conditions on $Y_t$ and $\widehat{X}_{t,k}$
|
||
- The weighting scheme
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
##
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Optimal Convergence
|
||
|
||
<br/>
|
||
|
||
EWA satisfies optimal selection convergence \eqref{eq_optp_select} in a deterministic setting if:
|
||
|
||
- Loss $\ell$ is exp-concave
|
||
- Learning-rate $\eta$ is chosen correctly
|
||
|
||
Those results can be converted to any stochastic setting @wintenberger2017optimal.
|
||
|
||
Optimal convex aggregation convergence \eqref{eq_optp_conv} can be satisfied by applying the kernel-trick:
|
||
|
||
\begin{align}
|
||
\ell^{\nabla}(x,y) = \ell'(\widetilde{X},y) x
|
||
\end{align}
|
||
|
||
$\ell'$ is the subgradient of $\ell$ at forecast combination $\widetilde{X}$.
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Probabilistic Setting
|
||
|
||
<br/>
|
||
|
||
**An appropriate choice:**
|
||
|
||
\begin{equation*}
|
||
\text{CRPS}(F, y) = \int_{\mathbb{R}} {(F(x) - \mathbb{1}\{ x > y \})}^2 dx \label{eq:crps}
|
||
\end{equation*}
|
||
|
||
It's strictly proper @gneiting2007strictly.
|
||
|
||
Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what if the experts' performance varies in parts of the distribution?
|
||
|
||
<i class="fa fa-fw fa-lightbulb" style="color:var(--col_yellow_9);"></i> Utilize this relation:
|
||
|
||
\begin{equation*}
|
||
\text{CRPS}(F, y) = 2 \int_0^{1} \text{QL}_p(F^{-1}(p), y) dp.\label{eq_crps_qs}
|
||
\end{equation*}
|
||
|
||
... to combine quantiles of the probabilistic forecasts individually using the quantile-loss QL.
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## CRPS Learning Optimality
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Almost Optimal Convergence
|
||
|
||
:::: {style="font-size: 90%;"}
|
||
|
||
<i class="fa fa-fw fa-exclamation" style="color:var(--col_orange_10);"></i> QL is convex, but not exp-concave
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Bernstein Online Aggregation (BOA) lets us weaken the exp-concavity condition. It satisfies that there exist a $C>0$ such that for $x>0$ it holds that
|
||
|
||
\begin{equation}
|
||
P\left( \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\pi} \right) \leq C \log(\log(t)) \left(\sqrt{\frac{\log(K)}{t}} + \frac{\log(K)+x}{t}\right) \right) \geq
|
||
1-e^{-x}
|
||
\label{eq_boa_opt_conv}
|
||
\end{equation}
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Almost optimal w.r.t. *convex aggregation* \eqref{eq_optp_conv} @wintenberger2017optimal.
|
||
|
||
The same algorithm satisfies that there exist a $C>0$ such that for $x>0$ it holds that
|
||
\begin{equation}
|
||
P\left( \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) \leq
|
||
C\left(\frac{\log(K)+\log(\log(Gt))+ x}{\alpha t}\right)^{\frac{1}{2-\beta}} \right) \geq
|
||
1-2e^{-x}
|
||
\label{eq_boa_opt_select}
|
||
\end{equation}
|
||
|
||
if $Y_t$ is bounded, the considered loss $\ell$ is convex $G$-Lipschitz and weak exp-concave in its first coordinate.
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Almost optimal w.r.t. *selection* \eqref{eq_optp_select} @gaillard2018efficient.
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> We show that this holds for QL under feasible conditions.
|
||
|
||
:::
|
||
|
||
## Conditions + Lemma
|
||
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
**Lemma 1**
|
||
|
||
\begin{align}
|
||
2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\min}
|
||
& \leq \widehat{\mathcal{R}}^{\text{CRPS}}_{t,\min}
|
||
\label{eq_risk_ql_crps_expert} \\
|
||
2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\pi}
|
||
& \leq \widehat{\mathcal{R}}^{\text{CRPS}}_{t,\pi} .
|
||
\label{eq_risk_ql_crps_convex}
|
||
\end{align}
|
||
|
||
Pointwise can outperform constant procedures
|
||
|
||
QL is convex but not exp-concave:
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Almost optimal convergence w.r.t. *convex aggregation* \eqref{eq_boa_opt_conv} <i class="fa fa-fw fa-check" style="color:var(--col_green_9);"></i> </br>
|
||
|
||
For almost optimal congerence w.r.t. *selection* \eqref{eq_boa_opt_select} we need to check **A1** and **A2**:
|
||
|
||
QL is Lipschitz continuous:
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> **A1** holds <i class="fa fa-fw fa-check" style="color:var(--col_orange_9);"></i>
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
**A1**
|
||
|
||
For some $G>0$ it holds
|
||
for all $x_1,x_2\in \mathbb{R}$ and $t>0$ that
|
||
|
||
$$ | \ell(x_1, Y_t)-\ell(x_2, Y_t) | \leq G |x_1-x_2|$$
|
||
|
||
**A2** For some $\alpha>0$, $\beta\in[0,1]$ it holds
|
||
for all $x_1,x_2 \in \mathbb{R}$ and $t>0$ that
|
||
|
||
\begin{align*}
|
||
\mathbb{E}[
|
||
& \ell(x_1, Y_t)-\ell(x_2, Y_t) | \mathcal{F}_{t-1}] \leq \\
|
||
& \mathbb{E}[ \ell'(x_1, Y_t)(x_1 - x_2) |\mathcal{F}_{t-1}] \\
|
||
& +
|
||
\mathbb{E}\left[ \left. \left( \alpha(\ell'(x_1, Y_t)(x_1 - x_2))^{2}\right)^{1/\beta} \right|\mathcal{F}_{t-1}\right]
|
||
\end{align*}
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Almost optimal w.r.t. *selection* \eqref{eq_optp_select} @gaillard2018efficient.
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Proposition + Theorem
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Conditional quantile risk: $\mathcal{Q}_p(x) = \mathbb{E}[ \text{QL}_p(x, Y_t) | \mathcal{F}_{t-1}]$.
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> convexity properties of $\mathcal{Q}_p$ depend on the
|
||
conditional distribution $Y_t|\mathcal{F}_{t-1}$.
|
||
|
||
**Proposition 1**
|
||
|
||
Let $Y$ be a univariate random variable with (Radon-Nikodym) $\nu$-density $f$, then for the second subderivative of the quantile risk
|
||
$\mathcal{Q}_p(x) = \mathbb{E}[ \text{QL}_p(x, Y) ]$
|
||
of $Y$ it holds for all $p\in(0,1)$ that
|
||
$\mathcal{Q}_p'' = f.$
|
||
Additionally, if $f$ is a continuous Lebesgue-density with $f\geq\gamma>0$ for some constant $\gamma>0$ on its support $\text{spt}(f)$ then
|
||
is $\mathcal{Q}_p$ is $\gamma$-strongly convex.
|
||
|
||
Strong convexity with $\beta=1$ implies weak exp-concavity **A2** <i class="fa fa-fw fa-check" style="color:var(--col_green_9);"></i> @gaillard2018efficient
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> **A1** and **A2** give us almost optimal convergence w.r.t. selection \eqref{eq_boa_opt_select} <i class="fa fa-fw fa-check" style="color:var(--col_green_9);"></i> </br>
|
||
|
||
**Theorem 1**
|
||
|
||
The gradient based fully adaptive Bernstein online aggregation (BOAG) applied pointwise for all $p\in(0,1)$ on $\text{QL}$ satisfies
|
||
\eqref{eq_boa_opt_conv} with minimal CRPS given by
|
||
|
||
$$\widehat{\mathcal{R}}_{t,\pi} = 2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\pi}.$$
|
||
|
||
If $Y_t|\mathcal{F}_{t-1}$ is bounded
|
||
and has a pdf $f_t$ satifying $f_t>\gamma >0$ on its
|
||
support $\text{spt}(f_t)$ then \ref{eq_boa_opt_select} holds with $\beta=1$ and
|
||
|
||
$$\widehat{\mathcal{R}}_{t,\min} = 2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\min}$$.
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
::::
|
||
|
||
:::: {.notes}
|
||
|
||
We apply Bernstein Online Aggregation (BOA). It lets us weaken the exp-concavity condition while almost keeping the optimalities \ref{eq_optp_select} and \ref{eq_optp_conv}.
|
||
|
||
::::
|
||
|
||
## A Probabilistic Example
|
||
|
||
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Simple Example:
|
||
|
||
|
||
\begin{align}
|
||
Y_t & \sim \mathcal{N}(0,\,1) \\
|
||
\widehat{X}_{t,1} & \sim \widehat{F}_{1} = \mathcal{N}(-1,\,1) \\
|
||
\widehat{X}_{t,2} & \sim \widehat{F}_{2} = \mathcal{N}(3,\,4)
|
||
\label{eq:dgp_sim1}
|
||
\end{align}
|
||
|
||
- True weights vary over $p$
|
||
- Figures show the ECDF and calculated weights using $T=25$ realizations
|
||
- Pointwise solution creates rough estimates
|
||
- Pointwise is better than constant
|
||
- Smooth solution is better than pointwise
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## CDFs
|
||
|
||
```{r, echo = FALSE, fig.width=7, fig.height=6, fig.align='center', cache = TRUE}
|
||
source("assets/01_common.R")
|
||
load("assets/crps_learning/01_motivation_01.RData")
|
||
ggplot(df, aes(x = x, y = y, xend = xend, yend = yend)) +
|
||
stat_function(
|
||
fun = pnorm, n = 10000,
|
||
args = list(mean = dev[2], sd = experts_sd[2]),
|
||
aes(col = "Expert 2"), size = 1.5
|
||
) +
|
||
stat_function(
|
||
fun = pnorm, n = 10000,
|
||
args = list(mean = dev[1], sd = experts_sd[1]),
|
||
aes(col = "Expert 1"), size = 1.5
|
||
) +
|
||
stat_function(
|
||
fun = pnorm,
|
||
n = 10000,
|
||
size = 1.5, aes(col = "DGP") # , linetype = "dashed"
|
||
) +
|
||
geom_point(aes(col = "ECDF"), size = 1.5, show.legend = FALSE) +
|
||
geom_segment(aes(col = "ECDF")) +
|
||
geom_segment(data = tibble(
|
||
x_ = -5,
|
||
xend_ = min(y),
|
||
y_ = 0,
|
||
yend_ = 0
|
||
), aes(x = x_, xend = xend_, y = y_, yend = yend_)) +
|
||
theme_minimal() +
|
||
theme(
|
||
text = element_text(size = text_size),
|
||
legend.position = "bottom",
|
||
legend.key.width = unit(1.5, "cm")
|
||
) +
|
||
ylab("Probability p") +
|
||
xlab("Value") +
|
||
scale_colour_manual(NULL, values = c("#969696", "#252525", col_auto, col_blue)) +
|
||
guides(color = guide_legend(
|
||
nrow = 2,
|
||
byrow = FALSE # ,
|
||
# override.aes = list(
|
||
# size = c(1.5, 1.5, 1.5, 1.5)
|
||
# )
|
||
)) +
|
||
scale_x_continuous(limits = c(-5, 7.5))
|
||
```
|
||
|
||
## Weights
|
||
|
||
```{r, echo = FALSE, fig.width=7, fig.height=6, fig.align='center', cache = TRUE}
|
||
source("assets/01_common.R")
|
||
load("assets/crps_learning/01_motivation_02.RData")
|
||
ggplot() +
|
||
geom_line(data = weights[weights$var != "1Optimum", ], size = 1.5, aes(x = prob, y = val, col = var)) +
|
||
geom_line(
|
||
data = weights[weights$var == "1Optimum", ], size = 1.5, aes(x = prob, y = val, col = var) # , linetype = "dashed"
|
||
) +
|
||
theme_minimal() +
|
||
theme(
|
||
text = element_text(size = text_size),
|
||
legend.position = "bottom",
|
||
legend.key.width = unit(1.5, "cm")
|
||
) +
|
||
xlab("Probability p") +
|
||
ylab("Weight w") +
|
||
scale_colour_manual(
|
||
NULL,
|
||
values = c("#969696", col_pointwise, col_p_constant, col_p_smooth),
|
||
labels = modnames[-c(3, 5)]
|
||
) +
|
||
guides(color = guide_legend(
|
||
ncol = 3,
|
||
byrow = FALSE,
|
||
title.hjust = 5,
|
||
# override.aes = list(
|
||
# linetype = c(rep("solid", 5), "dashed")
|
||
# )
|
||
)) +
|
||
ylim(c(0, 1))
|
||
```
|
||
|
||
::::
|
||
|
||
:::
|
||
|
||
:::
|
||
|
||
## The Smoothing Procedures
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Penalized Smoothing
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Penalized cubic B-Splines for smoothing weights:
|
||
|
||
Let $\varphi=(\varphi_1,\ldots, \varphi_L)$ be bounded basis functions on $(0,1)$ Then we approximate $w_{t,k}$ by
|
||
|
||
\begin{align}
|
||
w_{t,k}^{\text{smooth}} = \sum_{l=1}^L \beta_l \varphi_l = \beta'\varphi
|
||
\end{align}
|
||
|
||
with parameter vector $\beta$. The latter is estimated to penalize $L_2$-smoothing which minimizes
|
||
|
||
\begin{equation}
|
||
\| w_{t,k} - \beta' \varphi \|^2_2 + \lambda \| \mathcal{D}^{d} (\beta' \varphi) \|^2_2
|
||
\label{eq_function_smooth}
|
||
\end{equation}
|
||
|
||
with differential operator $\mathcal{D}$
|
||
|
||
Computation is easy, since we have an analytical solution
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
We receive the constant solution for high values of $\lambda$ when setting $d=1$
|
||
|
||
<center>
|
||
<img src="assets/crps_learning/weights_lambda.gif">
|
||
</center>
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Basis Smoothing
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Represent weights as linear combinations of bounded basis functions:
|
||
|
||
\begin{equation}
|
||
w_{t,k} = \sum_{l=1}^L \beta_{t,k,l} \varphi_l = \boldsymbol \beta_{t,k}' \boldsymbol \varphi
|
||
\end{equation}
|
||
|
||
A popular choice are are B-Splines as local basis functions
|
||
|
||
$\boldsymbol \beta_{t,k}$ is calculated using a reduced regret matrix:
|
||
|
||
\begin{equation}
|
||
\underbrace{\boldsymbol r_{t}}_{\text{LxK}} = \frac{L}{P} \underbrace{\boldsymbol B'}_{\text{LxP}} \underbrace{\left({\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)}_{\text{PxK}}
|
||
\end{equation}
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> $\boldsymbol r_{t}$ is transformed from PxK to LxK
|
||
|
||
If $L = P$ it holds that $\boldsymbol \varphi = \boldsymbol{I}$
|
||
For $L = 1$ we receive constant weights
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Weights converge to the constant solution if $L\rightarrow 1$
|
||
|
||
<center>
|
||
<img src="assets/crps_learning/weights_kstep.gif">
|
||
</center>
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
::::
|
||
|
||
---
|
||
|
||
## The Proposed CRPS-Learning Algorithm
|
||
|
||
<br/>
|
||
|
||
::: {style="font-size: 85%;"}
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="43%"}
|
||
|
||
### Initialization:
|
||
|
||
Array of expert predicitons: $\widehat{X}_{t,p,k}$
|
||
|
||
Vector of Prediction targets: $Y_t$
|
||
|
||
Starting Weights: $\boldsymbol w_0=(w_{0,1},\ldots, w_{0,K})$
|
||
|
||
Penalization parameter: $\lambda\geq 0$
|
||
|
||
B-spline and penalty matrices $\boldsymbol B$ and $\boldsymbol D$ on $\mathcal{P}= (p_1,\ldots,p_M)$
|
||
|
||
Hat matrix: $$\boldsymbol{\mathcal{H}} = \boldsymbol B(\boldsymbol B'\boldsymbol B+ \lambda (\alpha \boldsymbol D_1'\boldsymbol D_1 + (1-\alpha) \boldsymbol D_2'\boldsymbol D_2))^{-1} \boldsymbol B'$$
|
||
|
||
Cumulative Regret: $R_{0,k} = 0$
|
||
|
||
Range parameter: $E_{0,k}=0$
|
||
|
||
Starting pseudo-weights: $\boldsymbol \beta_0 = \boldsymbol B^{\text{pinv}}\boldsymbol w_0(\boldsymbol{\mathcal{P}})$
|
||
|
||
|
||
:::
|
||
|
||
::: {.column width="2%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="55%"}
|
||
|
||
### Core:
|
||
|
||
for( t in 1:T ) {
|
||
|
||
$\widetilde{\boldsymbol X}_{t} = \text{Sort}\left( \boldsymbol w_{t-1}'(\boldsymbol P) \widehat{\boldsymbol X}_{t} \right)$ <b style="color: var(--col_grey_7);"># Prediction</b>
|
||
|
||
$\boldsymbol r_{t} = \frac{L}{M} \boldsymbol B' \left({\boldsymbol{QL}}_{\boldsymbol{\mathcal P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\boldsymbol{\mathcal P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)$
|
||
|
||
$\boldsymbol E_{t} = \max(\boldsymbol E_{t-1}, \boldsymbol r_{t}^+ + \boldsymbol r_{t}^-)$
|
||
|
||
$\boldsymbol V_{t} = \boldsymbol V_{t-1} + \boldsymbol r_{t}^{ \odot 2}$
|
||
|
||
$\boldsymbol \eta_{t} =\min\left( \left(-\log(\boldsymbol \beta_{0}) \odot \boldsymbol V_{t}^{\odot -1} \right)^{\odot\frac{1}{2}} , \frac{1}{2}\boldsymbol E_{t}^{\odot-1}\right)$
|
||
|
||
$\boldsymbol R_{t} = \boldsymbol R_{t-1}+ \boldsymbol r_{t} \odot \left( \boldsymbol 1 - \boldsymbol \eta_{t} \odot \boldsymbol r_{t} \right)/2 + \boldsymbol E_{t} \odot \mathbb{1}\{-2\boldsymbol \eta_{t}\odot \boldsymbol r_{t} > 1\}$
|
||
|
||
$\boldsymbol \beta_{t} = K \boldsymbol \beta_{0} \odot \boldsymbol {SoftMax}\left( - \boldsymbol \eta_{t} \odot \boldsymbol R_{t} + \log( \boldsymbol \eta_{t}) \right)$
|
||
|
||
$\boldsymbol w_{t}(\boldsymbol P) = \underbrace{\boldsymbol B(\boldsymbol B'\boldsymbol B+ \lambda (\alpha \boldsymbol D_1'\boldsymbol D_1 + (1-\alpha) \boldsymbol D_2'\boldsymbol D_2))^{-1} \boldsymbol B'}_{\boldsymbol{\mathcal{H}}} \boldsymbol B \boldsymbol \beta_{t}$
|
||
|
||
}
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
:::
|
||
|
||
## Simulation Study
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## BOA
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Data Generating Process of the [simple probabilistic example](#simple_example):
|
||
|
||
\begin{align*}
|
||
Y_t &\sim \mathcal{N}(0,\,1)\\
|
||
\widehat{X}_{t,1} &\sim \widehat{F}_{1}=\mathcal{N}(-1,\,1) \\
|
||
\widehat{X}_{t,2} &\sim \widehat{F}_{2}=\mathcal{N}(3,\,4)
|
||
\end{align*}
|
||
|
||
- Constant solution $\lambda \rightarrow \infty$
|
||
- Pointwise Solution of the proposed BOAG
|
||
- Smoothed Solution of the proposed BOAG
|
||
- Weights are smoothed during learning
|
||
- Smooth weights are used to calculate Regret, adjust weights, etc.
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## QL Deviation
|
||
|
||
Deviation from best attainable QL (1000 runs).
|
||
|
||

|
||
|
||
## CRPS vs. Lambda
|
||
|
||
CRPS Values for different $\lambda$ (1000 runs)
|
||
|
||

|
||
|
||
## Knots
|
||
|
||
CRPS for different number of knots (1000 runs)
|
||
|
||

|
||
|
||
::::
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Comparison to EWA and ML-Poly
|
||
|
||
The same simulation carried out for different algorithms (1000 runs):
|
||
|
||
<center>
|
||
<img src="assets/crps_learning/algos_constant.gif">
|
||
</center>
|
||
|
||
## Study Forget
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="38%"}
|
||
|
||
#### New DGP:
|
||
|
||
\begin{align*}
|
||
Y_t &\sim \mathcal{N}\left(\frac{\sin(0.005 \pi t )}{2},\,1\right) \\
|
||
\widehat{X}_{t,1} &\sim \widehat{F}_{1} = \mathcal{N}(-1,\,1) \\
|
||
\widehat{X}_{t,2} &\sim \widehat{F}_{2} = \mathcal{N}(3,\,4)
|
||
\end{align*}
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Changing optimal weights
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> Single run example depicted aside
|
||
|
||
<i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> No forgetting leads to long-term constant weights
|
||
|
||
<center>
|
||
<img src="assets/crps_learning/forget.png">
|
||
</center>
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="58%"}
|
||
|
||
###
|
||
|
||
|
||
```{r, echo = FALSE, fig.width=7, fig.height=5, fig.align='center', cache = TRUE}
|
||
load("assets/crps_learning/weights_preprocessed.rda")
|
||
mod_labs <- c("Optimum", "No Forget\nPointwise", "No Forget\nP-Smooth", "Forget\nPointwise", "Forget\nP-Smooth")
|
||
names(mod_labs) <- c("Optimum", "nf_ptw", "nf_psmth", "f_ptw", "f_psmth")
|
||
colseq <- c(grey(.99), "orange", "red", "purple", "blue", "darkblue", "black")
|
||
weights_preprocessed %>%
|
||
mutate(w = 1 - w) %>%
|
||
ggplot(aes(t, p, fill = w)) +
|
||
geom_raster(interpolate = TRUE) +
|
||
facet_grid(Mod ~ ., labeller = labeller(Mod = mod_labs)) +
|
||
theme_minimal() +
|
||
theme(
|
||
# plot.margin = unit(c(0.5, 0.5, 0.5, 0.5), "cm"),
|
||
text = element_text(size = 15),
|
||
legend.key.height = unit(1, "inch")
|
||
) +
|
||
scale_x_continuous(expand = c(0, 0)) +
|
||
xlab("Time t") +
|
||
scale_fill_gradientn(
|
||
limits = c(0, 1),
|
||
colours = colseq,
|
||
breaks = seq(0, 1, 0.2)
|
||
) +
|
||
ylab("Weight w")
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
::::
|
||
|
||
## Application Study
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Overview
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="29%"}
|
||
|
||
::: {style="font-size: 85%;"}
|
||
|
||
Data:
|
||
|
||
- Forecasting European emission allowances (EUA)
|
||
- Daily month-ahead prices
|
||
- Jan 13 - Dec 20 (Phase III, 2092 Obs)
|
||
- Rolling Window (length 250 ~ 1 year)
|
||
|
||
Combination methods:
|
||
|
||
- Online
|
||
- Naive, BOAG, EWAG, ML-PolyG, BMA
|
||
- Batch
|
||
- QRlin, QRconv
|
||
|
||
::::
|
||
|
||
:::
|
||
|
||
::: {.column width="2%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="69%"}
|
||
|
||
Tuning paramter grids:
|
||
|
||
- Smoothing Penalty: $\Lambda= \{0\}\cup \{2^x|x\in \{-4,-3.5,\ldots,12\}\}$
|
||
- Learning Rates: $\mathcal{E}= \{2^x|x\in \{-1,-0.5,\ldots,9\}\}$
|
||
|
||
```{r, echo = FALSE, fig.width=10, fig.height=5, fig.align='center', cache = TRUE}
|
||
load("assets/crps_learning/overview_data.rds")
|
||
|
||
data %>%
|
||
ggplot(aes(x = Date, y = value)) +
|
||
geom_line(size = 1, col = cols[9,"blue"]) +
|
||
theme_minimal() +
|
||
ylab("Value") +
|
||
facet_wrap(. ~ name, scales = "free", ncol = 1) +
|
||
theme(
|
||
text = element_text(size = 15),
|
||
strip.background = element_blank(),
|
||
strip.text.x = element_blank()
|
||
) -> p1
|
||
|
||
data %>%
|
||
ggplot(aes(x = value)) +
|
||
geom_histogram(aes(y = ..density..), size = 1, fill = cols[9,"blue"], bins = 50) +
|
||
ylab("Density") +
|
||
xlab("Value") +
|
||
theme_minimal() +
|
||
theme(
|
||
strip.background = element_rect(fill = cols[3,"grey"], colour = cols[3,"grey"]),
|
||
text = element_text(size = text_size)
|
||
) +
|
||
facet_wrap(. ~ name, scales = "free", ncol = 1, strip.position = "right") -> p2
|
||
|
||
overview <- cowplot::plot_grid(plotlist = list(p1, p2), align = "hv", axis = "tblr", rel_widths = c(0.65, 0.35))
|
||
overview
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Experts
|
||
|
||
::: {style="font-size: 90%;"}
|
||
|
||
Simple exponential smoothing with additive errors (**ETS-ANN**):
|
||
|
||
\begin{align*}
|
||
Y_{t} = l_{t-1} + \varepsilon_t \quad \text{with} \quad l_t = l_{t-1} + \alpha \varepsilon_t \quad \text{and} \quad \varepsilon_t \sim \mathcal{N}(0,\sigma^2)
|
||
\end{align*}
|
||
|
||
Quantile regression (**QuantReg**): For each $p \in \mathcal{P}$ we assume:
|
||
|
||
\begin{align*}
|
||
F^{-1}_{Y_t}(p) = \beta_{p,0} + \beta_{p,1} Y_{t-1} + \beta_{p,2} |Y_{t-1}-Y_{t-2}|
|
||
\end{align*}
|
||
|
||
ARIMA(1,0,1)-GARCH(1,1) with Gaussian errors (**ARMA-GARCH**):
|
||
|
||
\begin{align*}
|
||
Y_{t} = \mu + \phi(Y_{t-1}-\mu) + \theta \varepsilon_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \quad \text{and} \quad Z_t \sim \mathcal{N}(0,1)
|
||
\end{align*}
|
||
|
||
ARIMA(0,1,0)-I-EGARCH(1,1) with Gaussian errors (**I-EGARCH**):
|
||
|
||
\begin{align*}
|
||
Y_{t} = \mu + Y_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \log(\sigma_t^2) = \omega + \alpha Z_{t-1}+ \gamma (|Z_{t-1}|-\mathbb{E}|Z_{t-1}|) + \beta \log(\sigma_{t-1}^2) \quad \text{and} \quad Z_t \sim \mathcal{N}(0,1)
|
||
\end{align*}
|
||
|
||
ARIMA(0,1,0)-GARCH(1,1) with student-t errors (**I-GARCHt**):
|
||
|
||
\begin{align*}
|
||
Y_{t} = \mu + Y_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \quad \text{and} \quad Z_t \sim t(0,1, \nu)
|
||
\end{align*}
|
||
|
||
::::
|
||
|
||
## Results
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Significance
|
||
|
||
```{r, echo = FALSE, fig.width=7, fig.height=5.5, fig.align='center', cache = TRUE, results='asis'}
|
||
load("assets/crps_learning/bernstein_application_study_estimations+learnings_rev1.RData")
|
||
|
||
quantile_loss <- function(X, y, tau) {
|
||
t(t(y - X) * tau) * (y - X > 0) + t(t(X - y) * (1 - tau)) * (y - X < 0)
|
||
}
|
||
QL <- FCSTN * NA
|
||
for (k in 1:dim(QL)[1]) {
|
||
QL[k, , ] <- quantile_loss(FCSTN[k, , ], as.numeric(yoos), Qgrid)
|
||
}
|
||
|
||
## TABLE AREA
|
||
|
||
KK <- length(mnames)
|
||
TTinit <- 1 ## without first, as all comb. are uniform
|
||
RQL <- apply(QL[1:KK, -c(1:TTinit), ], c(1, 3), mean)
|
||
dimnames(RQL) <- list(mnames, Qgrid)
|
||
RQLm <- apply(RQL, c(1), mean, na.rm = TRUE)
|
||
##
|
||
qq <- apply(QL[1:KK, -c(1:TTinit), ], c(1, 2), mean)
|
||
|
||
library(xtable)
|
||
Pall <- numeric(KK)
|
||
for (i in 1:KK) Pall[i] <- t.test(qq[K + 1, ] - qq[i, ], alternative = "greater")$p.val
|
||
|
||
Mall <- (RQLm - RQLm[K + 1]) * 10000
|
||
Mout <- matrix(Mall[-c(1:(K + 3))], 5, 6)
|
||
dimnames(Mout) <- list(moname, mtname)
|
||
|
||
Pallout <- format(round(Pall, 3), nsmall = 3)
|
||
Pallout[Pallout == "0.000"] <- "<.001"
|
||
Pallout[Pallout == "1.000"] <- ">.999"
|
||
|
||
MO <- K
|
||
IDX <- c(1:K)
|
||
OUT <- t(Mall[IDX])
|
||
OUT.num <- OUT
|
||
class(OUT.num) <- "numeric"
|
||
|
||
xxx <- OUT.num
|
||
xxxx <- OUT
|
||
table <- round(OUT, 3)
|
||
table_col <- OUT
|
||
i.p <- 1
|
||
for (i.p in 1:MO) {
|
||
xmax <- -min(Mall) * 5 # max(Mall)
|
||
xmin <- min(Mall)
|
||
cred <- rev(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, .8, .5)) # , .5,0,0,0,1,1,1) ## red
|
||
cgreen <- rev(c(.5, .5, .55, .6, .65, .7, .75, .8, .85, .9, .95, 1, 1, .9)) # , .5,0,1,1,1,0,0) ## green
|
||
cblue <- rev(c(.55, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5)) # , .5,1,1,0,0,0,1) ## blue
|
||
crange <- c(xmin, xmax) ## range
|
||
## colors in plot:
|
||
fred <- round(approxfun(seq(crange[1], crange[2], length = length(cred)), cred)(pmin(xxx[, i.p], xmax)), 3)
|
||
fgreen <- round(approxfun(seq(crange[1], crange[2], length = length(cgreen)), cgreen)(pmin(xxx[, i.p], xmax)), 3)
|
||
fblue <- round(approxfun(seq(crange[1], crange[2], length = length(cblue)), cblue)(pmin(xxx[, i.p], xmax)), 3)
|
||
tmp <- format(round(xxx[, i.p], 3), nsmall = 3)
|
||
xxxx[, i.p] <- paste("\\cellcolor[rgb]{", fred, ",", fgreen, ",", fblue, "}", tmp, " {\\footnotesize (", Pallout[IDX[i.p]], ")}", sep = "")
|
||
table_col[, i.p] <- rgb(fred, fgreen, fblue, maxColorValue = 1)
|
||
table[, i.p] <- paste0(
|
||
table[, i.p],
|
||
'<sup>(', Pallout[i.p], ")</sup>"
|
||
)
|
||
}
|
||
|
||
table_out <- kbl(
|
||
table,
|
||
align = rep("c", ncol(table)),
|
||
bootstrap_options = c("condensed"),
|
||
escape = FALSE) %>%
|
||
kable_paper(full_width = TRUE) %>%
|
||
row_spec(0:nrow(table), color = cols[9, "grey"])
|
||
|
||
for (j in 1:ncol(table)) {
|
||
table_out <- table_out %>%
|
||
column_spec(j, background = table_col[, j])
|
||
}
|
||
table_out
|
||
```
|
||
|
||
```{r, echo = FALSE, fig.width=7, fig.height=5.5, fig.align='center', cache = TRUE, results='asis'}
|
||
MO <- 6
|
||
OUT <- Mout
|
||
OUT.num <- OUT
|
||
class(OUT.num) <- "numeric"
|
||
|
||
xxx <- OUT.num
|
||
xxxx <- OUT
|
||
i.p <- 1
|
||
table2 <- OUT
|
||
table_col2 <- OUT
|
||
for (i.p in 1:MO) {
|
||
xmax <- -min(Mall) * 5 # max(Mall)
|
||
xmin <- min(Mall)
|
||
cred <- rev(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, .8, .5)) # , .5,0,0,0,1,1,1) ## red
|
||
cgreen <- rev(c(.5, .5, .55, .6, .65, .7, .75, .8, .85, .9, .95, 1, 1, .9)) # , .5,0,1,1,1,0,0) ## green
|
||
cblue <- rev(c(.55, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5, .5)) # , .5,1,1,0,0,0,1) ## blue
|
||
crange <- c(xmin, xmax) ## range
|
||
## colors in plot:
|
||
fred <- round(approxfun(seq(crange[1], crange[2], length = length(cred)), cred)(pmin(xxx[, i.p], xmax)), 3)
|
||
fgreen <- round(approxfun(seq(crange[1], crange[2], length = length(cgreen)), cgreen)(pmin(xxx[, i.p], xmax)), 3)
|
||
fblue <- round(approxfun(seq(crange[1], crange[2], length = length(cblue)), cblue)(pmin(xxx[, i.p], xmax)), 3)
|
||
tmp <- format(round(xxx[, i.p], 3), nsmall = 3)
|
||
xxxx[, i.p] <- paste("\\cellcolor[rgb]{", fred, ",", fgreen, ",", fblue, "}", tmp, " {\\footnotesize (", Pallout[K + 3 + 5 * (i.p - 1) + 1:5], ")}", sep = "")
|
||
# table2[, i.p] <- paste0(tmp, " (", Pallout[K + 3 + 5 * (i.p - 1) + 1:5], ")")
|
||
|
||
table2[, i.p] <- paste0(
|
||
tmp,
|
||
"<sup>(", Pallout[K + 3 + 5 * (i.p - 1) + 1:5], ")</sup>"
|
||
)
|
||
|
||
table_col2[, i.p] <- rgb(fred, fgreen, fblue, maxColorValue = 1)
|
||
} # i.p
|
||
|
||
rownames(table2) <- c("Pointwise", "B-Smooth", "P-Smooth", "B-Constant", "P-Constant")
|
||
rownames(table_col2) <- rownames(table2)
|
||
|
||
table2["B-Smooth", c("QRlin", "QRconv")] <- "-"
|
||
table_col2["B-Smooth", c("QRlin", "QRconv")] <- cols[2, "grey"]
|
||
|
||
idx <- c("Pointwise", "B-Constant", "P-Constant", "B-Smooth", "P-Smooth")
|
||
|
||
table2["P-Smooth", "BOAG"] <- "<strong>-0.182</strong><sup>(0.039)</sup>"
|
||
|
||
table_out2 <- kableExtra::kbl(
|
||
table2[idx, ],
|
||
align = rep("c", ncol(table2)),
|
||
bootstrap_options = c("condensed"),
|
||
escape = FALSE
|
||
) %>%
|
||
kable_paper(full_width = TRUE) %>%
|
||
row_spec(0:nrow(table2[idx, ]), color = cols[9, "grey"])
|
||
|
||
for (j in seq_len(ncol(table2[idx, ]))) {
|
||
table_out2 <- table_out2 %>%
|
||
column_spec(1 + j,
|
||
background = table_col2[idx, j]
|
||
)
|
||
}
|
||
|
||
table_out2 %>%
|
||
column_spec(1, bold = T)
|
||
```
|
||
|
||
<div style="font-size: 0.7em;">
|
||
CRPS difference to Naive. Negative values correspond to better performance (the best value is bold). <br/>
|
||
Additionally, we show the p-value of the DM-test, testing against Naive. The cells are colored with respect to their values (the greener better).
|
||
</div>
|
||
|
||
## QL
|
||
|
||
```{r, echo = FALSE, fig.width=13, fig.height=5.5, fig.align='center', cache = TRUE}
|
||
|
||
##### Performance across probabilities
|
||
M <- length(mnames)
|
||
Msel <- c(1:K, K + 1, K + 1 + 2 + 1:4 * 5 - 2) ## experts + naive + smooth
|
||
modnames <- mnames[Msel]
|
||
|
||
tCOL <- c(
|
||
"#E6CC00", "#CC6600", "#E61A1A", "#99004D", "#F233BF",
|
||
"#666666", "#0000CC", "#1A80E6", "#1AE680", "#00CC00"
|
||
)
|
||
|
||
|
||
t(RQL) %>%
|
||
as_tibble() %>%
|
||
select(Naive) %>%
|
||
mutate(Naive = 0) %>%
|
||
mutate(p = 1:99 / 100) %>%
|
||
pivot_longer(-p, values_to = "Loss differences") -> dummy
|
||
|
||
t(RQL) %>%
|
||
as_tibble() %>%
|
||
select(mnames[Msel]) %>%
|
||
mutate(p = 1:99 / 100) %>%
|
||
pivot_longer(!p & !Naive) %>%
|
||
mutate(`Loss differences` = value - Naive) %>%
|
||
select(-value, -Naive) %>%
|
||
rbind(dummy) %>%
|
||
mutate(
|
||
p = as.numeric(p),
|
||
name = stringr::str_replace(name, "-P-smooth", ""),
|
||
name = factor(name, levels = stringr::str_replace(mnames[Msel], "-P-smooth", ""), ordered = T),
|
||
`Loss differences` = `Loss differences` * 1000
|
||
) %>%
|
||
ggplot(aes(x = p, y = `Loss differences`, colour = name)) +
|
||
geom_line(linewidth = 1) +
|
||
theme_minimal() +
|
||
theme(
|
||
text = element_text(size = text_size),
|
||
legend.position = "bottom"
|
||
) +
|
||
xlab("Probability p") +
|
||
scale_color_manual(NULL, values = tCOL) +
|
||
guides(colour = guide_legend(nrow = 2, byrow = TRUE))
|
||
```
|
||
|
||
## Cumulative Loss Difference
|
||
|
||
```{r, echo = FALSE, fig.width=13, fig.height=5.5, fig.align='center', cache = TRUE}
|
||
DQL <- t(apply(apply(QL[1:KK, -c(1:TTinit), ], c(1, 2), mean), 1, cumsum))
|
||
|
||
rownames(DQL) <- mnames
|
||
|
||
t(DQL) %>%
|
||
as_tibble() %>%
|
||
select(Naive) %>%
|
||
mutate(
|
||
`Difference of cumulative loss` = 0,
|
||
Date = ytime[-c(1:(TT + TTinit + 1))],
|
||
name = "Naive"
|
||
) %>%
|
||
select(-Naive) -> dummy
|
||
|
||
|
||
data <- t(DQL) %>%
|
||
as_tibble() %>%
|
||
select(mnames[Msel]) %>%
|
||
mutate(Date = ytime[-c(1:(TT + TTinit + 1))]) %>%
|
||
pivot_longer(!Date & !Naive) %>%
|
||
mutate(`Difference of cumulative loss` = value - Naive) %>%
|
||
select(-value, -Naive) %>%
|
||
rbind(dummy) %>%
|
||
mutate(
|
||
name = stringr::str_replace(name, "-P-smooth", ""),
|
||
name = factor(name, levels = stringr::str_replace(mnames[Msel], "-P-smooth", ""))
|
||
)
|
||
|
||
data %>%
|
||
ggplot(aes(x = Date, y = `Difference of cumulative loss`, colour = name)) +
|
||
geom_line(size = 1) +
|
||
theme_minimal() +
|
||
theme(
|
||
text = element_text(size = text_size),
|
||
legend.position = "bottom"
|
||
) +
|
||
scale_color_manual(NULL, values = tCOL) +
|
||
guides(colour = guide_legend(nrow = 2, byrow = TRUE))
|
||
```
|
||
|
||
## Weights (BOAG P-Smooth)
|
||
|
||
```{r, echo = FALSE, fig.width=13, fig.height=5.5, fig.align='center', cache = TRUE}
|
||
load("assets/crps_learning/weights_data.RData")
|
||
weights_data %>%
|
||
ggplot(aes(Date, p, fill = w)) +
|
||
geom_raster(interpolate = TRUE) +
|
||
facet_grid(Mod ~ .) +
|
||
theme_minimal() +
|
||
theme(
|
||
plot.margin = unit(c(0.2, 0.2, 0.2, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
legend.key.height = unit(0.9, "inch")
|
||
) +
|
||
ylab("p") +
|
||
scale_fill_gradientn(
|
||
limits = c(0, 1),
|
||
colours = colseq,
|
||
breaks = seq(0, 1, 0.2)
|
||
) +
|
||
scale_x_date(expand = c(0, 0))
|
||
```
|
||
|
||
## Weights (Last)
|
||
|
||
```{r, echo = FALSE, fig.width=13, fig.height=5.5, fig.align='center', cache = TRUE}
|
||
load("assets/crps_learning/weights_example.RData")
|
||
weights %>%
|
||
ggplot(aes(x = p, y = weights, col = Model)) +
|
||
geom_line(size = 1.5) +
|
||
theme_minimal() +
|
||
theme(
|
||
plot.margin = unit(c(0.2, 0.3, 0.2, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
legend.position = "bottom",
|
||
legend.title = element_blank(),
|
||
panel.spacing = unit(1.5, "lines")
|
||
) +
|
||
scale_color_manual(NULL, values = tCOL[1:K]) +
|
||
facet_grid(. ~ K)
|
||
```
|
||
|
||
::::
|
||
|
||
::::
|
||
|
||
# Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices
|
||
|
||
Berrisch, J., & Ziel, F. (2024). *International Journal of Forecasting*, 40(4), 1568-1586.
|
||
|
||
|
||
## Multivariate CRPS Learning
|
||
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="45%"}
|
||
|
||
|
||
We extend the **B-Smooth** and **P-Smooth** procedures to the multivariate setting:
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Penalized Smoothing
|
||
|
||
Let $\boldsymbol{\psi}^{\text{mv}}=(\psi_1,\ldots, \psi_{D})$ and $\boldsymbol{\psi}^{\text{pr}}=(\psi_1,\ldots, \psi_{P})$ be two sets of bounded basis functions on $(0,1)$:
|
||
|
||
\begin{equation*}
|
||
\boldsymbol w_{t,k} = \boldsymbol{\psi}^{\text{mv}} \boldsymbol{b}_{t,k} {\boldsymbol{\psi}^{pr}}'
|
||
\end{equation*}
|
||
|
||
with parameter matix $\boldsymbol b_{t,k}$. The latter is estimated to penalize $L_2$-smoothing which minimizes
|
||
|
||
\begin{align}
|
||
& \| \boldsymbol{\beta}_{t,d, k}' \boldsymbol{\varphi}^{\text{pr}} - \boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}} \|^2_2 + \lambda^{\text{pr}} \| \mathcal{D}_{q} (\boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}}) \|^2_2 + \nonumber \\
|
||
& \| \boldsymbol{\beta}_{t, p, k}' \boldsymbol{\varphi}^{\text{mv}} - \boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}} \|^2_2 + \lambda^{\text{mv}} \| \mathcal{D}_{q} (\boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}}) \|^2_2 \nonumber
|
||
\end{align}
|
||
|
||
with differential operator $\mathcal{D}_q$ of order $q$
|
||
|
||
[{{< fa calculator >}}]{style="color:var(--col_green_10);"} We have an analytical solution.
|
||
|
||
## Basis Smoothing
|
||
|
||
Linear combinations of bounded basis functions:
|
||
|
||
\begin{equation}
|
||
\underbrace{\boldsymbol w_{t,k}}_{D \text{ x } P} = \sum_{j=1}^{\widetilde D} \sum_{l=1}^{\widetilde P} \beta_{t,j,l,k} \varphi^{\text{mv}}_{j} \varphi^{\text{pr}}_{l} = \underbrace{\boldsymbol \varphi^{\text{mv}}}_{D\text{ x }\widetilde D} \boldsymbol \beta_{t,k} \underbrace{{\boldsymbol\varphi^{\text{pr}}}'}_{\widetilde P \text{ x }P} \nonumber
|
||
\end{equation}
|
||
|
||
A popular choice: B-Splines
|
||
|
||
$\boldsymbol \beta_{t,k}$ is calculated using a reduced regret matrix:
|
||
|
||
$\underbrace{\boldsymbol r_{t,k}}_{\widetilde P \times \widetilde D} = \boldsymbol \varphi^{\text{pr}} \underbrace{\left({\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)}_{\text{PxD}}\boldsymbol \varphi^{\text{mv}}$
|
||
|
||
If $\widetilde P = P$ it holds that $\boldsymbol \varphi^{pr} = \boldsymbol{I}$ (pointwise)
|
||
|
||
For $\widetilde P = 1$ we receive constant weights
|
||
|
||
::::
|
||
|
||
:::
|
||
|
||
::: {.column width="2%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="53%"}
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "1000px", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/algorithm.svg")
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Application
|
||
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
#### Data
|
||
|
||
- Day-Ahead electricity price forecasts from @marcjasz2022distributional
|
||
- Produced using probabilistic neural networks
|
||
- 24-dimensional distributional forecasts
|
||
- Distribution assumptions: JSU and Normal
|
||
- 8 experts (4 JSU, 4 Normal)
|
||
- 27th Dec. 2018 to 31st Dec. 2020 (736 days)
|
||
- We extract 99 quantiles (percentiles)
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
#### Setup
|
||
|
||
Evaluation: Exclude first 182 observations
|
||
|
||
Extensions: Penalized smoothing | Forgetting
|
||
|
||
Tuning strategies:
|
||
|
||
- Bayesian Fix
|
||
- Sophisticated Baesian Search algorithm
|
||
- Online
|
||
- Dynamic based on past performance
|
||
- Bayesian Online
|
||
- First Bayesian Fix then Online
|
||
|
||
Computation Time: ~30 Minutes
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Results
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="55%"}
|
||
|
||
```{r, cache = TRUE}
|
||
load("assets/mcrps_learning/naive_table_df.rds")
|
||
|
||
table_naive <- naive_table_df %>%
|
||
as_tibble() %>%
|
||
head(1) %>%
|
||
mutate_all(round, 4) %>%
|
||
mutate_all(sprintf, fmt = "%#.3f") %>%
|
||
kbl(
|
||
bootstrap_options = "condensed",
|
||
escape = FALSE,
|
||
format = "html",
|
||
booktabs = FALSE,
|
||
align = c("c", rep("c", ncol(naive_table_df) - 1))
|
||
) %>%
|
||
kable_paper(full_width = TRUE) %>%
|
||
row_spec(0:1, color = cols[9, "grey"]) %>%
|
||
kable_styling(font_size = 16)
|
||
|
||
|
||
for (i in 1:ncol(naive_table_df)) {
|
||
table_naive <- table_naive %>%
|
||
column_spec(i,
|
||
background = ifelse(
|
||
is.na(naive_table_df["stat", i, drop = TRUE][-ncol(naive_table_df)]),
|
||
cols[5, "grey"],
|
||
col_scale2(
|
||
naive_table_df["stat", i, drop = TRUE][-ncol(naive_table_df)],
|
||
rng_t
|
||
)
|
||
),
|
||
bold = i == which.min(naive_table_df["loss", ])
|
||
)
|
||
}
|
||
|
||
table_naive
|
||
|
||
load("assets/mcrps_learning/performance_data.rds")
|
||
i <- 1
|
||
j <- 1
|
||
for (j in 1:3) {
|
||
for (i in seq_len(nrow(performance_loss_tibble))) {
|
||
if (loss_and_dm[i, j, "p.val"] < 0.001) {
|
||
performance_loss_tibble[i, 2 + j] <- paste0(
|
||
performance_loss_tibble[i, 2 + j],
|
||
'<span class="sup-zero-width">***</span>'
|
||
)
|
||
} else if (loss_and_dm[i, j, "p.val"] < 0.01) {
|
||
performance_loss_tibble[i, 2 + j] <- paste0(
|
||
performance_loss_tibble[i, 2 + j],
|
||
'<span class="sup-zero-width">**</span>'
|
||
)
|
||
} else if (loss_and_dm[i, j, "p.val"] < 0.05) {
|
||
performance_loss_tibble[i, 2 + j] <- paste0(
|
||
performance_loss_tibble[i, 2 + j],
|
||
'<span class="sup-zero-width">*</span>'
|
||
)
|
||
} else if (loss_and_dm[i, j, "p.val"] < 0.1) {
|
||
performance_loss_tibble[i, 2 + j] <- paste0(
|
||
performance_loss_tibble[i, 2 + j],
|
||
'<span class="zero-width">.</span>'
|
||
)
|
||
} else {
|
||
performance_loss_tibble[i, 2 + j] <- paste0(
|
||
performance_loss_tibble[i, 2 + j]
|
||
)
|
||
}
|
||
}
|
||
}
|
||
|
||
table_performance <- performance_loss_tibble %>%
|
||
kbl(
|
||
padding=-1L,
|
||
col.names = c(
|
||
'Description',
|
||
'Parameter Tuning',
|
||
'BOA',
|
||
'ML-Poly',
|
||
'EWA'
|
||
),
|
||
bootstrap_options = "condensed",
|
||
# Dont replace any string, dataframe has to be valid latex code ...
|
||
escape = FALSE,
|
||
format = "html",
|
||
align = c("l", "l", rep("c", ncol(performance_loss_tibble)-2))
|
||
) %>%
|
||
kable_paper(full_width = TRUE) %>%
|
||
row_spec(0:nrow(performance_loss_tibble), color = cols[9, "grey"])
|
||
|
||
# %%
|
||
|
||
for (i in 3:ncol(performance_loss_tibble)) {
|
||
bold_cells <- rep(FALSE, times = nrow(performance_loss_tibble))
|
||
|
||
loss <- loss_and_dm[, i - 2, "loss"]
|
||
|
||
table_performance <- table_performance %>%
|
||
column_spec(i,
|
||
background = c(
|
||
col_scale2(
|
||
loss_and_dm[, i - 2, "stat"],
|
||
rng_t
|
||
)
|
||
),
|
||
bold = loss == min(loss),
|
||
)
|
||
}
|
||
table_performance %>%
|
||
kable_styling(font_size = 16)
|
||
```
|
||
|
||
```{=html}
|
||
<div style="font-size: 0.6em; margin-top: 0.5em;">
|
||
<span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
|
||
<span style="background-color: #66BA6A; padding: 2px 6px;"><-5</span>
|
||
<span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
|
||
<span style="background-color: #91C866; padding: 2px 6px;">-3</span>
|
||
<span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
|
||
<span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
|
||
<span style="background-color: #FFED58; padding: 2px 6px;">0</span>
|
||
<span style="background-color: #FFD145; padding: 2px 6px;">1</span>
|
||
<span style="background-color: #FFB531; padding: 2px 6px;">2</span>
|
||
<span style="background-color: #FC9733; padding: 2px 6px;">3</span>
|
||
<span style="background-color: #F67744; padding: 2px 6px;">4</span>
|
||
<span style="background-color: #EE5250; padding: 2px 6px;">>5</span>
|
||
</div>
|
||
|
||
<div style="font-size: 0.7em;">
|
||
<span style="padding: 2px 6px;">Significance denoted by: </span><span>.</span> p < 0.1; <span>*</span> p < 0.05; <span>**</span> p < 0.01; <span>***</span> p < 0.001;
|
||
</div>
|
||
```
|
||
|
||
:::
|
||
|
||
::: {.column width = "45%"}
|
||
|
||
<br/>
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Constant
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/constant.svg")
|
||
```
|
||
|
||
## Pointwise
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/pointwise.svg")
|
||
```
|
||
|
||
## B Constant PR
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/constant_pr.svg")
|
||
```
|
||
|
||
## B Constant MV
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/constant_mv.svg")
|
||
```
|
||
|
||
## Smooth.Forget
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/smooth_best.svg")
|
||
```
|
||
|
||
::::
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Results
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Chosen Parameters
|
||
|
||
```{r, warning=FALSE, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
|
||
load("assets/mcrps_learning/pars_data.rds")
|
||
pars_data %>%
|
||
ggplot(aes(x = dates, y = value)) +
|
||
geom_rect(aes(
|
||
ymin = 0,
|
||
ymax = value * 1.2,
|
||
xmin = dates[1],
|
||
xmax = dates[182],
|
||
fill = "Burn-In"
|
||
)) +
|
||
geom_line(aes(color = name), linewidth = linesize, show.legend = FALSE) +
|
||
scale_colour_manual(
|
||
values = as.character(cols[5, c("pink", "amber", "green")])
|
||
) +
|
||
facet_grid(name ~ .,
|
||
scales = "free_y",
|
||
# switch = "both"
|
||
) +
|
||
scale_y_continuous(
|
||
trans = "log2",
|
||
labels = scaleFUN
|
||
) +
|
||
theme_minimal() +
|
||
theme(
|
||
# plot.margin = unit(c(0.2, 0.2, 0.2, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
legend.key.width = unit(0.9, "inch"),
|
||
legend.position = "none"
|
||
) +
|
||
ylab(NULL) +
|
||
xlab("date") +
|
||
scale_fill_manual(NULL,
|
||
values = as.character(cols[3, "grey"])
|
||
)
|
||
```
|
||
|
||
## Weights: Hour 16:00-17:00
|
||
|
||
```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
|
||
load("assets/mcrps_learning/weights_h.rds")
|
||
weights_h %>%
|
||
ggplot(aes(date, q, fill = weight)) +
|
||
geom_raster(interpolate = TRUE) +
|
||
facet_grid(
|
||
Expert ~ . # , labeller = labeller(Mod = mod_labs)
|
||
) +
|
||
theme_minimal() +
|
||
theme(
|
||
# plot.margin = unit(c(0.2, 0.2, 0.2, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
legend.key.height = unit(0.9, "inch")
|
||
) +
|
||
scale_x_date(expand = c(0, 0)) +
|
||
scale_fill_gradientn(
|
||
oob = scales::squish,
|
||
limits = c(0, 1),
|
||
values = c(seq(0, 0.4, length.out = 8), 0.65, 1),
|
||
colours = c(
|
||
cols[8, "red"],
|
||
cols[5, "deep-orange"],
|
||
cols[5, "amber"],
|
||
cols[5, "yellow"],
|
||
cols[5, "lime"],
|
||
cols[5, "light-green"],
|
||
cols[5, "green"],
|
||
cols[7, "green"],
|
||
cols[9, "green"],
|
||
cols[10, "green"]
|
||
),
|
||
breaks = seq(0, 1, 0.1)
|
||
) +
|
||
xlab("date") +
|
||
ylab("probability") +
|
||
scale_y_continuous(breaks = c(0.1, 0.5, 0.9))
|
||
```
|
||
|
||
## Weights: Median
|
||
|
||
```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
|
||
load("assets/mcrps_learning/weights_q.rds")
|
||
weights_q %>%
|
||
mutate(hour = as.numeric(hour) - 1) %>%
|
||
ggplot(aes(date, hour, fill = weight)) +
|
||
geom_raster(interpolate = TRUE) +
|
||
facet_grid(
|
||
Expert ~ . # , labeller = labeller(Mod = mod_labs)
|
||
) +
|
||
theme_minimal() +
|
||
theme(
|
||
# plot.margin = unit(c(0.2, 0.2, 0.2, 0.2), "cm"),
|
||
text = element_text(size = text_size),
|
||
legend.key.height = unit(0.9, "inch")
|
||
) +
|
||
scale_x_date(expand = c(0, 0)) +
|
||
scale_fill_gradientn(
|
||
oob = scales::squish,
|
||
limits = c(0, 1),
|
||
values = c(seq(0, 0.4, length.out = 8), 0.65, 1),
|
||
colours = c(
|
||
cols[8, "red"],
|
||
cols[5, "deep-orange"],
|
||
cols[5, "amber"],
|
||
cols[5, "yellow"],
|
||
cols[5, "lime"],
|
||
cols[5, "light-green"],
|
||
cols[5, "green"],
|
||
cols[7, "green"],
|
||
cols[9, "green"],
|
||
cols[10, "green"]
|
||
),
|
||
breaks = seq(0, 1, 0.1)
|
||
) +
|
||
xlab("date") +
|
||
ylab("hour") +
|
||
scale_y_continuous(breaks = c(0, 8, 16, 24))
|
||
```
|
||
|
||
::::
|
||
|
||
## Non-Equidistant Knots
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Knot Placement Illustration
|
||
|
||
```{ojs}
|
||
d3 = require("d3@7")
|
||
```
|
||
|
||
```{ojs}
|
||
bsplineData = FileAttachment("assets/mcrps_learning/basis_functions.csv").csv({ typed: true })
|
||
```
|
||
|
||
```{ojs}
|
||
function updateChartInner(g, x, y, linesGroup, color, line, data) {
|
||
// Update axes with transitions
|
||
x.domain([0, d3.max(data, d => d.x)]);
|
||
g.select(".x-axis").transition().duration(1500).call(d3.axisBottom(x).ticks(10));
|
||
y.domain([0, d3.max(data, d => d.y)]);
|
||
g.select(".y-axis").transition().duration(1500).call(d3.axisLeft(y).ticks(5));
|
||
|
||
// Group data by basis function
|
||
const dataByFunction = Array.from(d3.group(data, d => d.b));
|
||
const keyFn = d => d[0];
|
||
|
||
// Update basis function lines
|
||
const u = linesGroup.selectAll("path").data(dataByFunction, keyFn);
|
||
u.join(
|
||
enter => enter.append("path").attr("fill","none").attr("stroke-width",3)
|
||
.attr("stroke", (_, i) => color(i)).attr("d", d => line(d[1].map(pt => ({x: pt.x, y: 0}))))
|
||
.style("opacity",0),
|
||
update => update,
|
||
exit => exit.transition().duration(1000).style("opacity",0).remove()
|
||
)
|
||
.transition().duration(1000)
|
||
.attr("d", d => line(d[1]))
|
||
.attr("stroke", (_, i) => color(i))
|
||
.style("opacity",1);
|
||
}
|
||
|
||
chart = {
|
||
// State variables for selected parameters
|
||
let selectedMu = 0.5;
|
||
let selectedSig = 1;
|
||
let selectedNonc = 0;
|
||
let selectedTailw = 1;
|
||
const filteredData = () => bsplineData.filter(d =>
|
||
Math.abs(selectedMu - d.mu) < 0.001 &&
|
||
d.sig === selectedSig &&
|
||
d.nonc === selectedNonc &&
|
||
d.tailw === selectedTailw
|
||
);
|
||
const container = d3.create("div")
|
||
.style("max-width", "none")
|
||
.style("width", "100%");;
|
||
const controlsContainer = container.append("div")
|
||
.style("display", "flex")
|
||
.style("gap", "20px");
|
||
// slider controls
|
||
const sliders = [
|
||
{ label: 'Mu', get: () => selectedMu, set: v => selectedMu = v, min: 0.1, max: 0.9, step: 0.2 },
|
||
{ label: 'Sigma', get: () => Math.log2(selectedSig), set: v => selectedSig = 2 ** v, min: -2, max: 2, step: 1 },
|
||
{ label: 'Noncentrality', get: () => selectedNonc, set: v => selectedNonc = v, min: -4, max: 4, step: 2 },
|
||
{ label: 'Tailweight', get: () => Math.log2(selectedTailw), set: v => selectedTailw = 2 ** v, min: -2, max: 2, step: 1 }
|
||
];
|
||
// Build slider controls with D3 data join
|
||
const sliderCont = controlsContainer.selectAll('div').data(sliders).join('div')
|
||
.style('display','flex').style('align-items','center').style('gap','10px')
|
||
.style('flex','1').style('min-width','0px');
|
||
sliderCont.append('label').text(d => d.label + ':').style('font-size','20px');
|
||
sliderCont.append('input')
|
||
.attr('type','range').attr('min', d => d.min).attr('max', d => d.max).attr('step', d => d.step)
|
||
.property('value', d => d.get())
|
||
.on('input', function(event, d) {
|
||
const val = +this.value; d.set(val);
|
||
d3.select(this.parentNode).select('span').text(d.label.match(/Sigma|Tailweight/) ? 2**val : val);
|
||
updateChart(filteredData());
|
||
})
|
||
.style('width', '100%');
|
||
sliderCont.append('span').text(d => (d.label.match(/Sigma|Tailweight/) ? d.get() : d.get()))
|
||
.style('font-size','20px');
|
||
|
||
// Add Reset button to clear all sliders to their defaults
|
||
controlsContainer.append('button')
|
||
.text('Reset')
|
||
.style('font-size', '20px')
|
||
.style('align-self', 'center')
|
||
.style('margin-left', 'auto')
|
||
.on('click', () => {
|
||
// reset state vars
|
||
selectedMu = 0.5;
|
||
selectedSig = 1;
|
||
selectedNonc = 0;
|
||
selectedTailw = 1;
|
||
// update input positions
|
||
sliderCont.selectAll('input').property('value', d => d.get());
|
||
// update displayed labels
|
||
sliderCont.selectAll('span')
|
||
.text(d => d.label.match(/Sigma|Tailweight/) ? (2**d.get()) : d.get());
|
||
// redraw chart
|
||
updateChart(filteredData());
|
||
});
|
||
|
||
// Build SVG
|
||
const width = 1200;
|
||
const height = 450;
|
||
const margin = {top: 40, right: 20, bottom: 40, left: 40};
|
||
const innerWidth = width - margin.left - margin.right;
|
||
const innerHeight = height - margin.top - margin.bottom;
|
||
|
||
// Set controls container width to match SVG plot width
|
||
controlsContainer.style("max-width", "none").style("width", "100%");
|
||
// Distribute each control evenly and make sliders full-width
|
||
controlsContainer.selectAll("div").style("flex", "1").style("min-width", "0px");
|
||
controlsContainer.selectAll("input").style("width", "100%").style("box-sizing", "border-box");
|
||
|
||
// Create scales
|
||
const x = d3.scaleLinear()
|
||
.domain([0, 1])
|
||
.range([0, innerWidth]);
|
||
|
||
const y = d3.scaleLinear()
|
||
.domain([0, 1])
|
||
.range([innerHeight, 0]);
|
||
|
||
// Create a color scale for the basis functions
|
||
const color = d3.scaleOrdinal(d3.schemeCategory10);
|
||
|
||
// Create SVG
|
||
const svg = d3.create("svg")
|
||
.attr("width", "100%")
|
||
.attr("height", "auto")
|
||
.attr("viewBox", [0, 0, width, height])
|
||
.attr("preserveAspectRatio", "xMidYMid meet")
|
||
.attr("style", "max-width: 100%; height: auto;");
|
||
|
||
// Create the chart group
|
||
const g = svg.append("g")
|
||
.attr("transform", `translate(${margin.left},${margin.top})`);
|
||
|
||
// Add axes
|
||
const xAxis = g.append("g")
|
||
.attr("transform", `translate(0,${innerHeight})`)
|
||
.attr("class", "x-axis")
|
||
.call(d3.axisBottom(x).ticks(10))
|
||
.style("font-size", "20px");
|
||
|
||
const yAxis = g.append("g")
|
||
.attr("class", "y-axis")
|
||
.call(d3.axisLeft(y).ticks(5))
|
||
.style("font-size", "20px");
|
||
|
||
// Add a horizontal line at y = 0
|
||
g.append("line")
|
||
.attr("x1", 0)
|
||
.attr("x2", innerWidth)
|
||
.attr("y1", y(0))
|
||
.attr("y2", y(0))
|
||
.attr("stroke", "#000")
|
||
.attr("stroke-opacity", 0.2);
|
||
|
||
// Add gridlines
|
||
g.append("g")
|
||
.attr("class", "grid-lines")
|
||
.selectAll("line")
|
||
.data(y.ticks(5))
|
||
.join("line")
|
||
.attr("x1", 0)
|
||
.attr("x2", innerWidth)
|
||
.attr("y1", d => y(d))
|
||
.attr("y2", d => y(d))
|
||
.attr("stroke", "#ccc")
|
||
.attr("stroke-opacity", 0.5);
|
||
|
||
// Create a line generator
|
||
const line = d3.line()
|
||
.x(d => x(d.x))
|
||
.y(d => y(d.y))
|
||
.curve(d3.curveBasis);
|
||
|
||
// Group to contain the basis function lines
|
||
const linesGroup = g.append("g")
|
||
.attr("class", "basis-functions");
|
||
|
||
// Store the current basis functions for transition
|
||
let currentBasisFunctions = new Map();
|
||
|
||
// Function to update the chart with new data
|
||
function updateChart(data) {
|
||
updateChartInner(g, x, y, linesGroup, color, line, data);
|
||
}
|
||
|
||
// Store the update function
|
||
svg.node().update = updateChart;
|
||
|
||
// Initial render
|
||
updateChart(filteredData());
|
||
|
||
container.node().appendChild(svg.node());
|
||
return container.node();
|
||
}
|
||
```
|
||
|
||
## Knot Placement Details
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Non-central beta distribution @johnson1995continuous:
|
||
|
||
::: {style="font-size: 70%;"}
|
||
|
||
\begin{equation*}
|
||
\mathcal{B}(x, a, b, c) = \sum_{j=0}^{\infty} e^{-c/2} \frac{\left( \frac{c}{2} \right)^j}{j!} I_x \left( a + j , b \right)
|
||
\end{equation*}
|
||
|
||
::::
|
||
|
||
```{r, fig.align="center", echo=FALSE, out.width = "1000px", cache = TRUE}
|
||
knitr::include_graphics("assets/mcrps_learning/knot_placement.svg")
|
||
```
|
||
|
||
<i class="fa fa-fw fa-triangle-exclamation" style="color:var(--col_orange_9);"></i> Penalty and $\lambda$ need to be adjusted accordingly @li2022general
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Using non equidistant knots in `profoc` is straightforward:
|
||
|
||
```{r, echo = TRUE, eval = FALSE, cache = TRUE}
|
||
mod <- online(
|
||
y = Y,
|
||
experts = experts,
|
||
tau = 1:99 / 100,
|
||
b_smooth_pr = list(
|
||
knots = 9,
|
||
mu = 0.3,
|
||
sigma = 1,
|
||
nonc = 0,
|
||
tailweight = 1,
|
||
deg = 3
|
||
)
|
||
)
|
||
```
|
||
|
||
Basis specification `b_smooth_pr` is internally passed to `make_basis_mats()`.
|
||
|
||
<i class="fa fa-fw fa-check" style="color:var(--col_green_9);"></i> Profoc adjusts penatly and $\lambda$
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
::::
|
||
|
||
## Wrap-Up
|
||
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
[{{< fa triangle-exclamation >}}]{style="color:var(--col_red_9);"} Potential Downsides:
|
||
|
||
- Pointwise optimization can induce quantile crossing
|
||
- Can be solved by sorting the predictions
|
||
|
||
[{{< fa magnifying-glass >}}]{style="color:var(--col_orange_9);"} Important:
|
||
|
||
- The choice of the learning rate is crucial
|
||
- The loss function has to meet certain criteria
|
||
|
||
[{{< fa rocket >}}]{style="color:var(--col_green_9);"} Upsides:
|
||
|
||
- Pointwise learning outperforms the Naive solution significantly
|
||
- Online learning is much faster than batch methods
|
||
- Smoothing further improves the predictive performance
|
||
- Asymptotically not worse than the best convex combination
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
The [<i class="fa-brands fa-fw fa-github" style="color:var(--col_grey_10);"></i> profoc](https://profoc.berrisch.biz/) R Package:
|
||
|
||
- Implements all algorithms discussed above
|
||
- Is written using RcppArmadillo <i class="fa fa-fw fa-arrow-right" style="color:var(--col_grey_10);"></i> its fast
|
||
- Accepts vectors for most parameters
|
||
- The best parameter combination is chosen online
|
||
- Implements
|
||
- Forgetting, Fixed Share
|
||
- Different loss functions + gradients
|
||
|
||
Pubications:
|
||
|
||
<i class="fa fa-fw fa-newspaper" style="color:var(--col_grey_10);"></i> Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. CRPS learning. *Journal of Econometrics*, 237(2), 105221.
|
||
|
||
<i class="fa fa-fw fa-newspaper" style="color:var(--col_grey_10);"></i> Berrisch, J., & Ziel, F. [-@BERRISCH20241568]. Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. *International Journal of Forecasting*, 40(4), 1568-1586.
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
# Modeling Volatility and Dependence of European Carbon and Energy Prices {#sec-voldep}
|
||
|
||
Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). *Finance Research Letters*, 52, 103503.
|
||
|
||
---
|
||
|
||
##
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Motivation
|
||
|
||
Understanding European Allowances (EUA) dynamics is important
|
||
for several fields:
|
||
|
||
<i class="fa fa-fw fa-chart-pie" style="color:var(--col_grey_9);"></i> Portfolio & Risk Management,
|
||
|
||
<i class="fa fa-fw fa-timeline" style="color:var(--col_grey_9);"></i> Sustainability Planing
|
||
|
||
<i class="fa fa-fw fa-handshake" style="color:var(--col_grey_9);"></i> Political decisions
|
||
|
||
EUA prices are obviously connected to the energy market
|
||
|
||
How can the dynamics be characterized?
|
||
|
||
Several Questions arise:
|
||
|
||
- Data (Pre)processing
|
||
- Modeling Approach
|
||
- Evaluation
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Data
|
||
|
||
EUA, natural gas, Brent crude oil, coal
|
||
|
||
March 15, 2010, until October 14, 2022
|
||
|
||
Data was normalized w.r.t. $\text{CO}_2$ emissions
|
||
|
||
Emission-adjusted prices reflects one tonne of $\text{CO}_2$
|
||
|
||
We adjusted for inflation by Eurostat's HICP, excluding energy
|
||
|
||
Log transformation of the data to stabilize the variance
|
||
|
||
ADF Test: All series are stationary in first differences
|
||
|
||
Johansen’s likelihood ratio trace test suggests two cointegrating relationships (levels)
|
||
|
||
Johansen’s likelihood ratio trace test suggests no cointegrating relationships (logs)
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Data
|
||
|
||
```{r, echo=FALSE, fig.width = 12, fig.height = 6, fig.align="center", cache = TRUE}
|
||
readr::read_csv("assets/voldep/2022_10_14_eur_ref_co2_adj_hvpi_ex_nrg.csv") %>%
|
||
select(-EUR_USD, -hvpi_x_nrg) %>%
|
||
pivot_longer(-Date) %>%
|
||
mutate(name = factor(name,
|
||
levels = c("EUA", "Oil", "NGas", "Coal")
|
||
)) %>%
|
||
ggplot(aes(x = Date, y = value, color = name)) +
|
||
geom_line(
|
||
linewidth = linesize
|
||
) + # The ggplot2 call got a little simpler now :)
|
||
geom_vline(
|
||
# Beginning of the war
|
||
xintercept = as.Date("2022-02-24"),
|
||
col = as.character(cols[8, "grey"]),
|
||
linetype = "dashed"
|
||
) +
|
||
ylab("Prices EUR") +
|
||
xlab("Date") +
|
||
scale_color_manual(
|
||
values = as.character(c(
|
||
cols[8, col_eua],
|
||
cols[8, col_oil],
|
||
cols[8, col_gas],
|
||
cols[8, col_coal]
|
||
)),
|
||
labels = c(
|
||
"EUA",
|
||
"Oil",
|
||
"NGas",
|
||
"Coal"
|
||
)
|
||
) +
|
||
guides(color = guide_legend(override.aes = list(size = 2))) +
|
||
theme_minimal() +
|
||
theme(
|
||
legend.position = "bottom",
|
||
# legend.key.size = unit(1.5, "cm"),
|
||
legend.title = element_blank(),
|
||
text = element_text(
|
||
size = text_size
|
||
)
|
||
) +
|
||
scale_x_date(
|
||
breaks = as.Date(
|
||
paste0(
|
||
as.character(seq(2010, 2022, 2)),
|
||
"-01-01"
|
||
)
|
||
),
|
||
date_labels = "%Y"
|
||
) +
|
||
xlab(NULL) +
|
||
scale_y_continuous(trans = "log2")
|
||
```
|
||
|
||
## Modeling Approach: Overview
|
||
|
||
</br>
|
||
|
||
### VECM: Vector Error Correction Model
|
||
|
||
- Modeling the expectaion
|
||
- Captures the long-run cointegrating relationship
|
||
- Different cointegrating ranks, including rank zero (no cointegration)
|
||
|
||
### GARCH: Generalized Autoregressive Conditional Heteroscedasticity
|
||
|
||
- Captures dynamics in conditional variance
|
||
|
||
### Copula: Captures the dependence structure
|
||
|
||
- Captures: conditional cross-sectional dependencies
|
||
- Dependence allowed to vary over time
|
||
|
||
|
||
## Modeling Approach: Notation
|
||
|
||
<br/>
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
- Let $\boldsymbol{X}_t$ be a $K$-dimensional vector at time $t$
|
||
- The forecasting target:
|
||
- Conditional joint distribution
|
||
- $F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}$
|
||
- $\mathcal{F}_{t}$ is the sigma field generated by all information available up to and including time $t$
|
||
|
||
Sklars theorem: decompose target into
|
||
- marginal distributions: $F_{X_{k,t}|\mathcal{F}_{t-1}}$ for $k=1,\ldots, K$, and
|
||
- copula function: $C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}$
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Let $\boldsymbol{x}_t= (x_{1,t},\ldots, x_{K,t})^\intercal$ be the realized values
|
||
|
||
It holds that:
|
||
|
||
\begin{align}
|
||
F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}(\boldsymbol{x}_t) = C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}(\boldsymbol{u}_t) \nonumber
|
||
\end{align}
|
||
|
||
with: $\boldsymbol{u}_t =(u_{1,t},\ldots, u_{K,t})^\intercal$, $u_{k,t} = F_{X_{k,t}|\mathcal{F}_{t-1}}(x_{k,t})$
|
||
|
||
For brewity we drop the conditioning on $\mathcal{F}_{t-1}$.
|
||
|
||
The model can be specified as follows
|
||
|
||
\begin{align}
|
||
F(\boldsymbol{x}_t) = C \left[\mathbf{F}(\boldsymbol{x}_t; \boldsymbol{\mu}_t, \boldsymbol{ \sigma }_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda}); \Xi_t, \Theta\right] \nonumber
|
||
\end{align}
|
||
|
||
$\Xi_{t}$ denotes time-varying dependence parameters
|
||
$\Theta$ denotes time-invariant dependence parameters
|
||
|
||
We take $C$ as the $t$-copula
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Modeling Approach: Mean and Variance
|
||
|
||
<br/>
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Individual marginal distributions:
|
||
|
||
$$\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}$$
|
||
|
||
### Generalized non-central t-distributions
|
||
- To account for heavy tails
|
||
- Time varying
|
||
- expectation: $\boldsymbol{\mu}_t = (\mu_{1,t}, \ldots, \mu_{K,t})^{\intercal}$
|
||
- variance: $\boldsymbol{\sigma}_{t}^2 = (\sigma_{1,t}^2, \ldots, \sigma_{K,t}^2)^{\intercal}$
|
||
- Time invariant
|
||
- degrees of freedom: $\boldsymbol{\nu} = (\nu_1, \ldots, \nu_K)^{\intercal}$
|
||
- noncentrality: $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_K)^{\intercal}$
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### VECM Model
|
||
|
||
\begin{align}
|
||
\Delta \boldsymbol{\mu}_t = \Pi \boldsymbol{x}_{t-1} + \Gamma \Delta \boldsymbol{x}_{t-1} \nonumber
|
||
\end{align}
|
||
|
||
where $\Pi = \alpha \beta^{\intercal}$ is the cointegrating matrix of rank $r$, $0 \leq r\leq K$.
|
||
|
||
### GARCH model
|
||
|
||
\begin{align}
|
||
\sigma_{i,t}^2 = & \omega_i + \alpha^+_{i} (\epsilon_{i,t-1}^+)^2 + \alpha^-_{i} (\epsilon_{i,t-1}^-)^2 + \beta_i \sigma_{i,t-1}^2 \nonumber
|
||
\end{align}
|
||
|
||
where $\epsilon_{i,t-1}^+ = \max\{\epsilon_{i,t-1}, 0\}$ ...
|
||
|
||
Separate coefficients for positive and negative innovations to capture leverage effects.
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Modeling Approach: Dependence
|
||
|
||
<br/>
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Time-varying dependence parameters
|
||
|
||
\begin{align*}
|
||
\Xi_{t} = & \Lambda\left(\boldsymbol{\xi}_{t}\right)
|
||
\\
|
||
\xi_{ij,t} = & \eta_{0,ij} + \eta_{1,ij} \xi_{ij,t-1} + \eta_{2,ij} z_{i,t-1} z_{j,t-1},
|
||
\end{align*}
|
||
|
||
$\xi_{ij,t}$ is a latent process
|
||
|
||
|
||
$z_{i,t}$ denotes the $i$-th standardized residual from time series $i$ at time point $t$
|
||
|
||
|
||
$\Lambda(\cdot)$ is a link function
|
||
- ensures that $\Xi_{t}$ is a valid variance covariance matrix
|
||
- ensures that $\Xi_{t}$ does not exceed its support space and remains semi-positive definite
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Maximum Likelihood Estimation
|
||
|
||
All parameters can be estimated jointly. Using conditional independence:
|
||
\begin{align*}
|
||
L = f_{X_1} \prod_{i=2}^T f_{X_i|\mathcal{F}_{i-1}},
|
||
\end{align*}
|
||
with multivariate conditional density:
|
||
\begin{align*}
|
||
f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu},
|
||
\boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda})
|
||
\end{align*}
|
||
The copula density $c$ can be derived analytically.
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Study Design and Evaluation
|
||
|
||
<br/>
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Rolling-window forecasting study
|
||
|
||
- 3257 observations total
|
||
- Window size: 1000 days (~ four years)
|
||
- Forecasting 30-steps-ahead
|
||
|
||
=> 2227 potential starting points
|
||
|
||
We sample 250 to reduce computational cost
|
||
|
||
We draw $2^{12}= 2048$ trajectories from the joint predictive distribution
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
### Evaluation
|
||
|
||
Forecasts are evaluated by the energy score (ES)
|
||
|
||
\begin{align*}
|
||
\text{ES}_t(F, \mathbf{x}_t) = \mathbb{E}_{F} \left(||\tilde{\mathbf{X}}_t - \mathbf{x}_t||_2\right) - \\ \frac{1}{2} \mathbb{E}_F \left(||\tilde{\mathbf{X}}_t - \tilde{\mathbf{X}}_t'||_2 \right)
|
||
\end{align*}
|
||
|
||
where $\mathbf{x}_t$ is the observed $K$-dimensional realization and $\tilde{\mathbf{X}}_t$, respectively $\tilde{\mathbf{X}}_t'$ are independent random vectors distributed according to $F$
|
||
|
||
For univariate cases the Energy Score becomes the Continuous Ranked Probability Score (CRPS)
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Results
|
||
|
||
::: {.panel-tabset}
|
||
|
||
## Energy Scores
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="55%"}
|
||
|
||
Relative improvement in ES compared to $\text{RW}^{\sigma, \rho}$
|
||
|
||
Cellcolor: w.r.t. test statistic of Diebold-Mariano test (testing wether the model outperformes the benchmark, greener = better).
|
||
|
||
```{r, echo=FALSE, results='asis', width = 'revert-layer', cache = TRUE}
|
||
load("assets/voldep/energy_df.Rdata")
|
||
table_energy <- energy %>%
|
||
kbl(
|
||
digits = 2,
|
||
col.names = c(
|
||
"Model",
|
||
"\\(\\text{ES}^{\\text{All}}_{1-30}\\)",
|
||
"\\(\\text{ES}^{\\text{EUA}}_{1-30}\\)",
|
||
"\\(\\text{ES}^{\\text{Oil}}_{1-30}\\)",
|
||
"\\(\\text{ES}^{\\text{NGas}}_{1-30}\\)",
|
||
"\\(\\text{ES}^{\\text{Coal}}_{1-30}\\)",
|
||
"\\(\\text{ES}^{\\text{All}}_{1}\\)",
|
||
"\\(\\text{ES}^{\\text{All}}_{5}\\)",
|
||
"\\(\\text{ES}^{\\text{All}}_{30}\\)"
|
||
),
|
||
# linesep = "",
|
||
bootstrap_options = "condensed",
|
||
# Dont replace any string, dataframe has to be valid latex code ...
|
||
escape = TRUE,
|
||
format = "html",
|
||
align = c("l", rep("r", ncol(energy) - 1))
|
||
) %>%
|
||
kable_paper(full_width = TRUE) %>%
|
||
row_spec(0:nrow(energy), color = cols[9, "grey"])
|
||
|
||
for (i in 2:ncol(energy)) {
|
||
bold_cells <- rep(FALSE, times = nrow(energy))
|
||
|
||
if (all(energy[-1, i, drop = TRUE] < 0)) {
|
||
bold_cells[1] <- TRUE
|
||
} else {
|
||
bold_cells[which.max(energy[-1, i, drop = TRUE]) + 1] <- TRUE
|
||
}
|
||
|
||
table_energy <- table_energy %>%
|
||
column_spec(i,
|
||
background = c(
|
||
cols[5, "grey"],
|
||
col_scale2(
|
||
energy_dm[, i, drop = TRUE][-1],
|
||
rng_t
|
||
)
|
||
),
|
||
bold = bold_cells
|
||
)
|
||
}
|
||
table_energy %>%
|
||
kable_styling(
|
||
bootstrap_options = c("condensed"),
|
||
font_size = 14
|
||
)
|
||
```
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="41%"}
|
||
|
||
- Benchmarks:
|
||
- $\text{RW}^{\sigma, \rho}$: Random walk with constant volatility and correlation
|
||
- Univariate $\text{ETS}^{\sigma}$ with constant volatility
|
||
- Vector ETS $VES^{\sigma}$ with constant volatility
|
||
|
||
- Heteroscedasticity is a main driver of ES
|
||
- The VECM model without cointegration (essentially a VAR) is the best performing model in terms of ES overall
|
||
- For EUA, the ETS Benchmark is the best performing model in terms of ES
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## CRPS Scores
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="28%"}
|
||
|
||
- CRPS solely evaluates the marginal distributions
|
||
- The cross-sectional dependence is ignored
|
||
|
||
|
||
- VES models deliver poor performance in short horizons
|
||
- For Oil prices the RW Benchmark can't be oupterformed 30 steps ahead
|
||
- Both VECM models generally deliver good performance
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="68%"}
|
||
|
||
Improvement in CRPS of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$ (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
|
||
|
||
```{r, echo=FALSE, results = 'asis', cache = TRUE}
|
||
load("assets/voldep/crps_df.Rdata")
|
||
|
||
table_crps <- crps %>%
|
||
kbl(
|
||
col.names = c("Model", rep(paste0("H", h_sub), 4)),
|
||
digits = 1,
|
||
linesep = "",
|
||
# Dont replace any string, dataframe has to be valid latex code ...
|
||
escape = FALSE,
|
||
format = "html",
|
||
booktabs = FALSE,
|
||
align = c("l", rep("r", ncol(crps) - 1))
|
||
) %>%
|
||
kable_paper(full_width = FALSE)
|
||
|
||
for (i in 2:ncol(crps)) {
|
||
bold_cells <- rep(FALSE, times = nrow(crps))
|
||
|
||
if (all(crps[-1, i, drop = TRUE] < 0)) {
|
||
bold_cells[1] <- TRUE
|
||
} else {
|
||
bold_cells[which.max(crps[-1, i, drop = TRUE]) + 1] <- TRUE
|
||
}
|
||
|
||
table_crps <- table_crps %>%
|
||
column_spec(i,
|
||
background = c(cols[5, "grey"], col_scale2(
|
||
crps_dm[, i, drop = TRUE][-1], rng_t
|
||
)),
|
||
bold = bold_cells
|
||
)
|
||
}
|
||
|
||
table_crps <- table_crps %>%
|
||
add_header_above(c(" ", "EUA" = 3, "Oil" = 3, "NGas" = 3, "Coal" = 3)) %>%
|
||
row_spec(0:nrow(crps), color = cols[9, "grey"])
|
||
|
||
table_crps %>%
|
||
kable_styling(
|
||
bootstrap_options = c("condensed"),
|
||
full_width = FALSE,
|
||
font_size = 16
|
||
)
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## RMSE
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="28%"}
|
||
|
||
RMSE measures the performance of the forecasts at their mean
|
||
|
||
|
||
</br>
|
||
|
||
|
||
- Some models beat the benchmarks at short horizons
|
||
|
||
</br>
|
||
|
||
Conclusion: the Improvements seen before must be attributed to other parts of the multivariate probabilistic predictive distribution
|
||
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="68%"}
|
||
|
||
Improvement in RMSE score of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$ (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
|
||
|
||
```{r, echo=FALSE, results = 'asis', cache = TRUE}
|
||
load("assets/voldep/rmsq_df.Rdata")
|
||
|
||
table_rmsq <- rmsq %>%
|
||
kbl(
|
||
col.names = c("Model", rep(paste0("H", h_sub), 4)),
|
||
digits = 1,
|
||
# Dont replace any string, dataframe has to be valid latex code ...
|
||
escape = FALSE,
|
||
linesep = "",
|
||
format = "html",
|
||
align = c("l", rep("r", ncol(rmsq) - 1))
|
||
) %>%
|
||
kable_paper(full_width = FALSE)
|
||
|
||
for (i in 2:ncol(rmsq)) {
|
||
bold_cells <- rep(FALSE, times = nrow(rmsq))
|
||
|
||
if (all(rmsq[-1, i, drop = TRUE] < 0)) {
|
||
bold_cells[1] <- TRUE
|
||
} else {
|
||
bold_cells[which.max(rmsq[-1, i, drop = TRUE]) + 1] <- TRUE
|
||
}
|
||
|
||
table_rmsq <- table_rmsq %>%
|
||
column_spec(i,
|
||
background = c(cols[5, "grey"], col_scale2(
|
||
rmsq_dm[, i, drop = TRUE][-1], rng_t
|
||
)),
|
||
bold = bold_cells
|
||
)
|
||
}
|
||
|
||
table_rmsq <- table_rmsq %>%
|
||
add_header_above(c(" ", "EUA" = 3, "Oil" = 3, "NGas" = 3, "Coal" = 3)) %>%
|
||
row_spec(0:nrow(rmsq), color = cols[9, "grey"])
|
||
|
||
table_rmsq %>%
|
||
kable_styling(
|
||
bootstrap_options = c("condensed"),
|
||
full_width = FALSE,
|
||
font_size = 14
|
||
)
|
||
```
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Evolution of Linear Dependence $\Xi$
|
||
|
||
```{r, echo=FALSE, fig.width = 12, fig.height = 5.5, fig.align="center", cache = TRUE}
|
||
load("assets/voldep/plot_rho_df.Rdata")
|
||
ggplot() +
|
||
geom_line(
|
||
size = linesize,
|
||
data = plot_data[plot_data$name == "5Oil-Coal", ],
|
||
aes(x = idx, y = value, col = name)
|
||
) +
|
||
geom_line(
|
||
size = linesize,
|
||
data = plot_data[plot_data$name == "2EUA-NGas", ],
|
||
aes(x = idx, y = value, col = name)
|
||
) +
|
||
geom_line(
|
||
size = linesize,
|
||
data = plot_data[plot_data$name == "1EUA-Oil", ],
|
||
aes(x = idx, y = value, col = name)
|
||
) +
|
||
geom_line(
|
||
size = linesize,
|
||
data = plot_data[plot_data$name == "3EUA-Coal", ],
|
||
aes(x = idx, y = value, col = name)
|
||
) +
|
||
geom_line(
|
||
size = linesize,
|
||
data = plot_data[plot_data$name == "6NGas-Coal", ],
|
||
aes(x = idx, y = value, col = name)
|
||
) +
|
||
geom_line(
|
||
size = linesize,
|
||
data = plot_data[plot_data$name == "4Oil-NGas", ],
|
||
aes(x = idx, y = value, col = name)
|
||
) +
|
||
geom_vline(
|
||
xintercept = as.Date("2022-02-24"), # Beginning of the war
|
||
col = as.character(cols[8, "grey"]),
|
||
linetype = "dashed"
|
||
) +
|
||
geom_vline(
|
||
xintercept = as.Date("2022-02-28"), # Central bank of russia was blocked from access to foreign reserves
|
||
col = as.character(cols[8, "grey"]),
|
||
linetype = "dotted"
|
||
) +
|
||
scale_colour_manual(
|
||
values = as.character(
|
||
cols[
|
||
6,
|
||
c("green", "purple", "blue", "red", "orange", "brown")
|
||
]
|
||
),
|
||
labels = c(
|
||
paste0(varnames[comb[, 1] + 1], "-", varnames[comb[, 2] + 1])
|
||
)
|
||
) +
|
||
theme_minimal() +
|
||
theme(
|
||
zoom.x = element_rect(fill = cols[4, "grey"], colour = NA),
|
||
legend.position = "bottom",
|
||
text = element_text(size = text_size)
|
||
) +
|
||
ylab("Correlation") +
|
||
scale_x_date(
|
||
breaks = date_breaks_fct,
|
||
labels = date_labels_fct
|
||
) +
|
||
xlab(NULL) +
|
||
guides(col = guide_legend(
|
||
title = NULL,
|
||
nrow = 1,
|
||
byrow = TRUE,
|
||
override.aes = list(size = 2)
|
||
)) +
|
||
# scale_y_continuous(breaks = seq(-1.5, 1.0, 0.5)) +
|
||
ggforce::facet_zoom(
|
||
xlim = c(
|
||
as.Date("2022-02-02"),
|
||
as.Date("2022-04-30")
|
||
),
|
||
zoom.size = 1.5,
|
||
show.area = TRUE
|
||
)
|
||
```
|
||
|
||
## Predictive Quantiles
|
||
|
||
```{r, echo=FALSE, fig.width = 12, fig.height = 5.5, fig.align="center", cache = TRUE}
|
||
load("assets/voldep/plot_quant_df.Rdata")
|
||
|
||
plot_quant_data %>% ggplot(aes(x = date, y = value)) +
|
||
geom_line(size = 1, aes(col = quant)) +
|
||
geom_vline(
|
||
xintercept = as.Date("2022-02-24"), # Beginning of the war
|
||
col = as.character(cols[8, "grey"]),
|
||
linetype = "dashed"
|
||
) +
|
||
geom_vline(
|
||
xintercept = as.Date("2022-02-28"), # Central bank of russia was blocked from access to foreign reserves
|
||
col = as.character(cols[8, "grey"]),
|
||
linetype = "dotted"
|
||
) +
|
||
scale_colour_manual(
|
||
values = as.character(
|
||
c(
|
||
cols[
|
||
5,
|
||
c(1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15)
|
||
],
|
||
cols[9, 18]
|
||
)
|
||
)
|
||
) +
|
||
facet_grid(var ~ ., scales = "free_y") +
|
||
theme_minimal() +
|
||
theme(
|
||
legend.position = "bottom",
|
||
text = element_text(size = text_size),
|
||
strip.background = element_rect(
|
||
fill = cols[1, 18],
|
||
colour = cols[1, 18]
|
||
)
|
||
) +
|
||
ylab("EUR") +
|
||
scale_x_date(
|
||
breaks = (seq(
|
||
as.Date("2022-02-15"),
|
||
as.Date("2022-03-31"),
|
||
by = "4 day"
|
||
) + 1)[-12],
|
||
date_labels = "%b %d",
|
||
limits = c(as.Date("2022-02-15"), as.Date("2022-03-31"))
|
||
) +
|
||
xlab(NULL) +
|
||
guides(col = guide_legend(
|
||
title = "Quantiles [\\%]",
|
||
nrow = 2,
|
||
byrow = TRUE,
|
||
override.aes = list(size = 2)
|
||
)) +
|
||
scale_y_continuous(
|
||
trans = "log2",
|
||
breaks = y_breaks
|
||
)
|
||
|
||
```
|
||
|
||
::::
|
||
|
||
## Conclusion
|
||
|
||
:::: {.columns}
|
||
|
||
::: {.column width="48%"}
|
||
|
||
Accounting for heteroscedasticity or stabilizing the variance via log transformation is crucial for good performance in terms of ES
|
||
|
||
- Price dynamics emerged way before the russian invaion into ukraine
|
||
- Linear dependence between the series reacted only right after the invasion
|
||
- Improvements in forecasting performance is mainly attributed to:
|
||
- the tails multivariate probabilistic predictive distribution
|
||
- the dependence structure between the marginals
|
||
|
||
:::
|
||
|
||
::: {.column width="4%"}
|
||
|
||
:::
|
||
|
||
::: {.column width="48%"}
|
||
|
||
</br>
|
||
|
||
<center>
|
||
<img src="assets/voldep/frame.png">
|
||
</center>
|
||
|
||
<i class="fa fa-fw fa-newspaper" style="color:var(--col_grey_10);"></i> @berrisch2023modeling
|
||
|
||
:::
|
||
|
||
::::
|
||
|
||
## Contributions and Outlook {#sec-conclusion}
|
||
|
||
|
||
|
||
|
||
|
||
## References |