A thousand minor improvements

This commit is contained in:
2025-06-20 11:30:44 +02:00
parent 2b22506fb8
commit 9bc402b926
6 changed files with 2309 additions and 880 deletions

240
index.qmd
View File

@@ -36,7 +36,8 @@ revealjs-plugins:
# - drop
---
## Outline {.center}
# High-Level View {.center visibility="uncounted"}
<!--
Render with: quarto preview /home/jonathan/git/PHD-Presentation/25_07_phd_defense/index.qmd --no-browser --port 6074
@@ -48,7 +49,7 @@ $$
$$
:::
:::: {style="font-size: 150%;"}
<!-- :::: {style="font-size: 150%;"}
<i class="fa fa-fw fa-rocket" style="color:var(--col_grey_9);"></i> &ensp; [Research Motivation](#motivation)
@@ -60,7 +61,8 @@ $$
<i class="fa fa-fw fa-newspaper" style="color:var(--col_grey_9);"></i> &ensp; [Contributions](#sec-contributions)
:::
:::: -->
```{r, setup, include=FALSE}
# Compile with: rmarkdown::render("crps_learning.Rmd")
@@ -754,7 +756,7 @@ void main(){
::::
# CRPS Learning {#sec-crps-learning}
# CRPS Learning {#sec-crps-learning visibility="uncounted"}
Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. *Journal of Econometrics*, 237(2), 105221.
@@ -843,7 +845,7 @@ plot(w[, 3],
xlab = "",
ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#FFD44EFF"
)
text(6, 0.25, TeX("$w_3(t)$"), cex = 2, col = "#FFD44EFF")
text(6, 0.25, TeX("$w_2(t)$"), cex = 2, col = "#FFD44EFF")
arrows(13, 0.75, 15, 1, , lwd = 4, bty = "n", col = "#414141FF")
```
@@ -941,7 +943,7 @@ chart = {
.style('align-self', 'center')
.style('margin-left', 'auto')
.on('click', () => {
selectedMu = 0.5;
selectedMu = 1;
muSlider.property('value', selectedMu);
muDisplay.text(selectedMu.toFixed(1));
updateChart(filteredData());
@@ -1053,7 +1055,7 @@ chart = {
## The Framework of Prediction under Expert Advice
### The sequential framework
### &nbsp;
:::: {.columns}
@@ -1287,7 +1289,7 @@ $\ell'$ is the subgradient of $\ell$ at forecast combination $\widetilde{X}$.
\text{CRPS}(F, y) = \int_{\mathbb{R}} {(F(x) - \mathbb{1}\{ x > y \})}^2 dx \label{eq:crps}
\end{equation*}
It's strictly proper @gneiting2007strictly.
It's strictly proper [@gneiting2007strictly].
Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what if the experts' performance varies in parts of the distribution?
@@ -2277,7 +2279,7 @@ weights %>%
::::
# Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices
# Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices {visibility="uncounted"}
Berrisch, J., & Ziel, F. (2024). *International Journal of Forecasting*, 40(4), 1568-1586.
@@ -3014,14 +3016,12 @@ Pubications:
::::
# Modeling Volatility and Dependence of European Carbon and Energy Prices {#sec-voldep}
# Modeling Volatility and Dependence of European Carbon and Energy Prices {#sec-voldep visibility="uncounted"}
Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). *Finance Research Letters*, 52, 103503.
---
## &nbsp;
:::: {.columns}
::: {.column width="48%"}
@@ -3037,7 +3037,7 @@ for several fields:
<i class="fa fa-fw fa-handshake" style="color:var(--col_grey_9);"></i> Political decisions
EUA prices are obviously connected to the energy market
EUA prices are connected to energy markets
How can the dynamics be characterized?
@@ -3057,23 +3057,20 @@ Several Questions arise:
### Data
EUA, natural gas, Brent crude oil, coal
Daily Observations: 03/15/2010 - 10/14/2022
March 15, 2010, until October 14, 2022
EUA, Natural Gas, Brent Crude Oil, Coal
Data was normalized w.r.t. $\text{CO}_2$ emissions
- normalized w.r.t. $\text{CO}_2$ emissions
- Adjusted for inflation by Eurostat's HICP, *excluding energy*
Emission-adjusted prices reflects one tonne of $\text{CO}_2$
We adjusted for inflation by Eurostat's HICP, excluding energy
Emission-adjusted prices reflect one tonne of $\text{CO}_2$
Log transformation of the data to stabilize the variance
ADF Test: All series are stationary in first differences
Johansens likelihood ratio trace test suggests two cointegrating relationships (levels)
Johansens likelihood ratio trace test suggests no cointegrating relationships (logs)
Johansens likelihood ratio trace test suggests two cointegrating relationships (only in levels)
:::
@@ -3137,26 +3134,6 @@ readr::read_csv("assets/voldep/2022_10_14_eur_ref_co2_adj_hvpi_ex_nrg.csv") %>%
scale_y_continuous(trans = "log2")
```
## Modeling Approach: Overview
</br>
### VECM: Vector Error Correction Model
- Modeling the expectaion
- Captures the long-run cointegrating relationship
- Different cointegrating ranks, including rank zero (no cointegration)
### GARCH: Generalized Autoregressive Conditional Heteroscedasticity
- Captures dynamics in conditional variance
### Copula: Captures the dependence structure
- Captures: conditional cross-sectional dependencies
- Dependence allowed to vary over time
## Modeling Approach: Notation
<br/>
@@ -3171,9 +3148,10 @@ readr::read_csv("assets/voldep/2022_10_14_eur_ref_co2_adj_hvpi_ex_nrg.csv") %>%
- $F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}$
- $\mathcal{F}_{t}$ is the sigma field generated by all information available up to and including time $t$
Sklars theorem: decompose target into
- marginal distributions: $F_{X_{k,t}|\mathcal{F}_{t-1}}$ for $k=1,\ldots, K$, and
- copula function: $C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}$
Sklars theorem: decompose target into
- marginal distributions: $F_{X_{k,t}|\mathcal{F}_{t-1}}$ for $k=1,\ldots, K$, and
- copula function: $C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}$
:::
@@ -3210,7 +3188,7 @@ We take $C$ as the $t$-copula
::::
## Modeling Approach: Mean and Variance
## Modeling Approach: The General Framework
<br/>
@@ -3222,22 +3200,13 @@ We take $C$ as the $t$-copula
$$\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}$$
### Generalized non-central t-distributions
- To account for heavy tails
- Time varying
- expectation: $\boldsymbol{\mu}_t = (\mu_{1,t}, \ldots, \mu_{K,t})^{\intercal}$
- variance: $\boldsymbol{\sigma}_{t}^2 = (\sigma_{1,t}^2, \ldots, \sigma_{K,t}^2)^{\intercal}$
- Time invariant
- degrees of freedom: $\boldsymbol{\nu} = (\nu_1, \ldots, \nu_K)^{\intercal}$
- noncentrality: $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_K)^{\intercal}$
Generalized non-central t-distributions
:::
::: {.column width="4%"}
:::
::: {.column width="48%"}
- Time varying: expectation $\boldsymbol{\mu}_t = (\mu_{1,t}, \ldots, \mu_{K,t})^{\intercal}$
- variance: $\boldsymbol{\sigma}_{t}^2 = (\sigma_{1,t}^2, \ldots, \sigma_{K,t}^2)^{\intercal}$
- Time invariant
- degrees of freedom: $\boldsymbol{\nu} = (\nu_1, \ldots, \nu_K)^{\intercal}$
- noncentrality: $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_K)^{\intercal}$
### VECM Model
@@ -3247,6 +3216,14 @@ $$\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}$$
where $\Pi = \alpha \beta^{\intercal}$ is the cointegrating matrix of rank $r$, $0 \leq r\leq K$.
:::
::: {.column width="4%"}
:::
::: {.column width="48%"}
### GARCH model
\begin{align}
@@ -3255,19 +3232,7 @@ where $\Pi = \alpha \beta^{\intercal}$ is the cointegrating matrix of rank $r$,
where $\epsilon_{i,t-1}^+ = \max\{\epsilon_{i,t-1}, 0\}$ ...
Separate coefficients for positive and negative innovations to capture leverage effects.
:::
::::
## Modeling Approach: Dependence
<br/>
:::: {.columns}
::: {.column width="48%"}
Positive vs. negative innovations (capture leverage effects).
### Time-varying dependence parameters
@@ -3277,39 +3242,15 @@ Separate coefficients for positive and negative innovations to capture leverage
\xi_{ij,t} = & \eta_{0,ij} + \eta_{1,ij} \xi_{ij,t-1} + \eta_{2,ij} z_{i,t-1} z_{j,t-1},
\end{align*}
$\xi_{ij,t}$ is a latent process
$z_{i,t}$ is the $i$-th standardized residual from time series $i$
$\Lambda(\cdot)$ is a link function:
$z_{i,t}$ denotes the $i$-th standardized residual from time series $i$ at time point $t$
$\Lambda(\cdot)$ is a link function
- ensures that $\Xi_{t}$ is a valid variance covariance matrix
- ensures that $\Xi_{t}$ does not exceed its support space and remains semi-positive definite
:::
::: {.column width="4%"}
:::
::: {.column width="48%"}
### Maximum Likelihood Estimation
All parameters can be estimated jointly. Using conditional independence:
\begin{align*}
L = f_{X_1} \prod_{i=2}^T f_{X_i|\mathcal{F}_{i-1}},
\end{align*}
with multivariate conditional density:
\begin{align*}
f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu},
\boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda})
\end{align*}
The copula density $c$ can be derived analytically.
:::
::::
## Study Design and Evaluation
@@ -3324,13 +3265,19 @@ The copula density $c$ can be derived analytically.
- 3257 observations total
- Window size: 1000 days (~ four years)
- Forecasting 30-steps-ahead
- We sample 250 of 2227 starting points
- We draw $2^{12}= 2048$ trajectories 30 steps ahead
=> 2227 potential starting points
### Estimation
We sample 250 to reduce computational cost
Joint maximum lieklihood estimation:
We draw $2^{12}= 2048$ trajectories from the joint predictive distribution
\begin{align*}
f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu},
\boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda})
\end{align*}
The copula density $c$ can be derived analytically.
:::
@@ -3342,7 +3289,7 @@ We draw $2^{12}= 2048$ trajectories from the joint predictive distribution
### Evaluation
Forecasts are evaluated by the energy score (ES)
Our main objective is the Energy Score (ES)
\begin{align*}
\text{ES}_t(F, \mathbf{x}_t) = \mathbb{E}_{F} \left(||\tilde{\mathbf{X}}_t - \mathbf{x}_t||_2\right) - \\ \frac{1}{2} \mathbb{E}_F \left(||\tilde{\mathbf{X}}_t - \tilde{\mathbf{X}}_t'||_2 \right)
@@ -3368,7 +3315,7 @@ For univariate cases the Energy Score becomes the Continuous Ranked Probability
Relative improvement in ES compared to $\text{RW}^{\sigma, \rho}$
Cellcolor: w.r.t. test statistic of Diebold-Mariano test (testing wether the model outperformes the benchmark, greener = better).
Cellcolor: w.r.t. test statistic of Diebold-Mariano test (wether the model outperformes the benchmark, greener = better).
```{r, echo=FALSE, results='asis', width = 'revert-layer', cache = TRUE}
load("assets/voldep/energy_df.Rdata")
@@ -3424,6 +3371,23 @@ table_energy %>%
)
```
```{=html}
<div style="font-size: 0.5em; margin-top: 0.5em;">
<span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
<span style="background-color: #66BA6A; padding: 2px 6px;">&lt;-5</span>
<span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
<span style="background-color: #91C866; padding: 2px 6px;">-3</span>
<span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
<span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
<span style="background-color: #FFED58; padding: 2px 6px;">0</span>
<span style="background-color: #FFD145; padding: 2px 6px;">1</span>
<span style="background-color: #FFB531; padding: 2px 6px;">2</span>
<span style="background-color: #FC9733; padding: 2px 6px;">3</span>
<span style="background-color: #F67744; padding: 2px 6px;">4</span>
<span style="background-color: #EE5250; padding: 2px 6px;">&gt;5</span>
</div>
```
:::
::: {.column width="4%"}
@@ -3438,7 +3402,7 @@ table_energy %>%
- Vector ETS $VES^{\sigma}$ with constant volatility
- Heteroscedasticity is a main driver of ES
- The VECM model without cointegration (essentially a VAR) is the best performing model in terms of ES overall
- The VECM model without cointegration (VAR) is the best performing model in terms of ES overall
- For EUA, the ETS Benchmark is the best performing model in terms of ES
:::
@@ -3467,7 +3431,7 @@ table_energy %>%
::: {.column width="68%"}
Improvement in CRPS of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$ (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
Relative improvement in CRPS compared to $\text{RW}^{\sigma, \rho}$
```{r, echo=FALSE, results = 'asis', cache = TRUE}
load("assets/voldep/crps_df.Rdata")
@@ -3515,6 +3479,23 @@ table_crps %>%
)
```
```{=html}
<div style="font-size: 0.5em; margin-top: 0.5em;">
<span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
<span style="background-color: #66BA6A; padding: 2px 6px;">&lt;-5</span>
<span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
<span style="background-color: #91C866; padding: 2px 6px;">-3</span>
<span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
<span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
<span style="background-color: #FFED58; padding: 2px 6px;">0</span>
<span style="background-color: #FFD145; padding: 2px 6px;">1</span>
<span style="background-color: #FFB531; padding: 2px 6px;">2</span>
<span style="background-color: #FC9733; padding: 2px 6px;">3</span>
<span style="background-color: #F67744; padding: 2px 6px;">4</span>
<span style="background-color: #EE5250; padding: 2px 6px;">&gt;5</span>
</div>
```
:::
::::
@@ -3527,16 +3508,9 @@ table_crps %>%
RMSE measures the performance of the forecasts at their mean
Some models beat the benchmarks at short horizons
</br>
- Some models beat the benchmarks at short horizons
</br>
Conclusion: the Improvements seen before must be attributed to other parts of the multivariate probabilistic predictive distribution
Conclusion: the Improvements seen before must be attributed to other parts of the multivariate predictive distribution
:::
@@ -3546,7 +3520,7 @@ Conclusion: the Improvements seen before must be attributed to other parts of th
::: {.column width="68%"}
Improvement in RMSE score of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$ (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
Relative improvement in RMSE compared to $\text{RW}^{\sigma, \rho}$
```{r, echo=FALSE, results = 'asis', cache = TRUE}
load("assets/voldep/rmsq_df.Rdata")
@@ -3593,6 +3567,23 @@ table_rmsq %>%
)
```
```{=html}
<div style="font-size: 0.5em; margin-top: 0.5em;">
<span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
<span style="background-color: #66BA6A; padding: 2px 6px;">&lt;-5</span>
<span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
<span style="background-color: #91C866; padding: 2px 6px;">-3</span>
<span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
<span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
<span style="background-color: #FFED58; padding: 2px 6px;">0</span>
<span style="background-color: #FFD145; padding: 2px 6px;">1</span>
<span style="background-color: #FFB531; padding: 2px 6px;">2</span>
<span style="background-color: #FC9733; padding: 2px 6px;">3</span>
<span style="background-color: #F67744; padding: 2px 6px;">4</span>
<span style="background-color: #EE5250; padding: 2px 6px;">&gt;5</span>
</div>
```
:::
::::
@@ -3757,8 +3748,8 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma
- Price dynamics emerged way before the russian invaion into ukraine
- Linear dependence between the series reacted only right after the invasion
- Improvements in forecasting performance is mainly attributed to:
- the tails multivariate probabilistic predictive distribution
- the dependence structure between the marginals
- the tails
- the dependence structure between the marginals
:::
@@ -3778,7 +3769,7 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma
::::
---
# Final Remarks {visibility="uncounted"}
## Contributions {#sec-contributions}
@@ -3786,8 +3777,6 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma
::: {.column width="48%"}
<p style="margin:1.5em;"></p>
**Theoretical**
Probabilistic Online Learning:
@@ -3821,8 +3810,6 @@ Applications
::: {.column width="48%"}
<p style="margin:1.5em;"></p>
**Software**
R Packages:
@@ -3852,5 +3839,8 @@ Berrisch, J., Narajewski, M., & Ziel, F. [-@BERRISCH2023100236]:
::::
## Questions! {visibility="uncounted"}
![Artwork by [\@allison_horst](https://allisonhorst.com/)](assets/allisonhorst/hiding.png)
## References {visibility="uncounted"}