A thousand minor improvements

2025-06-20 11:30:44 +02:00
parent 2b22506fb8
commit 9bc402b926
6 changed files with 2309 additions and 880 deletions
--- a/index.qmd
+++ b/index.qmd
@@ -36,7 +36,8 @@ revealjs-plugins:
 #  - drop
 ---

-## Outline {.center}
+# High-Level View {.center visibility="uncounted"}
+

 <!--
 Render with: quarto preview /home/jonathan/git/PHD-Presentation/25_07_phd_defense/index.qmd --no-browser --port 6074
@@ -48,7 +49,7 @@ $$
 $$
 :::

-:::: {style="font-size: 150%;"}
+<!-- :::: {style="font-size: 150%;"}

 <i class="fa fa-fw fa-rocket" style="color:var(--col_grey_9);"></i> &ensp; [Research Motivation](#motivation)

@@ -60,7 +61,8 @@ $$

 <i class="fa fa-fw fa-newspaper" style="color:var(--col_grey_9);"></i> &ensp; [Contributions](#sec-contributions)

-:::
+:::: -->
+

 ```{r, setup, include=FALSE}
 # Compile with: rmarkdown::render("crps_learning.Rmd")
@@ -754,7 +756,7 @@ void main(){
 ::::


-# CRPS Learning {#sec-crps-learning}
+# CRPS Learning {#sec-crps-learning visibility="uncounted"}

 Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. *Journal of Econometrics*, 237(2), 105221.

@@ -843,7 +845,7 @@ plot(w[, 3],
  xlab = "",
  ylab = "", xaxt = "n", yaxt = "n", bty = "n", col = "#FFD44EFF"
 )
-text(6, 0.25, TeX("$w_3(t)$"), cex = 2, col = "#FFD44EFF")
+text(6, 0.25, TeX("$w_2(t)$"), cex = 2, col = "#FFD44EFF")
 arrows(13, 0.75, 15, 1, , lwd = 4, bty = "n", col = "#414141FF")

 ```
@@ -941,7 +943,7 @@ chart = {
    .style('align-self', 'center')
    .style('margin-left', 'auto')
    .on('click', () => {
-      selectedMu = 0.5;
+      selectedMu = 1;
      muSlider.property('value', selectedMu);
      muDisplay.text(selectedMu.toFixed(1));
      updateChart(filteredData());
@@ -1053,7 +1055,7 @@ chart = {

 ## The Framework of Prediction under Expert Advice

-### The sequential framework
+### &nbsp;

 :::: {.columns}

@@ -1287,7 +1289,7 @@ $\ell'$ is the subgradient of $\ell$ at forecast combination $\widetilde{X}$.
  \text{CRPS}(F, y) = \int_{\mathbb{R}} {(F(x) - \mathbb{1}\{ x > y \})}^2 dx \label{eq:crps}
 \end{equation*}

-It's strictly proper @gneiting2007strictly.
+It's strictly proper [@gneiting2007strictly].

 Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what if the experts' performance varies in parts of the distribution? 

@@ -2277,7 +2279,7 @@ weights %>%

 ::::

-# Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices
+# Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices {visibility="uncounted"}

 Berrisch, J., & Ziel, F. (2024). *International Journal of Forecasting*, 40(4), 1568-1586.

@@ -3014,14 +3016,12 @@ Pubications:

 ::::

-# Modeling Volatility and Dependence of European Carbon and Energy Prices {#sec-voldep}
+# Modeling Volatility and Dependence of European Carbon and Energy Prices {#sec-voldep visibility="uncounted"}

 Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). *Finance Research Letters*, 52, 103503.

 ---

-## &nbsp;
-
 :::: {.columns}

 ::: {.column width="48%"}
@@ -3037,7 +3037,7 @@ for several fields:

 <i class="fa fa-fw fa-handshake" style="color:var(--col_grey_9);"></i> Political decisions

-EUA prices are obviously connected to the energy market
+EUA prices are connected to energy markets

 How can the dynamics be characterized?

@@ -3057,23 +3057,20 @@ Several Questions arise:

 ### Data

-EUA, natural gas, Brent crude oil, coal 
+Daily Observations:  03/15/2010 - 10/14/2022

-March 15, 2010, until October 14, 2022
+EUA, Natural Gas, Brent Crude Oil, Coal 

-Data was normalized w.r.t. $\text{CO}_2$ emissions
+- normalized w.r.t. $\text{CO}_2$ emissions
+- Adjusted for inflation by Eurostat's HICP, *excluding energy*

-Emission-adjusted prices reflects one tonne of $\text{CO}_2$
-
-We adjusted for inflation by Eurostat's HICP, excluding energy
+Emission-adjusted prices reflect one tonne of $\text{CO}_2$

 Log transformation of the data to stabilize the variance

 ADF Test: All series are stationary in first differences

-Johansen’s likelihood ratio trace test suggests two cointegrating relationships (levels)
-
-Johansen’s likelihood ratio trace test suggests no cointegrating relationships (logs)
+Johansen’s likelihood ratio trace test suggests two cointegrating relationships (only in levels)

 :::

@@ -3137,26 +3134,6 @@ readr::read_csv("assets/voldep/2022_10_14_eur_ref_co2_adj_hvpi_ex_nrg.csv") %>%
  scale_y_continuous(trans = "log2")
 ```

-## Modeling Approach: Overview
-
-</br>
-
-### VECM: Vector Error Correction Model
-
-  - Modeling the expectaion
-  - Captures the long-run cointegrating relationship
-  - Different cointegrating ranks, including rank zero (no cointegration)
-
-### GARCH: Generalized Autoregressive Conditional Heteroscedasticity
-
-  - Captures dynamics in conditional variance
-
-### Copula: Captures the dependence structure
-
-  - Captures: conditional cross-sectional dependencies
-  - Dependence allowed to vary over time
-
-
 ## Modeling Approach: Notation

 <br/>
@@ -3171,9 +3148,10 @@ readr::read_csv("assets/voldep/2022_10_14_eur_ref_co2_adj_hvpi_ex_nrg.csv") %>%
  - $F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}$ 
  - $\mathcal{F}_{t}$ is the sigma field generated by all information available up to and including time $t$

-Sklars theorem: decompose target into 
-  - marginal distributions: $F_{X_{k,t}|\mathcal{F}_{t-1}}$ for $k=1,\ldots, K$, and
-  - copula function: $C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}$
+Sklars theorem:  decompose target into 
+
+- marginal distributions: $F_{X_{k,t}|\mathcal{F}_{t-1}}$ for $k=1,\ldots, K$, and
+- copula function: $C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}$

 :::

@@ -3210,7 +3188,7 @@ We take $C$ as the $t$-copula

 ::::

-## Modeling Approach: Mean and Variance
+## Modeling Approach: The General Framework

 <br/>

@@ -3222,22 +3200,13 @@ We take $C$ as the $t$-copula

 $$\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}$$

-### Generalized non-central t-distributions
-  - To account for heavy tails
-  - Time varying
-    - expectation: $\boldsymbol{\mu}_t = (\mu_{1,t}, \ldots, \mu_{K,t})^{\intercal}$
-    - variance: $\boldsymbol{\sigma}_{t}^2 = (\sigma_{1,t}^2, \ldots, \sigma_{K,t}^2)^{\intercal}$
-  - Time invariant 
-    - degrees of freedom: $\boldsymbol{\nu} = (\nu_1, \ldots, \nu_K)^{\intercal}$
-    - noncentrality: $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_K)^{\intercal}$
+Generalized non-central t-distributions

-:::
-
-::: {.column width="4%"}
-
-:::
-
-::: {.column width="48%"}
+- Time varying: expectation $\boldsymbol{\mu}_t = (\mu_{1,t}, \ldots, \mu_{K,t})^{\intercal}$
+  - variance: $\boldsymbol{\sigma}_{t}^2 = (\sigma_{1,t}^2, \ldots, \sigma_{K,t}^2)^{\intercal}$
+- Time invariant 
+  - degrees of freedom: $\boldsymbol{\nu} = (\nu_1, \ldots, \nu_K)^{\intercal}$
+  - noncentrality: $\boldsymbol{\lambda} = (\lambda_1, \ldots, \lambda_K)^{\intercal}$

 ### VECM Model

@@ -3247,6 +3216,14 @@ $$\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}$$

 where $\Pi = \alpha \beta^{\intercal}$  is the cointegrating matrix of rank $r$, $0 \leq r\leq K$.

+:::
+
+::: {.column width="4%"}
+
+:::
+
+::: {.column width="48%"}
+
 ### GARCH model

 \begin{align}
@@ -3255,19 +3232,7 @@ where $\Pi = \alpha \beta^{\intercal}$  is the cointegrating matrix of rank $r$,

 where $\epsilon_{i,t-1}^+ = \max\{\epsilon_{i,t-1}, 0\}$ ...

-Separate coefficients for positive and negative innovations to capture leverage effects.
-
-:::
-
-::::
-
-## Modeling Approach: Dependence
-
-<br/>
-
-:::: {.columns}
-
-::: {.column width="48%"}
+Positive vs. negative innovations (capture leverage effects).

 ### Time-varying dependence parameters

@@ -3277,39 +3242,15 @@ Separate coefficients for positive and negative innovations to capture leverage
    \xi_{ij,t} = & \eta_{0,ij} + \eta_{1,ij} \xi_{ij,t-1} + \eta_{2,ij} z_{i,t-1} z_{j,t-1},
 \end{align*}

-$\xi_{ij,t}$ is a latent process 
+$z_{i,t}$ is the $i$-th standardized residual from time series $i$

+$\Lambda(\cdot)$ is a link function:

-$z_{i,t}$ denotes the $i$-th standardized residual from time series $i$ at time point $t$
-
-
-$\Lambda(\cdot)$ is a link function
 - ensures that $\Xi_{t}$ is a valid variance covariance matrix
 - ensures that $\Xi_{t}$ does not exceed its support space and remains semi-positive definite

 :::

-::: {.column width="4%"}
-
-:::
-
-::: {.column width="48%"}
-
-### Maximum Likelihood Estimation
-
-All parameters can be estimated jointly. Using conditional independence:
-\begin{align*}
-    L = f_{X_1} \prod_{i=2}^T f_{X_i|\mathcal{F}_{i-1}},
-\end{align*}
-with multivariate conditional density:
-\begin{align*}
-    f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu},
-    \boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda})
-\end{align*}
-The copula density $c$ can be derived analytically.
-
-:::
-
 ::::

 ## Study Design and Evaluation
@@ -3324,13 +3265,19 @@ The copula density $c$ can be derived analytically.

 - 3257 observations total
 - Window size: 1000 days (~ four years)
- Forecasting 30-steps-ahead
+- We sample 250 of 2227 starting points
+- We draw $2^{12}= 2048$ trajectories 30 steps ahead

-=> 2227 potential starting points 
+### Estimation

-We sample 250 to reduce computational cost
+Joint maximum lieklihood estimation:

-We draw $2^{12}= 2048$ trajectories from the joint predictive distribution 
+\begin{align*}
+    f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu},
+    \boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda})
+\end{align*}
+
+The copula density $c$ can be derived analytically.

 :::

@@ -3342,7 +3289,7 @@ We draw $2^{12}= 2048$ trajectories from the joint predictive distribution

 ### Evaluation

-Forecasts are evaluated by the energy score (ES)
+Our main objective is the Energy Score (ES)

 \begin{align*}
    \text{ES}_t(F, \mathbf{x}_t) = \mathbb{E}_{F} \left(||\tilde{\mathbf{X}}_t - \mathbf{x}_t||_2\right) - \\ \frac{1}{2} \mathbb{E}_F \left(||\tilde{\mathbf{X}}_t - \tilde{\mathbf{X}}_t'||_2 \right)
@@ -3368,7 +3315,7 @@ For univariate cases the Energy Score becomes the Continuous Ranked Probability

 Relative improvement in ES compared to $\text{RW}^{\sigma, \rho}$

-Cellcolor: w.r.t. test statistic of Diebold-Mariano test (testing wether the model outperformes the benchmark, greener = better).
+Cellcolor: w.r.t. test statistic of Diebold-Mariano test (wether the model outperformes the benchmark, greener = better).

 ```{r, echo=FALSE, results='asis', width = 'revert-layer', cache = TRUE}
 load("assets/voldep/energy_df.Rdata")
@@ -3424,6 +3371,23 @@ table_energy %>%
  )
 ```

+```{=html}
+<div style="font-size: 0.5em; margin-top: 0.5em;">
+  <span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
+  <span style="background-color: #66BA6A; padding: 2px 6px;">&lt;-5</span>
+  <span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
+  <span style="background-color: #91C866; padding: 2px 6px;">-3</span>
+  <span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
+  <span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
+  <span style="background-color: #FFED58; padding: 2px 6px;">0</span>
+  <span style="background-color: #FFD145; padding: 2px 6px;">1</span>
+  <span style="background-color: #FFB531; padding: 2px 6px;">2</span>
+  <span style="background-color: #FC9733; padding: 2px 6px;">3</span>
+  <span style="background-color: #F67744; padding: 2px 6px;">4</span>
+  <span style="background-color: #EE5250; padding: 2px 6px;">&gt;5</span>
+</div>
+```
+
 :::

 ::: {.column width="4%"}
@@ -3438,7 +3402,7 @@ table_energy %>%
  - Vector ETS $VES^{\sigma}$ with constant volatility 

 - Heteroscedasticity is a main driver of ES
- The VECM model without cointegration (essentially a VAR) is the best performing model in terms of ES overall
+- The VECM model without cointegration (VAR) is the best performing model in terms of ES overall
 - For EUA, the ETS Benchmark is the best performing model in terms of ES

 :::
@@ -3467,7 +3431,7 @@ table_energy %>%

 ::: {.column width="68%"}

-Improvement in CRPS of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$  (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
+Relative improvement in CRPS compared to $\text{RW}^{\sigma, \rho}$

 ```{r, echo=FALSE, results = 'asis', cache = TRUE}
 load("assets/voldep/crps_df.Rdata")
@@ -3515,6 +3479,23 @@ table_crps %>%
  )
 ```

+```{=html}
+<div style="font-size: 0.5em; margin-top: 0.5em;">
+  <span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
+  <span style="background-color: #66BA6A; padding: 2px 6px;">&lt;-5</span>
+  <span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
+  <span style="background-color: #91C866; padding: 2px 6px;">-3</span>
+  <span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
+  <span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
+  <span style="background-color: #FFED58; padding: 2px 6px;">0</span>
+  <span style="background-color: #FFD145; padding: 2px 6px;">1</span>
+  <span style="background-color: #FFB531; padding: 2px 6px;">2</span>
+  <span style="background-color: #FC9733; padding: 2px 6px;">3</span>
+  <span style="background-color: #F67744; padding: 2px 6px;">4</span>
+  <span style="background-color: #EE5250; padding: 2px 6px;">&gt;5</span>
+</div>
+```
+
 :::

 ::::
@@ -3527,16 +3508,9 @@ table_crps %>%

 RMSE measures the performance of the forecasts at their mean

+Some models beat the benchmarks at short horizons

-</br>
-
-
- Some models beat the benchmarks at short horizons
-
-</br>
-
-Conclusion: the Improvements seen before must be attributed to other parts of the multivariate probabilistic predictive distribution
-
+Conclusion: the Improvements seen before must be attributed to other parts of the multivariate predictive distribution

 :::

@@ -3546,7 +3520,7 @@ Conclusion: the Improvements seen before must be attributed to other parts of th

 ::: {.column width="68%"}

-Improvement in RMSE score of selected models relative to $\textrm{RW}^{\sigma, \rho}_{}$ in % (higher = better). Colored according to the test statistic of a DM-Test comparing to $\textrm{RW}^{\sigma, \rho}_{}$  (greener means lower test statistic i.e., better performance compared to $\textrm{RW}^{\sigma, \rho}_{}$).
+Relative improvement in RMSE compared to $\text{RW}^{\sigma, \rho}$

 ```{r, echo=FALSE, results = 'asis', cache = TRUE}
 load("assets/voldep/rmsq_df.Rdata")
@@ -3593,6 +3567,23 @@ table_rmsq %>%
  )
 ```

+```{=html}
+<div style="font-size: 0.5em; margin-top: 0.5em;">
+  <span style="padding: 2px 6px;">Coloring w.r.t. test statistic: </span>
+  <span style="background-color: #66BA6A; padding: 2px 6px;">&lt;-5</span>
+  <span style="background-color: #7CC168; padding: 2px 6px;">-4</span>
+  <span style="background-color: #91C866; padding: 2px 6px;">-3</span>
+  <span style="background-color: #B0D363; padding: 2px 6px;">-2</span>
+  <span style="background-color: #D8E05E; padding: 2px 6px;">-1</span>
+  <span style="background-color: #FFED58; padding: 2px 6px;">0</span>
+  <span style="background-color: #FFD145; padding: 2px 6px;">1</span>
+  <span style="background-color: #FFB531; padding: 2px 6px;">2</span>
+  <span style="background-color: #FC9733; padding: 2px 6px;">3</span>
+  <span style="background-color: #F67744; padding: 2px 6px;">4</span>
+  <span style="background-color: #EE5250; padding: 2px 6px;">&gt;5</span>
+</div>
+```
+
 :::

 ::::
@@ -3757,8 +3748,8 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma
 - Price dynamics emerged way before the russian invaion into ukraine
 - Linear dependence between the series reacted only right after the invasion 
 - Improvements in forecasting performance is mainly attributed to:
-  -  the tails multivariate probabilistic predictive distribution
-  -  the dependence structure between the marginals
+  - the tails
+  - the dependence structure between the marginals

 :::

@@ -3778,7 +3769,7 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma

 ::::

---
+# Final Remarks {visibility="uncounted"}

 ## Contributions {#sec-contributions}

@@ -3786,8 +3777,6 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma

 ::: {.column width="48%"}

-<p style="margin:1.5em;"></p>
-
 **Theoretical**

 Probabilistic Online Learning:
@@ -3821,8 +3810,6 @@ Applications

 ::: {.column width="48%"}

-<p style="margin:1.5em;"></p>
-
 **Software**

 R Packages:
@@ -3852,5 +3839,8 @@ Berrisch, J., Narajewski, M., & Ziel, F. [-@BERRISCH2023100236]:

 ::::

+## Questions! {visibility="uncounted"}
+
+![Artwork by [\@allison_horst](https://allisonhorst.com/)](assets/allisonhorst/hiding.png)

 ## References {visibility="uncounted"}