Finalize CRPS + MCRS begin working on VolDep
This commit is contained in:
580
index.qmd
580
index.qmd
@@ -97,6 +97,8 @@ col_yellow <- "#FCE135"
|
||||
|
||||
# CRPS Learning
|
||||
|
||||
Berrisch, J., & Ziel, F. (2023). *Journal of Econometrics*, 237(2), 105221.
|
||||
|
||||
## Motivation
|
||||
|
||||
:::: {.columns}
|
||||
@@ -1022,21 +1024,17 @@ The same simulation carried out for different algorithms (1000 runs):
|
||||
|
||||
## Study Forget
|
||||
|
||||
::::
|
||||
|
||||
## Simulation Study
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
::: {.column width="38%"}
|
||||
|
||||
**New DGP:**
|
||||
#### New DGP:
|
||||
|
||||
\begin{align}
|
||||
Y_t & \sim \mathcal{N}\left(\frac{\sin(0.005 \pi t )}{2},\,1\right) \\
|
||||
\widehat{X}_{t,1} & \sim \widehat{F}_{1} = \mathcal{N}(-1,\,1) \\
|
||||
\widehat{X}_{t,2} & \sim \widehat{F}_{2} = \mathcal{N}(3,\,4) \label{eq_dgp_sim2}
|
||||
\end{align}
|
||||
\begin{align*}
|
||||
Y_t &\sim \mathcal{N}\left(\frac{\sin(0.005 \pi t )}{2},\,1\right) \\
|
||||
\widehat{X}_{t,1} &\sim \widehat{F}_{1} = \mathcal{N}(-1,\,1) \\
|
||||
\widehat{X}_{t,2} &\sim \widehat{F}_{2} = \mathcal{N}(3,\,4)
|
||||
\end{align*}
|
||||
|
||||
`r fontawesome::fa("arrow-right", fill ="#000000")` Changing optimal weights
|
||||
|
||||
@@ -1044,20 +1042,25 @@ The same simulation carried out for different algorithms (1000 runs):
|
||||
|
||||
`r fontawesome::fa("arrow-right", fill ="#000000")` No forgetting leads to long-term constant weights
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
<center>
|
||||
<img src="assets/crps_learning/forget.png">
|
||||
</center>
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
::: {.column width="4%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="58%"}
|
||||
|
||||
###
|
||||
|
||||
**Weights of expert 2**
|
||||
|
||||
```{r, echo = FALSE, fig.width=7, fig.height=5, fig.align='center', cache = TRUE}
|
||||
load("assets/crps_learning/changing_weights.rds")
|
||||
mod_labs <- c("Optimum", "Pointwise", "Smooth", "Constant")
|
||||
names(mod_labs) <- c("TOptimum", "Pointwise", "Smooth", "Constant")
|
||||
load("assets/crps_learning/weights_preprocessed.rda")
|
||||
mod_labs <- c("Optimum", "No Forget\nPointwise", "No Forget\nP-Smooth", "Forget\nPointwise", "Forget\nP-Smooth")
|
||||
names(mod_labs) <- c("Optimum", "nf_ptw", "nf_psmth", "f_ptw", "f_psmth")
|
||||
colseq <- c(grey(.99), "orange", "red", "purple", "blue", "darkblue", "black")
|
||||
weights_preprocessed %>%
|
||||
mutate(w = 1 - w) %>%
|
||||
@@ -1084,19 +1087,10 @@ weights_preprocessed %>%
|
||||
|
||||
::::
|
||||
|
||||
## Simulation Results
|
||||
|
||||
The simulation using the new DGP carried out for different algorithms (1000 runs):
|
||||
|
||||
<center>
|
||||
<img src="assets/crps_learning/algos_changing.gif">
|
||||
</center>
|
||||
::::
|
||||
|
||||
## Possible Extensions
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
**Forgetting**
|
||||
|
||||
@@ -1117,34 +1111,20 @@ The simulation using the new DGP carried out for different algorithms (1000 runs
|
||||
\label{fixed_share_simple}.
|
||||
\end{align*}
|
||||
|
||||
:::
|
||||
TODO: Move these to the multivariate slides
|
||||
|
||||
::: {.column width="2%"}
|
||||
## Application Study
|
||||
|
||||
:::
|
||||
::: {.panel-tabset}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
**Non-Equidistant Knots**
|
||||
|
||||
- Non-equidistant spline-basis could be used
|
||||
- Potentially improves the tail-behavior
|
||||
- Destroys shrinkage towards constant
|
||||
|
||||
<center>
|
||||
<img src="assets/crps_learning/uneven_grid.gif">
|
||||
</center>
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
## Application Study: Overview
|
||||
## Overview
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="29%"}
|
||||
|
||||
::: {style="font-size: 85%;"}
|
||||
|
||||
Data:
|
||||
|
||||
- Forecasting European emission allowances (EUA)
|
||||
@@ -1160,6 +1140,8 @@ Tuning paramter grids:
|
||||
- Smoothing Penalty: $\Lambda= \{0\}\cup \{2^x|x\in \{-4,-3.5,\ldots,12\}\}$
|
||||
- Learning Rates: $\mathcal{E}= \{2^x|x\in \{-1,-0.5,\ldots,9\}\}$
|
||||
|
||||
::::
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
@@ -1203,7 +1185,9 @@ overview
|
||||
|
||||
::::
|
||||
|
||||
## Application Study: Experts
|
||||
## Experts
|
||||
|
||||
::: {style="font-size: 90%;"}
|
||||
|
||||
Simple exponential smoothing with additive errors (**ETS-ANN**):
|
||||
|
||||
@@ -1235,6 +1219,7 @@ ARIMA(0,1,0)-GARCH(1,1) with student-t errors (**I-GARCHt**):
|
||||
Y_{t} = \mu + Y_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \quad \text{and} \quad Z_t \sim t(0,1, \nu)
|
||||
\end{align*}
|
||||
|
||||
::::
|
||||
|
||||
## Results
|
||||
|
||||
@@ -1242,6 +1227,8 @@ Y_{t} = \mu + Y_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \
|
||||
|
||||
## Significance
|
||||
|
||||
<br/>
|
||||
|
||||
```{r, echo = FALSE, fig.width=7, fig.height=5.5, fig.align='center', cache = TRUE, results='asis'}
|
||||
load("assets/crps_learning/bernstein_application_study_estimations+learnings_rev1.RData")
|
||||
|
||||
@@ -1494,247 +1481,47 @@ weights %>%
|
||||
|
||||
::::
|
||||
|
||||
## Wrap-Up
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
Potential Downsides:
|
||||
|
||||
- Pointwise optimization can induce quantile crossing
|
||||
- Can be solved by sorting the predictions
|
||||
|
||||
Upsides:
|
||||
|
||||
- Pointwise learning outperforms the Naive solution significantly
|
||||
- Online learning is much faster than batch methods
|
||||
- Smoothing further improves the predictive performance
|
||||
- Asymptotically not worse than the best convex combination
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
Important:
|
||||
|
||||
- The choice of the learning rate is crucial
|
||||
- The loss function has to meet certain criteria
|
||||
|
||||
The [`r fontawesome::fa("github")` profoc](https://profoc.berrisch.biz/) R Package:
|
||||
|
||||
- Implements all algorithms discussed above
|
||||
- Is written using RcppArmadillo `r fontawesome::fa("arrow-right", fill ="#000000")` its fast
|
||||
- Accepts vectors for most parameters
|
||||
- The best parameter combination is chosen online
|
||||
- Implements
|
||||
- Forgetting, Fixed Share
|
||||
- Different loss functions + gradients
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
<!-- :::: {.notes}
|
||||
|
||||
Execution Times:
|
||||
|
||||
T = 5000
|
||||
|
||||
Opera:
|
||||
|
||||
Ml-Poly > 157 ms
|
||||
Boa > 212 ms
|
||||
|
||||
Profoc:
|
||||
|
||||
Ml-Poly > 17
|
||||
BOA > 16 -->
|
||||
|
||||
# Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices
|
||||
|
||||
---
|
||||
Berrisch, J., & Ziel, F. (2024). *International Journal of Forecasting*, 40(4), 1568-1586.
|
||||
|
||||
## Outline
|
||||
|
||||
</br>
|
||||
|
||||
**Multivariate CRPS Learning**
|
||||
|
||||
- Introduction
|
||||
- Smoothing procedures
|
||||
- Application to multivariate electricity price forecasts
|
||||
|
||||
**The `profoc` R package**
|
||||
|
||||
- Package overview
|
||||
- Implementation details
|
||||
- Illustrative examples
|
||||
|
||||
## The Framework of Prediction under Expert Advice
|
||||
|
||||
### The sequential framework
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
Each day, $t = 1, 2, ... T$
|
||||
|
||||
- The **forecaster** receives predictions $\widehat{X}_{t,k}$ from $K$ **experts**
|
||||
- The **forecaster** assings weights $w_{t,k}$ to each **expert**
|
||||
- The **forecaster** calculates her prediction:
|
||||
|
||||
$$\widetilde{X}_{t}=\sum_{k=1}^K w_{t,k}\widehat{X}_{t,k}$$
|
||||
|
||||
- The realization for $t$ is observed
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
- The experts can be institutions, persons, or models
|
||||
- The forecasts can be point-forecasts (i.e., mean or median) or full predictive distributions
|
||||
- We do not need any assumptions concerning the underlying data
|
||||
- @cesa2006prediction
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
## The Regret
|
||||
|
||||
Weights are updated sequentially according to the past performance of the $K$ experts.
|
||||
|
||||
`r fontawesome::fa("arrow-right", fill ="#000000")` A loss function $\ell$ is needed (to compute the **cumulative regret** $R_{t,k}$)
|
||||
|
||||
\begin{equation}
|
||||
R_{t,k} = \widetilde{L}_{t} - \widehat{L}_{t,k} = \sum_{i = 1}^t \ell(\widetilde{X}_{i},Y_i) - \ell(\widehat{X}_{i,k},Y_i)
|
||||
\label{eq_regret}
|
||||
\end{equation}
|
||||
|
||||
The cumulative regret:
|
||||
- Indicates the predictive accuracy of expert $k$ until time $t$.
|
||||
- Measures how much the forecaster *regrets* not having followed the expert's advice
|
||||
|
||||
Popular loss functions for point forecasting @gneiting2011making:
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
- $\ell_2$-loss $\ell_2(x, y) = | x -y|^2$
|
||||
- optimal for mean prediction
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
- $\ell_1$-loss $\ell_1(x, y) = | x -y|$
|
||||
- optimal for median predictions
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
---
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
### Probabilistic Setting
|
||||
|
||||
An appropriate loss:
|
||||
|
||||
\begin{align*}
|
||||
\text{CRPS}(F, y) & = \int_{\mathbb{R}} {(F(x) - \mathbb{1}\{ x > y \})}^2 dx
|
||||
\label{eq_crps}
|
||||
\end{align*}
|
||||
|
||||
It's strictly proper @gneiting2007strictly.
|
||||
|
||||
Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what if the experts' performance varies in parts of the distribution?
|
||||
|
||||
`r fontawesome::fa("lightbulb", fill = col_yellow)` Utilize this relation:
|
||||
|
||||
\begin{align*}
|
||||
\text{CRPS}(F, y) = 2 \int_0^{1} \text{QL}_p(F^{-1}(p), y) \, d p.
|
||||
\label{eq_crps_qs}
|
||||
\end{align*}
|
||||
|
||||
... to combine quantiles of the probabilistic forecasts individually using the quantile-loss QL.
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
### Optimal Convergence
|
||||
|
||||
</br>
|
||||
|
||||
`r fontawesome::fa("exclamation", fill = col_orange)` exp-concavity of the loss is required for *selection* and *convex aggregation* properties
|
||||
|
||||
`r fontawesome::fa("exclamation", fill = col_orange)` QL is convex, but not exp-concave
|
||||
|
||||
`r fontawesome::fa("arrow-right", fill ="#000000")` The Bernstein Online Aggregation (BOA) lets us weaken the exp-concavity condition.
|
||||
|
||||
Convergence rates of BOA are:
|
||||
|
||||
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *selection* @gaillard2018efficient.
|
||||
|
||||
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *convex aggregation* @wintenberger2017optimal.
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
## Multivariate CRPS Learning
|
||||
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
Additionally, we extend the **B-Smooth** and **P-Smooth** procedures to the multivariate setting:
|
||||
|
||||
- Basis matrices for reducing
|
||||
- - the probabilistic dimension from $P$ to $\widetilde P$
|
||||
- - the multivariate dimension from $D$ to $\widetilde D$
|
||||
::: {.column width="45%"}
|
||||
|
||||
|
||||
- Hat matrices
|
||||
- - penalized smoothing across P and D dimensions
|
||||
We extend the **B-Smooth** and **P-Smooth** procedures to the multivariate setting:
|
||||
|
||||
We utilize the mean Pinball Score over the entire space for hyperparameter optimization (e.g, $\lambda$)
|
||||
::: {.panel-tabset}
|
||||
|
||||
:::
|
||||
## Penalized Smoothing
|
||||
|
||||
::: {.column width="2%"}
|
||||
Let $\boldsymbol{\psi}^{\text{mv}}=(\psi_1,\ldots, \psi_{D})$ and $\boldsymbol{\psi}^{\text{pr}}=(\psi_1,\ldots, \psi_{P})$ be two sets of bounded basis functions on $(0,1)$:
|
||||
|
||||
:::
|
||||
\begin{equation*}
|
||||
\boldsymbol w_{t,k} = \boldsymbol{\psi}^{\text{mv}} \boldsymbol{b}_{t,k} {\boldsymbol{\psi}^{pr}}'
|
||||
\end{equation*}
|
||||
|
||||
::: {.column width="48%"}
|
||||
with parameter matix $\boldsymbol b_{t,k}$. The latter is estimated to penalize $L_2$-smoothing which minimizes
|
||||
|
||||
*Basis Smoothing*
|
||||
\begin{align}
|
||||
& \| \boldsymbol{\beta}_{t,d, k}' \boldsymbol{\varphi}^{\text{pr}} - \boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}} \|^2_2 + \lambda^{\text{pr}} \| \mathcal{D}_{q} (\boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}}) \|^2_2 + \nonumber \\
|
||||
& \| \boldsymbol{\beta}_{t, p, k}' \boldsymbol{\varphi}^{\text{mv}} - \boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}} \|^2_2 + \lambda^{\text{mv}} \| \mathcal{D}_{q} (\boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}}) \|^2_2 \nonumber
|
||||
\end{align}
|
||||
|
||||
Represent weights as linear combinations of bounded basis functions:
|
||||
with differential operator $\mathcal{D}_q$ of order $q$
|
||||
|
||||
[{{< fa calculator >}}]{style="color:var(--col_green_10);"} We have an analytical solution.
|
||||
|
||||
## Basis Smoothing
|
||||
|
||||
Linear combinations of bounded basis functions:
|
||||
|
||||
\begin{equation}
|
||||
\underbrace{\boldsymbol w_{t,k}}_{D \text{ x } P} = \sum_{j=1}^{\widetilde D} \sum_{l=1}^{\widetilde P} \beta_{t,j,l,k} \varphi^{\text{mv}}_{j} \varphi^{\text{pr}}_{l} = \underbrace{\boldsymbol \varphi^{\text{mv}}}_{D\text{ x }\widetilde D} \boldsymbol \beta_{t,k} \underbrace{{\boldsymbol\varphi^{\text{pr}}}'}_{\widetilde P \text{ x }P} \nonumber
|
||||
@@ -1750,42 +1537,15 @@ If $\widetilde P = P$ it holds that $\boldsymbol \varphi^{pr} = \boldsymbol{I}$
|
||||
|
||||
For $\widetilde P = 1$ we receive constant weights
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
## Multivariate CRPS Learning
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
**Penalized smoothing:**
|
||||
|
||||
Let $\boldsymbol{\psi}^{\text{mv}}=(\psi_1,\ldots, \psi_{D})$ and $\boldsymbol{\psi}^{\text{pr}}=(\psi_1,\ldots, \psi_{P})$ be two sets of bounded basis functions on $(0,1)$:
|
||||
|
||||
\begin{equation}
|
||||
\boldsymbol w_{t,k} = \boldsymbol{\psi}^{\text{mv}} \boldsymbol{b}_{t,k} {\boldsymbol{\psi}^{pr}}'
|
||||
\end{equation}
|
||||
|
||||
with parameter matix $\boldsymbol b_{t,k}$. The latter is estimated to penalize $L_2$-smoothing which minimizes
|
||||
|
||||
\begin{align}
|
||||
& \| \boldsymbol{\beta}_{t,d, k}' \boldsymbol{\varphi}^{\text{pr}} - \boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}} \|^2_2 + \lambda^{\text{pr}} \| \mathcal{D}_{q} (\boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}}) \|^2_2 + \nonumber \\
|
||||
& \| \boldsymbol{\beta}_{t, p, k}' \boldsymbol{\varphi}^{\text{mv}} - \boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}} \|^2_2 + \lambda^{\text{mv}} \| \mathcal{D}_{q} (\boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}}) \|^2_2 \nonumber
|
||||
\end{align}
|
||||
|
||||
with differential operator $\mathcal{D}_q$ of order $q$
|
||||
|
||||
Computation is easy since we have an analytical solution.
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
::: {.column width="53%"}
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "1000px", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/algorithm.svg")
|
||||
@@ -1841,63 +1601,6 @@ Computation Time: ~30 Minutes
|
||||
|
||||
::::
|
||||
|
||||
## Special Cases
|
||||
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
::: {.panel-tabset}
|
||||
|
||||
## Constant
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/constant.svg")
|
||||
```
|
||||
|
||||
## Constant PR
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/constant_pr.svg")
|
||||
```
|
||||
|
||||
## Constant MV
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/constant_mv.svg")
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
::: {.panel-tabset}
|
||||
|
||||
## Pointwise
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/pointwise.svg")
|
||||
```
|
||||
|
||||
## Smooth
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/smooth_best.svg")
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
## Results
|
||||
|
||||
:::: {.columns}
|
||||
@@ -2040,7 +1743,41 @@ table_performance %>%
|
||||
|
||||
::: {.column width = "45%"}
|
||||
|
||||
Foo
|
||||
<br/>
|
||||
|
||||
::: {.panel-tabset}
|
||||
|
||||
## Constant
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/constant.svg")
|
||||
```
|
||||
|
||||
## Pointwise
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/pointwise.svg")
|
||||
```
|
||||
|
||||
## B Constant PR
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/constant_pr.svg")
|
||||
```
|
||||
|
||||
## B Constant MV
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/constant_mv.svg")
|
||||
```
|
||||
|
||||
## Smooth.Forget
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
|
||||
knitr::include_graphics("assets/mcrps_learning/smooth_best.svg")
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
:::
|
||||
|
||||
@@ -2048,7 +1785,11 @@ Foo
|
||||
|
||||
## Results
|
||||
|
||||
```{r, warning=FALSE, fig.align="center", echo=FALSE, fig.width=12, fig.height=6, cache = TRUE}
|
||||
::: {.panel-tabset}
|
||||
|
||||
## Chosen Parameters
|
||||
|
||||
```{r, warning=FALSE, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
|
||||
load("assets/mcrps_learning/pars_data.rds")
|
||||
pars_data %>%
|
||||
ggplot(aes(x = dates, y = value)) +
|
||||
@@ -2085,9 +1826,9 @@ pars_data %>%
|
||||
)
|
||||
```
|
||||
|
||||
## Results: Hour 16:00-17:00
|
||||
## Weights: Hour 16:00-17:00
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=6, cache = TRUE}
|
||||
```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
|
||||
load("assets/mcrps_learning/weights_h.rds")
|
||||
weights_h %>%
|
||||
ggplot(aes(date, q, fill = weight)) +
|
||||
@@ -2125,9 +1866,9 @@ weights_h %>%
|
||||
scale_y_continuous(breaks = c(0.1, 0.5, 0.9))
|
||||
```
|
||||
|
||||
## Results: Median
|
||||
## Weights: Median
|
||||
|
||||
```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=6, cache = TRUE}
|
||||
```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
|
||||
load("assets/mcrps_learning/weights_q.rds")
|
||||
weights_q %>%
|
||||
mutate(hour = as.numeric(hour) - 1) %>%
|
||||
@@ -2166,51 +1907,9 @@ weights_q %>%
|
||||
scale_y_continuous(breaks = c(0, 8, 16, 24))
|
||||
```
|
||||
|
||||
## Profoc R Package
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
### Probabilistic Forecast Combination - profoc
|
||||
|
||||
Available on [Github](https://github.com/BerriJ/profoc) and [CRAN](https://CRAN.R-project.org/package=profoc)
|
||||
|
||||
Main Function: `online()` for online learning.
|
||||
- Works with multivariate and/or probabilistic data
|
||||
- Implements BOA, ML-POLY, EWA (and the gradient versions)
|
||||
- Implements many extensions like smoothing, forgetting, thresholding, etc.
|
||||
- Various loss functions are available
|
||||
- Various methods (`predict`, `update`, `plot`, etc.)
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="2%"}
|
||||
|
||||
:::
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
### Speed
|
||||
|
||||
Large parts of profoc are implemented in C++.
|
||||
|
||||
<center>
|
||||
<img src="assets/mcrps_learning/profoc_langs.png">
|
||||
</center>
|
||||
|
||||
We use `Rcpp`, `RcppArmadillo`, and OpenMP.
|
||||
|
||||
We use `Rcpp` modules to expose a class to R
|
||||
- Offers great flexibility for the end-user
|
||||
- Requires very little knowledge of C++ code
|
||||
- High-Level interface is easy to use
|
||||
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
## Profoc - B-Spline Basis
|
||||
## Non-Equidistant Knots
|
||||
|
||||
::: {.panel-tabset}
|
||||
|
||||
@@ -2315,8 +2014,8 @@ chart = {
|
||||
});
|
||||
|
||||
// Build SVG
|
||||
const width = 800;
|
||||
const height = 400;
|
||||
const width = 1200;
|
||||
const height = 450;
|
||||
const margin = {top: 40, right: 20, bottom: 40, left: 40};
|
||||
const innerWidth = width - margin.left - margin.right;
|
||||
const innerHeight = height - margin.top - margin.bottom;
|
||||
@@ -2347,15 +2046,6 @@ chart = {
|
||||
.attr("preserveAspectRatio", "xMidYMid meet")
|
||||
.attr("style", "max-width: 100%; height: auto;");
|
||||
|
||||
// Add chart title
|
||||
// svg.append("text")
|
||||
// .attr("class", "chart-title")
|
||||
// .attr("x", width / 2)
|
||||
// .attr("y", 20)
|
||||
// .attr("text-anchor", "middle")
|
||||
// .attr("font-size", "20px")
|
||||
// .attr("font-weight", "bold");
|
||||
|
||||
// Create the chart group
|
||||
const g = svg.append("g")
|
||||
.attr("transform", `translate(${margin.left},${margin.top})`);
|
||||
@@ -2372,20 +2062,6 @@ chart = {
|
||||
.call(d3.axisLeft(y).ticks(5))
|
||||
.style("font-size", "20px");
|
||||
|
||||
// Add axis labels
|
||||
// g.append("text")
|
||||
// .attr("x", innerWidth / 2)
|
||||
// .attr("y", innerHeight + 35)
|
||||
// .attr("text-anchor", "middle")
|
||||
// .text("x");
|
||||
|
||||
// g.append("text")
|
||||
// .attr("transform", "rotate(-90)")
|
||||
// .attr("x", -innerHeight / 2)
|
||||
// .attr("y", -30)
|
||||
// .attr("text-anchor", "middle")
|
||||
// .text("y");
|
||||
|
||||
// Add a horizontal line at y = 0
|
||||
g.append("line")
|
||||
.attr("x1", 0)
|
||||
@@ -2482,23 +2158,27 @@ TODO: Add actual algorithm to backup slides
|
||||
|
||||
## Wrap-Up
|
||||
|
||||
|
||||
:::: {.columns}
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
The [`r fontawesome::fa("github")` profoc](https://profoc.berrisch.biz/) R Package:
|
||||
[{{< fa triangle-exclamation >}}]{style="color:var(--col_red_9);"} Potential Downsides:
|
||||
|
||||
Profoc is a flexible framework for online learning.
|
||||
- Pointwise optimization can induce quantile crossing
|
||||
- Can be solved by sorting the predictions
|
||||
|
||||
- It implements several algorithms
|
||||
- It implements several loss functions
|
||||
- It implements several extensions
|
||||
- Its high- and low-level interfaces offer great flexibility
|
||||
[{{< fa magnifying-glass >}}]{style="color:var(--col_orange_9);"} Important:
|
||||
|
||||
Profoc is fast.
|
||||
- The choice of the learning rate is crucial
|
||||
- The loss function has to meet certain criteria
|
||||
|
||||
- The core components are written in C++
|
||||
- The core components utilize OpenMP for parallelization
|
||||
[{{< fa rocket >}}]{style="color:var(--col_green_9);"} Upsides:
|
||||
|
||||
- Pointwise learning outperforms the Naive solution significantly
|
||||
- Online learning is much faster than batch methods
|
||||
- Smoothing further improves the predictive performance
|
||||
- Asymptotically not worse than the best convex combination
|
||||
|
||||
:::
|
||||
|
||||
@@ -2508,17 +2188,21 @@ Profoc is fast.
|
||||
|
||||
::: {.column width="48%"}
|
||||
|
||||
Multivariate Extension:
|
||||
The [`r fontawesome::fa("github")` profoc](https://profoc.berrisch.biz/) R Package:
|
||||
|
||||
- Code is available now
|
||||
- [Pre-Print](https://arxiv.org/abs/2303.10019) is available now
|
||||
- Implements all algorithms discussed above
|
||||
- Is written using RcppArmadillo `r fontawesome::fa("arrow-right", fill ="#000000")` its fast
|
||||
- Accepts vectors for most parameters
|
||||
- The best parameter combination is chosen online
|
||||
- Implements
|
||||
- Forgetting, Fixed Share
|
||||
- Different loss functions + gradients
|
||||
|
||||
Get these slides:
|
||||
Pubications:
|
||||
|
||||
<center>
|
||||
<img src="assets/mcrps_learning/web_pres.png">
|
||||
</center>
|
||||
[https://berrisch.biz/slides/23_06_ecmi/](https://berrisch.biz/slides/23_06_ecmi/)
|
||||
[{{< fa newspaper >}}]{style="color:var(--col_grey_7);"} Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. CRPS learning. *Journal of Econometrics*, 237(2), 105221.
|
||||
|
||||
[{{< fa newspaper >}}]{style="color:var(--col_grey_7);"} Berrisch, J., & Ziel, F. [-@BERRISCH20241568]. Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. *International Journal of Forecasting*, 40(4), 1568-1586.
|
||||
|
||||
:::
|
||||
|
||||
@@ -2526,7 +2210,7 @@ Get these slides:
|
||||
|
||||
# Modeling Volatility and Dependence of European Carbon and Energy Prices
|
||||
|
||||
TODO: Add Reference
|
||||
Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). *Finance Research Letters*, 52, 103503.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user