Add pngs's to version control

This commit is contained in:
2025-05-24 22:30:15 +02:00
parent 5c98bf64ae
commit 37f76d4f1d
12 changed files with 710 additions and 29 deletions

View File

@@ -17,9 +17,13 @@ format:
smaller: true
fig-format: svg
slide-number: true
self-contained-math: true
crossrefs-hover: true
execute:
daemon: false
highlight-style: github
bibliography: assets/library.bib
csl: apa-old-doi-prefix.csl
---
## Outline
@@ -284,7 +288,7 @@ Each day, $t = 1, 2, ... T$
- The experts can be institutions, persons, or models
- The forecasts can be point-forecasts (i.e., mean or median) or full predictive distributions
- We do not need any assumptions concerning the underlying data
- `r Citet(my_bib, "cesa2006prediction")`
- @cesa2006prediction
:::
@@ -307,7 +311,7 @@ The cumulative regret:
- Indicates the predictive accuracy of the expert $k$ until time $t$.
- Measures how much the forecaster *regrets* not having followed the expert's advice
Popular loss functions for point forecasting `r Citet(my_bib, "gneiting2011making")`:
Popular loss functions for point forecasting @gneiting2011making:
:::: {.columns}
@@ -366,7 +370,7 @@ with $q\geq 2$ and $x_{+}$ the (vector) of positive parts of $x$.
## Optimality
In stochastic settings, the cumulative Risk should be analyezed `r Citet(my_bib, "wintenberger2017optimal")`:
In stochastic settings, the cumulative Risk should be analyezed @wintenberger2017optimal:
\begin{align}
\underbrace{\widetilde{\mathcal{R}}_t = \sum_{i=1}^t \mathbb{E}[\ell(\widetilde{X}_{i},Y_i)|\mathcal{F}_{i-1}]}_{\text{Cumulative Risk of Forecaster}} \qquad\qquad\qquad \text{ and } \qquad\qquad\qquad
@@ -411,7 +415,7 @@ The forecaster is asymptotically not worse than the best convex combination $\wi
Satisfying the convexity property \eqref{eq_opt_conv} comes at the cost of slower possible convergence.
According to `r Citet(my_bib, "wintenberger2017optimal")`, an algorithm has optimal rates with respect to selection \eqref{eq_opt_select} and convex aggregation \eqref{eq_opt_conv} if
According to @wintenberger2017optimal, an algorithm has optimal rates with respect to selection \eqref{eq_opt_select} and convex aggregation \eqref{eq_opt_conv} if
\begin{align}
\frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) & =
@@ -432,11 +436,11 @@ Algorithms can statisfy both \eqref{eq_optp_select} and \eqref{eq_optp_conv} dep
## Optimality
According to `r Citet(my_bib, "cesa2006prediction")` EWA \eqref{eq_ewa_general} satisfies the optimal selection convergence \eqref{eq_optp_select} in a deterministic setting if the:
According to @cesa2006prediction EWA \eqref{eq_ewa_general} satisfies the optimal selection convergence \eqref{eq_optp_select} in a deterministic setting if the:
- Loss $\ell$ is exp-concave
- Learning-rate $\eta$ is chosen correctly
Those results can be converted to stochastic iid settings `r Citet(my_bib, "kakade2008generalization")` `r Citet(my_bib, "gaillard2014second")`.
Those results can be converted to stochastic iid settings @kakade2008generalization, @gaillard2014second.
The optimal convex aggregation convergence \eqref{eq_optp_conv} can be satisfied by applying the kernel-trick. Thereby, the loss is linearized:
\begin{align}
@@ -465,7 +469,7 @@ We apply Bernstein Online Aggregation (BOA). It lets us weaken the exp-concavity
\label{eq_crps}
\end{align*}
It's strictly proper `r Citet(my_bib, "gneiting2007strictly")`.
It's strictly proper @gneiting2007strictly.
Using the CRPS, we can calculate time-adaptive weight $w_{t,k}$. However, what if the experts' performance is not uniform over all parts of the distribution?
@@ -513,7 +517,7 @@ For convex losses, BOAG satisfies that there exist a $C>0$ such that for $x>0$ i
1-e^{x}
\label{eq_boa_opt_conv}
\end{equation}
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *convex aggregation* \eqref{eq_optp_conv} `r Citet(my_bib, "wintenberger2017optimal")` .
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *convex aggregation* \eqref{eq_optp_conv} @wintenberger2017optimal.
The same algorithm satisfies that there exist a $C>0$ such that for $x>0$ it holds that
\begin{equation}
@@ -551,7 +555,7 @@ for all $x_1,x_2 \in \mathbb{R}$ and $t>0$ that
\mathbb{E}\left[ \left. \left( \alpha(\ell'(x_1, Y_t)(x_1 - x_2))^{2}\right)^{1/\beta} \right|\mathcal{F}_{t-1}\right]
\end{align*}
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *selection* \eqref{eq_optp_select} `r Citet(my_bib, "gaillard2018efficient")`.
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *selection* \eqref{eq_optp_select} @gaillard2018efficient.
:::
@@ -609,7 +613,7 @@ $\mathcal{Q}_p'' = f.$
Additionally, if $f$ is a continuous Lebesgue-density with $f\geq\gamma>0$ for some constant $\gamma>0$ on its support $\text{spt}(f)$ then
is $\mathcal{Q}_p$ is $\gamma$-strongly convex.
Strong convexity with $\beta=1$ implies **A2** `r fontawesome::fa("check", fill ="#ffa600")` `r Citet(my_bib, "gaillard2018efficient")`
Strong convexity with $\beta=1$ implies **A2** `r fontawesome::fa("check", fill ="#ffa600")` @gaillard2018efficient
:::
@@ -947,7 +951,7 @@ The simulation using the new DGP carried out for different algorithms (1000 runs
R_{t,k} & = R_{t-1,k}(1-\xi) + \ell(\widetilde{F}_{t},Y_i) - \ell(\widehat{F}_{t,k},Y_i) \label{eq_regret_forget}
\end{align*}
**Fixed Shares** `r Citet(my_bib, "herbster1998tracking")`
**Fixed Shares** @herbster1998tracking
- Adding fixed shares to the weights
- Shrinkage towards a constant solution
@@ -1457,7 +1461,7 @@ $$\widetilde{X}_{t}=\sum_{k=1}^K w_{t,k}\widehat{X}_{t,k}$$
- The experts can be institutions, persons, or models
- The forecasts can be point-forecasts (i.e., mean or median) or full predictive distributions
- We do not need any assumptions concerning the underlying data
- `r Citet(my_bib, "cesa2006prediction")`
- @cesa2006prediction
:::
@@ -1478,7 +1482,7 @@ The cumulative regret:
- Indicates the predictive accuracy of expert $k$ until time $t$.
- Measures how much the forecaster *regrets* not having followed the expert's advice
Popular loss functions for point forecasting `r Citet(my_bib, "gneiting2011making")`:
Popular loss functions for point forecasting @gneiting2011making:
:::: {.columns}
@@ -1517,7 +1521,7 @@ An appropriate loss:
\label{eq_crps}
\end{align*}
It's strictly proper `r Citet(my_bib, "gneiting2007strictly")`.
It's strictly proper @gneiting2007strictly.
Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what if the experts' performance varies in parts of the distribution?
@@ -1550,9 +1554,9 @@ Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what
Convergence rates of BOA are:
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *selection* `r Citet(my_bib, "gaillard2018efficient")`.
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *selection* @gaillard2018efficient.
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *convex aggregation* `r Citet(my_bib, "wintenberger2017optimal")`.
`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *convex aggregation* @wintenberger2017optimal.
:::
@@ -1657,7 +1661,7 @@ knitr::include_graphics("assets/mcrps_learning/algorithm.svg")
#### Data
- Day-Ahead electricity price forecasts from `r Citet(my_bib, "marcjasz2022distributional")`
- Day-Ahead electricity price forecasts from @marcjasz2022distributional
- Produced using probabilistic neural networks
- 24-dimensional distributional forecasts
- Distribution assumptions: JSU and Normal
@@ -3119,18 +3123,10 @@ Accounting for heteroscedasticity or stabilizing the variance via log transforma
<img src="assets/voldep/frame.png">
</center>
`r fontawesome::fa("newspaper")` `r Citet(my_bib, "berrisch2023modeling")`
`r fontawesome::fa("newspaper")` @berrisch2023modeling
:::
::::
## References
::: {.scrollable}
```{r refs1, echo=FALSE, results="asis"}
PrintBibliography(my_bib, .opts = list(style = "text"))
```
::::
## References