Finalize CRPS + MCRS begin working on VolDep

2025-05-25 17:24:06 +02:00
parent 301eccdd5b
commit e4021df6d2
3 changed files with 164 additions and 448 deletions
--- a/index.qmd
+++ b/index.qmd
@@ -97,6 +97,8 @@ col_yellow <- "#FCE135"

 # CRPS Learning

+Berrisch, J., & Ziel, F. (2023). *Journal of Econometrics*, 237(2), 105221.
+
 ## Motivation

 :::: {.columns}
@@ -1022,21 +1024,17 @@ The same simulation carried out for different algorithms (1000 runs):

 ## Study Forget

-::::
-
-## Simulation Study
-
 :::: {.columns}

-::: {.column width="48%"}
+::: {.column width="38%"}

-**New DGP:**
+#### New DGP:

-\begin{align}
-    Y_t               & \sim \mathcal{N}\left(\frac{\sin(0.005 \pi t )}{2},\,1\right) \\
-    \widehat{X}_{t,1} & \sim      \widehat{F}_{1}  = \mathcal{N}(-1,\,1)              \\
-    \widehat{X}_{t,2} & \sim       \widehat{F}_{2}  = \mathcal{N}(3,\,4) \label{eq_dgp_sim2}
-\end{align}
+\begin{align*}
+  Y_t               &\sim \mathcal{N}\left(\frac{\sin(0.005 \pi t )}{2},\,1\right) \\
+  \widehat{X}_{t,1} &\sim      \widehat{F}_{1}  = \mathcal{N}(-1,\,1)              \\
+  \widehat{X}_{t,2} &\sim       \widehat{F}_{2}  = \mathcal{N}(3,\,4)
+\end{align*}

 `r fontawesome::fa("arrow-right", fill ="#000000")` Changing optimal weights

@@ -1044,20 +1042,25 @@ The same simulation carried out for different algorithms (1000 runs):

 `r fontawesome::fa("arrow-right", fill ="#000000")` No forgetting leads to long-term constant weights

-:::
-
-::: {.column width="2%"}
+<center>
+<img src="assets/crps_learning/forget.png">
+</center>

 :::

-::: {.column width="48%"}
+::: {.column width="4%"}
+
+:::
+
+::: {.column width="58%"}
+
+### &nbsp;

-**Weights of expert 2**

 ```{r, echo = FALSE, fig.width=7, fig.height=5, fig.align='center', cache = TRUE}
-load("assets/crps_learning/changing_weights.rds")
-mod_labs <- c("Optimum", "Pointwise", "Smooth", "Constant")
-names(mod_labs) <- c("TOptimum", "Pointwise", "Smooth", "Constant")
+load("assets/crps_learning/weights_preprocessed.rda")
+mod_labs <- c("Optimum", "No Forget\nPointwise", "No Forget\nP-Smooth", "Forget\nPointwise", "Forget\nP-Smooth")
+names(mod_labs) <- c("Optimum", "nf_ptw", "nf_psmth", "f_ptw", "f_psmth")
 colseq <- c(grey(.99), "orange", "red", "purple", "blue", "darkblue", "black")
 weights_preprocessed %>%
  mutate(w = 1 - w) %>%
@@ -1084,19 +1087,10 @@ weights_preprocessed %>%

 ::::

-## Simulation Results
-
-The simulation using the new DGP carried out for different algorithms (1000 runs):
-
-<center>
-<img src="assets/crps_learning/algos_changing.gif">
-</center>
+::::

 ## Possible Extensions

-:::: {.columns}
-
-::: {.column width="48%"}

 **Forgetting**

@@ -1117,34 +1111,20 @@ The simulation using the new DGP carried out for different algorithms (1000 runs
    \label{fixed_share_simple}.
 \end{align*}

-:::
+TODO: Move these to the multivariate slides

-::: {.column width="2%"}
+## Application Study

-:::
+::: {.panel-tabset}

-::: {.column width="48%"}
-
-**Non-Equidistant Knots**
-
- Non-equidistant spline-basis could be used
- Potentially improves the tail-behavior
- Destroys shrinkage towards constant
-
-<center>
-<img src="assets/crps_learning/uneven_grid.gif">
-</center>
-
-:::
-
-::::
-
-## Application Study: Overview
+## Overview

 :::: {.columns}

 ::: {.column width="29%"}

+::: {style="font-size: 85%;"}
+
 Data:

 - Forecasting European emission allowances (EUA)
@@ -1160,6 +1140,8 @@ Tuning paramter grids:
 - Smoothing Penalty: $\Lambda= \{0\}\cup \{2^x|x\in \{-4,-3.5,\ldots,12\}\}$
 - Learning Rates: $\mathcal{E}= \{2^x|x\in \{-1,-0.5,\ldots,9\}\}$

+::::
+
 :::

 ::: {.column width="2%"}
@@ -1203,7 +1185,9 @@ overview

 ::::

-## Application Study: Experts
+## Experts
+
+::: {style="font-size: 90%;"}

 Simple exponential smoothing with additive errors (**ETS-ANN**):

@@ -1235,6 +1219,7 @@ ARIMA(0,1,0)-GARCH(1,1) with student-t errors (**I-GARCHt**):
 Y_{t} = \mu + Y_{t-1}  + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \quad \text{and} \quad Z_t \sim t(0,1, \nu)
 \end{align*}

+::::

 ## Results

@@ -1242,6 +1227,8 @@ Y_{t} = \mu + Y_{t-1}  + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \

 ## Significance

+<br/>
+
 ```{r, echo = FALSE, fig.width=7, fig.height=5.5, fig.align='center', cache = TRUE, results='asis'}
 load("assets/crps_learning/bernstein_application_study_estimations+learnings_rev1.RData")

@@ -1494,247 +1481,47 @@ weights %>%

 ::::

-## Wrap-Up
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
-Potential Downsides:
-
- Pointwise optimization can induce quantile crossing
-  - Can be solved by sorting the predictions
-
-Upsides:
-
- Pointwise learning outperforms the Naive solution significantly
- Online learning is much faster than batch methods
- Smoothing further improves the predictive performance
- Asymptotically not worse than the best convex combination
-
-:::
-
-::: {.column width="2%"}
-
-:::
-
-::: {.column width="48%"}
-
-Important:
-
- The choice of the learning rate is crucial
- The loss function has to meet certain criteria
-
-The [`r fontawesome::fa("github")` profoc](https://profoc.berrisch.biz/) R Package:
-
- Implements all algorithms discussed above
- Is written using RcppArmadillo `r fontawesome::fa("arrow-right", fill ="#000000")` its fast
- Accepts vectors for most parameters
-  - The best parameter combination is chosen online
- Implements 
-  - Forgetting, Fixed Share
-  - Different loss functions + gradients
-
-:::
-
 ::::

-<!-- :::: {.notes}
-
-Execution Times:
-
-T = 5000
-
-Opera:
-
-Ml-Poly > 157 ms
-Boa     > 212 ms
-
-Profoc:
-
-Ml-Poly > 17
-BOA     > 16 -->
-
 # Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices

---
+Berrisch, J., & Ziel, F. (2024). *International Journal of Forecasting*, 40(4), 1568-1586.

-## Outline
-
-</br>
-
-**Multivariate CRPS Learning**
-
- Introduction
- Smoothing procedures
- Application to multivariate electricity price forecasts
-
-**The `profoc` R package**
-
- Package overview
- Implementation details
- Illustrative examples
-
-## The Framework of Prediction under Expert Advice
-
-### The sequential framework
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
-Each day, $t = 1, 2, ... T$
-
- The **forecaster** receives predictions $\widehat{X}_{t,k}$ from $K$ **experts**
- The **forecaster** assings weights $w_{t,k}$ to each **expert**
- The **forecaster** calculates her prediction:
-
-$$\widetilde{X}_{t}=\sum_{k=1}^K w_{t,k}\widehat{X}_{t,k}$$
-
- The realization for $t$ is observed
-
-:::
-
-::: {.column width="2%"}
-
-:::
-
-::: {.column width="48%"}
-
- The experts can be institutions, persons, or models
- The forecasts can be point-forecasts (i.e., mean or median) or full predictive distributions
- We do not need any assumptions concerning the underlying data
- @cesa2006prediction
-
-:::
-
-::::
-
-## The Regret
-
-Weights are updated sequentially according to the past performance of the $K$ experts.
-
-`r fontawesome::fa("arrow-right", fill ="#000000")` A loss function $\ell$ is needed (to compute the **cumulative regret** $R_{t,k}$)
-
-\begin{equation}
-    R_{t,k}  = \widetilde{L}_{t} - \widehat{L}_{t,k} =  \sum_{i = 1}^t \ell(\widetilde{X}_{i},Y_i) - \ell(\widehat{X}_{i,k},Y_i)
-    \label{eq_regret}
-\end{equation}
-
-The cumulative regret:
- Indicates the predictive accuracy of expert $k$ until time $t$.
- Measures how much the forecaster *regrets* not having followed the expert's advice
-
-Popular loss functions for point forecasting @gneiting2011making:
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
- $\ell_2$-loss $\ell_2(x, y) = | x -y|^2$
-  - optimal for mean prediction 
-
-:::
-
-::: {.column width="2%"}
-
-:::
-
-::: {.column width="48%"}
-
- $\ell_1$-loss $\ell_1(x, y) = | x -y|$ 
-  - optimal for median predictions 
-
-:::
-
-::::
-
---
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
-### Probabilistic Setting
-
-An appropriate loss:
-
-\begin{align*}
-    \text{CRPS}(F, y) & = \int_{\mathbb{R}} {(F(x) - \mathbb{1}\{ x > y \})}^2 dx
-    \label{eq_crps}
-\end{align*}
-
-It's strictly proper @gneiting2007strictly.
-
-Using the CRPS, we can calculate time-adaptive weights $w_{t,k}$. However, what if the experts' performance varies in parts of the distribution? 
-
-`r fontawesome::fa("lightbulb", fill = col_yellow)` Utilize this relation:
-
-\begin{align*}
-    \text{CRPS}(F, y) = 2 \int_0^{1}  \text{QL}_p(F^{-1}(p), y) \, d p.
-    \label{eq_crps_qs}
-\end{align*}
-
-... to combine quantiles of the probabilistic forecasts individually using the quantile-loss QL.
-
-:::
-
-::: {.column width="2%"}
-
-:::
-
-::: {.column width="48%"}
-
-### Optimal Convergence
-
-</br>
-
-`r fontawesome::fa("exclamation", fill = col_orange)` exp-concavity of the loss is required for *selection* and *convex aggregation* properties 
-
-`r fontawesome::fa("exclamation", fill = col_orange)` QL is convex, but not exp-concave 
-
-`r fontawesome::fa("arrow-right", fill ="#000000")` The Bernstein Online Aggregation (BOA) lets us weaken the exp-concavity condition.
-
-Convergence rates of BOA are:
-
-`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *selection* @gaillard2018efficient.
-
-`r fontawesome::fa("arrow-right", fill ="#000000")` Almost optimal w.r.t *convex aggregation* @wintenberger2017optimal.
-
-:::
-
-::::

 ## Multivariate CRPS Learning


 :::: {.columns}

-::: {.column width="48%"}
-
-Additionally, we extend the **B-Smooth** and **P-Smooth** procedures to the multivariate setting:
-
- Basis matrices for reducing 
- - the probabilistic dimension from $P$ to $\widetilde P$
- - the multivariate dimension from $D$ to $\widetilde D$
+::: {.column width="45%"}


- Hat matrices
- - penalized smoothing across P and D dimensions
+We extend the **B-Smooth** and **P-Smooth** procedures to the multivariate setting:

-We utilize the mean Pinball Score over the entire space for hyperparameter optimization (e.g, $\lambda$)
+::: {.panel-tabset}

-:::
+## Penalized Smoothing

-::: {.column width="2%"}
+Let $\boldsymbol{\psi}^{\text{mv}}=(\psi_1,\ldots, \psi_{D})$ and $\boldsymbol{\psi}^{\text{pr}}=(\psi_1,\ldots, \psi_{P})$ be two sets of bounded basis functions on $(0,1)$:

-:::
+\begin{equation*}
+  \boldsymbol w_{t,k} = \boldsymbol{\psi}^{\text{mv}} \boldsymbol{b}_{t,k} {\boldsymbol{\psi}^{pr}}'
+\end{equation*}

-::: {.column width="48%"}
+with parameter matix $\boldsymbol b_{t,k}$. The latter is estimated to penalize $L_2$-smoothing which minimizes

-*Basis Smoothing*
+\begin{align}
+   & \| \boldsymbol{\beta}_{t,d, k}' \boldsymbol{\varphi}^{\text{pr}}  - \boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}}  \|^2_2 + \lambda^{\text{pr}}  \| \mathcal{D}_{q}  (\boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}})  \|^2_2 +                       \nonumber \\
+   & \| \boldsymbol{\beta}_{t, p, k}' \boldsymbol{\varphi}^{\text{mv}}  - \boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}}  \|^2_2 + \lambda^{\text{mv}}  \| \mathcal{D}_{q}  (\boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}})  \|^2_2  \nonumber
+\end{align}

-Represent weights as linear combinations of bounded basis functions:
+with differential operator $\mathcal{D}_q$ of order $q$
+
+[{{< fa calculator >}}]{style="color:var(--col_green_10);"} We have an analytical solution.
+
+## Basis Smoothing
+
+Linear combinations of bounded basis functions:

 \begin{equation}
  \underbrace{\boldsymbol w_{t,k}}_{D \text{ x } P} = \sum_{j=1}^{\widetilde D} \sum_{l=1}^{\widetilde P} \beta_{t,j,l,k} \varphi^{\text{mv}}_{j} \varphi^{\text{pr}}_{l} = \underbrace{\boldsymbol \varphi^{\text{mv}}}_{D\text{ x }\widetilde D} \boldsymbol \beta_{t,k} \underbrace{{\boldsymbol\varphi^{\text{pr}}}'}_{\widetilde P \text{ x }P} \nonumber
@@ -1750,42 +1537,15 @@ If $\widetilde P = P$ it holds that $\boldsymbol \varphi^{pr} = \boldsymbol{I}$

 For $\widetilde P = 1$ we receive constant weights

-:::
-
 ::::

-## Multivariate CRPS Learning
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
-**Penalized smoothing:**
-
-Let $\boldsymbol{\psi}^{\text{mv}}=(\psi_1,\ldots, \psi_{D})$ and $\boldsymbol{\psi}^{\text{pr}}=(\psi_1,\ldots, \psi_{P})$ be two sets of bounded basis functions on $(0,1)$:
-
-\begin{equation}
-  \boldsymbol w_{t,k} = \boldsymbol{\psi}^{\text{mv}} \boldsymbol{b}_{t,k} {\boldsymbol{\psi}^{pr}}'
-\end{equation}
-
-with parameter matix $\boldsymbol b_{t,k}$. The latter is estimated to penalize $L_2$-smoothing which minimizes
-
-\begin{align}
-   & \| \boldsymbol{\beta}_{t,d, k}' \boldsymbol{\varphi}^{\text{pr}}  - \boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}}  \|^2_2 + \lambda^{\text{pr}}  \| \mathcal{D}_{q}  (\boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}})  \|^2_2 +                       \nonumber \\
-   & \| \boldsymbol{\beta}_{t, p, k}' \boldsymbol{\varphi}^{\text{mv}}  - \boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}}  \|^2_2 + \lambda^{\text{mv}}  \| \mathcal{D}_{q}  (\boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}})  \|^2_2  \nonumber
-\end{align}
-
-with differential operator $\mathcal{D}_q$ of order $q$
-
-Computation is easy since we have an analytical solution.
-
 :::

 ::: {.column width="2%"}

 :::

-::: {.column width="48%"}
+::: {.column width="53%"}

 ```{r, fig.align="center", echo=FALSE, out.width = "1000px", cache = TRUE}
 knitr::include_graphics("assets/mcrps_learning/algorithm.svg")
@@ -1841,63 +1601,6 @@ Computation Time: ~30 Minutes

 ::::

-## Special Cases 
-
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
-::: {.panel-tabset}
-
-## Constant
-
-```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
-knitr::include_graphics("assets/mcrps_learning/constant.svg")
-```
-
-## Constant PR
-
-```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
-knitr::include_graphics("assets/mcrps_learning/constant_pr.svg")
-```
-
-## Constant MV
-
-```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
-knitr::include_graphics("assets/mcrps_learning/constant_mv.svg")
-```
-
-::::
-
-:::
-
-::: {.column width="2%"}
-
-:::
-
-::: {.column width="48%"}
-
-::: {.panel-tabset}
-
-## Pointwise
-
-```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
-knitr::include_graphics("assets/mcrps_learning/pointwise.svg")
-```
-
-## Smooth
-
-```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
-knitr::include_graphics("assets/mcrps_learning/smooth_best.svg")
-```
-
-::::
-
-:::
-
-::::
-
 ## Results

 :::: {.columns}
@@ -2040,7 +1743,41 @@ table_performance %>%

 ::: {.column width = "45%"}

-Foo
+<br/>
+
+::: {.panel-tabset}
+
+## Constant
+
+```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
+knitr::include_graphics("assets/mcrps_learning/constant.svg")
+```
+
+## Pointwise
+
+```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
+knitr::include_graphics("assets/mcrps_learning/pointwise.svg")
+```
+
+## B Constant PR
+
+```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
+knitr::include_graphics("assets/mcrps_learning/constant_pr.svg")
+```
+
+## B Constant MV
+
+```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
+knitr::include_graphics("assets/mcrps_learning/constant_mv.svg")
+```
+
+## Smooth.Forget
+
+```{r, fig.align="center", echo=FALSE, out.width = "400", cache = TRUE}
+knitr::include_graphics("assets/mcrps_learning/smooth_best.svg")
+```
+
+::::

 :::

@@ -2048,7 +1785,11 @@ Foo

 ## Results

-```{r, warning=FALSE, fig.align="center", echo=FALSE, fig.width=12, fig.height=6, cache = TRUE}
+::: {.panel-tabset}
+
+## Chosen Parameters
+
+```{r, warning=FALSE, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
 load("assets/mcrps_learning/pars_data.rds")
 pars_data %>%
    ggplot(aes(x = dates, y = value)) +
@@ -2085,9 +1826,9 @@ pars_data %>%
    )
 ```

-## Results: Hour 16:00-17:00
+## Weights: Hour 16:00-17:00

-```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=6, cache = TRUE}
+```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
 load("assets/mcrps_learning/weights_h.rds")
 weights_h %>%
        ggplot(aes(date, q, fill = weight)) +
@@ -2125,9 +1866,9 @@ weights_h %>%
    scale_y_continuous(breaks = c(0.1, 0.5, 0.9))
 ```

-## Results: Median
+## Weights: Median

-```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=6, cache = TRUE}
+```{r, fig.align="center", echo=FALSE, fig.width=12, fig.height=5.5, cache = TRUE}
 load("assets/mcrps_learning/weights_q.rds")
 weights_q %>%
    mutate(hour = as.numeric(hour) - 1) %>%
@@ -2166,51 +1907,9 @@ weights_q %>%
    scale_y_continuous(breaks = c(0, 8, 16, 24))
 ```

-## Profoc R Package
-
-:::: {.columns}
-
-::: {.column width="48%"}
-
-### Probabilistic Forecast Combination - profoc 
-
-Available on [Github](https://github.com/BerriJ/profoc) and [CRAN](https://CRAN.R-project.org/package=profoc)
-
-Main Function: `online()` for online learning.
- Works with multivariate and/or probabilistic data
- Implements BOA, ML-POLY, EWA (and the gradient versions)
- Implements many extensions like smoothing, forgetting, thresholding, etc.
- Various loss functions are available 
- Various methods (`predict`, `update`, `plot`, etc.)
-
-:::
-
-::: {.column width="2%"}
-
-:::
-
-::: {.column width="48%"}
-
-### Speed
-
-Large parts of profoc are implemented in C++.
-
-<center>
-<img src="assets/mcrps_learning/profoc_langs.png">
-</center>
-
-We use `Rcpp`, `RcppArmadillo`, and OpenMP.
-
-We use `Rcpp` modules to expose a class to R
- Offers great flexibility for the end-user
- Requires very little knowledge of C++ code
- High-Level interface is easy to use
-
-:::
-
 ::::

-## Profoc - B-Spline Basis
+## Non-Equidistant Knots

 ::: {.panel-tabset}

@@ -2315,8 +2014,8 @@ chart = {
    });

   // Build SVG
-   const width = 800;
-   const height = 400;
+   const width = 1200;
+   const height = 450;
   const margin = {top: 40, right: 20, bottom: 40, left: 40};
   const innerWidth = width - margin.left - margin.right;
   const innerHeight = height - margin.top - margin.bottom;
@@ -2347,15 +2046,6 @@ chart = {
     .attr("preserveAspectRatio", "xMidYMid meet")
     .attr("style", "max-width: 100%; height: auto;");
   
-   // Add chart title
-  //  svg.append("text")
-  //    .attr("class", "chart-title")
-  //    .attr("x", width / 2)
-  //    .attr("y", 20)
-  //    .attr("text-anchor", "middle")
-  //    .attr("font-size", "20px")
-  //    .attr("font-weight", "bold");
-   
   // Create the chart group
   const g = svg.append("g")
     .attr("transform", `translate(${margin.left},${margin.top})`);
@@ -2372,20 +2062,6 @@ chart = {
     .call(d3.axisLeft(y).ticks(5))
     .style("font-size", "20px");
   
-   // Add axis labels
-  //  g.append("text")
-  //    .attr("x", innerWidth / 2)
-  //    .attr("y", innerHeight + 35)
-  //    .attr("text-anchor", "middle")
-  //    .text("x");
-   
-  //  g.append("text")
-  //    .attr("transform", "rotate(-90)")
-  //    .attr("x", -innerHeight / 2)
-  //    .attr("y", -30)
-  //    .attr("text-anchor", "middle")
-  //    .text("y");
-   
   // Add a horizontal line at y = 0
   g.append("line")
     .attr("x1", 0)
@@ -2482,23 +2158,27 @@ TODO: Add actual algorithm to backup slides

 ## Wrap-Up

+
 :::: {.columns}

 ::: {.column width="48%"}

-  The [`r fontawesome::fa("github")` profoc](https://profoc.berrisch.biz/) R Package:
+[{{< fa triangle-exclamation >}}]{style="color:var(--col_red_9);"} Potential Downsides:

-Profoc is a flexible framework for online learning.
+- Pointwise optimization can induce quantile crossing
+  - Can be solved by sorting the predictions

- It implements several algorithms
- It implements several loss functions
- It implements several extensions
- Its high- and low-level interfaces offer great flexibility
+[{{< fa magnifying-glass >}}]{style="color:var(--col_orange_9);"} Important:

-Profoc is fast.
+- The choice of the learning rate is crucial
+- The loss function has to meet certain criteria

- The core components are written in C++
- The core components utilize OpenMP for parallelization
+[{{< fa rocket >}}]{style="color:var(--col_green_9);"} Upsides:
+
+- Pointwise learning outperforms the Naive solution significantly
+- Online learning is much faster than batch methods
+- Smoothing further improves the predictive performance
+- Asymptotically not worse than the best convex combination

 :::

@@ -2508,17 +2188,21 @@ Profoc is fast.

 ::: {.column width="48%"}

-Multivariate Extension:
+The [`r fontawesome::fa("github")` profoc](https://profoc.berrisch.biz/) R Package:

- Code is available now
- [Pre-Print](https://arxiv.org/abs/2303.10019) is available now
+- Implements all algorithms discussed above
+- Is written using RcppArmadillo `r fontawesome::fa("arrow-right", fill ="#000000")` its fast
+- Accepts vectors for most parameters
+  - The best parameter combination is chosen online
+- Implements 
+  - Forgetting, Fixed Share
+  - Different loss functions + gradients

-Get these slides:
+Pubications:

-<center>
-<img src="assets/mcrps_learning/web_pres.png">
-</center>
-[https://berrisch.biz/slides/23_06_ecmi/](https://berrisch.biz/slides/23_06_ecmi/)
+[{{< fa newspaper >}}]{style="color:var(--col_grey_7);"} Berrisch, J., & Ziel, F. [-@BERRISCH2023105221]. CRPS learning. *Journal of Econometrics*, 237(2), 105221.
+
+[{{< fa newspaper >}}]{style="color:var(--col_grey_7);"} Berrisch, J., & Ziel, F. [-@BERRISCH20241568]. Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. *International Journal of Forecasting*, 40(4), 1568-1586.

 :::

@@ -2526,7 +2210,7 @@ Get these slides:

 # Modeling Volatility and Dependence of European Carbon and Energy Prices

-TODO: Add Reference
+Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). *Finance Research Letters*, 52, 103503.

 ---