diff --git a/.gitignore b/.gitignore index 6f7f8bc..1923afb 100644 --- a/.gitignore +++ b/.gitignore @@ -83,6 +83,5 @@ data/* # Ignore html files for now # TODO: Remove later -index.html index_cache index_files diff --git a/index.html b/index.html new file mode 100644 index 0000000..8721b74 --- /dev/null +++ b/index.html @@ -0,0 +1,3513 @@ + +
+ + + + + + + + + + + + + ++ University of Duisburg-Essen, House of Energy, Climate, and Finance +
+2025-06-30
+\[ +\newcommand{\A}{{\mathbb A}} +\]
+Artwork by @allison_horst
| + + | ++Energy market liberalization created complex, interconnected trading systems + | +
| + + | ++Renewable energy transition introduces uncertainty and volatility from weather-dependent generation + | +
| + + | ++Traditional point forecasts are inadequate for modern energy markets with increasing uncertainty + | +
| + + | ++Risk inherently is a probabilistic concept + | +
| + + | ++Probabilistic forecasting essential for risk management, planning and decision making in volatile energy environments + | +
| + + | ++Online learning methods needed for fast-updating models with streaming energy data + | +

| + + | ++Berrisch, J., & Ziel, F. (2023). CRPS learning. Journal of Econometrics, 237(2), 105221. + | +
| + + | ++Berrisch, J., & Ziel, F. (2024). Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. International Journal of Forecasting, 40(4), 1568–1586. + | +
| + + | ++Hirsch, S., Berrisch, J., & Ziel, F. (2024). Online Distributional Regression. arXiv preprint arXiv:2407.08750. + | +
| + + | ++Berrisch, J., & Ziel, F. (2022). Distributional modeling and forecasting of natural gas prices. Journal of Forecasting, 41(6), 1065–1086. + | +
| + + | ++Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). Modeling volatility and dependence of European carbon and energy prices. Finance Research Letters, 52, 103503. + | +
| + + | ++Berrisch, J., Narajewski, M., & Ziel, F. (2023). High-resolution peak demand estimation using generalized additive models and deep neural networks. Energy and AI, 13, 100236. + | +
| + + | ++Berrisch, J. (2025). rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support. arXiv preprint arXiv:2501.15856. + | +
| + + | ++Berrisch, J., & Ziel, F. (2023). CRPS learning. Journal of Econometrics, 237(2), 105221. + | +
| + + | ++Berrisch, J., & Ziel, F. (2024). Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. International Journal of Forecasting, 40(4), 1568–1586. + | +
| + + | ++Hirsch, S., Berrisch, J., & Ziel, F. (2024). Online Distributional Regression. arXiv preprint arXiv:2407.08750. + | +
| + + | ++Berrisch, J., & Ziel, F. (2022). Distributional modeling and forecasting of natural gas prices. Journal of Forecasting, 41(6), 1065–1086. + | +
| + + | ++Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). Modeling volatility and dependence of European carbon and energy prices. Finance Research Letters, 52, 103503. + | +
| + + | ++Berrisch, J., Narajewski, M., & Ziel, F. (2023). High-resolution peak demand estimation using generalized additive models and deep neural networks. Energy and AI, 13, 100236. + | +
| + + | ++Berrisch, J. (2025). rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support. arXiv preprint arXiv:2501.15856. + | +
| + + | ++Berrisch, J., & Ziel, F. (2023). CRPS learning. Journal of Econometrics, 237(2), 105221. + | +
| + + | ++Berrisch, J., & Ziel, F. (2024). Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. International Journal of Forecasting, 40(4), 1568–1586. + | +
| + + | ++Hirsch, S., Berrisch, J., & Ziel, F. (2024). Online Distributional Regression. arXiv preprint arXiv:2407.08750. + | +
| + + | ++Berrisch, J., & Ziel, F. (2022). Distributional modeling and forecasting of natural gas prices. Journal of Forecasting, 41(6), 1065–1086. + | +
| + + | ++Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). Modeling volatility and dependence of European carbon and energy prices. Finance Research Letters, 52, 103503. + | +
| + + | ++Berrisch, J., Narajewski, M., & Ziel, F. (2023). High-resolution peak demand estimation using generalized additive models and deep neural networks. Energy and AI, 13, 100236. + | +
| + + | ++Berrisch, J. (2025). rcpptimer: Rcpp Tic-Toc Timer with OpenMP Support. arXiv preprint arXiv:2501.15856. + | +
Reduces estimation time by 2-3 orders of magnitude
+Maintainins competitive forecasting accuracy
+Real-World Validation in Energy Markets
+ +Predict high-resolution electricity peaks using only low-resolution data
+Combine GAMs and DNN’s for superior accuracy
+Won Western Power Distribution Competition
+Won Best-Student-Presentation Award
+Berrisch, J., & Ziel, F. (2023). Journal of Econometrics, 237(2), 105221.
+The Idea:
+Combine multiple forecasts instead of choosing one
Combination weights may vary over time, over the distribution or both
2 Popular options for combining distributions:
+Each day, \(t = 1, 2, ... T\)
+Weights are updated sequentially according to the past performance of the \(K\) experts.
+That is, a loss function \(\ell\) is needed. This is used to compute the cumulative regret \(R_{t,k}\)
+\[\begin{equation} + R_{t,k} = \widetilde{L}_{t} - \widehat{L}_{t,k} = \sum_{i = 1}^t \ell(\widetilde{X}_{i},Y_i) - \ell(\widehat{X}_{i,k},Y_i)\label{eq:regret} +\end{equation}\]
+The cumulative regret:
+Popular loss functions for point forecasting Gneiting (2011):
+\(\ell_2\) loss:
+\[\begin{equation} + \ell_2(x, y) = | x -y|^2 \label{eq:elltwo} +\end{equation}\]
+Strictly proper for mean prediction
+\(\ell_1\) loss:
+\[\begin{equation} + \ell_1(x, y) = | x -y| \label{eq:ellone} +\end{equation}\]
+Strictly proper for median predictions
+\[\begin{equation} +w_{t,k}^{\text{Naive}} = \frac{1}{K}\label{eq:naive_combination} +\end{equation}\]
+\[\begin{equation} + \begin{aligned} + w_{t,k}^{\text{EWA}} & = \frac{e^{\eta R_{t,k}} }{\sum_{k = 1}^K e^{\eta R_{t,k}}}\\ + & = + \frac{e^{-\eta \ell(\widehat{X}_{t,k},Y_t)} w^{\text{EWA}}_{t-1,k} }{\sum_{k = 1}^K e^{-\eta \ell(\widehat{X}_{t,k},Y_t)} w^{\text{EWA}}_{t-1,k}} + \end{aligned}\label{eq:exp_combination} +\end{equation}\]
+In stochastic settings, the cumulative Risk should be analyzed Wintenberger (2017): \[\begin{align} + &\underbrace{\widetilde{\mathcal{R}}_t = \sum_{i=1}^t \mathbb{E}[\ell(\widetilde{X}_{i},Y_i)|\mathcal{F}_{i-1}]}_{\text{Cumulative Risk of Forecaster}} \\ + &\underbrace{\widehat{\mathcal{R}}_{t,k} = \sum_{i=1}^t \mathbb{E}[\ell(\widehat{X}_{i,k},Y_i)|\mathcal{F}_{i-1}]}_{\text{Cumulative Risk of Experts}} + \label{eq_def_cumrisk} +\end{align}\]
+\[\begin{equation} + \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) \stackrel{t\to \infty}{\rightarrow} a \quad \text{with} \quad a \leq 0. + \label{eq_opt_select} +\end{equation}\] The forecaster is asymptotically not worse than the best expert \(\widehat{\mathcal{R}}_{t,\min}\).
+\[\begin{equation} + \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\pi} \right) \stackrel{t\to \infty}{\rightarrow} b \quad \text{with} \quad b \leq 0 . + \label{eq_opt_conv} +\end{equation}\] The forecaster is asymptotically not worse than the best convex combination \(\widehat{X}_{t,\pi}\) in hindsight (oracle).
+Optimal rates with respect to selection \(\eqref{eq_opt_select}\) and convex aggregation \(\eqref{eq_opt_conv}\) Wintenberger (2017):
+\[\begin{align} + \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) & = + \mathcal{O}\left(\frac{\log(K)}{t}\right)\label{eq_optp_select} +\end{align}\]
+\[\begin{align} + \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\pi} \right) & = + \mathcal{O}\left(\sqrt{\frac{\log(K)}{t}}\right) + \label{eq_optp_conv} +\end{align}\]
+Algorithms can statisfy both \(\eqref{eq_optp_select}\) and \(\eqref{eq_optp_conv}\) depending on:
+EWA satisfies optimal selection convergence \(\eqref{eq_optp_select}\) in a deterministic setting if:
+Those results can be converted to any stochastic setting Wintenberger (2017).
+Optimal convex aggregation convergence \(\eqref{eq_optp_conv}\) can be satisfied by applying the kernel-trick:
+\[\begin{align} +\ell^{\nabla}(x,y) = \ell'(\widetilde{X},y) x +\end{align}\]
+\(\ell'\) is the subgradient of \(\ell\) at forecast combination \(\widetilde{X}\).
+An appropriate choice:
+\[\begin{equation*} + \text{CRPS}(F, y) = \int_{\mathbb{R}} {(F(x) - \mathbb{1}\{ x > y \})}^2 dx \label{eq:crps} +\end{equation*}\]
+It’s strictly proper Gneiting & Raftery (2007).
+Using the CRPS, we can calculate time-adaptive weights \(w_{t,k}\). However, what if the experts’ performance varies in parts of the distribution?
+Utilize this relation:
+\[\begin{equation*} + \text{CRPS}(F, y) = 2 \int_0^{1} \text{QL}_p(F^{-1}(p), y) dp.\label{eq_crps_qs} +\end{equation*}\]
+… to combine quantiles of the probabilistic forecasts individually using the quantile-loss QL.
+QL is convex, but not exp-concave
+Bernstein Online Aggregation (BOA) lets us weaken the exp-concavity condition. It satisfies that there exist a \(C>0\) such that for \(x>0\) it holds that
+\[\begin{equation} + P\left( \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\pi} \right) \leq C \log(\log(t)) \left(\sqrt{\frac{\log(K)}{t}} + \frac{\log(K)+x}{t}\right) \right) \geq + 1-e^{-x} + \label{eq_boa_opt_conv} +\end{equation}\]
+Almost optimal w.r.t. convex aggregation \(\eqref{eq_optp_conv}\) Wintenberger (2017).
+The same algorithm satisfies that there exist a \(C>0\) such that for \(x>0\) it holds that \[\begin{equation} + P\left( \frac{1}{t}\left(\widetilde{\mathcal{R}}_t - \widehat{\mathcal{R}}_{t,\min} \right) \leq + C\left(\frac{\log(K)+\log(\log(Gt))+ x}{\alpha t}\right)^{\frac{1}{2-\beta}} \right) \geq + 1-2e^{-x} + \label{eq_boa_opt_select} +\end{equation}\]
+if \(Y_t\) is bounded, the considered loss \(\ell\) is convex \(G\)-Lipschitz and weak exp-concave in its first coordinate.
+Almost optimal w.r.t. selection \(\eqref{eq_optp_select}\) Gaillard & Wintenberger (2018).
+We show that this holds for QL under feasible conditions.
+Lemma 1
+\[\begin{align} + 2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\min} + & \leq \widehat{\mathcal{R}}^{\text{CRPS}}_{t,\min} + \label{eq_risk_ql_crps_expert} \\ + 2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\pi} + & \leq \widehat{\mathcal{R}}^{\text{CRPS}}_{t,\pi} . + \label{eq_risk_ql_crps_convex} +\end{align}\]
+Pointwise can outperform constant procedures
+QL is convex but not exp-concave:
+ Almost optimal convergence w.r.t. convex aggregation \(\eqref{eq_boa_opt_conv}\)
For almost optimal congerence w.r.t. selection \(\eqref{eq_boa_opt_select}\) we need to check A1 and A2:
+QL is Lipschitz continuous:
+A1 holds
+A1
+For some \(G>0\) it holds for all \(x_1,x_2\in \mathbb{R}\) and \(t>0\) that
+\[ | \ell(x_1, Y_t)-\ell(x_2, Y_t) | \leq G |x_1-x_2|\]
+A2 For some \(\alpha>0\), \(\beta\in[0,1]\) it holds for all \(x_1,x_2 \in \mathbb{R}\) and \(t>0\) that
+\[\begin{align*} + \mathbb{E}[ + & \ell(x_1, Y_t)-\ell(x_2, Y_t) | \mathcal{F}_{t-1}] \leq \\ + & \mathbb{E}[ \ell'(x_1, Y_t)(x_1 - x_2) |\mathcal{F}_{t-1}] \\ + & + + \mathbb{E}\left[ \left. \left( \alpha(\ell'(x_1, Y_t)(x_1 - x_2))^{2}\right)^{1/\beta} \right|\mathcal{F}_{t-1}\right] +\end{align*}\]
+Almost optimal w.r.t. selection \(\eqref{eq_optp_select}\) Gaillard & Wintenberger (2018).
+Conditional quantile risk: \(\mathcal{Q}_p(x) = \mathbb{E}[ \text{QL}_p(x, Y_t) | \mathcal{F}_{t-1}]\).
+convexity properties of \(\mathcal{Q}_p\) depend on the conditional distribution \(Y_t|\mathcal{F}_{t-1}\).
+Proposition 1
+Let \(Y\) be a univariate random variable with (Radon-Nikodym) \(\nu\)-density \(f\), then for the second subderivative of the quantile risk \(\mathcal{Q}_p(x) = \mathbb{E}[ \text{QL}_p(x, Y) ]\) of \(Y\) it holds for all \(p\in(0,1)\) that \(\mathcal{Q}_p'' = f.\) Additionally, if \(f\) is a continuous Lebesgue-density with \(f\geq\gamma>0\) for some constant \(\gamma>0\) on its support \(\text{spt}(f)\) then is \(\mathcal{Q}_p\) is \(\gamma\)-strongly convex.
+Strong convexity with \(\beta=1\) implies weak exp-concavity A2 Gaillard & Wintenberger (2018)
+ A1 and A2 give us almost optimal convergence w.r.t. selection \(\eqref{eq_boa_opt_select}\)
Theorem 1
+The gradient based fully adaptive Bernstein online aggregation (BOAG) applied pointwise for all \(p\in(0,1)\) on \(\text{QL}\) satisfies \(\eqref{eq_boa_opt_conv}\) with minimal CRPS given by
+\[\widehat{\mathcal{R}}_{t,\pi} = 2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\pi}.\]
+If \(Y_t|\mathcal{F}_{t-1}\) is bounded and has a pdf \(f_t\) satifying \(f_t>\gamma >0\) on its support \(\text{spt}(f_t)\) then \(\ref{eq_boa_opt_select}\) holds with \(\beta=1\) and
+\[\widehat{\mathcal{R}}_{t,\min} = 2\overline{\widehat{\mathcal{R}}}^{\text{QL}}_{t,\min}\].
+Simple Example:
+\[\begin{align} + Y_t & \sim \mathcal{N}(0,\,1) \\ + \widehat{X}_{t,1} & \sim \widehat{F}_{1} = \mathcal{N}(-1,\,1) \\ + \widehat{X}_{t,2} & \sim \widehat{F}_{2} = \mathcal{N}(3,\,4) + \label{eq:dgp_sim1} +\end{align}\]
+Penalized cubic B-Splines for smoothing weights:
+Let \(\varphi=(\varphi_1,\ldots, \varphi_L)\) be bounded basis functions on \((0,1)\) Then we approximate \(w_{t,k}\) by
+\[\begin{align} +w_{t,k}^{\text{smooth}} = \sum_{l=1}^L \beta_l \varphi_l = \beta'\varphi +\end{align}\]
+with parameter vector \(\beta\). The latter is estimated to penalize \(L_2\)-smoothing which minimizes
+\[\begin{equation} + \| w_{t,k} - \beta' \varphi \|^2_2 + \lambda \| \mathcal{D}^{d} (\beta' \varphi) \|^2_2 + \label{eq_function_smooth} +\end{equation}\]
+with differential operator \(\mathcal{D}\)
+Computation is easy, since we have an analytical solution
+We receive the constant solution for high values of \(\lambda\) when setting \(d=1\)
+
+Represent weights as linear combinations of bounded basis functions:
+\[\begin{equation} + w_{t,k} = \sum_{l=1}^L \beta_{t,k,l} \varphi_l = \boldsymbol \beta_{t,k}' \boldsymbol \varphi +\end{equation}\]
+A popular choice are are B-Splines as local basis functions
+\(\boldsymbol \beta_{t,k}\) is calculated using a reduced regret matrix:
+\[\begin{equation} + \underbrace{\boldsymbol r_{t}}_{\text{LxK}} = \frac{L}{P} \underbrace{\boldsymbol B'}_{\text{LxP}} \underbrace{\left({\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)}_{\text{PxK}} +\end{equation}\]
+\(\boldsymbol r_{t}\) is transformed from PxK to LxK
+If \(L = P\) it holds that \(\boldsymbol \varphi = \boldsymbol{I}\) For \(L = 1\) we receive constant weights
+Weights converge to the constant solution if \(L\rightarrow 1\)
+
+Array of expert predicitons: \(\widehat{X}_{t,p,k}\)
+Vector of Prediction targets: \(Y_t\)
+Starting Weights: \(\boldsymbol w_0=(w_{0,1},\ldots, w_{0,K})\)
+Penalization parameter: \(\lambda\geq 0\)
+B-spline and penalty matrices \(\boldsymbol B\) and \(\boldsymbol D\) on \(\mathcal{P}= (p_1,\ldots,p_M)\)
+Hat matrix: \[\boldsymbol{\mathcal{H}} = \boldsymbol B(\boldsymbol B'\boldsymbol B+ \lambda (\alpha \boldsymbol D_1'\boldsymbol D_1 + (1-\alpha) \boldsymbol D_2'\boldsymbol D_2))^{-1} \boldsymbol B'\]
+Cumulative Regret: \(R_{0,k} = 0\)
+Range parameter: \(E_{0,k}=0\)
+Starting pseudo-weights: \(\boldsymbol \beta_0 = \boldsymbol B^{\text{pinv}}\boldsymbol w_0(\boldsymbol{\mathcal{P}})\)
+for( t in 1:T ) {
+\(\widetilde{\boldsymbol X}_{t} = \text{Sort}\left( \boldsymbol w_{t-1}'(\boldsymbol P) \widehat{\boldsymbol X}_{t} \right)\) # Prediction
+\(\boldsymbol r_{t} = \frac{L}{M} \boldsymbol B' \left({\boldsymbol{QL}}_{\boldsymbol{\mathcal P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\boldsymbol{\mathcal P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)\)
+\(\boldsymbol E_{t} = \max(\boldsymbol E_{t-1}, \boldsymbol r_{t}^+ + \boldsymbol r_{t}^-)\)
+\(\boldsymbol V_{t} = \boldsymbol V_{t-1} + \boldsymbol r_{t}^{ \odot 2}\)
+\(\boldsymbol \eta_{t} =\min\left( \left(-\log(\boldsymbol \beta_{0}) \odot \boldsymbol V_{t}^{\odot -1} \right)^{\odot\frac{1}{2}} , \frac{1}{2}\boldsymbol E_{t}^{\odot-1}\right)\)
+\(\boldsymbol R_{t} = \boldsymbol R_{t-1}+ \boldsymbol r_{t} \odot \left( \boldsymbol 1 - \boldsymbol \eta_{t} \odot \boldsymbol r_{t} \right)/2 + \boldsymbol E_{t} \odot \mathbb{1}\{-2\boldsymbol \eta_{t}\odot \boldsymbol r_{t} > 1\}\)
+\(\boldsymbol \beta_{t} = K \boldsymbol \beta_{0} \odot \boldsymbol {SoftMax}\left( - \boldsymbol \eta_{t} \odot \boldsymbol R_{t} + \log( \boldsymbol \eta_{t}) \right)\)
+\(\boldsymbol w_{t}(\boldsymbol P) = \underbrace{\boldsymbol B(\boldsymbol B'\boldsymbol B+ \lambda (\alpha \boldsymbol D_1'\boldsymbol D_1 + (1-\alpha) \boldsymbol D_2'\boldsymbol D_2))^{-1} \boldsymbol B'}_{\boldsymbol{\mathcal{H}}} \boldsymbol B \boldsymbol \beta_{t}\)
+}
+Data Generating Process of the simple probabilistic example:
+\[\begin{align*} + Y_t &\sim \mathcal{N}(0,\,1)\\ + \widehat{X}_{t,1} &\sim \widehat{F}_{1}=\mathcal{N}(-1,\,1) \\ + \widehat{X}_{t,2} &\sim \widehat{F}_{2}=\mathcal{N}(3,\,4) +\end{align*}\]
+Deviation from best attainable QL (1000 runs).
+
CRPS Values for different \(\lambda\) (1000 runs)
+
CRPS for different number of knots (1000 runs)
+
The same simulation carried out for different algorithms (1000 runs):
+
+\[\begin{align*} + Y_t &\sim \mathcal{N}\left(\frac{\sin(0.005 \pi t )}{2},\,1\right) \\ + \widehat{X}_{t,1} &\sim \widehat{F}_{1} = \mathcal{N}(-1,\,1) \\ + \widehat{X}_{t,2} &\sim \widehat{F}_{2} = \mathcal{N}(3,\,4) +\end{align*}\]
+Changing optimal weights
+Single run example depicted aside
+No forgetting leads to long-term constant weights
+
+Data:
+Combination methods:
+Tuning paramter grids:
+Simple exponential smoothing with additive errors (ETS-ANN):
+\[\begin{align*} +Y_{t} = l_{t-1} + \varepsilon_t \quad \text{with} \quad l_t = l_{t-1} + \alpha \varepsilon_t \quad \text{and} \quad \varepsilon_t \sim \mathcal{N}(0,\sigma^2) +\end{align*}\]
+Quantile regression (QuantReg): For each \(p \in \mathcal{P}\) we assume:
+\[\begin{align*} +F^{-1}_{Y_t}(p) = \beta_{p,0} + \beta_{p,1} Y_{t-1} + \beta_{p,2} |Y_{t-1}-Y_{t-2}| +\end{align*}\]
+ARIMA(1,0,1)-GARCH(1,1) with Gaussian errors (ARMA-GARCH):
+\[\begin{align*} +Y_{t} = \mu + \phi(Y_{t-1}-\mu) + \theta \varepsilon_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \quad \text{and} \quad Z_t \sim \mathcal{N}(0,1) +\end{align*}\]
+ARIMA(0,1,0)-I-EGARCH(1,1) with Gaussian errors (I-EGARCH):
+\[\begin{align*} +Y_{t} = \mu + Y_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \log(\sigma_t^2) = \omega + \alpha Z_{t-1}+ \gamma (|Z_{t-1}|-\mathbb{E}|Z_{t-1}|) + \beta \log(\sigma_{t-1}^2) \quad \text{and} \quad Z_t \sim \mathcal{N}(0,1) +\end{align*}\]
+ARIMA(0,1,0)-GARCH(1,1) with student-t errors (I-GARCHt):
+\[\begin{align*} +Y_{t} = \mu + Y_{t-1} + \varepsilon_t \quad \text{with} \quad \varepsilon_t = \sigma_t Z, \quad \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \quad \text{and} \quad Z_t \sim t(0,1, \nu) +\end{align*}\]
+| ETS | +QuantReg | +ARMA-GARCH | +I-EGARCH | +I-GARCHt | +
|---|---|---|---|---|
| 2.101(>.999) | +1.358(>.999) | +0.52(0.993) | +0.511(0.999) | +-0.037(0.406) | +
| + | BOAG | +EWAG | +ML-PolyG | +BMA | +QRlin | +QRconv | +
|---|---|---|---|---|---|---|
| Pointwise | +-0.170(0.055) | +-0.089(0.175) | +-0.141(0.112) | +0.032(0.771) | +3.482(>.999) | +-0.019(0.309) | +
| B-Constant | +-0.118(0.146) | +-0.049(0.305) | +-0.090(0.218) | +0.038(0.834) | +4.002(>.999) | +0.539(0.996) | +
| P-Constant | +-0.138(0.020) | +-0.070(0.137) | +-0.133(0.026) | +0.039(0.851) | +5.275(>.999) | +0.009(0.683) | +
| B-Smooth | +-0.173(0.062) | +-0.065(0.276) | +-0.141(0.118) | +-0.042(0.386) | +- | +- | +
| P-Smooth | +-0.182(0.039) | +-0.107(0.121) | +-0.160(0.065) | +0.040(0.804) | +3.495(>.999) | +-0.012(0.369) | +
CRPS difference to Naive. Negative values correspond to better performance (the best value is bold).
Additionally, we show the p-value of the DM-test, testing against Naive. The cells are colored with respect to their values (the greener better).
Berrisch, J., & Ziel, F. (2024). International Journal of Forecasting, 40(4), 1568-1586.
+We extend the B-Smooth and P-Smooth procedures to the multivariate setting:
+Let \(\boldsymbol{\psi}^{\text{mv}}=(\psi_1,\ldots, \psi_{D})\) and \(\boldsymbol{\psi}^{\text{pr}}=(\psi_1,\ldots, \psi_{P})\) be two sets of bounded basis functions on \((0,1)\):
+\[\begin{equation*} + \boldsymbol w_{t,k} = \boldsymbol{\psi}^{\text{mv}} \boldsymbol{b}_{t,k} {\boldsymbol{\psi}^{pr}}' +\end{equation*}\]
+with parameter matix \(\boldsymbol b_{t,k}\). The latter is estimated to penalize \(L_2\)-smoothing which minimizes
+\[\begin{align} + & \| \boldsymbol{\beta}_{t,d, k}' \boldsymbol{\varphi}^{\text{pr}} - \boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}} \|^2_2 + \lambda^{\text{pr}} \| \mathcal{D}_{q} (\boldsymbol b_{t, d, k}' \boldsymbol{\psi}^{\text{pr}}) \|^2_2 + \nonumber \\ + & \| \boldsymbol{\beta}_{t, p, k}' \boldsymbol{\varphi}^{\text{mv}} - \boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}} \|^2_2 + \lambda^{\text{mv}} \| \mathcal{D}_{q} (\boldsymbol b_{t, p, k}' \boldsymbol{\psi}^{\text{mv}}) \|^2_2 \nonumber +\end{align}\]
+with differential operator \(\mathcal{D}_q\) of order \(q\)
+We have an analytical solution.
+Linear combinations of bounded basis functions:
+\[\begin{equation} + \underbrace{\boldsymbol w_{t,k}}_{D \text{ x } P} = \sum_{j=1}^{\widetilde D} \sum_{l=1}^{\widetilde P} \beta_{t,j,l,k} \varphi^{\text{mv}}_{j} \varphi^{\text{pr}}_{l} = \underbrace{\boldsymbol \varphi^{\text{mv}}}_{D\text{ x }\widetilde D} \boldsymbol \beta_{t,k} \underbrace{{\boldsymbol\varphi^{\text{pr}}}'}_{\widetilde P \text{ x }P} \nonumber +\end{equation}\]
+A popular choice: B-Splines
+\(\boldsymbol \beta_{t,k}\) is calculated using a reduced regret matrix:
+\(\underbrace{\boldsymbol r_{t,k}}_{\widetilde P \times \widetilde D} = \boldsymbol \varphi^{\text{pr}} \underbrace{\left({\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widetilde{\boldsymbol X}_{t},Y_t)- {\boldsymbol{QL}}_{\mathcal{P}}^{\nabla}(\widehat{\boldsymbol X}_{t},Y_t)\right)}_{\text{PxD}}\boldsymbol \varphi^{\text{mv}}\)
+If \(\widetilde P = P\) it holds that \(\boldsymbol \varphi^{pr} = \boldsymbol{I}\) (pointwise)
+For \(\widetilde P = 1\) we receive constant weights
+Evaluation: Exclude first 182 observations
+Extensions: Penalized smoothing | Forgetting
+Tuning strategies:
+Computation Time: ~30 Minutes
+| JSU1 | +JSU2 | +JSU3 | +JSU4 | +Norm1 | +Norm2 | +Norm3 | +Norm4 | +Naive | +
|---|---|---|---|---|---|---|---|---|
| 1.487 | +1.444 | +1.499 | +1.374 | +1.414 | +1.535 | +1.420 | +1.422 | +1.295 | +
| Description | +Parameter Tuning | +BOA | +ML-Poly | +EWA | +
|---|---|---|---|---|
| Constant | ++ | 1.2933 | +1.2966 | +1.3188 | +
| Pointwise | ++ | 1.2936 | +1.3010 | +1.3101 | +
| FTL | ++ | 1.3752 | +1.3692 | +1.3863 | +
| B Constant Pr | ++ | 1.2936 | +1.3000 | +1.3432 | +
| B Constant Mv | ++ | 1.2918 | +1.2945 | +1.3076 | +
| Forget | +Bayesian Fix | +1.2930 | +1.2956 | +1.3096 | +
| Full | +Bayesian Fix | +1.2905 | +1.2902 | +1.2870. | +
| Smooth.forget | +Bayesian Fix | +1.2911 | +1.2912 | +1.2869. | +
| Smooth | +Bayesian Fix | +1.2918 | +1.2917 | +1.2873. | +
| Forget | +Bayesian Online | +1.2855** | +1.2961 | +1.3098 | +
| Full | +Bayesian Online | +1.2919 | +1.2873. | +1.2873. | +
| Smooth.forget | +Bayesian Online | +1.2845** | +1.2862* | +1.2864. | +
| Smooth | +Bayesian Online | +1.2918 | +1.2918 | +1.2874. | +
| Forget | +Sampling Online | +1.2855** | +1.2961 | +1.3114 | +
| Full | +Sampling Online | +1.2886 | +1.2861* | +1.2873. | +
| Smooth.forget | +Sampling Online | +1.2845*** | +1.2867* | +1.2866. | +
| Smooth | +Sampling Online | +1.2918 | +1.2917 | +1.2877. | +
function updateChartInner(g, x, y, linesGroup, color, line, data) {
+ // Update axes with transitions
+ x.domain([0, d3.max(data, d => d.x)]);
+ g.select(".x-axis").transition().duration(1500).call(d3.axisBottom(x).ticks(10));
+ y.domain([0, d3.max(data, d => d.y)]);
+ g.select(".y-axis").transition().duration(1500).call(d3.axisLeft(y).ticks(5));
+
+ // Group data by basis function
+ const dataByFunction = Array.from(d3.group(data, d => d.b));
+ const keyFn = d => d[0];
+
+ // Update basis function lines
+ const u = linesGroup.selectAll("path").data(dataByFunction, keyFn);
+ u.join(
+ enter => enter.append("path").attr("fill","none").attr("stroke-width",3)
+ .attr("stroke", (_, i) => color(i)).attr("d", d => line(d[1].map(pt => ({x: pt.x, y: 0}))))
+ .style("opacity",0),
+ update => update,
+ exit => exit.transition().duration(1000).style("opacity",0).remove()
+ )
+ .transition().duration(1000)
+ .attr("d", d => line(d[1]))
+ .attr("stroke", (_, i) => color(i))
+ .style("opacity",1);
+}
+
+chart = {
+ // State variables for selected parameters
+ let selectedMu = 0.5;
+ let selectedSig = 1;
+ let selectedNonc = 0;
+ let selectedTailw = 1;
+ const filteredData = () => bsplineData.filter(d =>
+ Math.abs(selectedMu - d.mu) < 0.001 &&
+ d.sig === selectedSig &&
+ d.nonc === selectedNonc &&
+ d.tailw === selectedTailw
+ );
+ const container = d3.create("div")
+ .style("max-width", "none")
+ .style("width", "100%");;
+ const controlsContainer = container.append("div")
+ .style("display", "flex")
+ .style("gap", "20px");
+ // slider controls
+ const sliders = [
+ { label: 'Mu', get: () => selectedMu, set: v => selectedMu = v, min: 0.1, max: 0.9, step: 0.2 },
+ { label: 'Sigma', get: () => Math.log2(selectedSig), set: v => selectedSig = 2 ** v, min: -2, max: 2, step: 1 },
+ { label: 'Noncentrality', get: () => selectedNonc, set: v => selectedNonc = v, min: -4, max: 4, step: 2 },
+ { label: 'Tailweight', get: () => Math.log2(selectedTailw), set: v => selectedTailw = 2 ** v, min: -2, max: 2, step: 1 }
+ ];
+ // Build slider controls with D3 data join
+ const sliderCont = controlsContainer.selectAll('div').data(sliders).join('div')
+ .style('display','flex').style('align-items','center').style('gap','10px')
+ .style('flex','1').style('min-width','0px');
+ sliderCont.append('label').text(d => d.label + ':').style('font-size','20px');
+ sliderCont.append('input')
+ .attr('type','range').attr('min', d => d.min).attr('max', d => d.max).attr('step', d => d.step)
+ .property('value', d => d.get())
+ .on('input', function(event, d) {
+ const val = +this.value; d.set(val);
+ d3.select(this.parentNode).select('span').text(d.label.match(/Sigma|Tailweight/) ? 2**val : val);
+ updateChart(filteredData());
+ })
+ .style('width', '100%');
+ sliderCont.append('span').text(d => (d.label.match(/Sigma|Tailweight/) ? d.get() : d.get()))
+ .style('font-size','20px');
+
+ // Add Reset button to clear all sliders to their defaults
+ controlsContainer.append('button')
+ .text('Reset')
+ .style('font-size', '20px')
+ .style('align-self', 'center')
+ .style('margin-left', 'auto')
+ .on('click', () => {
+ // reset state vars
+ selectedMu = 0.5;
+ selectedSig = 1;
+ selectedNonc = 0;
+ selectedTailw = 1;
+ // update input positions
+ sliderCont.selectAll('input').property('value', d => d.get());
+ // update displayed labels
+ sliderCont.selectAll('span')
+ .text(d => d.label.match(/Sigma|Tailweight/) ? (2**d.get()) : d.get());
+ // redraw chart
+ updateChart(filteredData());
+ });
+
+ // Build SVG
+ const width = 1200;
+ const height = 450;
+ const margin = {top: 40, right: 20, bottom: 40, left: 40};
+ const innerWidth = width - margin.left - margin.right;
+ const innerHeight = height - margin.top - margin.bottom;
+
+ // Set controls container width to match SVG plot width
+ controlsContainer.style("max-width", "none").style("width", "100%");
+ // Distribute each control evenly and make sliders full-width
+ controlsContainer.selectAll("div").style("flex", "1").style("min-width", "0px");
+ controlsContainer.selectAll("input").style("width", "100%").style("box-sizing", "border-box");
+
+ // Create scales
+ const x = d3.scaleLinear()
+ .domain([0, 1])
+ .range([0, innerWidth]);
+
+ const y = d3.scaleLinear()
+ .domain([0, 1])
+ .range([innerHeight, 0]);
+
+ // Create a color scale for the basis functions
+ const color = d3.scaleOrdinal(d3.schemeCategory10);
+
+ // Create SVG
+ const svg = d3.create("svg")
+ .attr("width", "100%")
+ .attr("height", "auto")
+ .attr("viewBox", [0, 0, width, height])
+ .attr("preserveAspectRatio", "xMidYMid meet")
+ .attr("style", "max-width: 100%; height: auto;");
+
+ // Create the chart group
+ const g = svg.append("g")
+ .attr("transform", `translate(${margin.left},${margin.top})`);
+
+ // Add axes
+ const xAxis = g.append("g")
+ .attr("transform", `translate(0,${innerHeight})`)
+ .attr("class", "x-axis")
+ .call(d3.axisBottom(x).ticks(10))
+ .style("font-size", "20px");
+
+ const yAxis = g.append("g")
+ .attr("class", "y-axis")
+ .call(d3.axisLeft(y).ticks(5))
+ .style("font-size", "20px");
+
+ // Add a horizontal line at y = 0
+ g.append("line")
+ .attr("x1", 0)
+ .attr("x2", innerWidth)
+ .attr("y1", y(0))
+ .attr("y2", y(0))
+ .attr("stroke", "#000")
+ .attr("stroke-opacity", 0.2);
+
+ // Add gridlines
+ g.append("g")
+ .attr("class", "grid-lines")
+ .selectAll("line")
+ .data(y.ticks(5))
+ .join("line")
+ .attr("x1", 0)
+ .attr("x2", innerWidth)
+ .attr("y1", d => y(d))
+ .attr("y2", d => y(d))
+ .attr("stroke", "#ccc")
+ .attr("stroke-opacity", 0.5);
+
+ // Create a line generator
+ const line = d3.line()
+ .x(d => x(d.x))
+ .y(d => y(d.y))
+ .curve(d3.curveBasis);
+
+ // Group to contain the basis function lines
+ const linesGroup = g.append("g")
+ .attr("class", "basis-functions");
+
+ // Store the current basis functions for transition
+ let currentBasisFunctions = new Map();
+
+ // Function to update the chart with new data
+ function updateChart(data) {
+ updateChartInner(g, x, y, linesGroup, color, line, data);
+ }
+
+ // Store the update function
+ svg.node().update = updateChart;
+
+ // Initial render
+ updateChart(filteredData());
+
+ container.node().appendChild(svg.node());
+ return container.node();
+}Non-central beta distribution Johnson et al. (1995):
+\[\begin{equation*} + \mathcal{B}(x, a, b, c) = \sum_{j=0}^{\infty} e^{-c/2} \frac{\left( \frac{c}{2} \right)^j}{j!} I_x \left( a + j , b \right) +\end{equation*}\]
+Penalty and \(\lambda\) need to be adjusted accordingly Li & Cao (2022)
+Using non equidistant knots in profoc is straightforward:
Basis specification b_smooth_pr is internally passed to make_basis_mats().
Profoc adjusts penatly and \(\lambda\)
+Potential Downsides:
+Important:
+Upsides:
+The profoc R Package:
+Pubications:
+Berrisch, J., & Ziel, F. (2023). CRPS learning. Journal of Econometrics, 237(2), 105221.
+Berrisch, J., & Ziel, F. (2024). Multivariate probabilistic CRPS learning with an application to day-ahead electricity prices. International Journal of Forecasting, 40(4), 1568-1586.
+Berrisch, J., Pappert, S., Ziel, F., & Arsova, A. (2023). Finance Research Letters, 52, 103503.
+Understanding European Allowances (EUA) dynamics is important for several fields:
+Portfolio & Risk Management,
+Sustainability Planing
+Political decisions
+EUA prices are obviously connected to the energy market
+How can the dynamics be characterized?
+Several Questions arise:
+EUA, natural gas, Brent crude oil, coal
+March 15, 2010, until October 14, 2022
+Data was normalized w.r.t. \(\text{CO}_2\) emissions
+Emission-adjusted prices reflects one tonne of \(\text{CO}_2\)
+We adjusted for inflation by Eurostat’s HICP, excluding energy
+Log transformation of the data to stabilize the variance
+ADF Test: All series are stationary in first differences
+Johansen’s likelihood ratio trace test suggests two cointegrating relationships (levels)
+Johansen’s likelihood ratio trace test suggests no cointegrating relationships (logs)
+Sklars theorem: decompose target into - marginal distributions: \(F_{X_{k,t}|\mathcal{F}_{t-1}}\) for \(k=1,\ldots, K\), and - copula function: \(C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}\)
+Let \(\boldsymbol{x}_t= (x_{1,t},\ldots, x_{K,t})^\intercal\) be the realized values
+It holds that:
+\[\begin{align} + F_{\boldsymbol{X}_t|\mathcal{F}_{t-1}}(\boldsymbol{x}_t) = C_{\boldsymbol{U}_{t}|\mathcal{F}_{t - 1}}(\boldsymbol{u}_t) \nonumber +\end{align}\]
+with: \(\boldsymbol{u}_t =(u_{1,t},\ldots, u_{K,t})^\intercal\), \(u_{k,t} = F_{X_{k,t}|\mathcal{F}_{t-1}}(x_{k,t})\)
+For brewity we drop the conditioning on \(\mathcal{F}_{t-1}\).
+The model can be specified as follows
+\[\begin{align} + F(\boldsymbol{x}_t) = C \left[\mathbf{F}(\boldsymbol{x}_t; \boldsymbol{\mu}_t, \boldsymbol{ \sigma }_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda}); \Xi_t, \Theta\right] \nonumber +\end{align}\]
+\(\Xi_{t}\) denotes time-varying dependence parameters \(\Theta\) denotes time-invariant dependence parameters
+We take \(C\) as the \(t\)-copula
+\[\mathbf{F} = (F_1, \ldots, F_K)^{\intercal}\]
+\[\begin{align} + \Delta \boldsymbol{\mu}_t = \Pi \boldsymbol{x}_{t-1} + \Gamma \Delta \boldsymbol{x}_{t-1} \nonumber +\end{align}\]
+where \(\Pi = \alpha \beta^{\intercal}\) is the cointegrating matrix of rank \(r\), \(0 \leq r\leq K\).
+\[\begin{align} + \sigma_{i,t}^2 = & \omega_i + \alpha^+_{i} (\epsilon_{i,t-1}^+)^2 + \alpha^-_{i} (\epsilon_{i,t-1}^-)^2 + \beta_i \sigma_{i,t-1}^2 \nonumber +\end{align}\]
+where \(\epsilon_{i,t-1}^+ = \max\{\epsilon_{i,t-1}, 0\}\) …
+Separate coefficients for positive and negative innovations to capture leverage effects.
+\[\begin{align*} + \Xi_{t} = & \Lambda\left(\boldsymbol{\xi}_{t}\right) + \\ + \xi_{ij,t} = & \eta_{0,ij} + \eta_{1,ij} \xi_{ij,t-1} + \eta_{2,ij} z_{i,t-1} z_{j,t-1}, +\end{align*}\]
+\(\xi_{ij,t}\) is a latent process
+\(z_{i,t}\) denotes the \(i\)-th standardized residual from time series \(i\) at time point \(t\)
+\(\Lambda(\cdot)\) is a link function - ensures that \(\Xi_{t}\) is a valid variance covariance matrix - ensures that \(\Xi_{t}\) does not exceed its support space and remains semi-positive definite
+All parameters can be estimated jointly. Using conditional independence: \[\begin{align*} + L = f_{X_1} \prod_{i=2}^T f_{X_i|\mathcal{F}_{i-1}}, +\end{align*}\] with multivariate conditional density: \[\begin{align*} + f_{\mathbf{X}_t}(\mathbf{x}_t | \mathcal{F}_{t-1}) = c\left[\mathbf{F}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, + \boldsymbol{\lambda});\Xi_t, \Theta\right] \cdot \\ \prod_{i=1}^K f_{X_{i,t}}(\mathbf{x}_t;\boldsymbol{\mu}_t, \boldsymbol{\sigma}_{t}^2, \boldsymbol{\nu}, \boldsymbol{\lambda}) +\end{align*}\] The copula density \(c\) can be derived analytically.
+=> 2227 potential starting points
+We sample 250 to reduce computational cost
+We draw \(2^{12}= 2048\) trajectories from the joint predictive distribution
+Forecasts are evaluated by the energy score (ES)
+\[\begin{align*} + \text{ES}_t(F, \mathbf{x}_t) = \mathbb{E}_{F} \left(||\tilde{\mathbf{X}}_t - \mathbf{x}_t||_2\right) - \\ \frac{1}{2} \mathbb{E}_F \left(||\tilde{\mathbf{X}}_t - \tilde{\mathbf{X}}_t'||_2 \right) +\end{align*}\]
+where \(\mathbf{x}_t\) is the observed \(K\)-dimensional realization and \(\tilde{\mathbf{X}}_t\), respectively \(\tilde{\mathbf{X}}_t'\) are independent random vectors distributed according to \(F\)
+For univariate cases the Energy Score becomes the Continuous Ranked Probability Score (CRPS)
+Relative improvement in ES compared to \(\text{RW}^{\sigma, \rho}\)
+Cellcolor: w.r.t. test statistic of Diebold-Mariano test (testing wether the model outperformes the benchmark, greener = better).
+| Model | +\(\text{ES}^{\text{All}}_{1-30}\) | +\(\text{ES}^{\text{EUA}}_{1-30}\) | +\(\text{ES}^{\text{Oil}}_{1-30}\) | +\(\text{ES}^{\text{NGas}}_{1-30}\) | +\(\text{ES}^{\text{Coal}}_{1-30}\) | +\(\text{ES}^{\text{All}}_{1}\) | +\(\text{ES}^{\text{All}}_{5}\) | +\(\text{ES}^{\text{All}}_{30}\) | +
|---|---|---|---|---|---|---|---|---|
| \(\textrm{RW}^{\sigma, \rho}_{}\) | +161.96 | +10.06 | +37.94 | +146.73 | +13.22 | +5.56 | +13.28 | +34.29 | +
| \(\textrm{RW}^{\sigma_t, \rho_t}_{}\) | +9.40 | +3.75 | +-0.41 | +11.39 | +4.13 | +10.34 | +9.10 | +7.59 | +
| \(\textrm{RW}^{\sigma, \rho_t}_{\textrm{ncp}, \textrm{log}}\) | +12.04 | +6.16 | +-0.56 | +14.33 | +7.35 | +9.22 | +9.82 | +10.02 | +
| \(\textrm{RW}^{\sigma, \rho}_{\textrm{log}}\) | +12.10 | +6.25 | +-0.59 | +14.44 | +7.31 | +9.04 | +9.66 | +9.91 | +
| \(\textrm{VECM}^{\textrm{r0}, \sigma_t, \rho_t}_{\textrm{lev}, \textrm{ncp}}\) | +9.68 | +-0.72 | +0.32 | +11.74 | +3.70 | +10.82 | +10.50 | +8.21 | +
| \(\textrm{VECM}^{\textrm{r0}, \sigma, \rho_t}_{\textrm{log}}\) | +12.15 | +6.10 | +-0.70 | +14.57 | +7.80 | +8.05 | +9.99 | +10.04 | +
| \(\textrm{ETS}^{\sigma}\) | +9.94 | +5.75 | +0.08 | +13.05 | +7.83 | +6.96 | +7.74 | +6.21 | +
| \(\textrm{ETS}^{\sigma}_{\textrm{log}}\) | +8.12 | +7.80 | +-0.51 | +11.17 | +8.54 | +5.05 | +6.14 | +2.66 | +
| \(\textrm{VES}^{\sigma}\) | +5.50 | +-4.43 | +-3.22 | +6.29 | +4.68 | +-25.99 | +-2.42 | +3.07 | +
| \(\textrm{VES}^{\sigma}_{\textrm{log}}\) | +7.68 | +3.31 | +-4.34 | +9.07 | +8.30 | +-22.11 | +1.07 | +4.32 | +
Improvement in CRPS of selected models relative to \(\textrm{RW}^{\sigma, \rho}_{}\) in % (higher = better). Colored according to the test statistic of a DM-Test comparing to \(\textrm{RW}^{\sigma, \rho}_{}\) (greener means lower test statistic i.e., better performance compared to \(\textrm{RW}^{\sigma, \rho}_{}\)).
+| + |
+EUA
+ |
+
+Oil
+ |
+
+NGas
+ |
+
+Coal
+ |
+||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | +H1 | +H5 | +H30 | +H1 | +H5 | +H30 | +H1 | +H5 | +H30 | +H1 | +H5 | +H30 | +
| \(\textrm{RW}^{\sigma, \rho}_{}\) | +0.4 | +0.9 | +2.1 | +1.5 | +3.4 | +9.1 | +4.7 | +11.6 | +29.8 | +0.3 | +0.9 | +2.8 | +
| \(\textrm{RW}^{\sigma_t, \rho_t}_{}\) | +5.6 | +6.0 | +2.8 | +2.1 | +2.7 | +-0.8 | +12.6 | +10.5 | +9.6 | +10.7 | +6.5 | +2.1 | +
| \(\textrm{RW}^{\sigma, \rho_t}_{\textrm{ncp}, \textrm{log}}\) | +5.1 | +8.7 | +5.0 | +0.7 | +0.8 | +-0.4 | +11.4 | +11.5 | +12.4 | +8.0 | +7.3 | +6.7 | +
| \(\textrm{RW}^{\sigma, \rho}_{\textrm{log}}\) | +4.7 | +8.9 | +5.2 | +0.0 | +0.3 | +-0.6 | +11.2 | +11.4 | +12.4 | +7.7 | +7.5 | +6.6 | +
| \(\textrm{VECM}^{\textrm{r0}, \sigma_t, \rho_t}_{\textrm{lev}, \textrm{ncp}}\) | +3.6 | +0.6 | +-1.6 | +2.7 | +3.0 | +0.0 | +13.1 | +12.2 | +10.4 | +11.8 | +7.2 | +1.5 | +
| \(\textrm{VECM}^{\textrm{r0}, \sigma, \rho_t}_{\textrm{log}}\) | +4.2 | +8.9 | +5.1 | +0.2 | +0.4 | +-0.8 | +9.9 | +11.8 | +12.7 | +7.8 | +7.9 | +7.3 | +
| \(\textrm{ETS}^{\sigma}\) | +0.2 | +6.8 | +5.7 | +1.1 | +0.9 | +-0.2 | +10.9 | +11.3 | +10.9 | +7.5 | +6.7 | +5.6 | +
| \(\textrm{ETS}^{\sigma}_{\textrm{log}}\) | +1.0 | +8.6 | +8.0 | +0.1 | +0.7 | +-0.6 | +8.9 | +9.4 | +7.1 | +7.3 | +7.8 | +6.7 | +
| \(\textrm{VES}^{\sigma}\) | +-38.5 | +-6.4 | +-5.4 | +-33.3 | +-6.1 | +-2.4 | +-26.6 | +-2.6 | +3.6 | +-37.5 | +-5.5 | +4.7 | +
| \(\textrm{VES}^{\sigma}_{\textrm{log}}\) | +-32.4 | +2.8 | +1.8 | +-30.4 | +-6.2 | +-3.2 | +-22.0 | +1.8 | +5.4 | +-27.0 | +2.3 | +6.4 | +
RMSE measures the performance of the forecasts at their mean
+Conclusion: the Improvements seen before must be attributed to other parts of the multivariate probabilistic predictive distribution
+Improvement in RMSE score of selected models relative to \(\textrm{RW}^{\sigma, \rho}_{}\) in % (higher = better). Colored according to the test statistic of a DM-Test comparing to \(\textrm{RW}^{\sigma, \rho}_{}\) (greener means lower test statistic i.e., better performance compared to \(\textrm{RW}^{\sigma, \rho}_{}\)).
+| + |
+EUA
+ |
+
+Oil
+ |
+
+NGas
+ |
+
+Coal
+ |
+||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | +H1 | +H5 | +H30 | +H1 | +H5 | +H30 | +H1 | +H5 | +H30 | +H1 | +H5 | +H30 | +
| \(\textrm{RW}^{\sigma, \rho}_{}\) | +0.9 | +2.0 | +5.0 | +2.9 | +6.4 | +16.7 | +17.8 | +42.8 | +85.4 | +0.9 | +2.9 | +7.0 | +
| \(\textrm{RW}^{\sigma_t, \rho_t}_{}\) | +-0.1 | +-0.1 | +0.7 | +0.0 | +-0.3 | +-0.1 | +-0.2 | +0.3 | +1.3 | +-0.2 | +0.0 | +-1.8 | +
| \(\textrm{RW}^{\sigma, \rho_t}_{\textrm{ncp}, \textrm{log}}\) | +-270.5 | +-154.1 | +-139.9 | +0.5 | +-0.5 | +-2.9 | +-0.8 | +0.7 | +-1.6 | +0.3 | +-31.2 | +-24.5 | +
| \(\textrm{RW}^{\sigma, \rho}_{\textrm{log}}\) | +-705.0 | +-265.4 | +-125.2 | +0.6 | +0.2 | +-0.2 | +-0.4 | +0.1 | +-1.6 | +-0.9 | +-0.3 | +-8.3 | +
| \(\textrm{VECM}^{\textrm{r0}, \sigma_t, \rho_t}_{\textrm{lev}, \textrm{ncp}}\) | +-0.9 | +0.2 | +0.5 | +0.5 | +0.2 | +0.0 | +-0.4 | +0.7 | +0.2 | +1.4 | +0.1 | +0.2 | +
| \(\textrm{VECM}^{\textrm{r0}, \sigma, \rho_t}_{\textrm{log}}\) | +-271.5 | +-191.3 | +-114.3 | +1.7 | +-12.3 | +-3.6 | +-0.6 | +1.6 | +-4.1 | +0.0 | +-0.8 | +-6.7 | +
| \(\textrm{ETS}^{\sigma}\) | +-0.3 | +0.3 | +1.6 | +0.7 | +0.1 | +-0.1 | +0.1 | +-0.1 | +0.2 | +-2.4 | +-3.9 | +2.5 | +
| \(\textrm{ETS}^{\sigma}_{\textrm{log}}\) | +-1.0 | +0.4 | +1.6 | +0.9 | +0.0 | +-0.1 | +-1.9 | +-1.9 | +-13.9 | +-0.3 | +-3.6 | +-1.8 | +
| \(\textrm{VES}^{\sigma}\) | +-37.4 | +-8.9 | +-6.0 | +-27.9 | +-7.4 | +-2.8 | +-27.2 | +-9.5 | +-2.4 | +-41.7 | +-1.2 | +1.6 | +
| \(\textrm{VES}^{\sigma}_{\textrm{log}}\) | +-37.6 | +-9.2 | +-7.8 | +-26.8 | +-7.3 | +-3.0 | +-27.0 | +-6.8 | +-3.5 | +-41.2 | +-2.2 | +-0.3 | +
Accounting for heteroscedasticity or stabilizing the variance via log transformation is crucial for good performance in terms of ES
++
+Theoretical
+Probabilistic Online Learning:
+ Aggregation
Regression
+
+Practical
+Applications
+ Energy Commodities
Electricity Prices
Electricity Load
+
+Well received by the academic community:
+of papers already published
+104 citations since 2020 (Google Scholar)
++
+Software
+R Packages:
+ +Python Packages:
+ +Contributions to other projects:
+ RcppArmadillo
gamlss
NixOS/nixpkgs
OpenPrinting/foomatic-db
Awards:
+Berrisch, J., Narajewski, M., & Ziel, F. (2023):
+ Won Western Power Distribution Competition
Won Best-Student-Presentation Award