Title: | Optimal Out-of-Sample Forecast Evaluation and Testing under Stationarity |
---|---|
Description: | Package 'ACV' (short for Affine Cross-Validation) offers an improved time-series cross-validation loss estimator which utilizes both in-sample and out-of-sample forecasting performance via a carefully constructed affine weighting scheme. Under the assumption of stationarity, the estimator is the best linear unbiased estimator of the out-of-sample loss. Besides that, the package also offers improved versions of Diebold-Mariano and Ibragimov-Muller tests of equal predictive ability which deliver more power relative to their conventional counterparts. For more information, see the accompanying article Stanek (2021) <doi:10.2139/ssrn.3996166>. |
Authors: | Filip Stanek [aut, cre] |
Maintainer: | Filip Stanek <[email protected]> |
License: | GPL (>=3) |
Version: | 1.0.2 |
Built: | 2025-03-05 03:47:09 UTC |
Source: | https://github.com/stanek-fi/acv |
Function estimateL()
estimates the out-of-sample loss of a given algorithm on specified time-series. By default, it uses the optimal weighting scheme which exploits also the in-sample performance in order to deliver a more precise estimate than the conventional estimator.
estimateL( y, algorithm, m, h = 1, v = 1, xreg = NULL, lossFunction = function(y, yhat) { (y - yhat)^2 }, method = "optimal", Phi = NULL, bw = NULL, rhoLimit = 0.99, ... )
estimateL( y, algorithm, m, h = 1, v = 1, xreg = NULL, lossFunction = function(y, yhat) { (y - yhat)^2 }, method = "optimal", Phi = NULL, bw = NULL, rhoLimit = 0.99, ... )
y |
Univariate time-series object. |
algorithm |
Algorithm which is to be applied to the time-series. The object which the algorithm produces should respond to |
m |
Length of the window on which the algorithm should be trained. |
h |
Number of predictions made after a single training of the algorithm. |
v |
Number of periods by which the estimation window progresses forward once the predictions are generated. |
xreg |
Matrix of exogenous regressors supplied to the algorithm (if applicable). |
lossFunction |
Loss function used to compute contrasts (defaults to squared error). |
method |
Can be set to either |
Phi |
User can also directly supply |
bw |
Bandwidth for the long run variance estimator. If |
rhoLimit |
Parameter |
... |
Other parameters passed to the algorithm. |
List containing loss estimate and its estimated variance along with some other auxiliary information like the matrix of contrasts Phi
and the weights used for computation.
set.seed(1) y <- rnorm(40) m <- 36 h <- 1 v <- 1 estimateL(y, forecast::Arima, m = m, h = h, v = v)
set.seed(1) y <- rnorm(40) m <- 36 h <- 1 v <- 1 estimateL(y, forecast::Arima, m = m, h = h, v = v)
Function testL()
tests the null hypothesis of equal predictive ability of algorithm1
and algorithm2
on time series y
. By default, it uses the optimal weighting scheme which exploits also the in-sample performance in order to deliver more power than the conventional tests.
testL( y, algorithm1, algorithm2, m, h = 1, v = 1, xreg = NULL, lossFunction = function(y, yhat) { (y - yhat)^2 }, method = "optimal", test = "Diebold-Mariano", Ha = "!=0", Phi = NULL, bw = NULL, groups = 2, rhoLimit = 0.99, ... )
testL( y, algorithm1, algorithm2, m, h = 1, v = 1, xreg = NULL, lossFunction = function(y, yhat) { (y - yhat)^2 }, method = "optimal", test = "Diebold-Mariano", Ha = "!=0", Phi = NULL, bw = NULL, groups = 2, rhoLimit = 0.99, ... )
y |
Univariate time-series object. |
algorithm1 |
First algorithm which is to be applied to the time-series. The object which the algorithm produces should respond to |
algorithm2 |
Second algorithm. See above. |
m |
Length of the window on which the algorithm should be trained. |
h |
Number of predictions made after a single training of the algorithm. |
v |
Number of periods by which the estimation window progresses forward once the predictions are generated. |
xreg |
Matrix of exogenous regressors supplied to the algorithm (if applicable). |
lossFunction |
Loss function used to compute contrasts (defaults to squared error). |
method |
Can be set to either |
test |
Type of the test which is to be executed. Can attain values |
Ha |
Alternative hypothesis. Can attain values |
Phi |
User can also directly supply |
bw |
Applicable to |
groups |
Applicable to |
rhoLimit |
Parameter |
... |
Other parameters passed to algorithms. |
List containing loss differential estimate and associated p-value along with some other auxiliary information like the matrix of contrasts differentials Phi
and the weights used for computation.
set.seed(1) y <- rnorm(40) m <- 36 h <- 1 v <- 1 algorithm1 <- function(y) { forecast::Arima(y, order = c(1, 0, 0)) } algorithm2 <- function(y) { forecast::Arima(y, order = c(2, 0, 0)) } testL(y, algorithm1, algorithm2, m = m, h = h, v = v)
set.seed(1) y <- rnorm(40) m <- 36 h <- 1 v <- 1 algorithm1 <- function(y) { forecast::Arima(y, order = c(1, 0, 0)) } algorithm2 <- function(y) { forecast::Arima(y, order = c(2, 0, 0)) } testL(y, algorithm1, algorithm2, m = m, h = h, v = v)
Function tsACV()
computes contrasts between forecasts produced by a given algorithm and the original time-series on which the algorithm is trained.
This can then be used to estimate the loss of the algorithm.
Unlike the similar tsCV()
function from the 'forecast'
package, tsACV()
also records in-sample contrasts as these can be leveraged to produce more accurate out-of-sample loss estimates.
tsACV( y, algorithm, m, h = 1, v = 1, xreg = NULL, lossFunction = function(y, yhat) { (y - yhat)^2 }, ... )
tsACV( y, algorithm, m, h = 1, v = 1, xreg = NULL, lossFunction = function(y, yhat) { (y - yhat)^2 }, ... )
y |
Univariate time-series object. |
algorithm |
Algorithm which is to be applied to the time-series. The object which the algorithm produces should respond to |
m |
Length of the window on which the algorithm should be trained. |
h |
Number of predictions made after a single training of the algorithm. |
v |
Number of periods by which the estimation window progresses forward once the predictions are generated. |
xreg |
Matrix of exogenous regressors supplied to the algorithm (if applicable). |
lossFunction |
Loss function used to compute contrasts (defaults to squared error). |
... |
Other parameters passed to the algorithm. |
Matrix of computed contrasts Phi
. Each row corresponds to a particular period of the y
time-series and each column corresponds to a particular location of the training window.
set.seed(1) y <- rnorm(40) m <- 36 h <- 1 v <- 1 tsACV(y, forecast::Arima, m = m, h = h, v = v)
set.seed(1) y <- rnorm(40) m <- 36 h <- 1 v <- 1 tsACV(y, forecast::Arima, m = m, h = h, v = v)