Mean absolute scaled error

In statistics, the mean absolute scaled error (MASE) is a measure of the accuracy of forecasts. It is the mean absolute error of the forecast values, divided by the mean absolute error of the in-sample one-step naive forecast. It was proposed in 2005 by statistician Rob J. Hyndman and Professor of Decision Sciences Anne B. Koehler, who described it as a "generally applicable measurement of forecast accuracy without the problems seen in the other measurements."^[1] The mean absolute scaled error has favorable properties when compared to other methods for calculating forecast errors, such as root-mean-square-deviation, and is therefore recommended for determining comparative accuracy of forecasts.^[2]

Rationale

The mean absolute scaled error has the following desirable properties:^[3]

Scale invariance: The mean absolute scaled error is independent of the scale of the data, so can be used to compare forecasts across data sets with different scales.
Predictable behavior as $y_{t}\rightarrow 0$ : Percentage forecast accuracy measures such as the Mean absolute percentage error (MAPE) rely on division of $y_{t}$ , skewing the distribution of the MAPE for values of $y_{t}$ near or equal to 0. This is especially problematic for data sets whose scales do not have a meaningful 0, such as temperature in Celsius or Fahrenheit, and for intermittent demand data sets, where $y_{t}=0$ occurs frequently.
Symmetry: The mean absolute scaled error penalizes positive and negative forecast errors equally, and penalizes errors in large forecasts and small forecasts equally. In contrast, the MAPE and median absolute percentage error (MdAPE) fail both of these criteria, while the "symmetric" sMAPE and sMdAPE^[4] fail the second criterion.
Interpretability: The mean absolute scaled error can be easily interpreted, as values greater than one indicate that in-sample one-step forecasts from the naïve method perform better than the forecast values under consideration.
Asymptotic normality of the MASE: The Diebold-Mariano test for one-step forecasts is used to test the statistical significance of the difference between two sets of forecasts.^[5]^[6]^[7] To perform hypothesis testing with the Diebold-Mariano test statistic, it is desirable for $DM\sim N(0,1)$ , where $DM$ is the value of the test statistic. The DM statistic for the MASE has been empirically shown to approximate this distribution, while the mean relative absolute error (MRAE), MAPE and sMAPE do not.^[2]

Non seasonal time series

For a non-seasonal time series,^[8] the mean absolute scaled error is estimated by

\mathrm {MASE} =\mathrm {mean} \left({\frac {\left|e_{j}\right|}{{\frac {1}{T-1}}\sum _{t=2}^{T}\left|Y_{t}-Y_{t-1}\right|}}\right)={\frac {{\frac {1}{J}}\sum _{j}\left|e_{j}\right|}{{\frac {1}{T-1}}\sum _{t=2}^{T}\left|Y_{t}-Y_{t-1}\right|}}

^[3]

where the numerator e_j is the forecast error for a given period (with J, the number of forecasts), defined as the actual value (Y_j) minus the forecast value (F_j) for that period: e_j = Y_j − F_j, and the denominator is the mean absolute error of the one-step "naive forecast method" on the training set (here defined as t = 1..T),^[8] which uses the actual value from the prior period as the forecast: F_t = Y_t−1^[9]

Seasonal time series

For a seasonal time series, the mean absolute scaled error is estimated in a manner similar to the method for non-seasonal time series:

$\mathrm {MASE} =\mathrm {mean} \left({\frac {\left|e_{j}\right|}{{\frac {1}{T-m}}\sum _{t=m+1}^{T}\left|Y_{t}-Y_{t-m}\right|}}\right)={\frac {{\frac {1}{J}}\sum _{j}\left|e_{j}\right|}{{\frac {1}{T-m}}\sum _{t=m+1}^{T}\left|Y_{t}-Y_{t-m}\right|}}$ ^[8]

The main difference with the method for non-seasonal time series, is that the denominator is the mean absolute error of the one-step "seasonal naive forecast method" on the training set,^[8] which uses the actual value from the prior season as the forecast: F_t = Y_t−m,^[9] where m is the seasonal period.

This scale-free error metric "can be used to compare forecast methods on a single series and also to compare forecast accuracy between series. This metric is well suited to intermittent-demand series (a data set containing a large amount of zeros) because it never gives infinite or undefined values^[1] except in the irrelevant case where all historical data are equal.^[3]

When comparing forecasting methods, the method with the lowest MASE is the preferred method.

Non-time series data

For non-time series data, the mean of the data ( ${\bar {Y}}$ ) can be used as the "base" forecast.^[10]

\mathrm {MASE} =\mathrm {mean} \left({\frac {\left|e_{j}\right|}{{\frac {1}{J}}\sum _{j=1}^{J}\left|Y_{j}-{\bar {Y}}\right|}}\right)={\frac {{\frac {1}{J}}\sum _{j}\left|e_{j}\right|}{{\frac {1}{J}}\sum _{j}\left|Y_{j}-{\bar {Y}}\right|}}

In this case the MASE is the Mean absolute error divided by the Mean Absolute Deviation.

References

^ ^a ^b Hyndman, R. J. (2006). "Another look at measures of forecast accuracy", FORESIGHT Issue 4 June 2006, pg46 [1]
^ ^a ^b Franses, Philip Hans (2016-01-01). "A note on the Mean Absolute Scaled Error". International Journal of Forecasting. 32 (1): 20–22. doi:10.1016/j.ijforecast.2015.03.008. hdl:1765/78815.
^ ^a ^b ^c Hyndman, R. J. and Koehler A. B. (2006). "Another look at measures of forecast accuracy." International Journal of Forecasting volume 22 issue 4, pages 679-688. doi:10.1016/j.ijforecast.2006.03.001
^ Makridakis, Spyros (1993-12-01). "Accuracy measures: theoretical and practical concerns". International Journal of Forecasting. 9 (4): 527–529. doi:10.1016/0169-2070(93)90079-3. S2CID 153403127.
^ Diebold, Francis X.; Mariano, Roberto S. (1995). "Comparing predictive accuracy". Journal of Business and Economic Statistics. 13 (3): 253–263. doi:10.1080/07350015.1995.10524599.
^ Diebold, Francis X.; Mariano, Roberto S. (2002). "Comparing predictive accuracy" (PDF). Journal of Business and Economic Statistics. 20 (1): 134–144. doi:10.1198/073500102753410444. S2CID 12090811.
^ Diebold, Francis X. (2015). "Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of Diebold–Mariano tests" (PDF). Journal of Business and Economic Statistics. 33 (1): 1. doi:10.1080/07350015.2014.983236.
^ ^a ^b ^c ^d "2.5 Evaluating forecast accuracy | OTexts". www.otexts.org. Retrieved 2016-05-15.
^ ^a ^b Hyndman, Rob et al, Forecasting with Exponential Smoothing: The State Space Approach, Berlin: Springer-Verlag, 2008. ISBN 978-3-540-71916-8.
^ Hyndman, Rob. "Alternative to MAPE when the data is not a time series". Cross Validated. Retrieved 2022-10-11.

[Hyndman2006a-1] Hyndman, R. J. (2006). "Another look at measures of forecast accuracy", FORESIGHT Issue 4 June 2006, pg46 [1]

[:1-2] Franses, Philip Hans (2016-01-01). "A note on the Mean Absolute Scaled Error". International Journal of Forecasting. 32 (1): 20–22. doi:10.1016/j.ijforecast.2015.03.008. hdl:1765/78815.

[Hyndman2006-3] Hyndman, R. J. and Koehler A. B. (2006). "Another look at measures of forecast accuracy." International Journal of Forecasting volume 22 issue 4, pages 679-688. doi:10.1016/j.ijforecast.2006.03.001

[4] Makridakis, Spyros (1993-12-01). "Accuracy measures: theoretical and practical concerns". International Journal of Forecasting. 9 (4): 527–529. doi:10.1016/0169-2070(93)90079-3. S2CID 153403127.

[5] Diebold, Francis X.; Mariano, Roberto S. (1995). "Comparing predictive accuracy". Journal of Business and Economic Statistics. 13 (3): 253–263. doi:10.1080/07350015.1995.10524599.

[6] Diebold, Francis X.; Mariano, Roberto S. (2002). "Comparing predictive accuracy" (PDF). Journal of Business and Economic Statistics. 20 (1): 134–144. doi:10.1198/073500102753410444. S2CID 12090811.

[7] Diebold, Francis X. (2015). "Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of Diebold–Mariano tests" (PDF). Journal of Business and Economic Statistics. 33 (1): 1. doi:10.1080/07350015.2014.983236.

[:0-8] "2.5 Evaluating forecast accuracy | OTexts". www.otexts.org. Retrieved 2016-05-15.

[Hyndman2008-9] Hyndman, Rob et al, Forecasting with Exponential Smoothing: The State Space Approach, Berlin: Springer-Verlag, 2008. ISBN 978-3-540-71916-8.

[10] Hyndman, Rob. "Alternative to MAPE when the data is not a time series". Cross Validated. Retrieved 2022-10-11.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Machine learning evaluation metrics
Regression	MSE MAE sMAPE MAPE MASE MSPE RMS RMSE/RMSD R² MDA MAD
Classification	F-score P4 Accuracy Precision Recall Kappa MCC AUC ROC Sensitivity and specificity Logarithmic Loss
Clustering	Silhouette Calinski-Harabasz index Davies-Bouldin Dunn index Hopkins statistic Jaccard index Rand index Similarity measure SMC SimHash
Ranking	MRR NDCG AP
Computer Vision	PSNR SSIM IoU
NLP	Perplexity BLEU
Deep Learning Related Metrics	Inception score FID
Recommender system	Coverage Intra-list Similarity
Similarity	Cosine similarity Euclidean distance Pearson correlation coefficient
Confusion matrix

Rationale

Non seasonal time series

Seasonal time series

Non-time series data

See also

References