Metrics - Core metrics and classes¶
Function/Class Documentation¶
Module containing verification and performance metrics
With the exception of the ContingencyNxN and Contingency2x2 classes, the inputs for all metrics are assumed to be array-like and 1D. Bad values are assumed to be stored as NaN and these are excluded in metric calculations.
With the exception of the ContingencyNxN and Contingency2x2 classes, the inputs for all metrics are assumed to be array-like and 1D. Bad values are assumed to be stored as NaN and these are excluded in metric calculations.
Author: Steve Morley Institution: Los Alamos National Laboratory
Copyright (c) 2017, Triad National Security, LLC All rights reserved.
- verify.metrics.MASE(predicted, observed)¶
Mean Absolute Scaled Error
- Parameters:
predicted (array-like) – predicted data for which to calculate MASE
observed (float) – observation vector (or climatological value (scalar)) to use as reference value
- Returns:
out – the mean absolute scaled error of the data set
- Return type:
float
See also
Notes
References: R.J. Hyndman and A.B. Koehler, Another look at measures of forecast accuracy, Intl. J. Forecasting, 22, pp. 679-688, 2006.
- verify.metrics.RMSE(data, climate=None)¶
Calcualte the root mean squared error of a data set relative to a reference value
- Parameters:
data (array-like) – data to calculate mean squared error, default reference is persistence
climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.
- Returns:
out – the root-mean-squared error of the data set relative to the chosen reference
- Return type:
float
See also
Notes
The chosen reference can be persistence, a provided climatological mean (scalar) or a provided climatology (observation vector).
- verify.metrics.Sn(data, scale=True, correct=True)¶
Sn statistic, a robust measure of scale
- Parameters:
data (array-like) – data to calculate Sn statistic for
scale (boolean) – Scale so that output is the same as the standard deviation for if the distribution is normal (default=True) (default=True)
correct (boolean) – Set a correction factor (default=True)
- Returns:
Sn – the Sn statistic
- Return type:
float
See also
Notes
Sn is more efficient than the median absolute deviation, and is not constructed with the assumption of a symmetric distribution, because it does not measure distance from an assumed central location. To quote RC1993, “…Sn looks at a typical distance between observations, which is still valid at asymmetric distributions.”
[RC1993] P.J.Rouseeuw and C.Croux, “Alternatives to the Median Absolute Deviation”, J. Amer. Stat. Assoc., 88 (424), pp.1273-1283. Equation 2.1, but note that they use “low” and “high” medians: Sn = c * 1.1926 * LOMED_{i} ( HIMED_{j} (|x_i - x_j|) )
Note that the implementation of the original formulation is slow for large n. As the original formulation is identical to using a true median for odd-length series, we do so here automatically to gain a significant speedup.
- verify.metrics.absPercError(predicted, observed)¶
Absolute percentage error
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
- Returns:
perc – Array of absolute percentage errors
- Return type:
array
- verify.metrics.accuracy(data, climate=None)¶
Convenience function to calculate a selection of unscaled accuracy measures
- Parameters:
data (array-like) – Array-like (list, numpy array, etc.) of predictions
climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.
- Returns:
out – Dictionary containing unscaled accuracy measures MSE - mean squared error RMSE - root mean squared error MAE - mean absolute error MdAE - median absolute error
- Return type:
dict
See also
- verify.metrics.bias(predicted, observed)¶
Scale-dependent bias as measured by the mean error
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
- Returns:
bias – Mean error of prediction
- Return type:
float
- verify.metrics.forecastError(predicted, observed, full=True)¶
forecast error, defined using the sign convention of J&S ch. 5
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
full (boolean, optional) – Switch determining nature of return value. When it is True (the default) the function returns the errors as well as the predicted and observed values as numpy arrays of floats, when False only the array of forecast errors is returned.
- Returns:
err (array) – the forecast error
pred (array) – Optional return array of predicted values as floats, included if full is True
obse (array) – Optional return array of observed values as floats, included if full is True
Notes
J&S: Jolliffe and Stephenson (Ch. 5)
- verify.metrics.logAccuracy(predicted, observed, base=10, mask=True)¶
Log Accuracy Ratio, defined as log(predicted/observed) or log(predicted)-log(observed)
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
base (number, optional) – Base to use for logarithmic transform (allows 10, 2, and ‘e’) (default=10)
mask (boolean, optional) – Switch to set masking behaviour. If True (default) the function will mask out NaN and negative values, and will return a masked array. If False, the presence of negative numbers will raise a ValueError and NaN will propagate through the calculation.
- Returns:
logacc – Array of absolute percentage errors
- Return type:
array or masked array
Notes
Using base 2 is computationally much faster, so unless the base is important to interpretation we recommend using that.
- verify.metrics.meanAPE(predicted, observed, mfunc=<function mean>)¶
mean absolute percentage error
- Parameters:
predicted (array-like) – predicted data for which to calculate mean squared error
observed (float) – observation vector (or climatological value (scalar)) to use as reference value
mfunc (function) – function to calculate mean (default=np.mean)
- Returns:
mape – the mean absolute percentage error
- Return type:
float
- verify.metrics.meanAbsError(data, climate=None)¶
mean absolute error of a data set relative to some reference value
- Parameters:
data (array-like) – data to calculate mean squared error, default reference is persistence
climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.
- Returns:
out – the mean absolute error of the data set relative to the chosen reference
- Return type:
float
See also
Notes
The chosen reference can be persistence, a provided climatological mean (scalar) or a provided climatology (observation vector).
- verify.metrics.meanPercentageError(predicted, observed)¶
Order-dependent bias as measured by the mean percentage error
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
- Returns:
mpe – Mean percentage error of prediction
- Return type:
float
- verify.metrics.meanSquaredError(data, climate=None)¶
Mean squared error of a data set relative to a reference value
- Parameters:
data (array-like) – data to calculate mean squared error, default reference is persistence
climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.
- Returns:
out – the mean-squared-error of the data set relative to the chosen reference
- Return type:
float
See also
Notes
The chosen reference can be persistence, a provided climatological mean (scalar), or a provided climatology (observation vector).
- verify.metrics.medAbsDev(series, scale=False, median=False)¶
Computes the median absolute deviation from the median
- Parameters:
series (array-like) – Input data
scale (boolean) – Scale so that median absolute deviation is the same as the standard deviation for normal distributions (default=False)
median (boolean) – Return the median of the series as well as the median absolute deviation (default=False)
- Returns:
mad (float) – median absolute deviation
perc50 (float) – median of series, optional output
- verify.metrics.medAbsError(data, climate=None)¶
median absolute error of a data set relative to some reference value
- Parameters:
data (array-like) – data to calculate median absolute error, default reference is persistence
climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.
- Returns:
out – the median absolute error of the data set relative to the chosen reference
- Return type:
float
See also
Notes
The chosen reference can be persistence, a provided climatological mean (scalar) or a provided climatology (observation vector).
- verify.metrics.medSymAccuracy(predicted, observed, mfunc=<function median>, method=None)¶
Scaled measure of accuracy that is not biased to over- or under-predictions.
- Parameters:
predicted (array-like) – predicted data for which to calculate mean squared error
observed (float) – observation vector (or climatological value (scalar)) to use as reference value
mfunc (function) – function for calculating the median (default=np.median)
method (string, optional) – Method to use for calculating the median symmetric accuracy (MSA). Options are ‘log’ which uses the median of the re-exponentiated absolute log accuracy, ‘UPE’ which calculates MSA using the unsigned percentage error, and None (default), in which case the method is implemented as described above. The UPE method has reduced accuracy compared to the other methods and is included primarily for testing purposes.
- Returns:
msa – Array of median symmetric accuracy
- Return type:
float
Notes
The accuracy ratio is given by (prediction/observation), to avoid the bias inherent in mean/median percentage error metrics we use the log of the accuracy ratio (which is symmetric about 0 for changes of the same factor). Specifically, the Median Symmetric Accuracy is found by calculating the median of the absolute log accuracy, and re-exponentiating: g = exp( median( |ln(pred) - ln(obs)| ) )
This can be expressed as a symmetric percentage error by shifting by one unit and multiplying by 100: MSA = 100*(g-1)
It can also be shown that this is identically equivalent to the median unsigned percentage error, where the unsigned relative error is given by: (y’ - x’)/x’
where y’ is always the larger of the (observation, prediction) pair, and x’ is always the smaller.
Reference: Morley, S.K., Brito, T.V., and Welling, D.T. (2018), Measures of Model Performance Based on the Log Accuracy Ratio, Space Weather, 16(1), pp. 69-88, doi: 10.102/2017SW001669.
- verify.metrics.median(data, ws=None)¶
Weighted median
- Parameters:
data (array) – Array of data values
ws (None or array) – None, which implies equal weighting, or an array of weights.
- Returns:
wmedian – (Weighted) median of input series
- Return type:
float
- verify.metrics.medianLogAccuracy(predicted, observed, mfunc=<function median>, base=10)¶
Order-dependent bias as measured by the median of the log accuracy ratio
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
mfunc (function, optional) – Function to use for central tendency (default: numpy.median)
base (number, optional) – Base to use for logarithmic transform (default: 10)
- Returns:
mla – Median log accuracy of prediction
- Return type:
float
Notes
Reference: Morley, S.K. (2016), Alternatives to accuracy and bias metrics based on percentage errors for radiation belt modeling applications, Los Alamos National Laboratory Report, LA-UR-15-24592.
- verify.metrics.nRMSE(predicted, observed)¶
normalized root mean squared error of a data set relative to a reference value
- Parameters:
predicted (array-like) – predicted data for which to calculate mean squared error
observed (float) – observation vector (or climatological value (scalar)) to use as reference value
- Returns:
out – the normalized root-mean-squared-error of the data set relative to the observations
- Return type:
float
See also
Notes
The chosen reference can be an observation vector or, a provided climatological mean (scalar). This definition is due to Yu and Ridley (2002)
References: Yu, Y., and A. J. Ridley (2008), Validation of the space weather modeling framework using ground-based magnetometers, Space Weather, 6, S05002, doi:10.1029/2007SW000345.
- verify.metrics.normSn(data, **kwargs)¶
Computes the normalized Sn statistic, a scaled measure of spread.
- Parameters:
data (array-like) – data to calculate normSn statistic for
**kwards (dict) – Optional keyword arguements (see Sn)
- Returns:
normSn – the normalized Sn statistic
- Return type:
float
See also
Notes
We here scale the Sn estimator by the median, giving a non-symmetric alternative to the robust coefficient of variation (rCV).
- verify.metrics.percBetter(predict1, predict2, observed)¶
The percentage of cases when method A was closer to actual than method B
- Parameters:
predict1 (array-like) – Array-like (list, numpy array, etc.) of predictions from model A
predict2 (array-like) – Array-like (list, numpy array, etc.) of predictions from model B
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
- Returns:
percBetter – The percentage of observations where method A was closer to observation than method B
- Return type:
float
Notes
For example, if we want to know whether a new forecast performs better than a reference forecast…
Examples
>>> import verify >>> data = [3,4,5,6,7,8] >>> p_ref = [5.5]*6 #mean prediction >>> p_good = [4,5,4,7,7,8] #"good" model prediction >>> verify.percBetter(p_good, p_ref, data) 66.66666666666666
That is, two-thirds (66.67%) of the predictions have a lower absolute error in p_good than in p_ref.
- verify.metrics.percError(predicted, observed)¶
Percentage Error
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
- Returns:
perc – Array of forecast errors expressed as a percentage
- Return type:
float
- verify.metrics.rCV(predicted)¶
robust coefficient of variation
- Parameters:
predicted (array-like) – Predicted input
- Returns:
rcv – robust coefficient of variation (see notes)
- Return type:
float
Notes
Computes the “robust coefficient of variation”, i.e. median absolute deviation divided by the median
By analogy with the coefficient of variation, which is the standard deviation divided by the mean, rCV gives the median absolute deviation (aka rSD) divided by the median, thereby providing a scaled measure of precision/spread.
- verify.metrics.rSD(predicted)¶
robust standard deviation
- Parameters:
predicted (array-like) – Predicted input
- Returns:
rsd – robust standard deviation, the scaled med abs dev
- Return type:
float
Notes
Computes the “robust standard deviation”, i.e. the median absolute deviation times a correction factor
The median absolute deviation (medAbsDev) scaled by a factor of 1.4826 recovers the standard deviation when applied to a normal distribution. However, unlike the standard deviation the medAbsDev has a high breakdown point and is therefore considered a robust estimator.
- verify.metrics.scaledAccuracy(predicted, observed)¶
Calculate scaled and relative accuracy measures
- Parameters:
predicted (array-like) – Array-like (list, numpy array, etc.) of predictions
observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity
- Returns:
out – Dictionary containing scaled or relative accuracy measures nRMSE - normalized root mean squared error MASE - mean absolute scaled error MAPE - mean absolute percentage error MdAPE - median absolute percentage error MdSymAcc - median symmetric accuracy
- Return type:
dict
See also
- verify.metrics.scaledError(predicted, observed)¶
Scaled errors, see Hyndman and Koehler (2006)
- Parameters:
predicted (array-like) – predicted data for which to calculate scaled error
observed (float) – observation vector (or climatological value (scalar)) to use as reference value
- Returns:
q – the scaled error
- Return type:
float
Notes
References: R.J. Hyndman and A.B. Koehler, Another look at measures of forecast accuracy, Intl. J. Forecasting, 22, pp. 679-688, 2006.
See also
- verify.metrics.skill(A_data, A_ref, A_perf=0)¶
Generic forecast skill score for quantifying forecast improvement
- Parameters:
A_data (float) – Accuracy measure of data set
A_ref (float) – Accuracy measure for reference forecast
A_perf (float, optional) – Accuracy measure for “perfect forecast” (Default = 0)
- Returns:
ss_ref – Forecast skill for the given forecast, relative to the reference, using the chosen accuracy measure
- Return type:
float
Notes
See section 7.1.4 of Wilks [2006] (Statistical methods in the atmospheric sciences) for details.
- verify.metrics.symmetricSignedBias(predicted, observed)¶
Symmetric signed bias, expressed as a percentage
- Parameters:
predicted (array-like) – List of predicted values
observed (array-like) – List of observed values
- Returns:
bias – symmetric signed bias, as a precentage
- Return type:
float
Notes
Reference: Morley, S.K., Brito, T.V., and Welling, D.T. (2018), Measures of Model Performance Based on the Log Accuracy Ratio, Space Weather, 16(1), pp. 69-88, doi: 10.102/2017SW001669.