Metrics - Core metrics and classes

Function/Class Documentation

Module containing verification and performance metrics

With the exception of the ContingencyNxN and Contingency2x2 classes, the inputs for all metrics are assumed to be array-like and 1D. Bad values are assumed to be stored as NaN and these are excluded in metric calculations.

With the exception of the ContingencyNxN and Contingency2x2 classes, the inputs for all metrics are assumed to be array-like and 1D. Bad values are assumed to be stored as NaN and these are excluded in metric calculations.

Author: Steve Morley Institution: Los Alamos National Laboratory

Copyright (c) 2017, Triad National Security, LLC All rights reserved.

verify.metrics.MASE(predicted, observed)

Mean Absolute Scaled Error

Parameters:
  • predicted (array-like) – predicted data for which to calculate MASE

  • observed (float) – observation vector (or climatological value (scalar)) to use as reference value

Returns:

out – the mean absolute scaled error of the data set

Return type:

float

See also

scaledError

Notes

References: R.J. Hyndman and A.B. Koehler, Another look at measures of forecast accuracy, Intl. J. Forecasting, 22, pp. 679-688, 2006.

verify.metrics.RMSE(data, climate=None)

Calcualte the root mean squared error of a data set relative to a reference value

Parameters:
  • data (array-like) – data to calculate mean squared error, default reference is persistence

  • climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.

Returns:

out – the root-mean-squared error of the data set relative to the chosen reference

Return type:

float

Notes

The chosen reference can be persistence, a provided climatological mean (scalar) or a provided climatology (observation vector).

verify.metrics.Sn(data, scale=True, correct=True)

Sn statistic, a robust measure of scale

Parameters:
  • data (array-like) – data to calculate Sn statistic for

  • scale (boolean) – Scale so that output is the same as the standard deviation for if the distribution is normal (default=True) (default=True)

  • correct (boolean) – Set a correction factor (default=True)

Returns:

Sn – the Sn statistic

Return type:

float

See also

medAbsDev

Notes

Sn is more efficient than the median absolute deviation, and is not constructed with the assumption of a symmetric distribution, because it does not measure distance from an assumed central location. To quote RC1993, “…Sn looks at a typical distance between observations, which is still valid at asymmetric distributions.”

[RC1993] P.J.Rouseeuw and C.Croux, “Alternatives to the Median Absolute Deviation”, J. Amer. Stat. Assoc., 88 (424), pp.1273-1283. Equation 2.1, but note that they use “low” and “high” medians: Sn = c * 1.1926 * LOMED_{i} ( HIMED_{j} (|x_i - x_j|) )

Note that the implementation of the original formulation is slow for large n. As the original formulation is identical to using a true median for odd-length series, we do so here automatically to gain a significant speedup.

verify.metrics.absPercError(predicted, observed)

Absolute percentage error

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

Returns:

perc – Array of absolute percentage errors

Return type:

array

verify.metrics.accuracy(data, climate=None)

Convenience function to calculate a selection of unscaled accuracy measures

Parameters:
  • data (array-like) – Array-like (list, numpy array, etc.) of predictions

  • climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.

Returns:

out – Dictionary containing unscaled accuracy measures MSE - mean squared error RMSE - root mean squared error MAE - mean absolute error MdAE - median absolute error

Return type:

dict

verify.metrics.bias(predicted, observed)

Scale-dependent bias as measured by the mean error

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

Returns:

bias – Mean error of prediction

Return type:

float

verify.metrics.forecastError(predicted, observed, full=True)

forecast error, defined using the sign convention of J&S ch. 5

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

  • full (boolean, optional) – Switch determining nature of return value. When it is True (the default) the function returns the errors as well as the predicted and observed values as numpy arrays of floats, when False only the array of forecast errors is returned.

Returns:

  • err (array) – the forecast error

  • pred (array) – Optional return array of predicted values as floats, included if full is True

  • obse (array) – Optional return array of observed values as floats, included if full is True

Notes

J&S: Jolliffe and Stephenson (Ch. 5)

verify.metrics.logAccuracy(predicted, observed, base=10, mask=True)

Log Accuracy Ratio, defined as log(predicted/observed) or log(predicted)-log(observed)

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

  • base (number, optional) – Base to use for logarithmic transform (allows 10, 2, and ‘e’) (default=10)

  • mask (boolean, optional) – Switch to set masking behaviour. If True (default) the function will mask out NaN and negative values, and will return a masked array. If False, the presence of negative numbers will raise a ValueError and NaN will propagate through the calculation.

Returns:

logacc – Array of absolute percentage errors

Return type:

array or masked array

Notes

Using base 2 is computationally much faster, so unless the base is important to interpretation we recommend using that.

verify.metrics.meanAPE(predicted, observed, mfunc=<function mean>)

mean absolute percentage error

Parameters:
  • predicted (array-like) – predicted data for which to calculate mean squared error

  • observed (float) – observation vector (or climatological value (scalar)) to use as reference value

  • mfunc (function) – function to calculate mean (default=np.mean)

Returns:

mape – the mean absolute percentage error

Return type:

float

verify.metrics.meanAbsError(data, climate=None)

mean absolute error of a data set relative to some reference value

Parameters:
  • data (array-like) – data to calculate mean squared error, default reference is persistence

  • climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.

Returns:

out – the mean absolute error of the data set relative to the chosen reference

Return type:

float

Notes

The chosen reference can be persistence, a provided climatological mean (scalar) or a provided climatology (observation vector).

verify.metrics.meanPercentageError(predicted, observed)

Order-dependent bias as measured by the mean percentage error

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

Returns:

mpe – Mean percentage error of prediction

Return type:

float

verify.metrics.meanSquaredError(data, climate=None)

Mean squared error of a data set relative to a reference value

Parameters:
  • data (array-like) – data to calculate mean squared error, default reference is persistence

  • climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.

Returns:

out – the mean-squared-error of the data set relative to the chosen reference

Return type:

float

See also

RMSE, meanAbsError

Notes

The chosen reference can be persistence, a provided climatological mean (scalar), or a provided climatology (observation vector).

verify.metrics.medAbsDev(series, scale=False, median=False)

Computes the median absolute deviation from the median

Parameters:
  • series (array-like) – Input data

  • scale (boolean) – Scale so that median absolute deviation is the same as the standard deviation for normal distributions (default=False)

  • median (boolean) – Return the median of the series as well as the median absolute deviation (default=False)

Returns:

  • mad (float) – median absolute deviation

  • perc50 (float) – median of series, optional output

verify.metrics.medAbsError(data, climate=None)

median absolute error of a data set relative to some reference value

Parameters:
  • data (array-like) – data to calculate median absolute error, default reference is persistence

  • climate (array-like or float, optional) – Array-like (list, numpy array, etc.) or float of observed values of scalar quantity. If climate is None (default) then the accuracy is assessed relative to persistence.

Returns:

out – the median absolute error of the data set relative to the chosen reference

Return type:

float

Notes

The chosen reference can be persistence, a provided climatological mean (scalar) or a provided climatology (observation vector).

verify.metrics.medSymAccuracy(predicted, observed, mfunc=<function median>, method=None)

Scaled measure of accuracy that is not biased to over- or under-predictions.

Parameters:
  • predicted (array-like) – predicted data for which to calculate mean squared error

  • observed (float) – observation vector (or climatological value (scalar)) to use as reference value

  • mfunc (function) – function for calculating the median (default=np.median)

  • method (string, optional) – Method to use for calculating the median symmetric accuracy (MSA). Options are ‘log’ which uses the median of the re-exponentiated absolute log accuracy, ‘UPE’ which calculates MSA using the unsigned percentage error, and None (default), in which case the method is implemented as described above. The UPE method has reduced accuracy compared to the other methods and is included primarily for testing purposes.

Returns:

msa – Array of median symmetric accuracy

Return type:

float

Notes

The accuracy ratio is given by (prediction/observation), to avoid the bias inherent in mean/median percentage error metrics we use the log of the accuracy ratio (which is symmetric about 0 for changes of the same factor). Specifically, the Median Symmetric Accuracy is found by calculating the median of the absolute log accuracy, and re-exponentiating: g = exp( median( |ln(pred) - ln(obs)| ) )

This can be expressed as a symmetric percentage error by shifting by one unit and multiplying by 100: MSA = 100*(g-1)

It can also be shown that this is identically equivalent to the median unsigned percentage error, where the unsigned relative error is given by: (y’ - x’)/x’

where y’ is always the larger of the (observation, prediction) pair, and x’ is always the smaller.

Reference: Morley, S.K., Brito, T.V., and Welling, D.T. (2018), Measures of Model Performance Based on the Log Accuracy Ratio, Space Weather, 16(1), pp. 69-88, doi: 10.102/2017SW001669.

verify.metrics.median(data, ws=None)

Weighted median

Parameters:
  • data (array) – Array of data values

  • ws (None or array) – None, which implies equal weighting, or an array of weights.

Returns:

wmedian – (Weighted) median of input series

Return type:

float

verify.metrics.medianLogAccuracy(predicted, observed, mfunc=<function median>, base=10)

Order-dependent bias as measured by the median of the log accuracy ratio

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

  • mfunc (function, optional) – Function to use for central tendency (default: numpy.median)

  • base (number, optional) – Base to use for logarithmic transform (default: 10)

Returns:

mla – Median log accuracy of prediction

Return type:

float

Notes

Reference: Morley, S.K. (2016), Alternatives to accuracy and bias metrics based on percentage errors for radiation belt modeling applications, Los Alamos National Laboratory Report, LA-UR-15-24592.

verify.metrics.nRMSE(predicted, observed)

normalized root mean squared error of a data set relative to a reference value

Parameters:
  • predicted (array-like) – predicted data for which to calculate mean squared error

  • observed (float) – observation vector (or climatological value (scalar)) to use as reference value

Returns:

out – the normalized root-mean-squared-error of the data set relative to the observations

Return type:

float

See also

RMSE

Notes

The chosen reference can be an observation vector or, a provided climatological mean (scalar). This definition is due to Yu and Ridley (2002)

References: Yu, Y., and A. J. Ridley (2008), Validation of the space weather modeling framework using ground-based magnetometers, Space Weather, 6, S05002, doi:10.1029/2007SW000345.

verify.metrics.normSn(data, **kwargs)

Computes the normalized Sn statistic, a scaled measure of spread.

Parameters:
  • data (array-like) – data to calculate normSn statistic for

  • **kwards (dict) – Optional keyword arguements (see Sn)

Returns:

normSn – the normalized Sn statistic

Return type:

float

See also

rCV

Notes

We here scale the Sn estimator by the median, giving a non-symmetric alternative to the robust coefficient of variation (rCV).

verify.metrics.percBetter(predict1, predict2, observed)

The percentage of cases when method A was closer to actual than method B

Parameters:
  • predict1 (array-like) – Array-like (list, numpy array, etc.) of predictions from model A

  • predict2 (array-like) – Array-like (list, numpy array, etc.) of predictions from model B

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

Returns:

percBetter – The percentage of observations where method A was closer to observation than method B

Return type:

float

Notes

For example, if we want to know whether a new forecast performs better than a reference forecast…

Examples

>>> import verify
>>> data = [3,4,5,6,7,8]
>>> p_ref = [5.5]*6 #mean prediction
>>> p_good = [4,5,4,7,7,8] #"good" model prediction
>>> verify.percBetter(p_good, p_ref, data)
66.66666666666666

That is, two-thirds (66.67%) of the predictions have a lower absolute error in p_good than in p_ref.

verify.metrics.percError(predicted, observed)

Percentage Error

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

Returns:

perc – Array of forecast errors expressed as a percentage

Return type:

float

verify.metrics.rCV(predicted)

robust coefficient of variation

Parameters:

predicted (array-like) – Predicted input

Returns:

rcv – robust coefficient of variation (see notes)

Return type:

float

Notes

Computes the “robust coefficient of variation”, i.e. median absolute deviation divided by the median

By analogy with the coefficient of variation, which is the standard deviation divided by the mean, rCV gives the median absolute deviation (aka rSD) divided by the median, thereby providing a scaled measure of precision/spread.

verify.metrics.rSD(predicted)

robust standard deviation

Parameters:

predicted (array-like) – Predicted input

Returns:

rsd – robust standard deviation, the scaled med abs dev

Return type:

float

Notes

Computes the “robust standard deviation”, i.e. the median absolute deviation times a correction factor

The median absolute deviation (medAbsDev) scaled by a factor of 1.4826 recovers the standard deviation when applied to a normal distribution. However, unlike the standard deviation the medAbsDev has a high breakdown point and is therefore considered a robust estimator.

verify.metrics.scaledAccuracy(predicted, observed)

Calculate scaled and relative accuracy measures

Parameters:
  • predicted (array-like) – Array-like (list, numpy array, etc.) of predictions

  • observed (array-like) – Array-like (list, numpy array, etc.) of observed values of scalar quantity

Returns:

out – Dictionary containing scaled or relative accuracy measures nRMSE - normalized root mean squared error MASE - mean absolute scaled error MAPE - mean absolute percentage error MdAPE - median absolute percentage error MdSymAcc - median symmetric accuracy

Return type:

dict

verify.metrics.scaledError(predicted, observed)

Scaled errors, see Hyndman and Koehler (2006)

Parameters:
  • predicted (array-like) – predicted data for which to calculate scaled error

  • observed (float) – observation vector (or climatological value (scalar)) to use as reference value

Returns:

q – the scaled error

Return type:

float

Notes

References: R.J. Hyndman and A.B. Koehler, Another look at measures of forecast accuracy, Intl. J. Forecasting, 22, pp. 679-688, 2006.

See also

MASE

verify.metrics.skill(A_data, A_ref, A_perf=0)

Generic forecast skill score for quantifying forecast improvement

Parameters:
  • A_data (float) – Accuracy measure of data set

  • A_ref (float) – Accuracy measure for reference forecast

  • A_perf (float, optional) – Accuracy measure for “perfect forecast” (Default = 0)

Returns:

ss_ref – Forecast skill for the given forecast, relative to the reference, using the chosen accuracy measure

Return type:

float

Notes

See section 7.1.4 of Wilks [2006] (Statistical methods in the atmospheric sciences) for details.

verify.metrics.symmetricSignedBias(predicted, observed)

Symmetric signed bias, expressed as a percentage

Parameters:
  • predicted (array-like) – List of predicted values

  • observed (array-like) – List of observed values

Returns:

bias – symmetric signed bias, as a precentage

Return type:

float

Notes

Reference: Morley, S.K., Brito, T.V., and Welling, D.T. (2018), Measures of Model Performance Based on the Log Accuracy Ratio, Space Weather, 16(1), pp. 69-88, doi: 10.102/2017SW001669.