Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Hogan, Robin J.; Ferro, Christopher A. T.; Jolliffe, Ian T.; Stephenson, David B. (2010)
Publisher: American Meteorological Society
Languages: English
Types: Article
In the forecasting of binary events, verification measures that are “equitable” were defined by Gandin and Murphy to satisfy two requirements: 1) they award all random forecasting systems, including those that always issue the same forecast, the same expected score (typically zero), and 2) they are expressible as the linear weighted sum of the elements of the contingency table, where the weights are independent of the entries in the table, apart from the base rate. The authors demonstrate that the widely used “equitable threat score” (ETS), as well as numerous others, satisfies neither of these requirements and only satisfies the first requirement in the limit of an infinite sample size. Such measures are referred to as “asymptotically equitable.” In the case of ETS, the expected score of a random forecasting system is always positive and only falls below 0.01 when the number of samples is greater than around 30. Two other asymptotically equitable measures are the odds ratio skill score and the symmetric extreme dependency score, which are more strongly inequitable than ETS, particularly for rare events; for example, when the base rate is 2% and the sample size is 1000, random but unbiased forecasting systems yield an expected score of around −0.5, reducing in magnitude to −0.01 or smaller only for sample sizes exceeding 25 000. This presents a problem since these nonlinear measures have other desirable properties, in particular being reliable indicators of skill for rare events (provided that the sample size is large enough). A potential way to reconcile these properties with equitability is to recognize that Gandin and Murphy’s two requirements are independent, and the second can be safely discarded without losing the key advantages of equitability that are embodied in the first. This enables inequitable and asymptotically equitable measures to be scaled to make them equitable, while retaining their nonlinearity and other properties such as being reliable indicators of skill for rare events. It also opens up the possibility of designing new equitable verification measures.
  • The results below are discovered through our pilot algorithms. Let us know how we are doing!

    • Baldwin, M. E., and J. S. Kain, 2006: Sensitivity of several performance measures to displacement error, bias, and event frequency. Wea. Forecasting, 21, 636-648.
    • Brill, K. F., 2009: A general analytic method for assessing sensitivity to bias of performance measures for dichotomous forecasts. Wea. Forecasting, 24, 307-318.
    • Donaldson, R. J., R. M. Dyer, and M. J. Kraus, 1975: An objective evaluator of techniques for predicting severe weather events. Preprints, Ninth Conf. on Severe Local Storms, Norman, OK, Amer. Meteor. Soc., 321-326.
    • Doswell, C. A., III, R. Davies-Jones, and D. L. Keller, 1990: On summary measures of skill in rare event forecasting based on contingency tables. Wea. Forecasting, 5, 576-586.
    • Finley, J. P., 1884: Tornado predictions. Amer. Meteor. J., 1, 85-88.
    • Gandin, K. S., and A. H. Murphy, 1992: Equitable scores for categorical forecasts. Mon. Wea. Rev., 120, 361-370.
    • Gilbert, G. K., 1884: Finley's tornado predictions. Amer. Meteor. J., 1, 166-172.
    • Gringorten, I. I., 1967: Verification to determine and measure forecasting skill. J. Appl. Meteor., 6, 742-747.
    • Heidke, P., 1926: Calculation of the success and goodness of strong wind forecasts in the storm warning service. Geogr. Ann. Stockholm, 8, 301-349.
    • Hilliker, J. L., 2004: The sensitivity of the number of correctly forecasted events to the threat score: A practical application. Wea. Forecasting, 19, 646-650.
    • Hogan, R. J., E. J. O'Connor, and A. J. Illingworth, 2009: Verification of cloud fraction forecasts. Quart. J. Roy. Meteor. Soc., 135, 1494-1511.
    • Jolliffe, I. T., 2008: The impenetrable hedge: A note on propriety, equitability and consistency. Meteor. Appl., 15, 25-29.
    • Livezey, R. E., 2003: Categorical events. Forecast Verification-A Practitioner's Guide in Atmospheric Science, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 77-96.
    • Manzato, A., 2005: An odds ratio parameterization for ROC diagram and skill score indices. Wea. Forecasting, 20, 918-930.
    • Marzban, C., 1998: Scalar measures of performance in rare-event situations. Wea. Forecasting, 13, 753-763.
    • --, and V. Lakshmanan, 1999: On the uniqueness of Gandin and Murphy's equitable performance measures. Mon. Wea. Rev., 127, 1134-1136.
    • Mason, I. B., 1989: Dependence of the critical success index on sample climate and threshold probability. Aust. Meteor. Mag., 37, 75-81.
    • --, 2003: Binary events. Forecast Verification-A Practitioner's Guide in Atmospheric Science, I. T. Jolliffe and D. B. Stephenson, Eds., Wiley, 37-76.
    • Mesinger, F., and T. L. Black, 1992: On the impact on forecast accuracy of the step-mountain (eta) vs. sigma coordinate. Meteor. Atmos. Phys., 50, 47-60.
    • Murphy, A. H., 1991: Forecast verification: Its complexity and dimensionality. Mon. Wea. Rev., 119, 1590-1601.
    • --, 1996: The Finley affair: a signal event in the history of forecast verification. Wea. Forecasting, 11, 3-20.
    • --, and H. Daan, 1985: Forecast evaluation. Probability, Statistics, and Decision Making in the Atmospheric Sciences, A. H. Murphy and R. W. Katz, Eds., Westview Press, 379-437.
    • Peirce, C. S., 1884: The numerical measure of the success of predictions. Science, 4, 453-454.
    • Primo, C., and A. Ghelli, 2009: The affect of the base rate on the extreme dependency score. Meteor. Appl., 16, 533-535.
    • Schaefer, J. T., 1990: The critical success index as an indicator of forecasting skill. Wea. Forecasting, 5, 570-575.
    • Severini, T. A., 2005: Elements of Distribution Theory. Cambridge University Press, 515 pp.
    • Stephenson, D. B., 2000: Use of the ''odds ratio'' for diagnosing forecast skill. Wea. Forecasting, 15, 221-232.
    • --, B. Casati, C. A. T. Ferro, and C. A. Wilson, 2008: The extreme dependency score: A non-vanishing measure for forecasts of rare events. Meteor. Appl., 15, 41-50.
    • Yule, G. U., 1900: On the association of attributes in statistics. Philos. Trans. Roy. Soc. London, 194A, 257-319.
  • No related research data.
  • No similar publications.

Share - Bookmark

Cite this article