David M. W. Powers - resources relating to the bookmaker algorithm. The original Bookmaker paper (6 pages) and poster (A4) derives Informedness from the idea of an edge in gambling or trading.
244224 bmpaper.doc - ICCS technical paper July
2003
206442 BMPaper.pdf -
ICCS technical paper July 2003
509808 BMPoster.pdf -
ICCS tutorial poster July 2003
The later ECAI 2008 paper (2 pages) and poster (A4) introducing/comparing Markedness and Informedness versus related measures, and summarizing key Correlation, Significance and Confidence Interval formulae from Technical Report SIE-07-001:
56409 ECAI-Evaluation_Evaluation-Short.pdf – ECAI short
paper, July 2008 (2p)
467514 ECAI-Evaluation
Evaluation-poster-A4.pdf – ECAI tutorial poster, July 2008
(1p)
296599 ECAIacc-Evaluation_Evaluation.pdf
– 5p version of ECAI paper, July 2008
133242 ECAIrej-Significance_Confidence.pdf
– 5p sequel to ECAI paper, July 2008
Excel spreadsheets (as discussed in ICCS paper and usable in doc version):
84480 BMExcel.xls – 2x2 case, 3x3 case,
13x13 worksheet
27136 bmsig.xls - shows 2x2 case + significance
estimates
27136 bmsmall.xls - shows 2x2 case + mean F&G
factors
28160 bmsym.xls - shows 2x2 case + misinformedness case
29184 bmtriple.xls - shows 3x3 case + mean
F&G factors
28672 bmwtsym.xls - shows 2x2 case + weighted
F&G factors
2834 bm.m – matlab/octave script for bookmaker – abbreviated
statistics (ICCS version)
6517 bookmaker.m - matlab/octave script for bookmaker + sig/conf/etc.
(ECAI version)
Brief motivation Powerpoint (abstract as slide 5) motivating Informedness, Markedness and showing the connection to Correlation and Chi-squared Significance (HCSNet 2007, Abstracts p77 and Speedpapers p 29):
2957875 EvaluationEvaluation_HCS
Technical Report SIE-07-001 showing full derivation and analysis of Informedness, Markedness and relating them to Recall, Precision, Correlation and Chi-squared Significance (draft to be submitted) as well as to ROC analysis (Receiver Operating Characteristics), AUC (Area under the curve), DeltaP, Regression, etc.
507993 Evaluation_SIETR_2up.pdf
– 2up version (12 sheets)
488472 Evaluation_SIETR.pdf
– standard version (24 pages)
In summary, Precision reflects at chance level performance the Prevalence of the positive case in the dataset, and subtracting off the Prevalence and renormalizing as a probability gives the probability of an informed prediction (versus guessed prediction) – in the binary case this corresponds to DeltaP’ or to 2AUC-1. Conversely, Recall reflects at chance level performance the Bias towards positive labels by the predictor, and subtracting off the Bias and renormalizing as a probability gives the probability of a marked prediction (versus chance association) – in the binary case this corresponds to DeltaP. The Geometric Mean of Informedness and Markedness is the Pearson Correlation. All three can be regarded as different normalizations of the Chi-squared statistic.