David M. W. Powers - resources relating to the bookmaker algorithm. The original Bookmaker paper (6 pages) and poster (A4) derives Informedness from the idea of an edge in gambling or trading.

244224 bmpaper.doc - ICCS technical paper July
2003

206442 BMPaper.pdf -
ICCS technical paper July 2003

509808 BMPoster.pdf -
ICCS tutorial *poster* July 2003

The later ECAI 2008 paper (2 pages) and poster (A4) introducing/comparing Markedness and Informedness versus related measures, and summarizing key Correlation, Significance and Confidence Interval formulae from Technical Report SIE-07-001:

56409 ECAI-Evaluation_Evaluation-Short.pdf – ECAI short
paper, July 2008 (2p)

467514 ECAI-Evaluation
Evaluation-poster-A4.pdf – ECAI tutorial *poster*, July 2008
(1p)

296599 ECAIacc-Evaluation_Evaluation.pdf
– 5p version of ECAI paper, July 2008

133242 ECAIrej-Significance_Confidence.pdf
– 5p sequel to ECAI paper, July 2008

Excel spreadsheets (as discussed in ICCS paper and usable in doc version):

84480 BMExcel.xls – 2x2 case, 3x3 case,
13x13 worksheet

27136 bmsig.xls - shows 2x2 case + significance
estimates

27136 bmsmall.xls - shows 2x2 case + mean F&G
factors

28160 bmsym.xls - shows 2x2 case + misinformedness case

29184 bmtriple.xls - shows 3x3 case + mean
F&G factors

28672 bmwtsym.xls - shows 2x2 case + weighted
F&G factors

2834 bm.m – matlab/octave script for bookmaker – abbreviated
statistics (ICCS version)

6517 bookmaker.m - matlab/octave script for bookmaker + sig/conf/etc.
(ECAI version)

Brief motivation Powerpoint (abstract as slide 5) motivating Informedness, Markedness and showing the connection to Correlation and Chi-squared Significance (HCSNet 2007, Abstracts p77 and Speedpapers p 29):

2957875 EvaluationEvaluation_HCS

Technical Report SIE-07-001 showing full derivation and analysis of Informedness, Markedness and relating them to Recall, Precision, Correlation and Chi-squared Significance (draft to be submitted) as well as to ROC analysis (Receiver Operating Characteristics), AUC (Area under the curve), DeltaP, Regression, etc.

507993 Evaluation_SIETR_2up.pdf
– 2up version (12 sheets)

488472 Evaluation_SIETR.pdf
– standard version (24 pages)

In summary, Precision reflects at chance level performance the Prevalence of the positive case in the dataset, and subtracting off the Prevalence and renormalizing as a probability gives the probability of an informed prediction (versus guessed prediction) – in the binary case this corresponds to DeltaP’ or to 2AUC-1. Conversely, Recall reflects at chance level performance the Bias towards positive labels by the predictor, and subtracting off the Bias and renormalizing as a probability gives the probability of a marked prediction (versus chance association) – in the binary case this corresponds to DeltaP. The Geometric Mean of Informedness and Markedness is the Pearson Correlation. All three can be regarded as different normalizations of the Chi-squared statistic.