Bookmaker Archive

David M. W. Powers - resources relating to the bookmaker algorithm.  The original Bookmaker paper (6 pages) and poster (A4) derives Informedness from the idea of an edge in gambling or trading.

244224 bmpaper.doc - ICCS technical paper July 2003
206442 BMPaper.pdf - ICCS technical paper July 2003
509808 BMPoster.pdf - ICCS tutorial poster July 2003

The later ECAI 2008 paper (2 pages) and poster (A4) introducing/comparing Markedness and Informedness versus related measures, and summarizing key Correlation, Significance and Confidence Interval formulae from Technical Report SIE-07-001:

56409 ECAI-Evaluation_Evaluation-Short.pdf – ECAI short paper, July 2008 (2p)
467514 ECAI-Evaluation Evaluation-poster-A4.pdf – ECAI tutorial poster, July 2008 (1p)
296599 ECAIacc-Evaluation_Evaluation.pdf – 5p version of ECAI paper, July 2008
133242 ECAIrej-Significance_Confidence.pdf – 5p sequel to ECAI paper, July 2008

Excel spreadsheets (as discussed in ICCS paper and usable in doc version):

84480 BMExcel.xls – 2x2 case, 3x3 case, 13x13 worksheet
27136 bmsig.xls - shows 2x2 case + significance estimates
27136 bmsmall.xls - shows 2x2 case + mean F&G factors
28160 bmsym.xls - shows 2x2 case + misinformedness case
29184 bmtriple.xls - shows 3x3 case + mean F&G factors
28672 bmwtsym.xls - shows 2x2 case + weighted F&G factors

2834 bm.mmatlab/octave script for bookmaker – abbreviated statistics (ICCS version)
6517 bookmaker.m - matlab/octave script for bookmaker + sig/conf/etc. (ECAI version)

Brief motivation Powerpoint (abstract as slide 5) motivating Informedness, Markedness and showing the connection to Correlation and Chi-squared Significance (HCSNet 2007, Abstracts p77 and Speedpapers p 29):

2957875 EvaluationEvaluation_HCS_2007.pdf

Technical Report SIE-07-001 showing full derivation and analysis of Informedness, Markedness and relating them to Recall, Precision, Correlation and Chi-squared Significance (draft to be submitted) as well as to ROC analysis (Receiver Operating Characteristics), AUC (Area under the curve), DeltaP, Regression, etc.

507993 Evaluation_SIETR_2up.pdf – 2up version (12 sheets)
488472 Evaluation_SIETR.pdf – standard version (24 pages)

In summary, Precision reflects at chance level performance the Prevalence of the positive case in the dataset, and subtracting off the Prevalence and renormalizing as a probability gives the probability of an informed prediction (versus guessed prediction) – in the binary case this corresponds to DeltaP’ or to 2AUC-1.  Conversely, Recall reflects at chance level performance the Bias towards positive labels by the predictor, and subtracting off the Bias and renormalizing as a probability gives the probability of a marked prediction (versus chance association) – in the binary case this corresponds to DeltaP.  The Geometric Mean of Informedness and Markedness is the Pearson Correlation.  All three can be regarded as different normalizations of the Chi-squared statistic.