lichess.org
Donate

Please implement the Chess Play Quality Index

Ye, very interesting thing which might be used to disclose cheaters.
I find this most troubling about the Quality of play index:
"Moves where both the move played and the move suggested by the computer had an evaluation outside the interval [-2.00, 2.00], are discarded. (In clearly won positions players are tempted to play a simple safe move instead of a stronger, but risky one. Such "inferior" moves are, from a practical viewpoint, perfectly justified. Similarly, in lost positions players sometimes deliberately play an objectively worse move.)"

Discarding data based on an arbitrarily chosen threshold suggests a hidden flaw in the model. Also, the Quality of play index is 1-dimensional (which greatly oversimplifies the data) and averages many values together (which make data less informative than if statistical models were applied to data). See http://en.wikipedia.org/wiki/Mathematical_statistics#Special_distributions for a handful of well-tested models.

My suggestions are met almost entirely with skepticism, but I still recommend:
http://en.lichess.org/forum/lichess-feedback/quality-rating-system-#7
@Toadofsky - I don't think this necessarily indicates a flaw, rather an approximation for a required reparameterization. The model could be improved to be more accurate, of course (e.g., a nonlinear transformation on the evaluations, continuously diminishing weight, etc.), but that doesn't mean it isn't usable as it is. It can definitely help calibrate lichess ratings against official ratings, and ratings on other sites.

Your link to an indiscriminate list of probability distributions is odd, to say the least. You don't just randomly pick a distribution from a bag of "well-tested models", you use a distribution which actually models whatever phenomenon you are investigating. Each distribution is different.
@Toadofsky, There are plenty of system that reduce one's chess prowls to a number (or a fixed set of number - basically, the same thing). The problem with these systems are that they're disjoint, and the need for comparison between them rises often.

The ELO rating system is established as the de-facto standard, and therefore it is interesting to find a rating/quality system that
it shown to correlates well with it.

What you consider to be the model's simplicity is actually one of its strong points: It's relatively cheap to calculate.

This topic has been archived and can no longer be replied to.