Cricket consistency
Normblog also links to this Wisden article on cricketer's consistency. Essentially it says that an average can conceal large variations, and (implicitly) that a batsman who scores 1,1,100 is worse than one who scores 34,34,34, so consistency should be taken into account. The way in which this is done is to divide the average by the standard deviation of the batsman's innings, to get a "batting index". Bradman falls from 1st to 8th.Leaving aside the issue of not outs (which is raised in the article) and the argument over whether consistency really is such a good thing, or to what degree it is a good thing, the measure seems very flawed. There seems no good reasons, and many bad ones, to divide the average by the standard deviation.
A simple example shows what odd results it is likely to throw up. A batsman with this scoring history - 60, 55, 59, 64, 61, 56, 58 has an average of 59, a s.d. of 2.82, and hence a consistency index of 20.8. One who scores 22,23,22,23,22,23,22 has an average of 22.4, a s.d. of 0.5, and hence a consistency index of nearly 45, or more than twice as high. But clearly player one is the better player.
The problem is that the "batting index" is no such thing - it is a consistency index. Wasn't that meant to be the standard deviation itself? It was, and indeed the standard deviation tells us the spread of runs each batsmen gets. So one with a s.d. of 50 has a wider spread than one with 5. But it's not obvious this is a good measure of what we mean by consistency. To give an example a batsman who scores 0,15,0,15,0,15 has a lower standard deviation than one who scores 75,100,75,100,75,100. But more consistent?
One way to compare standard deviations which might be better would be to divide the standard deviation by the mean. This is in fact the inverse of what the Wisden writer has done, which doesn't really matter as it just means his figures are the inverse of what we want. Thus using his figures (so higher means more consistent) we can say that Kallis is a more "consistent" player than Bradman, and indeed so are the other 6.
But it has told us nothing about which is the better player, even if we agree that 'better' is some combination of runs and consistency, as we haven't looked at their averages. When we do we find that what we know is Kallis is slightly more consistent in scoring his 56 on average than Bradman is in scoring his 99.
How do we then trade-off consistency and averages? We could now divide the average by the index, but there is not particularly good reason to do so. It might be best simply to look at the distribution. But importantly, what is remarkable about the Wisden table is just how similar the adjusted s.d of the leading batsmen are, suggesting there is very little to choose between them in terms of consistency. In fact so little that it's pretty obvious that Bradman would still top the league on most people's definition of the optimum mix.