I have just had a glass of a very nice chardonnay from Limoux in southern France, and in order to convey to you how good it was, I’m letting you know that I give it a score of 59! Fantastic! Oh, by the way, that score is on my 46-point scale that ranges from a minimum of 12 to 66, the maximum.
It’s just wrong, isn’t it? But why? When a newspaper or magazine wine columnist awards a wine a score of 95 points, we tend to assume that it is a great wine. We all interpret this in much the same way. That is, it is a wine that has been rated in the top 5% of all wines; a wine that leaves behind all those in the 70s, 80s, and early 90s. But when was the last time you saw a wine with a rating of even 80, much less one with a 75 score? Perhaps these wines never make the guides, magazine or newspaper columns, or blogs. Alternatively, maybe these wines don’t exist. In a system such as this, popularized by American wine critic Robert Parker, there is a dramatic change in meaning as one moves below the 90s. While a rating of 90 means a perfectly acceptable wine, a rating below 80 effectively means vinegar.
And? Well, it is important to recall that this system is supposed to be based on 100 points, that is, percentages. Think about marks at school or university. How would it feel if you scored 75 out of 100 in an exam and were considered a failure as result? Your sense of outrage would be fuelled by the understandable belief that coming in the top quarter of potential marks meant that you were much better than those who must have scored below you.
Considering for a moment what scales supposed to do, it is obvious that they provide an idea of the magnitude of quantities: distance, weight, height, temperature. By and large, we never have problems with these types of scales, a function perhaps of their longevity as useful tools and the fact that the quantities they measure are perceived to be properties of the ‘real, objective world’. It is when we start to measure the intangible, the subjective – thoughts, attitudes, feelings, perceptions – that things become a little trickier.
Why give things a number at all? Why can’t I just tell you that my glass of chardonnay was “bloody good”? Does a Robert Parker type score really tell you more than this? Of course it does, you might argue, because it uses 100 points and that gives far more discrimination between wines than a few categories. That’s true – or should be – but only if the scoring system uses the whole range of the scale. In other words, when rating wines, it is important to have wines that score 32, or 59, or 66. The use of the whole range gives meaning to the differences between scores. In simple terms, if the whole scale is used, you can be confident that a difference of one point – wherever it is on the scale - between two different wines means a 1% difference. If I never use a score below 70, however, then that means that the range of 70 to 100, or 30 points, defines the length of the scale, if it is the case that 70 means terrible and 100 means as good as wine gets (but see below). Now, a one-point difference between these two wines becomes a little more than a 3% difference.
Parker actually explains his scale by categorizing the score into deciles (groups of 10), except for the 90s, which are grouped as 90-95 and 96-100. These categories start at 50-59, which as a group are deemed ‘unacceptable’ wines. It is evident therefore that Parker’s scale is even potentially only a 50 point scale or, if we exclude unacceptable wines, then a 40 point scale. Also, consider if we then are putting wines into categories (either of 10 or 5 points), then perhaps it is a 5 or 6 (if we include unacceptable wines) category scale. But it also seems to be that we can rate within categories. So, any “average wine with little distinction except that it is soundly made. In short a straightforward, innocuous wine” (that is, 70-79) can be rated perhaps as more (e.g., a 71) or less (e.g., a 78) innocuous.
This is not to say that a 100-point scale, even if used as such, is perfect. Imagine a scenario in which you tasted some great wines – for example, Burgundies from a recent good year such as 2002. Fantastic La Tâche – 96! Fabulous Romanée-Conti – 97! Superb Grands Échezeaux – 98! But then I have slipped in an older wine from one of the great vintages of the last century. And it is Magnificent – easily 4 points higher than the Grands Échezeaux. Oops. This is known as a ceiling effect, and at least part of the problem is due to the scale being so compressed. If all really good wines have to be given a score somewhere in the 90s (because apparently that’s where they all live), then when comparing great wines, I am reduced suddenly to a 10 point scale! And, unlike the group Spinal Tap with their amplifiers, I cannot really just add a few points to my scale if something nicer comes along. To be fair, this is a wider problem than just wine rating (imagine rating different samples of Swiss chocolate), but such ceiling effects are reduced appreciably by using the entire scale, irrespective of what is being rated.
Part of the problem with wine ratings is that is that it is commonly believed that the numbers have an independent meaning – that is, they signify something about a wine, independent of other wines. Such beliefs are a carry over from a style of judging based on identifying defects often seen in past years in wine judging and still seen today with dairy judging. In such quality control type judging, a high number often means a product that is relatively free of defects. But this is not how scales of this type work. It isn’t even the way Parker-type scales are applied, as wines with obvious defects simply do not undergo the rating process (although the Wine Spectator magazine’s category from 75-79 is defined as including wines that are drinkable, yet still have some minor flaws).
Another aspect of wine ratings as they are commonly practiced is the view that wines can exist not only without defects, but also as perfect examples of their type. In other words, the practice of wine ratings clings to the idea of a Platonic, objective ideal of a perfect wine. If you are a wine judge, you may have even encountered wines that live in the highest 90s. But for the rest of us, we may not know a 99 if we drank a magnum of it …… even if we did think that it was very good.
A major reason why we want to assign number to things is that it allows comparison. In science, it frequently allows very rigorous comparison via the use of statistical analysis. But we can only do this if we know that our scale has certain properties. In rating food likes or sensory properties, scales like the wine 100-point scale are often used and the resultant data can be used to statistically compare products with one another. We can seldom talk about these properties in the same way we talk about weight, for example. It is difficult to make statements that one product is liked twice as much as another using the scales that are commonly used. However, we can usually talk about relative degrees of difference: so, a difference between 70 and 90 on a scale ought to mean the same as the difference between 50 and 70. Without using the whole scale however, it is not certain that such judgments could be made.
It is relatively common to compare foods from different manufacturers, if they are of the same type. We can compare Swiss milk chocolate with those from other countries, for example. But how do we compare a French pinot noir with one from New Zealand or the USA unless they are all made to be like one another? With wines, there are difficulties comparing on an equivalent basis, or like with like: different methods of growth, production, climate, season, intention, aging and so on. Which particular combination produces the Platonic ideal wine?
Preference, and to some extent quality (if we eliminate defects), is subjective. When a wine columnist gives a score of 96, they are valuing certain aspects of the wine in question. We might all agree that high acidity makes a wine tough to drink now, but what about astringency? A dry, puckering sensation might be a characteristic of some high quality red wines, but there are other wines that are judged as high in quality that are much softer in the mouth. You might trust Parker to tell you whether a wine was unbalanced or excessively astringent or that it will age well. You might even let him tell you that there were peach notes, and chocolate aromas, or that a wine was too redolent of “green bananas still on the tree”, but surely it is up to you as a wine drinker to decide whether it is a style that suits you.
I recently saw for the first time a wine columnist give two scores – one out of 100, which he labeled “empiric” (sic), and another out of 10 for ‘subjective’ assessment, by which he presumably meant how much he liked it. And this belated acknowledgement that perhaps describing or rating a wine ought to have something to do with preference highlights a major issue with wine ratings. Do you like the same wines as Robert Parker? The high impact of such an influential critic carries with it the implication that you ought.
At heart, wine ratings are based on the idea that a perfect wine can be achieved, and that its perfection is independent of what wine consumers think. At least part of the unnecessary complexity and inconsistency of such ratings, as well as their absence of scientific rigor, comes from this notion. This seems odd given that wines are drunk to give pleasure and that may vary from person to person. On one level, I can therefore make a case that a $10 chardonnay can be just a good as a $100 chardonnay if your palate says that it is.