Sunday, September 18, 2011

Double-blind Experiments for Dummies: Wine - Part 3

This is the final part of my wine-tasting experimentation trilogy preceded by Part 1 and Part 2.

Here, I'll be presenting the results, some basic analysis and a conclusion, including an evaluation of this experiment's credibility.

I've replaced comments with the actual wine (to make comparison easier), now that I've seen how poorly we all filled in comments. The only part of the data there that's useful is that the 3 who wrote comments for the first tasting said that the flavour didn't last long. I can show the full comments if anyone wants, but I highly doubt anyone will be interested. Issue #9: Withholding of data for whatever reason that could potentially play a part in understanding how to interpret the study.


CodeTasting #Score out of 10GuessActual

WineAverage Score
Rawson's Retreat5.33
Koonunga Hill4.33
Bin 4072.67

My Wife

CodeTasting #Score out of 10GuessActual
CAB7Didn't drinkNA
BCA8Didn't drinkNA
ABC9Didn't drinkNA

WineAverage Score
Rawson's Retreat6.5
Koonunga Hill4.5
Bin 4076.5
Issue #10: Incomplete results being used could weight the results incorrectly, but I'm still using them anyway, as I don't have a big dataset... (that's what she said)

Friend B

CodeTasting #Score out of 10GuessActual
CAB7Didn't scoreKHKH
BCA8Didn't scoreRRRR
ABC9Didn't score407407

WineAverage Score
Rawson's Retreat5.5
Koonunga Hill7.5
Bin 4077
Issue #11: Friend B is sure he picked the 407 correctly all 3 times (though his results say twice). I'm pretty sure I got the procedure right, meaning he would be mistaken, but I'm raising the doubt anyway.

Friend K

CodeTasting #Score out of 10GuessActual

WineAverage Score
Rawson's Retreat6
Koonunga Hill5.67
Bin 4076.33


If I now average the scores across people, we'll be able to see which wine was preferred.
WineAverage Score
Rawson's Retreat5.83
Koonunga Hill5.5
Bin 4075.63

Issue #12: Weighting across people like this has at least 2 sub-issues:
  1. As was mentioned in issue #10, some people didn't vote for all tastings, so their results won't be accounted for properly
  2. There was no strict guidance on what a 0 means, compared to a 10. I said that I gave the first wine a 5, because I have no idea about wines, and needed to have somewhere to go, both up and down. Others may have had their own ways of deciding.
Issue #13: Results gathered from so few rounds from so few people are not going to be statistically significant, though I don't know how to calculate it to prove the lack of significance (having long ago forgotten most of the statistics I learned in uni).


As can be seen by the 12 issues I highlighted with the experiment, which ranged in severity, scientifically rigorous experimentation is very difficult to do. Part of why I wrote this series of posts was to highlight that. When reading a headline or summary in the Herald Sun, The Register or The Daily Mail, it's very easy to just believe what is presented as fact, because "they" did a study. However, unless you read and understand the detailed reports, and unless you find some way to know that people haven't tweaked things a bit or failed to disclose the experiment's shortcomings, then you can't just trust believe what you read.

Back to the wine-tasting... The experiment was set up quite well, with no serious issues there. The conduct of the experiment, however, would disqualify it from being a proper double-blind experiment, and any scientist who agreed to pass it for a peer-reviewed journal would probably practice homoeopathy on the side. A proper double-blind experiment needs to be repeatable, and the social interactions that took place in this experiment negate that possibility and call the results into question. If the test-subjects had been in separate rooms, with an objective measure against which to score, and the next tasting was only given after the scoring of the previous one, the results would hold far more weight, but the night would have been less fun. It's a balancing act.

That said... Hahaha, the $7 Rawson's Retreat and the $12 Koonunga Hill beat the $52 Bin 407. Admittedly, if I hadn't been in the experiment, the Bin 407 would have won, but considering this whole experiment was to decide whether it was worth it for me to spend the money on expensive wine, I think that's been pretty well answered... No.


  1. Wow, I didn't realise I had made so many correct guesses! I was sure I had mixed up the 407 with the Rawsons a couple of times. By then end it was getting really hard to tell the difference because they all started blending into one unpleasant tasts. That's why I stopped.

  2. I think that the lesson learnt here is that the more you know about wine, the more you will need to spend to enjoy it.

    Sounds like not taking a wine tasting course and sticking to $3 bottles of clean-skin wine might be a clever thing to do! :)