Watch this next?: the $1 million data mining quality challenge
Quality is often measured in terms of accuracy (or Accurateness, according to this definition ) . For a shopping site, the closer you can predict what a customer likes, the more you can sell and the more inclined your customer will be to buy more and to stay loyal to you as a supplier. As any gambler knows, prediction is not a science but historical results can provide a best guess. Can you put a price on this? Who knows. One company has offered a prize to make it better though, one million dollars.
People with similar goals tend mostly to think alike. Robert Sabourin regularly runs a testing experiment, where he describes a piece of software and asks a group of testers for test ideas to check functionality, error conditions and issues like performance. When he analyses the data, there is a large degree of overlap, and surprisingly few unique ideas.
This is the principle behind suggested purchases for many shopping sites. Data mining based on purchase histories of other consumers who looked at the same item allows relevant items to be suggested. Given that this is based on a binary choice of buy or don’t buy, this is much more straightforward than selections based on personal preferences. It can still go wrong as it did on May 19 for this customer , who had a pet nail groomer suggested as a purchase with a video tennis game!
Movie rental company Netflix uses an algorithm called Cinematch to suggest movies to customers. This is based on a ranking score from one to five stars, but the predicted ratings can sometimes be off by a full star value from the subsequent customer rating. In order to improve this, Netflix have offered a million dollar prize for a better algorithm. Basically anyone with an internet connection can enter. A data set of 100 million anonymous (maybe not , is privacy one of our quality criteria?) movie ratings over 5 years is available as test data. The competition has been going since October 2006. All algortihms are checked against a set criteria of accuracy (known as root mean square error). To win the prize, a ten percent improvement on Cinematch is required. The new algorithms are currently edging towards nine per cent. There has been one $50,000 annual progress prize so far, and the competition will run till at least October 2011. Given the current progress, I think they should expect to be able to award the prize well before then but I guess they hope the big prize will see ongoing improvements until 2011. The leaderboard is public and you can watch the improvements as they edge towards the goal.
The data set is interesting as it follows the classic 80:20 rule, with a small number of movies having most of the ratings. This rule also typically holds for software functionality with a small subset of functionality having most of the defects.
The different approaches have varied from straight mathematical to behavioral economics to psychology. These were highlighted in a recent article about the leading entrants. The curious (but gratifying) thing is the leaders seem more excited by the intellectual challenge than the money. If you want to go for the prize, read this data mining guide or read the approach of an early leader.
The devil’s advocate in me says “All this advanced mathematics and science and computing power is going just to help suggest a movie to someone? Just point them at some film critic ratings or top 100 movie lists then put the computing power to SETI or something else. I’ll keep the prize money for safekeeping…. [grin] ”
[Update: At 17:15:25 March 18, 2008 team BellKor’s algortihm scored 0.8657, which is an improvement of 9.01%. ]