(Supposed) All time top 10 (non-fatal) IT disasters
This link packed post is based on a project failures blog post based on a list of IT failures (with even more links to more detail). It has some well known failures, some obscure ones, and some which don’t really qualify at all (exploding batteries?, Y2K ?). For some reason they chose to exclude failures that lead to human fatalities, so it is a very clean list (Ever hear of Therac 25?).
Brian Marick talks about the importance of examples in building software. Examples help people understand more clearly. If you are training/mentoring people on testing or software quality, this list has some great examples. For the classic “1 line code fix does not need testing”, look at the AT&T Network collapse. For the newer “component reuse leads to greater reliability”, look at Ariane 5 . The Mars lander is a classic example of integration bugs that even the most simple design review should have found. The LA Airport failure is a great example of how important risk planning is. The Airbus failure highlights the importance of configuration management. And the Russian missile defence snafu makes it clear why humans should be in control of computers and not the other way around. Let us give thanks that the world has some intelligent users like Lt Col Stanislav Petrov!
