I guess headlines like “An error found in one paragraph of a 3000 page IPCC report; climate science unaffected” wouldn’t sell many newspapers. And so instead, the papers spin out the story that a few mistakes undermine the whole IPCC process. As if newspapers never ever make mistakes. Well, of course, scientists are supposed to be much more careful than sloppy journalists, so “shock horror, those clever scientists made a mistake. Now we can’t trust them” plays well to certain audiences.
And yet there are bound to be errors; the key question is whether any of them impact any important results in the field. The error with the Himalayan glaciers in the Working Group II report is interesting because Working Group I got it right. And the erroneous paragraph in WGII quite clearly contradicts itself. Stupid mistake, that should be pretty obvious to anyone reading that paragraph carefully. There’s obviously room for improvement in the editing and review process. But does this tell us anything useful about the overall quality of the review process?
There are errors in just about every book, newspaper, and blog post I’ve ever read. People make mistakes. Editorial processes catch many of them. Some get through. But few of these things have the kind of systematic review that the IPCC reports went through. Indeed, as large, detailed, technical artifacts, with extensive expert review, the IPCC reports are much less like normal books, and much more like large software systems. So, how many errors get through a typical review process for software? Is the IPCC doing better than this?
Even the best software testing and review practices in the world let errors through. Some examples (expressed in number of faults experienced in operation, per thousand lines of code):
- Worst military systems: 55 faults/KLoC
- Best military systems: 5 faults/KLoC
- Agile software development (XP): 1.4 faults/KLoC
- The Apache web server (open source): 0.5 faults/KLoC
- NASA Space shuttle: 0.1 faults/KLoC
Because of the extensive review processes, the shuttle flight software is purported to be the most expensive in the world, in terms of dollars per line of code. Yet still about 1 error every ten thousand lines of code gets through the review and testing process. Thankfully none of those errors have ever caused a serious accident. When I worked for NASA on the Shuttle software verification in the 1990’s, they were still getting reports of software anomalies with every shuttle flight, and releasing a software update every 18 months (this, for an operational vehicle that had been flying for two decades, with only 500,000 lines of flight code!).
The IPCC reports consist of around 3000 pages, and approaching 100 lines of text per page. Let’s assume I can equate a line of text with a line of code (which seems reasonable, when you look a the information density of each line in the IPCC reports) – that would make them as complex as a 300,000 line software system. If the IPCC review process is as thorough as NASA’s, then we should still expect around 30 significant errors made it through the review process. We’ve heard of two recently – does this mean we have to endure another 28 stories, spread out over the next few months, as the drone army of denialists toils through trying to find more mistakes? Actually, it’s probably worse than that…
The IPCC writing, editing and review processes are carried out entirely by unpaid volunteers. They don’t have automated testing and static analysis tools to help – human reviewers are the only kind of review available. So they’re bound to do much worse than NASA’s flight software. I would expect there to be 100s of errors in the reports, even with the best possible review processes in the world. Somebody point me to a technical review process anywhere that can do better than this, and I’ll eat my hat. Now, what was the point of all those newspaper stories again? Oh, yes, sensationalism sells.