Well, this is what it comes down to. Code reviews on national TV. Who would have thought it? And, by the standards of a Newsnight code review, the code in question doesn’t look so good. Well, it’s not surprising it doesn’t. It’s the work of one, untrained programmer, working in an academic environment, trying to reconstruct someone else’s data analysis. And given the way in which the CRU files were stolen, we can be pretty sure this is not a random sample of code from the CRU; it’s handpicked to be one of the worst examples.
Watch the clip from about 2:00. They compare the code with some NASA code, although we’re not told what exactly. Well, duh. If you compare the experimental code written by one scientist on his own, which has clearly not been through any code review, with that produced by a NASA’s engineering processes, of course it looks messy. For any programmers reading this: How many of you can honestly say that you’d come out looking good if I trawled through your files, picked the worst piece of code lying around in there, and reviewed it on national TV? And the “software engineer” on the program says it’s “below the standards you would expect in any commercial software”. Well, I’ve seen a lot of commercial software. It’s a mix of good, bad, and ugly. If you’re deliberate with your sampling technique, you can find a lot worse out there.
Does any of this matter? Well, a number of things bug me about how this is being presented in the media and blogosphere:
- The first, obviously, is the ridiculous conclusion that many people seem to be making that poor code quality in one, deliberately selected program file somehow invalidates all of climate science. As cdavid points out towards the end of this discussion, if you’re going to do that, then you pretty much have to throw out most results in every field of science over the past few decades for the same reason. Bad code is endemic in science.
- The slightly more nuanced, but equally specious, conclusion that bugs in this code mean that research results at the CRU must be wrong. Eric Raymond picks out an example he calls blatant data-cooking, but is quite clearly fishing for results, because he ignores the fact that the correction he picks on is never used in the code, except in parts that are commented out. He’s quote mining for effect, and given Raymond’s political views, it’s not surprising. Just for fun, someone quote mined Raymond’s own code, and was horrified at what he found. Clearly we have to avoid all open source code immediately because of this…? The problem, of course, is that none of these quote miners have gone to the trouble to establish what this particular code is, why it was written, and what it was used for.
- The widely repeated assertion that this just proves that scientific software must be made open source, so that a broader community of people can review it and improve it.
It’s this last point that bothers me most, because at first sight, it seems very reasonable. But actually, it’s a red herring. To understand why, we need to pick apart two different arguments:
- An argument that when a paper is published, all of the code and data on which it is based should be released so that other scientists (who have the appropriate background) can re-run it and validate the results. In fields with complex, messy datasets, this is exceedingly hard, but might be achievable with good tools. The complete toolset needed to do this does not exist today, so just calling for making the code open source is pointless. Much climate code is already open source, but that doesn’t mean anyone in another lab can repeat a run and check the results. The problems of reproducibility have very little to do with whether the code is open – the key problem is to capture the entire scientific workflow and all data provenance. This is very much an active line of research, and we have a long way to go. In the absence of this, we rely on other scientists testing the results with other methods, rather than repeating the same tests. Which is the way it’s done in most branches of science.
- An argument that there is a big community of open source programmers out there who could help. This is based on a fundamental misconception about why open source software development works. It matters how the community is organised, and how contributions to the code are controlled by a small group of experts. It matters that it works as a meritocracy, where programmers need to prove their ability before they are accepted into the inner developer group. And most of all, it matters that the developers are the domain experts. For example, the developers who built the Linux kernel are world-class experts on operating systems and computer architecture. Quite often they don’t realize just how high their level of expertise is, because they hang out with others who also have the same level of expertise. Likewise, it takes years of training to understand the dynamics of atmospheric physics in order to be able to contribute to the development of a climate simulation model. There is not a big pool of people with the appropriate expertise to contribute to open source climate model development, and nor is there ever likely to be, unless we expand our PhD programs in climatology dramatically (I’m sure the nay-sayers would like that!).
We do know that most of the heavy duty climate models are built at large government research centres, rather than at universities. Dave Randall explains why this is: the operational overhead of developing, testing and maintaining a Global Climate Model is far too high for university-based researchers. The Universities use (parts of) the models, and do further data analysis on both observational data and outputs from the big models. Much of this is the work of indivdual PhD students or postdocs. Which means that the argument that all code written at all stages of climate research must meet some gold standard of code quality is about as sensible as saying no programmer should ever be allowed to throw together a script to test out if some idea works. Of course bad code will get written in a hurry. What matters is that as a particular line of research matures, the coding practices associated with it should mature too. And we have plenty of evidence that this is true of climate science: the software practices used at the Hadley Centre for their climate models are better than most commercial software practices. Furthermore, they manage to produce code that appears to be less buggy than just about any other code anywhere (although we’re still trying to validate this result, and understand what it means).
None of this excuses bad code written by scientists. But the sensible response to this problem is to figure out how to train scientists to be better programmers, rather than argue that some community of programmers other than scientists can take on the job instead. The idea of open source climate software is great, but it won’t magically make the code better.
This is what happens when we use the media hype and the word of mouth to publicize a scientific field. As we can see, media can be equally damaging. It is worth watching Richard Dawkins’ documentary film about “The Enemies Of Reason” (available on Google video).
Surely open source developers who work on things like the Linux kernel are world-class experts on operating systems and computer architectures precisely _because_ they’ve had first-hand experience working on that code? This includes making all the usual coding mistakes and also understanding why certain approaches don’t work through trying them out – basically doing all the right things necessary to really understand a subject in detail beyond the superficial. By the same token, we can better educate a larger community of interested individuals about a variety of scientific disciplines by allowing them to engage directly with the practice of that science. One relatively easy way to do this (easy in the sense that it fits in with current practice within both communities) is to develop scientific codes using the open source model – including allowing contributions from non-scientific developers. Indeed, it’s likely the open source developers and scientists will both learn a great deal from each other by this process.
Hi Steve,
Perhaps we should turn over cyber-security to the scientists as well? 🙂
Wading through the hay of all the straw men you so enthusiastically knock down (which I admit is fun to read), I am still left wondering. There is a knowledge domain called simulation-based engineering science. It includes a pretty well defined approach to V&V (includes method of manufactured solutions, etc.). IMHO, this “community of programmers” would be able to “take on the job instead” of the scientists. And since confidence is the crucial issue, their independence would be invaluable.
The point that I think you are not directly addressing is the “accreditation” problem. V&V assumes that an accredited authority performs the V&V. If the confidence in some portion of climate science has been called into question, climate science cannot logically use its authority to reestablish its authority. Outside remediation is needed from a trusted authority. For example, for the “hockey-stick” issue, IIRC, outside statisticians were brought in.
[That’s an awfully big “if” in your final paragraph. There are plenty of allegations, but no evidence at all that even one single published result is affected by all this. -Steve]
Dan – excellent points. I agree that developers learn a lot about the domain from participating in open source projects, just as climate scientists learn a lot from building their own models. But my point was mainly about the entry criteria. Torvalds began work on Linux as a masters thesis, after many years of training in computer science. Most other contributors have degrees in computer science and many years experience in computing. The equivalent entry criteria for climate models would be a degree in meteorology or atmospheric physics, and many years experience in climate research.
The idea of getting a broader audience to understand what climate models are by participating in building them is a pet project of mine (see: http://www.easterbrook.ca/steve/?p=482). But setting them to work on an existing climate model wouldn’t work; we’d have to build our own simplified model, aimed at elucidation of the principles, rather than doing cutting edge climate science.
I’d care a lot less about seeing all the source and data if I could just ignore climate scientists and shop elsewhere. But since I’m expected to hand over $$$ and change my lifestyle because of this research, your arguments ring hollow.
Regarding comment #4, I don’t personally have to understand all the code and methods myself. It’s enough that people I trust are saying good things about it. In the case of open source encryption software, for example, I can be certain that if there was a way to break the encryption people would be shouting about it.
[You can shop elsewhere – there are thousands of climate scientists across the world. If you don’t like the CRU folks, go to any one of a large number of climate science labs elsewhere (start here: http://www.realclimate.org/index.php/data-sources/). An analogy: Imagine your doctor told you that you have to change your eating habits, or your heart is unlikely to last out the year. You would go and get a second opinion from another doctor. And maybe a third. But when every qualified doctor tells you the same thing, do you finally accept their advice, or do you go around claiming that all doctors are corrupt? – Steve]
A couple of other things occur to me.
“How many of you can honestly say that you’d come out looking good if I trawled through your files, picked the worst piece of code lying around in there, and reviewed it on national TV?”
I’ve written poor quality code. I’ve never written poor quality open source code, because I know that other people might see it. This is an advantage of writing open source code — it improves quality.
“There are plenty of allegations, but no evidence at all that even one single published result is affected by all this.”
There are plenty of questions that would be answered by open sourcing the code. For example, one chap is scrabbling around trying to make sense of station adjustments. Questions like that would go away more quickly if people could see how such adjustments came about.
At the moment it’s plausible to claim that such problems invalidate anything that relies on these adjusted station records.
[The idea of open source on its own does not solve this problem. You have to build a knowledgeable community, and have a way of dealing with deliberate denial of service attacks on scientists’ time. People like Watts aren’t going to stop making up spurious attacks on the science just because the data is open. He’s been trying to find errors in station adjustments for years, and has found nothing wrong yet. That isn’t science, it’s more like the spanish inquisition. – Steve]
Pingback: Do “Many eyes make all bugs shallow”? | Serendipity
One minor point on that BBC Newsnight item: the software engineer comments on the lack of audit history in the source. That might have just about been a reasonable observation a decade or more ago but now most people would keep the code in a version control system of some sort and so have the audit history there rather than in a header comment.
Fun follow-up to the BBC story: John Graham-Cumming, the software engineer quoted in the story, downloaded some of the newly available data, found a discrepancy in some temperatures in Oceania, reported it to the Met Office, and received a gracious response.
Pingback: Climate modeling in an open, transparent world | Serendipity