Here’s a question I’ve been asking a few people lately, ever since I asserted that climate models are big expensive scientific instruments: How expensive are we talking about? Unfortunately, it’s almost impossible to calculate. The effort of creating a climate model is tangled up with the scientific research, such that you can’t even reliably determine how much of a particular scientist’s time is “model development” and how much is “doing science”. The problem is that you can’t build the model without a lot of that “doing science” part, because the model is the result of a lot of thinking, experimentation, theory building, testing hypotheses, analyzing simulation results, and discussions with other scientists. Many pieces of the model are based on the equations or empirical results in published research papers; even if you’re not doing the research yourself, you still have to keep up with the literature, understand the state-of-the-art, and know which bits of research are mature enough to incorporate into the model.
So, my first cut, which will be an over-estimation, is that *all* of the effort at a climate modeling lab is necessary to build the model. Labs vary in size, but a typical climate modeling lab is of the order of 200 people (including scientists, technicians, and admin support). And most of the models I’ve looked at have been under steady development for twenty years or more. So, that gives us starting point of 200*20 = 4,000 person-years. Luckily, most scientists care more about science than salary, so they’re much cheaper than software professionals. Given we’ll have a mix of postdocs and senior scientists, let’s say average salary would be around $150,000 per year including benefits and other overheads. Thats $600 million.
Oh, and that doesn’t including the costs of equipping and operating a tier-2 supercomputing facility, as the climate model runs will easily keep such a facility fully loaded full time (and we’ll need to factor in the cost to replace the supercomputer every few years to take advantage of performance increases). In most cases, the supercomputing facilities are shared with other scientific uses of high performance computing. But there is one centre that’s dedicated to climate modeling, the DKRZ in Hamburg, which has an annual budget of around 30 million euro. Let’s pretend euros are dollars, and call that $30 million per year, which for 20 years gives us another $600 million. The latest supercomputer at DKRZ, Blizzard, cost 35 million euro. Let’s say we replace this every five years, and throw some more money in for many terabytes of data storage, that’ll get us to around $200 million for hardware.
Grand total: $1.4 billion.
Now, I said that’s an over-estimate. Over lunch today I quizzed some of the experts here at IPSL in Paris, and they thought that 1,000 person-years (50 persons per year for 20 years) was a better estimate of the actual model development effort. This seems reasonable – it means that only 1/4 of the research at my 200 person research institute directly contributes to model development, the rest is science that uses the model but isn’t essential for developing it. So, that brings the salary figure down to $150 million. I’ve probably got to do the same conversion for the supercomputing facilities – let’s say about 1/4 of the supercomputing capacity is reserved for model development and testing. That also feels about right: 5-10% of the capacity is reserved for test processes (e.g. the ones that run automatically every day to do the automated build-and-test process), and a further 10%-20% might be used for validation runs on development versions of the model.
That brings the grand total down to $350 million.
Now, it has been done for less than this. For example, the Canadian Climate Centre, CCCma, has a modeling team one tenth this size, although they do share a lot of code with the Canadian Meteorological Service. And their model isn’t as full-featured as some of the other GCMs (it also has a much smaller user base). As with other software projects, the costs don’t scale linearly with functionality: a team of 5 software developers can achieve much more than 1/10th of what a team of 50 can (cf The Mythical Man Month). Oh, and the computing costs won’t come down much at all – the CCCma model is no more efficient than other models. So we’re still likely to be above the $100 million mark.
Now, there are probably other ways of figuring it – so far we’ve only looked at the total cumulative investment in one of today’s world leading climate models. What about replacement costs? If we had to build a new model from scratch, using what we already know (rather than doing all the research over again), how much would that cost? Well, nobody has ever done this, but there are few experiences we could draw on. For example, the Max Planck Institute has been developing a new model from scratch, ICON, which uses a icosahedral grid and hence needs a new approach to the dynamics. The project has been going for 8 years. It started with just a couple of people, and has ramped up to about a dozen. But they’re still a long way from being done, and they’re re-using a lot of the physics code from their old model, ECHAM. On the other hand, its an entirely new approach to the grid structure, so a lot of the early work was pure research.
Where does that leave us? It’s really a complete guess, but I would suggest a team of 10 people (half of them scientists, half scientific programmers) could re-implement the old model from scratch (including all the testing and validation) in around 5 years. Unfortunately, climate science is a fast moving field. What we’d get at the end of 5 years is a model that, scientifically speaking, is 5 years out of date. Unless of course we also paid for a large research effort to bring the latest science into the model while we were constructing it, but then we’re back where we started. I think this means you can’t replace a state-of-the-art climate model for much less than the original development costs.
What’s the conclusion? The bottom line is that the development cost of a climate model is in the hundreds of millions of dollars.
I’ll do it for $50M. And a beer. 🙂
A five year plan leaving the product five years behind would be a waterfall approach, wouldn’t it? It seems to me that the research scientists are most needed on the requirements phase that can be run periodically feeding change requests into the architecture/design group run by the scientific programmers. The end product is never really an end product that way and you reserve the specialized talents to those groups that use them best. Each product release must pass tests the original code passed and you have your anchor to the reality the scientists would have written had they done the whole job.
Pingback: Tweets that mention What’s the pricetag on a Global Climate Model? | Serendipity -- Topsy.com
Alfred: the challenge, as Steve points out elsewhere, is how you validate climate models: what tests can you set that the code passes / fails ? Some bits can be done: mass balances, etc: your model musn’t lose mass, the dynamics can be verified, etc. but parameterizations will change the whole model output and it really is a research question to test if the model has “passed” or not.
I’m working on one climate model (EC-Earth, new to the IPCC process), and during ocean spinup we saw the N Atlantic ocean dip into a near ice-age state at “1850” in the spinup process. It recovered, and looks “normal” now. Now obviously this didn’t happen in real life (did it? how good are our observations from that period? :-)) but it took serious work to check if this invalidated the model, or is a real potential state of our ocean. We decided it was, and use it as one ensemble member state for testing projections (with caveat notes attached).
But we know model runs are sensitive to initial conditions : do we run the model multiple times and get the ensemble average, before applying unit and system tests?
I understand how testing must be a research question for new models. However when old models exist and one wants to convert, can’t we use the old model to test against as well as the research opinion? I’m doubtful that the grunt coders have to be all that knowledgeable of the science beyond the basics for testing the assertions the science staff insists upon.
It’s been many, many years since I’ve seen a waterfall-type project attempted in my line of business, let alone succeed. I almost consider it an archaic form. Labor requirements can be divided among people, groups, and on the calendar so this strikes me as an organizational limitation and not a fundamental one.
Alfred: You’ve misunderstood the point of my comment. I’m certainly not advocating a waterfall style approach. My point about a five year re-implementation project was a thought experiment to estimate the costs, not a practical proposition. If you attempt to separate the model development from the science, you might get something that’s elegant from a software point of view, but scientifically useless. The whole point is that the models are scientific thinking tools, created by scientists to experiment with their understanding of earth system processes.
Pingback: Do Climate Models need Independent Verification and Validation? | Serendipity