The Muir Russell report came out today, and I just finished reading the thing. It should be no surprise to anyone paying attention that it completely demolishes the the allegations that have been made about the supposed bad behaviour of the CRU research team. But overall, I’m extremely disappointed, because the report completely misses the wood for the trees. It devotes over 100 pages to a painstaking walk through every single allegation made against the CRU, assessing the evidence for each, and demolishing them one after another. The worst it can find to say about the CRU is that it hasn’t been out there in the lead over the last decade in responding to the new FoI laws, adapting to the rise of the blogosphere, and adapting to changing culture of openness for scientific data. The report makes a number of recommendations for improvements in processes and practices at the CRU, and so can be taken as mildly critical, especially of CRU governance. But in so doing, it never really acknowledges the problems a small research unit (varying between 3.5 to 5 FTE staff over the last decade) would have in finding the resources and funding to be an early adopter in open data and public communication, while somehow managing to do cutting edge research in its area of expertise too. Sheesh!

But my biggest beef with the report is that nowhere, in 100 pages of report plus 60 pages of appendices, does it ever piece together the pattern represented by the set of allegations it investigates. Which means it achieves nothing more than being one more exoneration in a very long list of exonerations of climate scientists. It will do nothing to stop the flood of hostile attacks on science, because it never once considers the nature of those attacks. Let’s survey some of the missed opportunities…

I’m pleased to see the report cite some of the research literature on the nature of electronic communication (e.g. the early work of Sara Kiesler et al), but it’s a really pity they didn’t read much of this literature. One problem recognized even in early studies of email communication is the requesters/informers imbalance. Electronic communication makes it much easier for large numbers of people to offload information retrieval tasks onto others, and receivers of such requests find it hard to figure out which requests they are obliged to respond to. They end up being swamped. Which is exactly what happened with that (tiny) research unit in the UK, when a bunch of self-styled auditors went after them.

And similar imbalances pervade everything. For example on p42, we have:

“There continues to be a scientific debate about the reality, causes and uncertainties of climate change that is conducted through the conventional mechanisms of peer-reviewed publication of results, but this has been paralleled by a more vociferous, more polarised debate in the blogosphere and in popular books. In this the protagonists tend to be divided between those who believe that climate is changing and that human activities are contributing strongly to it, and those that are sceptical of this view. This strand of debate has been more passionate, more rhetorical, highly political and one in which each side frequently doubts the motives and impugns the honesty of the other, a conflict that has fuelled many of the views expressed in the released CRU emails, and one that has also been dramatically fuelled by them.” (page 42, para 26)

But the imbalance is clear. This highly rhetorical debate in the blogosphere occurs between, on the one hand, a group of climate scientists with many years training, and whose expertise is considerable (and the report makes a good job of defending their expertise), and on the other hand, a bunch of amateurs, most of whom have no understanding of how science works, and who are unable to distinguish scientific arguments from ideology. And the failure to recognise this imbalance leads the report to conclude that a suitable remedy is to :

“…urge all scientists to learn to communicate their work in ways that the public can access and understand; and to be open in providing the information that will enable the debate, wherever it occurs, to be conducted objectively.” (page 42, para 28)

No, no, no. As I said very strongly earlier this year, this is naive and irresponsible. No scientist can be an effective communicator in a world where people with vested interests will do everything they can to destroy his or her reputation.

Chapter 6 of the report, on the land station temperature record ought to shut Steve McKitrick McIntyre up forever. But of course it won’t, because he’s not interested in truth, only in the dogged determination to find fault with climate scientists’ work no matter what. Here’s some beautiful quotes:

“To carry out the analysis we obtained raw primary instrumental temperature station data. This can be obtained either directly from the appropriate National Meteorological Office (NMO) or by consulting the World Weather Records (WWR) …[web links elided] … Anyone working in this area would have knowledge of the availability of data from these sources.” (Page 46, paras 13-14)

“Any independent researcher may freely obtain the primary station data. It is impossible for a third party to withhold access to the data.” (Page 48, para 20).

…well, anyone that it except McKitrickMcIntyre and followers, who continue to insist, despite all evidence to the contrary, that climate scientists are withholding station data.

And on sharing the code, the report is equally dismissive of the allegations:

“The computer code required to read and analyse the instrumental temperature data is straightforward to write based upon the published literature.  It amounts a few hundred lines of executable code (i.e. ignoring spaces and comments). Such code could be written by any research unit which is competent to reproduce or test the CRUTEM analysis.  For the trial analysis of the Review Team, the code was written in less than two days and produced results similar to other independent analyses. No information was required from CRU to do this.” (Page 51, para 33)

I like the “any research unit which is competent to reproduce or test the CRUTEM analysis” bit. A lovely British way of saying that  the people making allegations about lack of openness are incompetent. And here’s another wonderful British understatement, referring to ongoing criticism of Briffa’s 1992 work:

“We find it unreasonable that this issue, pertaining to a publication in 1992, should continue to be misrepresented widely to imply some sort of wrongdoing or sloppy science.” (page 62, para 32)

Unreasonable? Unreasonable? It’s an outrage, an outrage I tell you!! (translation provided for those who don’t speak British English).

And there’s that failure to address the imbalance again. In examining the allegations from Boehmer-Christiansen, editor of the notoriously low-quality journal Energy and Environment, that the CRU researchers tried to interfer with the peer-review process, we get the following bits of evidence: An email sent by Boehmer-Christiansen to a variety of people with the subject line Please take note of potetially [sic] serious scientific fraud by CRU and Met Office.“, and Jones’ eventual reply to her head of department: “I don‟t think there is anything more you can do. I have vented my frustration and have had a considered reply from you“, which leads to the finding:

“We see nothing in these exchanges or in Boehmer-Christiansen’s evidence that supports any allegation that CRU has directly and improperly attempted to influence the journal that she edits. Jones’ response to her accusation of scientific fraud was appropriate, measured and restrained.” (page 66, para 14).

Again, a missed opportunity to comment on the imbalance here. Boehmer-Christiansen is able to make wild and completely unfounded accusations of fraud, and nobody investigates her, while Jones’ reactions to the allegations are endlessly dissected, and in the end everything’s okay, because his response was “appropriate, measured and restained”. No, that doesn’t make it okay. It means someone failed to ask some serious questions how and why people like Boehmer-Christiansen can be allowed to get away with continual smearing of respected climate scientists.

So, an entire 160 pages, in which the imbalance is never once questioned – the imbalance between the behaviour that’s expected of climate scientists, and the crap that the denialists are allowed to get away with. Someone has to put a stop to their nonsense, but unfortunately, Muir Russell ducked the responsibility.

Postscript: my interest in software engineering issues makes me unable to let this one pass without comment. The final few pages of the report criticize the CRU for poor software development standards:

“We found that, in common with many other small units across a range of universities and disciplines, CRU saw software development as a necessary part of a researcher‘s role, but not resourced in any professional sense.  Small pieces of software were written as required, with whatever level of skill the specific researcher happened to possess.  No formal standards were in place for: Software specification and implementation; Code reviews; and Software testing” (page 103, para 30).

I don’t dispute this – it is common across small units, and it ought to be fixed. However, it’s a real shame the report doesn’t address the lack of resources and funding for this. But wait. Scroll back a few pages…

“The computer code required to read and analyse the instrumental temperature data is straightforward to write […] It amounts a few hundred lines of executable code […]  For the trial analysis of the Review Team, the code was written in less than two days and produced results similar to other independent analyses.” (page 51, para 33)

Er, several hundred lines of code written in less than 2 days? What, with full software specification, code review, and good quality testing standards? I don’t think so. Ironic that the review team can criticize the CRU software practices, while taking the same approach themselves. Surely they must have spotted the irony?? But, apparently not. The hypocrisy that’s endemic across the software industry strikes again: everyone has strong opinions about what other groups ought to be doing, but nobody practices what they preach.

The IPCC schedule impacts nearly all aspects of climate science. At the start of this week’s CCSM workshop, Thomas Stocker from the University of Bern, and co-chair of working group 1 of the IPCC, gave an overview of the road toward the fifth assessment report (AR5), due to be released in 2013

First, Thomas reminded us that the IPCC does not perform science (it’s job is to assess the current state of the science), but increasingly it stimulates science. This causes some tension though, as curiosity-driven research must remain the priority for the scientific community.

The highly politicized environment also poses a huge risk. There are some groups actively seeking to discredit climate science and damage the IPCC, which means that rigor of the IPCC procedures are now particularly important. One important lesson from the last year is that there is no procedure for correcting serious errors in the assessment reports. Minor errors are routine, and are handled by releasing errata. But this process broke down for bigger issues such as the Himalayan glacier error.

Despite the critics, climate science is about as transparent as a scientific field can be. Anyone can download a climate model and see what’s in there. The IPCC process is founded on four key values (thanks to the advocacy of Susan Solomon): Rigor, Robustness, Transparency, and Comprehensiveness. However, there are clearly practical limits to transparency. For example, it’s not possible to open up lead author meetings, because the scientists need to be able to work together in a constructive atmosphere, rather than “having miscellaneous bloggers in the room”!

The structure of the IPCC remain the same: three working groups: WG1 on the physical science basis, WG2 on impacts and adaptation, and WG3 on mitigation, along with a task force on GHG inventories.

The most important principles for the IPCC are in article 2 and 3:

2. “The role of the IPCC is to assess on a comprehensive, objective, open and transparent basis the scientific, technical and socio-economic information relevant to understanding the scientific basis of risk of human-induced climate change, its potential impacts and options for adaptation and mitigation. IPCC reports should be neutral with respect to policy, although they may need to deal objectively with scientific, technical and socio-economic factors relevant to the application of particular policies.

3. Review is an essential part of the IPCC process. Since the IPCC is an intergovernmental body, review of IPCC documents should involve both peer review by experts and review by governments.

A series of meetings have already occurred in preparation for AR5:

  • Mar 2009: An expert meeting on science of alternative greenhouse gas metrics. The met and produced a report.
  • Sept 2009: An expert meeting on detection and attribution, which produced a report and a good practice guidance paper [which itself is a great introduction to how attribution studies are done].
  • Jan 2010: An expert meeting at NCAR on assessing and combining multi-model projections. The report from this meeting is due in a few weeks, and will also include a good practice guide.
  • Jun 2010: A workshop on sea level rise and ice sheet instability, which was needed because of the widespread recognition that AR4 was weak on this issue, perhaps too cautious.
  • And in a couple of weeks, in July 2010, a workshop on consistent treatment of uncertainties and risks. This is a cross-Working Group meeting, at which they hope to make progress on getting all three working groups to use the same approach. In the AR4, WG1 developed a standardized language for describing uncertainty, but other working groups have not yet.

Thomas then identified some important emerging questions leading up to AR5.

  1. Trends and rates of observed climate change, and in particular, the question of whether climate change has accelerated? Many recent papers and reports indicate that it has; the IPCC needs to figure out how to assess this, especially as there are mixed signals. For example, the decadal trend is accelerating in Arctic sea ice extent, but  the global temperature anomaly has not accelerated over this time period.
  2. Stability of the Western and Eastern Antarctic ice sheets (WAIS and EAIS). There has been much more dynamic change at margins of these ice sheets, accelerating mass loss, as observed by GRACE. The assessment needs to look into whether these really are accelerating trends, or if its just an artefact of limited duration of measurements.
  3. Irreversibilities and abrupt change: how robust and accurate is our understanding? For example, what long term commitment have been made already in sea level rise. And what about commitments in the hydrological cycle, where some regions (Africa, Europe) might go beyond the range of observed drought within the next couple of decades, and this may be unavoidable.
  4. Clouds and Aerosols, which will have their own entire chapter in AR5. There are still big uncertainties here. For example, low level clouds are a positive feedback in the north-east Pacific, yet all but one model are unable to simulate this.
  5. Carbon and other biogeochemical cycles. New ice core reconstructions were published just after AR4, and give us more insights into regional carbon cycle footprints caused by abrupt climate change in the past. For example, the ice cores show clear changes in soil moisture and total carbon stored  in the Amazon region.
  6. Near-term and long-term projections, for example the question of how reliable the decadal projections are. This is a difficult area. Some people say we already have seamless prediction (from decades to centuries), but not Thomas is not yet convinced. For example, there are alarming new results on number of extreme hot days across southern Europe that need to be assessed – these appear to challenge assumptions about the decadal trends.
  7. Regional issues – eg frequency and severity of impacts. Traditionally, the IPCC reports have taken an encyclopedic approach: take each region, and list the impacts in each. Instead, for AR5, the plan is to start with the physical processes, and then say something about sensitivity within each region to these processes.

Here’s an overview of the planned structure of the AR5 WG1 report:

  • Intro
  • 4 chps on observations and paleoclimate
  • 2 chps on process understanding (biogeochemistry and clouds/aerosols)
  • 3 chps from forcing to attributions
  • 2 chps on future climate change and predictability (near term and long term)
  • 2 integration chapters (one on sea level rise, and one on regional issues)

Some changes are evident from AR4. Observations have become more important. They grew to 3 chapters in AR4, and will keep the same in AR5. There will be another crack at paleoclimate, and new chapters on: sea level rise (a serious omission in AR4); clouds and aerosols; the carbon cycle; and regional change. There is also a proposal to produce an atlas which will include a series of maps summarizing the regional issues.

The final draft of the WG1 report is due in May 2013, with a final plenary in Sept 2013. WG2 will finish in March 2014, and WG3 in April 2014. Finally, the IPCC Synthesis Report is to be done no later than 12 months from WG1 report, ie. by September 2014. There has been pressure to create a process that incorporates new science throughout 2014 in to the synthesis report, however Thomas has successfully opposed this, on the basis that it will cause far more controversy if the synthesis report is not consistent with the WG reports.

The deadlines for published research to be included in the assessment is as follows. Papers need to be submitted for publication by 31 July 2012, and must be in press by 15 March 2013. The IPCC has to be very strict about this, because there are people out there who have nothing better to do than to wade through all the references in AR4 and check that all of them appeared before the cutoff date.

Of course, these dates are very relevant to the CCSM workshop audience. Thomas urged everyone not to leave this to the last minute; journal editors and reviewers will be swamped if everyone tries to get their papers published just prior to the deadline [although I suspect this is inevitable?].

Finally, here is a significant challenge in communication coming up. For AR5 we’re expecting to see a much broader model diversity than in previous assessments, partly because there are more models (and more variants), and partly because the models now include a broader range of earth system processes. This will almost certainly mean a bigger model spread,  and hence a likely increase in uncertainty. It will be a significant challenge to communicate the reasons for this to policymakers and a lay audience. Thomas argues that we must not be ashamed to present how science works – that in some cases the uncertainties multiply, during which the spread of projections grows, and then when we get the models more constrained by observations they converge again. But this also poses problems in how we do model elimination and model weighting in ensemble projections. For example, if a particular model shows no sea ice in the year 2000, it probably should be excluded as this is clearly wrong. But how do we set clear criteria for this?

I thought I wouldn’t blog any more about the CRU emails story, but this one is very close to my heart, so I can’t pass it up. Brian Angliss, over at Scholars and Rogues, has written an excellent piece on the lack of context in the stolen emails, and the reliability of any conclusions that might be based on them. To support his analysis, he quotes extensively from the paper “the Secret Life of Bugs” by Jorge Aranda and Gena Venolia from last year’s ICSE, in which they convincingly demonstrated that electronic records of discussions about software bugs are frequently unreliable, and that there is a big difference between the recorded discussions and what you find when you actually track down the participants and ask them directly.

BTW Jorge will be defending his PhD thesis in a couple of weeks, and it’s full of interesting ideas about how software teams develop a shared understanding of the software they develop, and the implications that this has on team organisation. I’ll be mining it for ideas to explore in my own studies of climate modellers later this year…

After catching the start of yesterday’s Centre for Environment Research Day, I headed around the corner to catch the talk by Ray Pierrehumbert on “Climate Ethics, Climate Justice“. Ray is here all week giving the 2010 Noble lectures, “New Worlds, New Climates“. His theme for the series is the new perspectives we get about Earth’s climate from the discovery of hundreds of new planets orbiting nearby stars, advances in knowledge about solar system planets, and advances in our knowledge of the early evolution of Earth, especially new insights into the snowball earth. I missed the rest of the series, but made it today, and I’m glad I did, because the talk was phenomenal.

Ray began by pointing out that climate ethics might not seem to fit with the theme of the rest of the series, but it does, because future climate change will, in effect, make the earth into a different planet. And the scary thing is we don’t know too much about what that planet will be like. Which then brings us to questions of responsibility, particularly the question of how much we should be spending to avoid this.

Figure 1 from Rockstrom et al, Nature 461, 472-475 (24 Sept 2009). Original caption: The inner green shading represents the proposed safe operating space for nine planetary systems. The red wedges represent an estimate of the current position for each variable. The boundaries in three systems (rate of biodiversity loss, climate change and human interference with the nitrogen cycle), have already been exceeded.

Humans are a form of life, and are altering the climate in a major way. Some people talk about humans now having an impact of “geological proportions” on the planet. But in fact, we’re a force of far greater than geological proportions: we’re releasing around 20 times as much carbon per year than what nature can do (for example via volcanoes). We may cause a major catastrophe. And we need to consider not just CO2, but many other planetary boundaries – all biogeochemical boundaries.

But this is nothing new – this is what life does – it alters the planet. The mother of all planet altering lifeforms is blue-green algae. It radically changed atmospheric chemistry, even affecting composition of rocks. If the IPCC had been around at the end of the Archean Eon (2500 million years ago) to consider how much photosynthesis should be allowed, it would have been a much bigger question than we face today. The biosphere (eventually!) recovers from such catastrophes. There are plenty of examples: oxygenation by cyanobacteria, snowball earth, permo-triassic mass extinction (90% of species died out) and the KT dinosaur killer astreroid (although the latter wasn’t biogeochemically driven). So the earth does just fine in the long run, and such catastrophes often cause interesting things to happen, eg. opening up new niches for new species to evolve (e.g. humans!).

But normally these changes take tens of millions of years, and whichever species were at the top of the heap before usually lose out: the new kind of planet favours new kinds of species.

So what is new with the current situation? Most importantly we have foresight and we know about what we’re doing to the planet. This means we have to decide what kind of climate the planet will have, and we can’t avoid that decision, because even deciding to do nothing about it is a decision. We cannot escape the responsibility. For example, we currently have a climate that humans evolved to exist in. The conservative thing is to decide not to rock the boat – to keep the climate we evolved in. On the other hand we could decide a different climate would be preferable, and work towards it – e.g. would things be better (on balance) if the world were a little warmer or a little cooler. So we have to decide how much warming is tolerable. And we must consider irreversible decisions – e.g. preserving irreplaceable treasures (e.g. species that will go extinct). Or we could put the human species at the centre of the issue, and observe that (as far as we know) the human specifies is unique as the only intelligent life in the universe; the welfare of the human species might be paramount. So then the question then becomes: how should we preserve a world worth living in for humanity?

So far, we’re not doing any better than cyanobacteria. We consume resources and reproduce until everything is filled up and used up. Okay, we have a few successes, for example in controlling acid rain and CFCs. But on balance, we don’t do much better than the bacteria.

Consider carbon accounting. You can buy carbon credits, sometimes expressed in terms of tonnes of CO2, sometimes in terms of tonnes of carbon. From a physics point of view, it’s much easier to think in terms of carbon molecules, because it’s the carbon in various forms that matters – e.g. dissolved in the ocean making them more acidic, in CO2 in the atmosphere, etc. We’re digging up this carbon in various forms (coal, oil, gas) and releasing it into the atmosphere. Most of this came from biological sources in the first place, but has been buried over very long (geological) timescales. So, we can do the accounting in terms of billions of tonnes (Gt) of carbon. The pre-industrial atmosphere contained 600Gt carbon. Burning another 600Gt would be enough to double atmospheric concentrations (except that we have to figure out how much stays in the atmosphere, how much is absorbed by the oceans, etc). World cumulative emissions show an exponential growth over last century. We are currently at 300Gt cumulative emissions from fossil fuel. 1000Gt of cumulative emissions is an interesting threshold, because that’s about enough to warm the planet by 2°C (which is the EU’s stated upper limit). A straight projection of the current exponential trend takes us to 5000GtC by 2100. It’s not clear there is enough coal to get us there, but it is dangerous to assume that we’ll run out of resources before this. The worst scenario: we get to 5000GtC, wreck the climate, just as we run out of fossil fuels, so civilization collapses, at a time when we no longer have a tolerable climate to live in.

Of course, such exponential growth can never continue indefinitely. To demonstrate the point, Ray showed the video of The Impossible Hamster Club. The key question is whether we will voluntarily stop this growth in carbon emissions, and if we don’t, at what point will natural limits kick in and stop the growth for us?

There are four timescales for CO2 drawdown:

  • Uptake by the ocean mixed layer – a few decades
  • Uptake by the deep ocean – a few centuries
  • Carbonate dissolution (laying down new sediments on the ocean bed) – a few millenia
  • Silicate weathering (reaction between rocks and CO2 in the atmosphere that creates limestone) – a few hundred millenia.

Ray then showed the results of some simulations using the Hamburg carbon cycle model. The scenario they used is a ramp up to peak emissions in 2010, followed by a drop to either 4, 2, or 1Gt per year from then on. The graph of atmospheric concentrations out to the year 4000 shows that holding emissions stable at 2Gt/yr still causes concentrations to ramp up to 1000ppm. Even reducing to 1Gt/yr leads to an increase to around 600ppm by the year 4000. The obvious conclusion is that we have to reduce net emissions to approximately zero in order to keep the climate stable over the next few thousand years.

What does a cumulative emissions total of 5000GtC mean for our future? Peak atmospheric concentrations will reach over 2000ppm, and stay there for around 10,000 years, then slowly reducing on a longer timescale because of silicate weathering. Global mean temperature rises by around 10°C. Most likely, the greenland and west antarctic ice sheets will melt completely (it’s not clear what it would take to melt the east antarctic). So what we do this century will affect us for tens of thousands of years. Paul Crutzen coined the term anthropocene to label this new era in which humans started altering the climate. In the distant future, the change in the start of the anthropocene will look as dramatic as other geolological shifts – certainly bigger than the change at the end of the KT extinction.

This makes geoengineering by changing the earth’s albedo an abomination (Ray mentioned as an example the view put forward in that awful book Superfreakonimics). It’s morally reprehensible, because it leads to the Damocles world. The sword hanging over us is that for the next 10,o000 years, we’re committed to doing the sulphur seeding every two years, and continuing to do so no matter what unforutunate consequence such as drought, etc. happen as side effects.

But we will need longwave geoengineering – some way of removing CO2 from the atmosphere to deal with the last gigatonne or so of emissions, because these will be hard to get rid of no matter how big the push to renewable energy sources. That suggests we do need a big research program on air capture techniques.

So, the core questions for climate ethics are:

  • What is the right amount to spend to reduce emissions?
  • How should costs be divided up (e.g. US, Europe, Asia, etc)?
  • How to figure the costs of inaction?
  • When should it be spent?

There is often a confusion between fairness and expedience (e.g. Cass Sunstein, an Obama advisor, makes this mistake in his work on climate change justice). The argument goes that a carbon tax that falls primarily on the first world is, in effect, a wealth transfer to the developing world. It’s a form of foreign aid, therefore hard to sell politically to Americans, and therefore unfair. But the real issue is not about what’s expedient, the issue is about the right thing to do.

Not all costs can be measured by money, which makes cost-benefit analysis a poor tool for reasoning about climate change. For example, how can we account for loss of life, loss of civil liberties, etc in a cost/benefit analysis? Take for example the effect of capital punishment on crime reduction versus the injustice of executing the innocent. This cannot be a cost/benefit decision, it’s a question of social values. In economic theory the “contingent valuation” of non-market costs and benefits is hopelessly broken. Does it make sense to trade off polar bear extinction against Arctic oil revenue by assigning a monetary value to polar bears? A democratic process must make these value judgements – we cannot push them off to economic analysis in terms of cost-benefits. The problem is that the costs and benefits of planetary scale processes are not additive. Therefore cost/benefit is not a suitable tool for making value decisions.

Similarly the use of (growth in) GDP, which is used by economists as a proxy for a nation’s welfare. Bastiat introduced the idea of the broken window fallacy – the idea that damage to people’s property boosts GDP because it increases the need for work to be done to fix it, and hence increases money circulation. This argument is often used by conservatives to poohpooh the idea of green jobs – what’s good for jobs doesnt necessarily make people better off. But right now the entire economy is made out of broken windows: Hummers, Mcmansions, video screens in fastfood joints,… all of it is consumption that boosts GDP without actually improving life for anyone. (Perhaps we should try to measuring gross national happiness instead, like the Bhutanese).

And then there’s discounting – how do we compare the future with the present? The usual technique is to exponentially downweight future harms according to how far in the future they are. The rationale is you could equally well put the money in the bank, and collect interest to pay for future harms (i.e. generate a “richer future”, rather than spend the money now on mitigating the problem). But certain things cannot be replaced by money (e.g. human life, species extinction). Therefore they cannot be discounted. And of course, economists make the 9 billion tonne hamster mistake – they assume the economy can keep on growing forever. [Note: Ray has more to say on cost-benefit and discounting in his slides, which he skipped over in the talk through lack of time]

Fairness is a major issue. How do we approach this? For example, retributive justice – punish the bad guys? You broke it, you fix it? Whoerever suffers the least from fixing it moves first? Consider everyone to be equal?  Well, the Canadian climate policy appears to be: wait to see what Obama does, and do the same, unless we can get away with doing less.

What about China vs. the US, the two biggest emitters of greenhouse gases? The graph of annual CO2 emissions shows that China overtook the US in the last few years (while, for example, France held their emissions constant). But you have to normalize the emissions per capita, then the picture looks very different. And here’s an interesting observation: China has a per capita emissions very close to that of France, but doesn’t have French standard of living. Therefore there is clearly room for China to improve its standard of living without increasing per capita emissions, which means that emissions controls do not necessarily hold back development.

But because it’s cumulative emissions that really matter, we have to look at each nation’s cumulative per capita emissions. The calculation is tricky because we have to account for population growth. It turns out that the US has a bigger population growth problem than China, which, when added to the cumulative emissions, means US has much bigger responsibility to act. If we take the target of 1000GtC as the upper limit on cumulative emissions (to stay within the 2°C temperature rise), and allocate that equally to everyone, based on 2006 population figures, we get about 100 tonnes of carbon per capita as a lifetime allowance. The US has an overdraft on this limit (because the US has used up more than this), while China still has a carbon balance (it’s used up less). In other words, in terms of the thing that matters most, cumulative emissions, the US has used up more than it’s fair share of a valuable resource (slide 43 from Ray’s talk):

This graph shows the cumulative emissions per (2006) capita for the US and China. If we take 100 tonnes as the lifetime limit for each person (to keep within the global 1000Gt target), then the US has already used more than its fair share, and China has used much less.

This analysis makes it clear what the climate justice position is. The Chinese might argue that just to protect themselves and their climate, China might need to do something more than its fair share. In terms of a negotiation, arguing about everyone taking action together, might be expedient. But the right thing to do for the US is not just to reduce emissions to zero immediately, but to pay back that overdraft.

Some interesting questions from the audience:

Q: On geoengineering – why rule out attempts to change the albedo of the earth by sulphate particle seeding when we might need an “all of the above” approach? A: Ray’s argument is largely about what happens if it fails. For example, if the dutch dykes fail, in the worst case, the Dutch could move elsewhere. If global geoengineering fails, we don’t have an elsewhere to move to. Also, if you stop, you get hit all at once with the accumulated temperature rise. This makes Levitt’s suggestion of “burn it all and geoengineer to balance” to be morally reprehensible.

Q: Could you say more about the potential for air capture? A: It’s a very intriguing idea. All the schemes being trialed right now capture carbon in the form of CO2 gas, which would then need to be put down into mineral form somehow. A more interesting approach is to capture CO2 directly in mineral form, e.g. limestone. It’s not obviously crazy, and if it works it would help. It’s more like insurance, and investing in research in this likely to provide a backup plan in a way that albedo alteration does not.

Q: What about other ways of altering the albedo? A: Suggestions such as painting roofs & parking lots white will help reduce urban heat, mitigate effect of heatwaves, and also reduce use of airconditioners. Which is good, but it’s essentially a regional effect. The overall effect on the global scale is probably negligible. So it’s a good idea because it only has a regional impact.

Q: About nuclear – will we need it? A: Ray says probably yes. If it comes down to a choice between nuclear vs. coal, the choice has to be nuclear.

Finally, I should mention Ray has a new book coming out: Principles of Planetary Climate, and is a regular contributor to

Note: This started as a comment on a thread at RealClimate about the Guardian’s investigation of the CRU emails fiasco. The Guardian has, until recently, had an outstandingly good record on it’s climate change reporting. It commissioned Fred Pearce to do a detailed investigation into the emails, and he published his results in a 12-part series. While some parts of it are excellent, other parts demonstrate a complete misunderstanding of how science works, especially the sections dealing with the peer-review process. These were just hopelessly wrong, as demonstrated by Ben Santer’s rebuttal of the specific allegations. In parallel, George Monbiot, who I normally respect as one of the few journalists who really understands the science, has been arguing for Phil Jones to resign as head of the CRU at East Anglia, on the basis that his handling of the FOI requests was unprofessional. Monbiot has repeated this more recently, as can be seen in this BBC clip, where he is hopelessly ineffective in combating Delingpole’s nonsense, because he’s unwilling to defend the CRU scientists adequately.

The problem with both Pearce’s investigation, and Monbiot’s criticisms of Prof Jones is that neither has any idea of what academic research looks like from the inside, nor how scientists normally talk to one another. The following is my attempt to explain this context, and in particular why scientists talking freely among themselves might seem to rude or worse. Enough people liked my comment at RC that I decided to edit it a little and post it here (the original has already been reposted at ClimateSight and Prof Mandia’s blog). I should add one disclaimer: I don’t mean to suggest here that scientists are not nice people – the climate scientists I’ve gotten to know over the past few years are some of the nicest people you could ever ask to meet. It’s just that scientists are extremely passionate about the integrity of their work, and don’t take kindly to people pissing them around. Okay, now read on…

Once we’ve gotten past the quote-mining and distortion, the worst that can be said about the CRU emails is that the scientists sometimes come across as rude or dismissive, and say things in the emails that really aren’t very nice. However, the personal email messages between senior academics in any field are frequently not very nice. We tend to be very blunt about what appears to us as ignorance, and intolerant of anything that wastes our time, or distracts us from our work. And when we think (rightly or wrongly) that the peer review process has let another crap paper through, we certainly don’t hold back in expressing our opinions to one another. Which is of course completely different to how we behave when we meet one another. Most scientists distinguish clearly between the intellectual cut and thrust (in which we’re sometimes very rude about one another’s ideas) and our social interactions (in which we all get together over a beer and bitch about the downsides of academic life). Occasionally, there’s someone who is unable to separate the two, and takes the intellectual jabs personally, but such people are rare enough in most scientific fields that the rest of us know exactly who they are, and try to avoid them at conferences.

Part of this is due to the nature of academic research. Most career academics have large egos and very thick skins. I think the tenure process and the peer review process filter out those who don’t. We’re all jostling to get our work published and recognised, often by pointing out how flawed everyone else’s work is. But we also care deeply about intellectual rigor, and preserving the integrity of the published body of knowledge. And we also know that many key career milestones are dependent on being respected (and preferably liked) by others in the field: for example, the more senior people who might get asked to write recommendation letters for us, for tenure and promotion and honors, or the scientists with competing theories who will get asked to peer review our papers.

Which means in public (e.g. in conference talks and published papers) our criticisms of others are usually carefully coded to appear polite and respectful. A published paper might talk about making “an improvement on the methodology of Bloggs et al”. Meanwhile, in private, when talking to our colleagues, we’re more likely to say that Bloggs’ work is complete rubbish, and should never have been published in the first place, and anyway everyone knows Bloggs didn’t do any of the work himself, and the only decent bits are due to his poor, underpaid postdoc, who never gets any credit for her efforts. (Yes, academics like to gossip about one another just as regular people do). This kind of blunt rudeness is common in private emails, especially when we’re discussing other scientists behind their backs with likeminded colleagues. Don’t be fooled by the more measured politeness in public: when we think an idea is wrong, we’ll tear it to shreds.

Now, in climate science, all our conventions are being broken. Private email exchanges are being made public. People who have no scientific training and/or no prior exposure to the scientific culture are attempting to engage in a discourse with scientists, and neither side understands the other. People are misquoting scientists, and trying to trip them up with loaded questions. And, occasionally, resorting to death threats. Outside of the scientific community, most people just don’t understand how science works, and so don’t know how to make sense of what’s going on.

And scientists don’t really know how to engage with these strange outsiders. Scientists normally only interact with other scientists. We live rather sheltered lives; they don’t call it the ivory tower for nothing. When scientists are attacked for political reasons, we mistake it for an intellectual discussion over brandy in the senior common room. Scientists have no training for political battles, and so our responses often look rude or dismissive to outsiders. Which in turn gets interpreted as unprofessional behaviour by those who don’t understand how scientists talk. And unlike commercial organisations and politicians, universities don’t engage professional PR firms to make us look good, and we academics would be horrified if they did: horrified at the expense, and horrified by the idea that our research might need to be communicated on anything other than its scientific merits.

Journalists like Monbiot, despite all his brilliant work in keeping up with the science and trying to explain it to the masses, just haven’t ever experienced academic culture from the inside. Hence his call, which he keeps repeating, for Phil Jones to resign, on the basis that Phil reacted unprofessionally to FOI requests. But if you keep provoking a scientist with nonsense, you’ll get a hostile response. Any fool knows you don’t get data from a scientist by using FOI requests, you do it by stroking their ego a little, or by engaging them with a compelling research idea that you need the data to pursue. And in the rare cases where this doesn’t work, you do some extra work yourself to reconstruct the data you need using other sources, or you test your hypothesis using a different approach (because it’s the research result we care about, not any particular dataset). So to a scientist, anyone stupid enough to try to get scientific data through repeated FOI requests quite clearly deserves our utter contempt. Jones was merely expressing (in private) a sentiment that most scientists would share – and extreme frustration with people who clearly don’t get it.

The same misunderstandings occur when outsiders look at how we talk about the peer-review process. Outsiders tend to think that all published papers are somehow equal in merit, and that peer-review is a magical process that only lets the truth through (hint: we refer to it more often as a crap-shoot). Scientists know that while some papers are accepted because they are brilliant, others are accepted because its hard to tell whether they are any good, and publication might provoke other scientists to do the necessary followup work. We know some published papers are worth reading, and some should be ignored. So, we’re natural skeptics – we tend to think that most new published results are likely to be wrong, and we tend to accept them only once they’ve been repeatedly tested and refined.

We’re used to having our own papers rejected from time to time, and we learn how to deal with it – quite clearly the reviewers were stupid, and we’ll show them by getting it published elsewhere (remember, big ego, thick skin). We’re also used to seeing the occasional crap paper get accepted (even into our most prized journals), and again we understand that the reviewers were stupid, and the journal editors incompetent, and we waste no time in expressing that. And if there’s a particularly egregious example, everyone in the community will know about it, everyone will agree it’s bad, and some of us will start complaining loudly about the idiot editor who let it through. Yet at the same time, we’re all reviewers, and some of us are editors, so it’s understood that the people we’re calling stupid and incompetent are our colleagues. And a big part of calling them stupid or incompetent is to get them to be more rigorous next time round, and it works because no honest scientist wants to be seen as lacking rigor. What looks to the outsider like a bunch of scientists trying to subvert some gold standard of scientific truth is really just scientists trying to goad one another into doing a better job in what we all know is a messy, noisy process.

The bottom line is that scientists will always tend to be rude to ignorant and lazy people, because we expect to see in one another a driving desire to master complex ideas and to work damn hard at it. Unfortunately the outside world (and many journalists) interpret that rudeness as unprofessional conduct. And because they don’t see it every day (like we do!) they’re horrified.

Some people have suggested that scientists need to wise up, and learn how to present themselves better on the public stage. Indeed, the Guardian published an editorial calling for the emergence of new leaders from the scientific community who can explain the science. This is naive and irresponsible. It completely ignores the nature of the current wave of attacks on scientists, and what motivates those attacks. No scientist can be an effective communicator in a world where people with vested interests will do everything they can to destroy his or her reputation. The scientific community doesn’t have the resources to defend itself in this situation, and quite frankly it shouldn’t have to. What we really need is for newspaper editors, politicians, and business leaders to start acting responsibly, make the effort to understand what the science is saying, make the effort to understand what is really driving these swiftboat-style attacks on scientists, and then shift the discourse from endless dissection of scientists’ emails onto useful, substantive discussions of the policy choices we’re faced with.

[Update: Joe Romm has reposted this at ClimateProgress, and it’s generated some very interesting discussion, including a response from George Monbiot that’s worth reading]

[Update 2: 31/3/2010 The UK Parliament released its findings last night, and completely exonerates Prof. Jones and the CRU. It does, however, suggest that the UEA should bear responsibility for any mistakes that were made over how the FoI requests were handled, and it makes a very strong call for more openness with data and software from the climate science community]

[Update 3: 7/4/2010 A followup post in which I engaged George Monbiot in a lengthy debate (and correct some possible misimpressions from the above post)]

[Update 4: 27/4/2010 This post was picked up by Physics Today]

Nature news runs some very readable articles on climate science, but is unfortunately behind a paywall. Which is a shame because they really should be widely read. Here’s a couple of recent beauties:

The Real Holes in Climate Science, (published 21 Jan 2010) points out that climate change denialists keep repeating long debunked myths about things they believe undermine the science. Meanwhile, in the serious scientific literature, there are some important open questions over real uncertainties in the science (h/t to AH). These are discussed openly in the IPCC reports (see for example, the 59 robust findings and 55 uncertainties listed in section 6 of the Technical Summary for WG1). None of these uncertainties pose a serious challenge to our basic understanding of climate change, but they do prevent absolute certainty about any particular projection. Not only that, many of these uncertainties suggest a strong application of the precautionary principle, because many of them suggest the potential for the IPCC to be underestimating the seriousness of climate change. The Nature News article identifies the following as particularly relevant:

  • Regional predictions. While the global models do a good job of simulating global trends in temperature, they often do poorly on fine-grained regional projections. Geographic features, such as mountain ridges, which mark the boundary of different climatic zones, occur at scales much smaller than the typical grids in GCMs, which means the GCMs get these zonal boundaries wrong, especially when coarse-grain predictions are downscaled.
  • Precipitation. As the IPCC report made clear, many of the models disagree even on the sign of the change in rainfall over much of the globe, especially for winter projections. The differences are due to uncertainties over convection processes. Worryingly, studies of recent trends (published after the IPCC report was compiled)  indicate the models are underestimating precipitation changes, such as the drying of the subtropics.
  • Aerosols. Estimates of the effect on climate from airborne particles (mainly from industrial pollution) vary by an order of magnitude. Some aerosols (e.g. suphates) induce a cooling effect by reflecting sunlight, while others (e.g. black carbon) produce a warming effect by absorbing sunlight. The extent to which these aerosols are masking the warming we’re already ‘owed’ from increased greenhouse gases is hard to determine.
  • Temperature reconstructions prior to the 20th century. The Nature News article discusses at length the issues in the tree ring data used as one of the proxies for reconstructing past temperature records, prior to the instrumental data from the last 150 years. The question of what causes the tree ring data to diverge from instrumental records in recent decades is obviously an interesting question, but to me it seems to be of marginal importance to climate science.

The Climate Machine, (published  24 Feb 2010) describes the Hadley Centre’s HadGEM-2 as an example of the current generation of earth system models, and discusses the challenges of capturing more and more earth systems into the models (h/t to JH). The article quotes many of the modelers I’ve been interviewing about their software development processes. Of particular interest is the discussion about the growing complexity of these models, once other earth systems processes are added: clouds, trees, tundra, land ice, and … pandas (the inclusion of pandas in the models is an in-joke in the modeling community) . There is likely to be a limit to the growth of this complexity, simply because the task of managing the contributions of a growing (and diversifying) group of experts gets harder and harder. The article also points out that one interesting result is likely to be an increase in some uncertainty ranges from these models in the next IPCC report, due to the additional variability introduced from these additional earth system processes.

I would post copies of the full articles, but I’m bound to get takedown emails from Macmillan publishing. But I guess they’re unlikely to object if I respond to emails requesting copies from me for research and education purposes…

Stephen Schneider‘s book, Science as a Contact Sport, makes fascinating reading, as he really gets his teeth into the disinformation campaign against climate science. However, the book was written before the denialist industry really cranked things up in the last few months, and now he’s angrier than ever, as is clear in this report yesterday about threats of violence against climate scientists (h/t to LG). By coincidence, I spoke to Schneider by phone yesterday – we were interviewing him as part of our analysis of the use of models such as C-ROADS in tools for online discussion, such as the collaboratorium. He’s very interested in such tools, partly because they have the potential to create a new generation of much more well-informed people (he noted that many of the people participating in the discussions in the collaboratorium are students), and partly because we need to find a much better way to get the science into the hands of the policymakers.

One of the things he said stuck out, in particular because it answers the question posed by Andrew Weaver at the end of the article above. Weaver says “good scientists are saying to themselves, ‘Why would I want to participate in the IPCC?'”. Steve Schneider told me he has a simple response to this – scientists have to keep doing the assessments and writing the reports, because you never know when they will be needed. When we get another climate shock (like Katrina, or the heatwaves in Europe in 2003), the media will suddenly look for the latest assessment report, and we have to have them ready. At that moment, all the effort is worthwhile. He pointed out this happened for the crisis over the ozone hole; when the media finally took notice, the scientific assessments were ready to hand, and it mattered. That’s why it’s important to keep at it.

I’m giving a talk today to a group of high school students. Most of the talk focusses on climate models, and the kinds of experiments you can do with them. But I thought I’d start with a little bit of history, to demonstrate some key points in the development of our understanding of climate change. Here’s some of the slides I put together (drawing heavily on Spencer Weart’s the Discovery of Global Warming for inspiration). Comments on these slides are welcome.

I plan to start with this image:

Spaceship Earth

…and ask some general questions like:

  • What do you think of when you see this image?
  • Where did all that energy come from?
  • Where does all that energy go? (remember, energy cannot be created or destroy, only transformed…)
  • What happens when you add up the energy needs of 6 billion people?
  • and, introducing the spaceship earth metaphor: Who’s driving this spaceship, and are the life support systems working properly?…

For millions of years, the planet had a natural control system that kept the climate relatively stable. We appear to have broken it. Now we’ve got to figure out how to control it ourselves, before we do irreversible damage. We’re not about to crash this spaceship, but we could damage its life support systems if we don’t figure out how to control it properly.

I then show some graphs showing temperature changes through pre-history, together with graphs of the recent temperature rise. As a prelude to a little history. Here’s my history slides:

John TyndallSvante ArrheniusVilhelm BjerknesRoger RevelleCharles KeelingJule Charney

In the last year, there were three major attempts to assess the current state of the science of climate change, as an update to the 2007 IPCC reports (which are already looking a little dated). They have very similar names, so I thought it might be useful to disambiguate them:

  • The Copenhagen Synthesis Report was put together at the University of Copenhagen to summarize a conference on “Climate change: Global Risks, Challenges and Decisions” that was held in Copenhagen in March 2009. The report has some great summaries of the research presented at the conference, and puts it all together to identify six key messages:
    1. Observations show that many key climate indicators are changing near the upper boundary of the IPCC range of projections;
    2. We have a lot more evidence now on how vulnerable societies and ecosystems are to temperature rises;
    3. Rapid mitigation strategies are needed because we now know that weaker targets for 2020 will make it much more likely we will cross tipping points and make it much harder to meet long term targets;
    4. There are serious equity issues because the impacts of climate change will be felt by those least able to afford to protect themselves;
    5. Action on climate change will have many useful benefits, including improvements in health, revitalization of ecosystems, and job growth in the sustainable energy sector;
    6. Many societal barriers need to be overcome, including existing social and economic policies that subsidize fossil fuel production and consumption, weak institutions and lack of political leadership.
  • The Copenhagen Prognosis was released in December 2009, put together as a joint publication of the Stockholm Environment Institute and the Potsdam Institute for Climate Impact Research. It focuses on the evidence behind the key issues for an international climate treaty, especially the target of limiting warming to 2°C, and the political actions necessary to do this. The key messages of the report are:
    1. The 2ºC limit is a scientifically meaningful one, because of the evidence about the damage caused by rises above this level;
    2. Even rises below 2°C will have devastating impacts on vulnerable communities and ecosystems (and for this reason, 80 nations have endorsed the idea of setting a global target to be “as far below 1.5ºC as possible”);
    3. Analysis of potential tipping points shows that currently discussed political targets will be unable to protect the world from devastating climate impacts and self-amplifying warming;
    4. Global greenhouse gas emissions must decline very rapidly after 2015, and reach net zero emissions by mid-century, if we want a good (75%) chance of staying below 2ºC of warming;
    5. The challenge is great, but not impossible – such a reduction in greenhouse gases appears to be technically feasible, economically affordable, and possibly even profitable (but only if we start quickly);
    6. The challenge will be especially hard for developing countries, who will need serious assistance from developed countries to make the necessary transitions;
    7. This will require unprecedented levels of North-South cooperation;
    8. Equitable allocation of carbon dioxide budgets suggest that industrialized nations must reach zero net emissions (or even negative emissions) in the 2020-2030 timeframe;
    9. Securing a safe climate for generations to come is now in the hands of just one generation, which means we need a new ethical paradigm for addressing this;
    10. The challenge isn’t only about reducing emissions – it will require a shift to sustainable management of land, water and biodiversity throughout the world’s ecosystems;
    11. The achieve the transformation, we’ll need all of: new policy instruments, new institutions for policy development and enforcement, a global climate fund, feed-in tariff systems, market incentives, technological innovations,
  • The Copenhagen Diagnosis was also released in December 2009. It was put together by 26 leading climate scientists, coordinated by the University of New South Wales, and intended as an update to the IPCC Working Group I report on the physical science basis. The report concentrates on how knowledge of the physical science has changed the IPCC assessment report, pointing out:
    1. Greenhouse gas emissions have surged, with emissions in 2008 40% higher than in 1990;
    2. Temperatures have increased at a rate of 0.19°C per decade over the past 25 years, in line with model forecasts;
    3. Satellite and ice measurements show the Greenland and Antarctic ice sheets are losing mass at an increasing rate, and mountain glacier melting is accelerating;
    4. Arctic sea ice has declined much more rapidly than the models predicted: in 2007-2009 the area of arctic sea ice was 40% lower than the IPCC projections.
    5. Satellite measurements show sea level rise to be 3.4mm/year over the last 15 years, which is about 80% above IPCC projections. This rise matches the observed loss of ice.
    6. Revised projections now suggest sea level rise will be double what the IPCC 2007 assessment reported by 2100, putting it at least 1 meter for unmitigated emissions, with an upper estimate of 2 meters; furthermore, sea levels will continue to rise for centuries, even after global temperatures have stabilized.
    7. Irreversible damage is likely to occur to continental ice sheets, the amazon rainforest, the West African Monsoon, etc, due to reaching tipping points; many of these tipping points will be crossed before we realize it.
    8. If global warming is to be limited to 2ºC above pre-industrial levels, global emissions need to peak between 2015 and 2020, and then decline rapidly, eventually reaching a decarbonized society with net zero emissions.

Here’s a letter I’ve sent to the Guardian newspaper. I wonder if they’ll print it? [Update – I’ve marked a few corrections since sending it. Darn]

Professor Darrel Ince, writing in the Guardian on February 5th, reflects on lessons from the emails and documents stolen from the Climatic Research Unit at the University of East Anglia. Prof Ince uses an example from the stolen emails to argue that there are serious concerns about software quality and openness in climate science, and goes on to suggest that this perceived alleged lack of openness is unscientific. Unfortunately, Prof Ince makes a serious error of science himself – he bases his entire argument on a single data point, without asking whether the example is in any way representative.

The email and files from the CRU that were released to the public are quite clearly a carefully chosen selection, where the selection criteria appears to be those that might cause maximum embarrassment to the climate scientists. I’m quite sure that I could find equally embarrassing examples of poor software on the computers of Prof Ince and his colleagues. The Guardian has been conducting a careful study of claims that have been made about these emails, and has shown that the allegations that have been made about defects in the climate science are unfounded. However, these investigations haven’t covered the issues that Prof Ince raises, so it is worth examining them in more detail.

The Harry README file does appear to be a long struggle by a junior scientist to get some poor quality software to work. Does this indicate that there is a systemic problem of software quality in climate science? To answer that question, we would need more data. Let me offer one more data point, representing the other end of the spectrum. Two years ago I carried out a careful study of the software development methods used for main climate simulation models developed at the UK Met Office. I was expecting to see many of the problems Prof Ince describes, because such problems are common across the entire software industry. However, I was extremely impressed with the care and rigor by which the climate models are constructed, and the extensive testing they are subjected to. In many ways, this process achieves a higher quality code than the vast majority of commercial software that I have studied, which includes the spacecraft flight control code developed by NASA’s contractors. [My results were published here:].

The climate models are developed over many years, by a large team of scientists, through a process of scientific experimentation. The scientists understand that their models are approximations of complex physical processes in the Earth’s atmosphere and oceans. They build their models through a process of iterative refinement. They run the models, and compare them with observational data, to look for the places where the models perform poorly. They then create hypotheses for how to improve the model, and then run experiments: using the previous version of the model as a control, and the new version as the experimental case, they compare both runs with the observational data to determine whether the hypothesis was correct. By a continual process of making small changes, and experimenting with the results, they end up testing their models far more effectively than most commercial software developers. And through careful use of tools to keep track of this process, they can reproduce past experiments on old versions of the model whenever necessary. The main climate models are also subjected to extensive model intercomparison tests, as part of the IPCC assessment process. Models from different labs are run on the same scenarios, and the results compared in detail, to explore the strengths and weaknesses of each model.

Like many software industries, different types of climate software are verified to different extents, representing choices of where to apply limited resources. The main climate models are tested extensively, as I described above. But often scientists need to develop other programs for occasional data analysis tasks. Sometimes, they do this rather haphazardly (which appears to be the case with the Harry file). Many of these tasks are experimental tentative in nature, and correspond to the way software engineers regularly throw a piece of code together to try out an idea. What matters is that, if the idea matures, and leads to results that are published or shared with other scientists, the results are checked out carefully by other scientists. Getting hold of the code and re-running it is usually a poor way of doing this (I’ve found over the years that replicating someone else’s experiment is fraught with difficulties, and not primarily exclusively because of problems with code quality). A much better approach is for other scientists to write their own code, and check independently whether the results are confirmed. This avoids the problem of everyone relying on one particular piece of software, as we can never be sure any software is entirely error-free.

The claim that many climate scientists have refused to publish their computer programs is also specious. I compiled a list last summer of how to access the code for the 23 main models used in the IPCC report. Although only a handful are fully open source, most are available free under fairly light licensing arrangements. For our own research we have asked for and obtained the the full code, version histories, and bug databases from several centres, with no difficulties (other than the need for a little patience as the appropriate licensing agreements were sorted out). Climate and weather forecasting code has a number of potential commercial applications, so the modeling centres use a license agreement that permits academic research, but prohibits commercial use. This is no different from what would be expected when we obtain code from any commercial organization.

Professor Ince mentions Hatton’s work, which is indeed an impressive study, and one of the few that that has been carried out on scientific code. And it is quite correct that there is a lot of shoddy scientific software out there. We’ve applied some of Hatton’s research methods to climate model software, and have found that, by standard software quality metrics, the climate models are consistently good quality code. Unfortunately, is it is not clear that standard software engineering quality metrics apply well to this code. Climate models aren’t built to satisfy a specification, but to address a scientific problem where the answer is not known in advance, and where only approximate solutions are possible. Many standard software testing techniques don’t work in this domain, and it is a shame that the software engineering research community has almost completely ignored this problem – we desperately need more research into this.

Prof Ince also echoes a belief that seems to be common across the academic software community that releasing the code will solve the quality problems seen in the specific case of the Harry file. This is a rather dubious claim. There is no evidence that, in general, open source software is any less buggy than closed source software. Dr Xu at the University of Notre Dame studied thousands of open source software projects, and found that the majority had nobody other than the original developer using them, while a very small number of projects had attracted a big community of developers. This pattern would be true of scientific software: the problem isn’t lack of openness, it’s lack of time – most of the code thrown together to test out an idea by a particular scientist is only of interest to that one scientist. If a result is published and other scientists think it’s interesting and novel, they attempt to replicate the result themselves. Sometimes they ask for the original code (and in my experience, are nearly always given it). But in general, they write their own versions, because what matters isn’t independent verification of the code, but independent verification of the scientific results.

I am encouraged that my colleagues in the software engineering research community are starting to take an interest in studying the methods by which climate science software is developed. I fully agree that this is an important topic, and have been urging my colleagues to address it for a number of years. I do hope that they take the time to study the problem more carefully though, before drawing conclusions about overall software quality of climate code.

Prof Steve Easterbrook, University of Toronto

Update: The Guardian never published my letter, but I did find a few other rebuttals to Ince’s article in various blogs. Davec’s is my favourite!

A reader writes to me from New Zealand, arguing that climate science isn’t a science at all because there is no possibility to conduct experiments. This misconception appears to be common, even among some distinguished scientists, who presumably have never taken the time to read many published papers in climatology. The misconception arises because people assume that climate science is all about predicting future climate change, and because such predictions are for decades/centuries into the future, and we only have one planet to work with, we can’t check to see if these predictions are correct until it’s too late to be useful.

In fact, predictions of future climate are really only a by-product of climate science. The science itself concentrates on improving our understanding of the processes that shape climate, by analyzing observations of past and present climate, and testing how well we understand them. For example, detection/attribution studies focus on the detection of changes in climate that are outside the bounds of natural variability (using statistical techniques), and determining how much of the change can be attributed to each of a number of possible forcings (e.g. changes in: greenhouse gases, land use, aerosols, solar variation, etc). Like any science, the attribution is done by creating hypotheses about possible effects of each forcing, and then testing those hypotheses. Such hypotheses can be tested by looking for contradictory evidence (e.g. other episodes in the past where the forcing was present or absent, to test how well the hypothesis explains these too). They can also be tested by encoding each hypothesis in a climate model, and checking how well it simulates the observed data.

I’m not a climate modeler, but I have conducted anthropological studies of how how climate modelers work. Climate models are developed slowly and carefully over many years, as scientific instruments. One of the most striking aspects of climate model development is that it is an experimental science in the strongest sense. What do I mean?

Well, a climate model is a detailed theory of some subset of the earth’s physical processes. Like all theories, it is a simplification that focusses on those processes that are salient to a particular set of scientific questions, and approximates or ignores those processes that are less salient. Climate modelers use their models as experimental instruments. They compare the model run with the observational record for some relevant historical period. They then come up with a hypothesis to explain any divergences between the run and the observational record, and make a small improvement to the model that the hypothesis predicts will reduce the divergence. They then run an experiment in which the old version of the model acts as a control, and the new version is the experimental case. By comparing the two runs with the observational record, they determine whether the predicted improvement was achieved (and whether the change messed anything else up in the process). After a series of such experiments, the modelers will eventually either accept the change to the model as an improvement to be permanently incorporated into the model code, or they discard it because the experiments failed (i.e. they failed to give the expected improvement). By doing this day after day, year after year, the models get steadily more sophisticated, and steadily better at simulating real climactic processes.

This experimental approach has another interesting effect: the software appears to be tested much more thoroughly than most commercial software. Whether this actually delivers higher quality code is an interesting question; however, it is clear that the approach is much more thorough than most industry practices for software regression testing.

In a blog post that was picked up by the Huffington post, Bill Gates writes about why we need innovation, not insulation. He sets up the piece as a choice of emphasis between two emissions targets: 30% reduction by 2025, and 80% reduction by 2050. He argues that the latter target is much more important, and hence we should focus on big R&D efforts to innovate our way to zero-carbon energy sources for transportation and power generation. In doing so, he pours scorn on energy conservation efforts, arguing, in effect, that they are a waste of time. Which means Bill Gates didn’t do his homework.

What matters is not some arbitrary target for any given year. What matters is the path we choose to get there. This is a prime example of the communications failure over climate change. Non-scientists don’t bother to learn the basic principles of climate science, and scientists completely fail to get the most important ideas across in a way that helps people make good judgements about strategy.

The key problem in climate change is not the actual emissions in any given year. It’s the cumulative emissions over time. The carbon we emit by burning fossil fuels doesn’t magically disappear. About half is absorbed by the oceans (making them more acidic). The rest cycles back and forth between the atmosphere and the biosphere, for centuries. And there is also tremendous lag in the system. The ocean warms up very slowly, so it take decades for the Earth to reach a new equilibrium temperature once concentrations in the atmosphere stabilize. This means even if we could immediately stop adding CO2 to the atmosphere today, the earth would keep warming for decades, and wouldn’t cool off again for centuries. It’s going to be tough adapting to the warming we’re already committed to. For every additional year that we fail to get emissions under control we compound the problem.

What does this mean for targets? It means that it matters much more how soon we get started on reducing emissions rather than eventual destination at any particular future year. Because any reduction in annual emissions achieved in the next few years means that we save that amount of emissions every year going forward. The longer we take to get the emissions under control, the harder we make the problem.

A picture might help:

Emissions pathways to give 67% chance of limiting global warming to 2ºC

Three different emissions pathways to give 67% chance of limiting global warming to 2ºC (From the Copenhagen Diagnosis, Figure 22)

The graph shows three different scenarios, each with the same cumulative emissions (i.e. the area under each curve is the same). If we get emissions to peak next year (the green line), it’s a lot easier to keep cumulative emissions under control. If we delay, and allow emissions to continue to rise until 2020, then we can forget about 80% reductions by 2050. We’ll have set ourselves the much tougher task of 100% emissions reductions by 2040!

The thing is, there are plenty of good analyses of how to achieve early emissions reductions by deploying existing technology. Anyone who argues we should put our hopes in some grand future R&D effort to invent new technologies clearly does not understand the climate science. Or perhaps can’t do calculus.

Weather and climate are different. Weather varies tremendously from day to day, week to week, season to season. Climate, on the other hand is average weather over a period of years; it can be thought of as the boundary conditions on the variability of weather. We might get an extreme cold snap, or a heatwave at a particular location, but our knowledge of the local climate tells us that these things are unusual, temporary phenomena, and sooner or later things will return to normal. Forecasting the weather is therefore very different from forecasting changes in the climate. One is an initial value problem, and the other is a boundary value problem. Let me explain.

Good weather forecasts depend upon an accurate knowledge of the current state of the weather system. You gather as much data you can about current temperatures, winds, clouds, etc., feed them all into a simulation model and then run it forward to see what happens. This is hard because the weather is an incredibly complex system. The amount of information needed is huge: both the data and the models are incomplete and error-prone. Despite this, weather forecasting has come a long way over the past few decades. Through a daily process of generating forecasts, comparing them with what happened, and thinking about how to reduce errors, we have incredibly accurate 1- and 3- day temperature forecasts. Accurate forecasts of rain, snow, and so on for a specific location is a little harder because of the chance that the rainfall will be in a slightly different place (e.g a few kilometers away) or a slightly different time than the model forecasts, even if the overall amount of precipitation is right. Hence, daily forecasts give fairly precise temperatures, but put probabilistic values on things like rain (Probability of Precipitation, PoP), based on knowledge of the uncertainty factors in the forecast. The probabilities are known because we have a huge body of previous forecasts to compare with.

The limit on useful weather forecasts seems to be about one week. There are inaccuracies and missing information in the inputs, and the models are only approximations of the real physical processes. Hence, the whole process is error prone. At first these errors tend to be localized, which means the forecast for the short term (a few days) might be wrong in places, but is good enough in most of the region we’re interested in to be useful. But the longer we run the simulation for, the more these errors multiply, until they dominate the computation. At this point, running the simulation for longer is useless. 1-day forecasts are much more accurate than 3-day forecasts, which are better than 5-day forecasts, and beyond that it’s not much better than guessing. However, steady improvements mean that 3-day forecasts are now as accurate as 2-day forecasts were a decade ago. Weather forecasting centres are very serious about reviewing the accuracy of their forecasts, and set themselves annual targets for accuracy improvements.

A number of things help in this process of steadily improving forecasting accuracy. Improvements to the models help, as we get better and better at simulating physical processes in the atmosphere and oceans. Advances in high performance computing help too – faster supercomputers mean we can run the models at a higher resolution, which means we get more detail about where exactly energy (heat) and mass (winds, waves) are moving. But all of these improvements are dwarfed by the improvements we get from better data gathering. If we had more accurate data on current conditions, and could get it into the models faster, we could get big improvements in the forecast quality. In other words, weather forecasting is an “initial value” problem. The biggest uncertainty is knowledge of the initial conditions.

One result of this is that weather forecasting centres (like the UK Met Office) can get an instant boost to forecasting accuracy whenever they upgrade to a faster supercomputer. This is because the weather forecast needs to be delivered to a customer (e.g. a newspaper or TV station) by a fixed deadline. If the models can be made to run faster, the start of the run can be delayed, giving the meteorologists more time to collect newer data on current conditions, and more time to process this data to correct for errors, and so on. For this reason, the national weather forecasting services around the world operate many of the world’s fastest supercomputers.

Hence weather forecasters are strongly biased towards data collection as the most important problem to tackle. They tend to regard computer models as useful, but of secondary importance to data gathering. Of course, I’m generalizing – developing the models is also a part of meteorology, and some meteorologists devote themselves to modeling, coming up with new numerical algorithms, faster implementations, and better ways of capturing the physics. It’s quite a specialized subfield.

Climate science has the opposite problem. Using pretty much the same model as for numerical weather prediction, climate scientists will run the model for years, decades or even centuries of simulation time. After the first few days of simulation, the similarity to any actual weather conditions disappears. But over the long term, day-to-day and season-to-season variability in the weather is constrained by the overall climate. We sometimes describe climate as “average weather over a long period”, but in reality it is the other way round – the climate constrains what kinds of weather we get.

For understanding climate, we no longer need to worry about the initial values, we have to worry about the boundary values. These are the conditions that constraint the climate over the long term: the amount of energy received from the sun, the amount of energy radiated back into space from the earth, the amount of energy absorbed or emitted from oceans and land surfaces, and so on. If we get these boundary conditions right, we can simulate the earth’s climate for centuries, no matter what the initial conditions are. The weather itself is a chaotic system, but it operates within boundaries that keep the long term averages stable. Of course, a particularly weird choice of initial conditions will make the model behave strangely for a while, at the start of a simulation. But if the boundary conditions are right, eventually the simulation will settle down into a stable climate. (This effect is well known in chaos theory: the butterfly effect expresses the idea that the system is very sensitive to initial conditions, and attractors are what cause a chaotic system to exhibit a stable pattern over the long term)

To handle this potential for initial instability, climate modellers create “spin-up” runs: pick some starting state, run the model for say 30 years of simulation, until it has settled down to a stable climate, and then use the state at the end of the spin-up run as the starting point for science experiments. In other words, the starting state for a climate model doesn’t have to match real weather conditions at all; it just has to be a plausible state within the bounds of the particular climate conditions we’re simulating.

To explore the role of these boundary values on climate, we need to know whether a particular combination of boundary conditions keep the climate stable, or tend to change it. Conditions that tend to change it are known as forcings. But the impact of these forcings can be complicated to assess because of feedbacks. Feedbacks are responses to the forcings that then tend to amplify or diminish the change. For example, increasing the input of solar energy to the earth would be a forcing. If this then led to more evaporation from the oceans, causing increased cloud cover, this could be a feedback, because clouds have a number of effects: they reflect more sunlight back into space (because they are whiter than the land and ocean surfaces they cover) and they trap more of the surface heat (because water vapour is a strong greenhouse gas). The first of these is a negative feedback (it reduces the surface warming from increased solar input) and the second is a positive feedback (it increases the surface warming by trapping heat). To determine the overall effect, we need to set the boundary conditions to match what we know from observational data (e.g. from detailed measurements of solar input, measurements of greenhouse gases, etc). Then we run the model and see what happens.

Observational data is again important, but this time for making sure we get the boundary values right, rather than the initial values. Which means we need different kinds of data too – in particular, longer term trends rather than instantaneous snapshots. But this time, errors in the data are dwarfed by errors in the model. If the algorithms are off even by a tiny amount, the simulation will drift over a long climate run, and it stops resembling the earth’s actual climate. For example, a tiny error in calculating where the mass of air leaving one grid square goes could mean we lose a tiny bit of mass on each time step. For a weather forecast, the error is so small we can ignore it. But over a century long climate run, we might end up with no atmosphere left! So a basic test for climate models is that they conserve mass and energy over each timestep.

Climate models have also improved in accuracy steadily over the last few decades. We can now use the known forcings over the last century to obtain a simulation that tracks the temperature record amazingly well. These simulations demonstrate the point nicely. They don’t correspond to any actual weather, but show patterns in both small and large scale weather systems that mimic what the planet’s weather systems actually do over the year (look at August – see the the daily bursts of rainfall in the Amazon, the gulf stream sending rain to the UK all summer long, and the cyclones forming off the coast of Japan by the middle of the month). And these patterns aren’t programmed into the model – it is all driven by sets of equations derived from the basic physics. This isn’t a weather forecast, because on any given day, the actual weather won’t look anything like this. But it is an accurate simulation of typical weather over time (i.e. climate). And, as was the case with weather forecasts, some bits are better than others – for example the Indian monsoons tend to be less well-captured than the North Atlantic Oscillation.

At first sight, numerical weather prediction and climate models look very similar. They model the same phenomena (e.g. how energy moves around the planet via airflows in the atmosphere and currents in the ocean), using the same computational techniques (e.g., three dimensional models of fluid flow on a rotating sphere). And quite often they use the same program code. But the problems are completely different: one is an initial value problem, and one is a boundary value problem.

Which also partly explains why a small minority of (mostly older, mostly male) meteorologists end up being climate change denialists. They fail to understand the difference in the two problems, and think that climate scientists are misusing the models. They know that the initial value problem puts serious limits on our ability to predict the weather, and assume the same limit must prevent the models being used for studying climate. Their experience tells them that weaknesses in our ability to get detailed, accurate, and up-to-date data about current conditions is the limiting factor for weather forecasting, and they assume this limitation must be true of climate simulations too.

Ultimately, such people tend to suffer from “senior scientist” syndrome: a lifetime of immersion in their field gives them tremendous expertise in that field, which in turn causes them to over-estimate how well their expertise transfers to a related field. They can become so heavily invested in a particular scientific paradigm that they fail to understand that a different approach is needed for different problem types. This isn’t the same as the Dunning-Kruger effect, because the people I’m talking about aren’t incompetent. So perhaps we need a new name. I’m going to call it the Dyson-effect, after one of it’s worst sufferers.

I should clarify that I’m certainly not stating that meteorologists in general suffer from this problem (the vast majority quite clearly don’t), nor am I claiming this is the only reason why a meteorologist might be skeptical of climate research. Nor am I claiming that any specific meteorologists (or physicists such as Dyson) don’t understand the difference between initial value and boundary value problems. However, I do think that some scientists’ ideological beliefs tend to bias them to be dismissive of climate science because they don’t like the societal implications, and the Dyson-effect disinclines them to finding out what climate science actually does.

I am, however, arguing that if more people understood this distinction between the two types of problem, we could get past silly soundbites about “we can’t even forecast the weather…” and “climate models are garbage in garbage out”, and have a serious conversation about how climate science works.

Update: Zeke has a more detailed post on the role of parameterizations climate models.

Well, this is what it comes down to. Code reviews on national TV. Who would have thought it? And, by the standards of a Newsnight code review, the code in question doesn’t look so good. Well, it’s not surprising it doesn’t. It’s the work of one, untrained programmer, working in an academic environment, trying to reconstruct someone else’s data analysis. And given the way in which the CRU files were stolen, we can be pretty sure this is not a random sample of code from the CRU; it’s handpicked to be one of the worst examples.

Watch the clip from about 2:00. They compare the code with some NASA code, although we’re not told what exactly. Well, duh. If you compare the experimental code written by one scientist on his own, which has clearly not been through any code review, with that produced by a NASA’s engineering processes, of course it looks messy. For any programmers reading this: How many of you can honestly say that you’d come out looking good if I trawled through your files, picked the worst piece of code lying around in there, and reviewed it on national TV? And the “software engineer” on the program says it’s “below the standards you would expect in any commercial software”. Well, I’ve seen a lot of commercial software. It’s a mix of good, bad, and ugly. If you’re deliberate with your sampling technique, you can find a lot worse out there.

Does any of this matter? Well, a number of things bug me about how this is being presented in the media and blogosphere:

  • The first, obviously, is the ridiculous conclusion that many people seem to be making that poor code quality in one, deliberately selected program file somehow invalidates all of climate science. As cdavid points out towards the end of this discussion, if you’re going to do that, then you pretty much have to throw out most results in every field of science over the past few decades for the same reason. Bad code is endemic in science.
  • The slightly more nuanced, but equally specious, conclusion that bugs in this code mean that research results at the CRU must be wrong. Eric Raymond picks out an example he calls blatant data-cooking, but is quite clearly fishing for results, because he ignores the fact that the correction he picks on is never used in the code, except in parts that are commented out. He’s quote mining for effect, and given Raymond’s political views, it’s not surprising. Just for fun, someone quote mined Raymond’s own code, and was horrified at what he found. Clearly we have to avoid all open source code immediately because of this…? The problem, of course, is that none of these quote miners have gone to the trouble to establish what this particular code is, why it was written, and what it was used for.
  • The widely repeated assertion that this just proves that scientific software must be made open source, so that a broader community of people can review it and improve it.

It’s this last point that bothers me most, because at first sight, it seems very reasonable. But actually, it’s a red herring. To understand why, we need to pick apart two different arguments:

  1. An argument that when a paper is published, all of the code and data on which it is based should be released so that other scientists (who have the appropriate background) can re-run it and validate the results. In fields with complex, messy datasets, this is exceedingly hard, but might be achievable with good tools. The complete toolset needed to do this does not exist today, so just calling for making the code open source is pointless. Much climate code is already open source, but that doesn’t mean anyone in another lab can repeat a run and check the results. The problems of reproducibility have very little to do with whether the code is open – the key problem is to capture the entire scientific workflow and all data provenance. This is very much an active line of research, and we have a long way to go. In the absence of this, we rely on other scientists testing the results with other methods, rather than repeating the same tests. Which is the way it’s done in most branches of science.
  2. An argument that there is a big community of open source programmers out there who could help. This is based on a fundamental misconception about why open source software development works. It matters how the community is organised, and how contributions to the code are controlled by a small group of experts. It matters that it works as a meritocracy, where programmers need to prove their ability before they are accepted into the inner developer group. And most of all, it matters that the developers are the domain experts. For example, the developers who built the Linux kernel are world-class experts on operating systems and computer architecture. Quite often they don’t realize just how high their level of expertise is, because they hang out with others who also have the same level of expertise. Likewise, it takes years of training to understand the dynamics of atmospheric physics in order to be able to contribute to the development of a climate simulation model. There is not a big pool of people with the appropriate expertise to contribute to open source climate model development, and nor is there ever likely to be, unless we expand our PhD programs in climatology dramatically (I’m sure the nay-sayers would like that!).

We do know that most of the heavy duty climate models are built at large government research centres, rather than at universities. Dave Randall explains why this is: the operational overhead of developing, testing and maintaining a Global Climate Model is far too high for university-based researchers. The Universities use (parts of) the models, and do further data analysis on both observational data and outputs from the big models. Much of this is the work of indivdual PhD students or postdocs. Which means that the argument that all code written at all stages of climate research must meet some gold standard of code quality is about as sensible as saying no programmer should ever be allowed to throw together a script to test out if some idea works. Of course bad code will get written in a hurry. What matters is that as a particular line of research matures, the coding practices associated with it should mature too. And we have plenty of evidence that this is true of climate science: the software practices used at the Hadley Centre for their climate models are better than most commercial software practices. Furthermore, they manage to produce code that appears to be less buggy than just about any other code anywhere (although we’re still trying to validate this result, and understand what it means).

None of this excuses bad code written by scientists. But the sensible response to this problem is to figure out how to train scientists to be better programmers, rather than argue that some community of programmers other than scientists can take on the job instead. The idea of open source climate software is great, but it won’t magically make the code better.

Our paper, Engineering the Software for Understanding Climate Change finally appeared today in IEEE Computing in Science and Engineering. The rest of the issue looks interesting too – a special issue on software engineering in computational science. Kudos to Greg and Andy for pulling it together.

Update: As the final paper is behind a paywall, folks might find this draft version useful. The final published version was edited for journal house style, and shortened to fit page constraints. Needless to say, I prefer my original draft…