Susan Leigh Star passed away in her sleep this week, coincidently on Ada Lovelace day. As I didn’t get a chance to do a Lovelace post, I’m writing this one belatedly, as a tribute to Leigh.

Leigh Star (sometimes also known as L*) had a huge influence on my work back in the early 90’s. I met her when she was in the UK, at a time when there was a growing community of folks at Sussex, Surrey, and Xerox Europarc, interested in CSCW. We organised a series of workshops on CSCW in London, at the behest of the UK funding councils. Leigh spoke at the the workshop that I chaired, and she subsequently contributed a chapter entitled “Cooperation Without Consensus in Scientific Problem Solving” to our book, CSCW: Cooperation of Conflict. Looks like the book is out of print, and I really want to read Leigh’s chapter again, so I hope I haven’t lost my copy – the only chapter I still have electronically is our introduction.

Anyway, Leigh pioneered a new kind of sociology of scientific work practices, looking at the mechanisms by which coordination and sharing occurs across disciplinary boundaries. Perhaps one of her most famous observations is the concept of boundary objects, which I described in detail last year in response to seeing coordination issues arise between geophysicists trying to consolidate their databases. The story of the geologists realizing they didn’t share a common definition of the term “bedrock” would have amused and fascinated her.

It was Leigh’s work on this that first switched me on to the value of sociological studies as a way of understanding the working practices of scientists, and she taught me a lot about how to use ethnographic techniques to study how people use and develop technical infrastructures. I’ve remained fascinated by her ideas ever since. For those wanting to know more about her work, I could suggest this interview with her from 2008, or better yet, buy her book on how classification schemes work, or perhaps read this shorter paper on the Ethnography of Infrastructure. She had just moved to the i-school at U Pittsburgh last year, so I assumed she still had many years of active research ahead of her. I’m deeply saddened that I didn’t get another chance to meet with her.

Leigh – we’ll miss you!

Note: This started as a comment on a thread at RealClimate about the Guardian’s investigation of the CRU emails fiasco. The Guardian has, until recently, had an outstandingly good record on it’s climate change reporting. It commissioned Fred Pearce to do a detailed investigation into the emails, and he published his results in a 12-part series. While some parts of it are excellent, other parts demonstrate a complete misunderstanding of how science works, especially the sections dealing with the peer-review process. These were just hopelessly wrong, as demonstrated by Ben Santer’s rebuttal of the specific allegations. In parallel, George Monbiot, who I normally respect as one of the few journalists who really understands the science, has been arguing for Phil Jones to resign as head of the CRU at East Anglia, on the basis that his handling of the FOI requests was unprofessional. Monbiot has repeated this more recently, as can be seen in this BBC clip, where he is hopelessly ineffective in combating Delingpole’s nonsense, because he’s unwilling to defend the CRU scientists adequately.

The problem with both Pearce’s investigation, and Monbiot’s criticisms of Prof Jones is that neither has any idea of what academic research looks like from the inside, nor how scientists normally talk to one another. The following is my attempt to explain this context, and in particular why scientists talking freely among themselves might seem to rude or worse. Enough people liked my comment at RC that I decided to edit it a little and post it here (the original has already been reposted at ClimateSight and Prof Mandia’s blog). I should add one disclaimer: I don’t mean to suggest here that scientists are not nice people – the climate scientists I’ve gotten to know over the past few years are some of the nicest people you could ever ask to meet. It’s just that scientists are extremely passionate about the integrity of their work, and don’t take kindly to people pissing them around. Okay, now read on…

Once we’ve gotten past the quote-mining and distortion, the worst that can be said about the CRU emails is that the scientists sometimes come across as rude or dismissive, and say things in the emails that really aren’t very nice. However, the personal email messages between senior academics in any field are frequently not very nice. We tend to be very blunt about what appears to us as ignorance, and intolerant of anything that wastes our time, or distracts us from our work. And when we think (rightly or wrongly) that the peer review process has let another crap paper through, we certainly don’t hold back in expressing our opinions to one another. Which is of course completely different to how we behave when we meet one another. Most scientists distinguish clearly between the intellectual cut and thrust (in which we’re sometimes very rude about one another’s ideas) and our social interactions (in which we all get together over a beer and bitch about the downsides of academic life). Occasionally, there’s someone who is unable to separate the two, and takes the intellectual jabs personally, but such people are rare enough in most scientific fields that the rest of us know exactly who they are, and try to avoid them at conferences.

Part of this is due to the nature of academic research. Most career academics have large egos and very thick skins. I think the tenure process and the peer review process filter out those who don’t. We’re all jostling to get our work published and recognised, often by pointing out how flawed everyone else’s work is. But we also care deeply about intellectual rigor, and preserving the integrity of the published body of knowledge. And we also know that many key career milestones are dependent on being respected (and preferably liked) by others in the field: for example, the more senior people who might get asked to write recommendation letters for us, for tenure and promotion and honors, or the scientists with competing theories who will get asked to peer review our papers.

Which means in public (e.g. in conference talks and published papers) our criticisms of others are usually carefully coded to appear polite and respectful. A published paper might talk about making “an improvement on the methodology of Bloggs et al”. Meanwhile, in private, when talking to our colleagues, we’re more likely to say that Bloggs’ work is complete rubbish, and should never have been published in the first place, and anyway everyone knows Bloggs didn’t do any of the work himself, and the only decent bits are due to his poor, underpaid postdoc, who never gets any credit for her efforts. (Yes, academics like to gossip about one another just as regular people do). This kind of blunt rudeness is common in private emails, especially when we’re discussing other scientists behind their backs with likeminded colleagues. Don’t be fooled by the more measured politeness in public: when we think an idea is wrong, we’ll tear it to shreds.

Now, in climate science, all our conventions are being broken. Private email exchanges are being made public. People who have no scientific training and/or no prior exposure to the scientific culture are attempting to engage in a discourse with scientists, and neither side understands the other. People are misquoting scientists, and trying to trip them up with loaded questions. And, occasionally, resorting to death threats. Outside of the scientific community, most people just don’t understand how science works, and so don’t know how to make sense of what’s going on.

And scientists don’t really know how to engage with these strange outsiders. Scientists normally only interact with other scientists. We live rather sheltered lives; they don’t call it the ivory tower for nothing. When scientists are attacked for political reasons, we mistake it for an intellectual discussion over brandy in the senior common room. Scientists have no training for political battles, and so our responses often look rude or dismissive to outsiders. Which in turn gets interpreted as unprofessional behaviour by those who don’t understand how scientists talk. And unlike commercial organisations and politicians, universities don’t engage professional PR firms to make us look good, and we academics would be horrified if they did: horrified at the expense, and horrified by the idea that our research might need to be communicated on anything other than its scientific merits.

Journalists like Monbiot, despite all his brilliant work in keeping up with the science and trying to explain it to the masses, just haven’t ever experienced academic culture from the inside. Hence his call, which he keeps repeating, for Phil Jones to resign, on the basis that Phil reacted unprofessionally to FOI requests. But if you keep provoking a scientist with nonsense, you’ll get a hostile response. Any fool knows you don’t get data from a scientist by using FOI requests, you do it by stroking their ego a little, or by engaging them with a compelling research idea that you need the data to pursue. And in the rare cases where this doesn’t work, you do some extra work yourself to reconstruct the data you need using other sources, or you test your hypothesis using a different approach (because it’s the research result we care about, not any particular dataset). So to a scientist, anyone stupid enough to try to get scientific data through repeated FOI requests quite clearly deserves our utter contempt. Jones was merely expressing (in private) a sentiment that most scientists would share – and extreme frustration with people who clearly don’t get it.

The same misunderstandings occur when outsiders look at how we talk about the peer-review process. Outsiders tend to think that all published papers are somehow equal in merit, and that peer-review is a magical process that only lets the truth through (hint: we refer to it more often as a crap-shoot). Scientists know that while some papers are accepted because they are brilliant, others are accepted because its hard to tell whether they are any good, and publication might provoke other scientists to do the necessary followup work. We know some published papers are worth reading, and some should be ignored. So, we’re natural skeptics – we tend to think that most new published results are likely to be wrong, and we tend to accept them only once they’ve been repeatedly tested and refined.

We’re used to having our own papers rejected from time to time, and we learn how to deal with it – quite clearly the reviewers were stupid, and we’ll show them by getting it published elsewhere (remember, big ego, thick skin). We’re also used to seeing the occasional crap paper get accepted (even into our most prized journals), and again we understand that the reviewers were stupid, and the journal editors incompetent, and we waste no time in expressing that. And if there’s a particularly egregious example, everyone in the community will know about it, everyone will agree it’s bad, and some of us will start complaining loudly about the idiot editor who let it through. Yet at the same time, we’re all reviewers, and some of us are editors, so it’s understood that the people we’re calling stupid and incompetent are our colleagues. And a big part of calling them stupid or incompetent is to get them to be more rigorous next time round, and it works because no honest scientist wants to be seen as lacking rigor. What looks to the outsider like a bunch of scientists trying to subvert some gold standard of scientific truth is really just scientists trying to goad one another into doing a better job in what we all know is a messy, noisy process.

The bottom line is that scientists will always tend to be rude to ignorant and lazy people, because we expect to see in one another a driving desire to master complex ideas and to work damn hard at it. Unfortunately the outside world (and many journalists) interpret that rudeness as unprofessional conduct. And because they don’t see it every day (like we do!) they’re horrified.

Some people have suggested that scientists need to wise up, and learn how to present themselves better on the public stage. Indeed, the Guardian published an editorial calling for the emergence of new leaders from the scientific community who can explain the science. This is naive and irresponsible. It completely ignores the nature of the current wave of attacks on scientists, and what motivates those attacks. No scientist can be an effective communicator in a world where people with vested interests will do everything they can to destroy his or her reputation. The scientific community doesn’t have the resources to defend itself in this situation, and quite frankly it shouldn’t have to. What we really need is for newspaper editors, politicians, and business leaders to start acting responsibly, make the effort to understand what the science is saying, make the effort to understand what is really driving these swiftboat-style attacks on scientists, and then shift the discourse from endless dissection of scientists’ emails onto useful, substantive discussions of the policy choices we’re faced with.

[Update: Joe Romm has reposted this at ClimateProgress, and it’s generated some very interesting discussion, including a response from George Monbiot that’s worth reading]

[Update 2: 31/3/2010 The UK Parliament released its findings last night, and completely exonerates Prof. Jones and the CRU. It does, however, suggest that the UEA should bear responsibility for any mistakes that were made over how the FoI requests were handled, and it makes a very strong call for more openness with data and software from the climate science community]

[Update 3: 7/4/2010 A followup post in which I engaged George Monbiot in a lengthy debate (and correct some possible misimpressions from the above post)]

[Update 4: 27/4/2010 This post was picked up by Physics Today]

This week I attended a Dagstuhl seminar on New Frontiers for Empirical Software Engineering. It was a select gathering, with many great people, which meant lots of fascinating discussions, and not enough time to type up all the ideas we’ve been bouncing around. I was invited to run a working group on the challenges to empirical software engineering posed by climate change. I started off with a quick overview of the three research themes we identified at the Oopsla workshop in the fall:

  • Climate Modeling, which we could characterize as a kind of end-user software development, embedded in a scientific process;
  • Global collective decision-making, which involves creating the software infrastructure for collective curation of sources of evidence in a highly charged political atmosphere;
  • Green Software Engineering, including carbon accounting for the software systems lifecycle (development, operation and disposal), but where we have no existing no measurement framework, and tendency to to make unsupported claims (aka greenwashing).

Inevitably, we spent most of our time this week talking about the first topic – software engineering of computational models, as that’s the closest to the existing expertise of the group, and the most obvious place to start.

So, here’s a summary of our discussions. The bright ideas are due to the group (Vic Basili, Lionel Briand, Audris Mockus, Carolyn Seaman and Claes Wohlin), while the mistakes in presenting them here are all mine.

A lot of our discussion was focussed on the observation that climate modeling (and software for computational science in general) is a very different kind of software engineering than most of what’s discussed in the SE literature. It’s like we’ve identified a new species of software engineering, which appears to be a an outlier (perhaps an entirely new phylum?). This discovery (and the resulting comparisons) seems to tell us a lot about the other species that we thought we already understood.

The SE research community hasn’t really tackled the question of how the different contexts in which software development occurs might affect software development practices, nor when and how it’s appropriate to attempt to generalize empirical observations across different contexts. In our discussions at the workshop, we came up with many insights for mainstream software engineering, which means this is a two-way street: plenty of opportunity for re-examination of mainstream software engineering, as well as learning how to study SE for climate science. I should also say that many of our comparisons apply to computational science in general, not just climate science, although we used climate modeling for many specific examples.

We ended up discussing three closely related issues:

  1. How do we characterize/distinguish different points in this space (different species of software engineering)? We focussed particularly on how climate modeling is different from other forms of SE, but we also attempted to identify factors that would distinguish other species of SE from one another. We identified lots of contextual factors that seem to matter. We looked for external and internal constraints on the software development project that seem important. External constraints are things like resource limitations, or particular characteristics of customers or the environment where the software must run. Internal constraints are those that are imposed on the software team by itself, for example, choices of working style, project schedule, etc.
  2. Once we’ve identified what we think are important distinguishing traits (or constraints), how do we investigate whether these are indeed salient contextual factors? Do these contextual factors really explain observed differences in SE practices, and if so how? We need to consider how we would determine this empirically. What kinds of study are needed to investigate these contextual factors? How should the contextual factors be taken into account in other empirical studies?
  3. Now imagine we have already characterized this space of species of SE. What measures of software quality attributes (e.g. defect rates, productivity, portability, changeability…) are robust enough to allow us to make valid comparisons between species of SE? Which metrics can be applied in a consistent way across vastly different contexts? And if none of the traditional software engineering metrics (e.g. for quality, productivity, …) can be used for cross-species comparison, how can we do such comparisons?

In my study of the climate modelers at the UK Met Office Hadley centre, I had identified a list of potential success factors that might explain why the climate modelers appear to be successful (i.e. to the extent that we are able to assess it, they appear to build good quality software with low defect rates, without following a standard software engineering process). My list was:

  • Highly tailored software development process – software development is tightly integrated into scientific work;
  • Single Site Development – virtually all coupled climate models are developed at a single site, managed and coordinated at a single site, once they become sufficiently complex [edited – see Bob’s comments below], usually a government lab as universities don’t have the resources;
  • Software developers are domain experts – they do not delegate programming tasks to programmers, which means they avoid the misunderstandings of the requirements common in many software projects;
  • Shared ownership and commitment to quality, which means that the software developers are more likely to make contributions to the project that matter over the long term (in contrast to, say, offshored software development, where developers are only likely to do the tasks they are immediately paid for);
  • Openness – the software is freely shared with a broad community, which means that there are plenty of people examining it and identifying defects;
  • Benchmarking – there are many groups around the world building similar software, with regular, systematic comparisons on the same set of scenarios, through model inter-comparison projects (this trait could be unique – we couldn’t think of any other type of software for which this is done so widely).
  • Unconstrained Release Schedule – as there is no external customer, software releases are unhurried, and occur only when the software is considered stable and tested enough.

At the workshop we identified many more distinguishing traits, any of which might be important:

  • A stable architecture, defined by physical processes: atmosphere, ocean, sea ice, land scheme,…. All GCMs have the same conceptual architecture, and it is unchanged since modeling began, because it is derived from the natural boundaries in physical processes being simulated [edit: I mean the top level organisation of the code, not the choice of numerical methods, which do vary across models – see Bob’s comments below]. This is used as an organising principle both for the code modules, and also for the teams of scientists who contribute code. However, the modelers don’t necessarily derive some of the usual benefits of stable software architectures, such as information hiding and limiting the impacts of code changes, because the modules have very complex interfaces between them.
  • The modules and integrated system each have independent lives, owned by different communities. For example, a particular ocean model might be used uncoupled by a large community, and also be integrated into several different coupled climate models at different labs. The communities who care about the ocean model on its own will have different needs and priorities than each of communities who care about the coupled models. Hence, the inter-dependence has to be continually re-negotiated. Some other forms of software have this feature too: Audris mentioned voice response systems in telecoms, which can be used stand-alone, and also in integrated call centre software; Lionel mentioned some types of embedded control systems onboard ships, where the modules are used indendently on some ships, and as part of a larger integrated command and control system on others.
  • The software has huge societal importance, but the impact of software errors is very limited. First, a contrast: for automotive software, a software error can immediately lead to death, or huge expense, legal liability, etc,  as cars are recalled. What would be the impact of software errors in climate models? An error may affect some of the experiments performed on the model, with perhaps the most serious consequence being the need to withdraw published papers (although I know of no cases where this has happened because of software errors rather than methodological errors). Because there are many other modeling groups, and scientific results are filtered through processes of replication, and systematic assessment of the overall scientific evidence, the impact of software errors on, say, climate policy is effectively nil. I guess it is possible that systematic errors are being made by many different climate modeling groups in the same way, but these wouldn’t be coding errors – they would be errors in the understanding of the physical processes and how best to represent them in a model.
  • The programming language of choice is Fortran, and is unlikely to change for very good reasons. The reasons are simple: there is a huge body of legacy Fortran code, everyone in the community knows and understands Fortran (and for many of them, only Fortran), and Fortran is ideal for much of the work of coding up the mathematical formulae that represent the physics. Oh, and performance matters enough that the overhead of object oriented languages makes them unattractive. Several climate scientists have pointed out to me that it probably doesn’t matter what language they use, the bulk of the code would look pretty much the same – long chunks of sequential code implementing a series of equations. Which means there’s really no push to discard Fortran.
  • Existence and use of shared infrastructure and frameworks. An example used by pretty much every climate model is MPI. However, unlike Fortran, which is generally liked (if not loved), everyone universally hates MPI. If there was something better they would use it. [OpenMP doesn’t seem to have any bigger fanclub]. There are also frameworks for structuring climate models and coupling the different physics components (more on these in a subsequent post). Use of frameworks is an internal constraint that will distinguish some species of software engineering, although I’m really not clear how it will relate to choices of software development process. More research needed.
  • The software developers are very smart people. Typically with PhDs in physics or related geosciences. When we discussed this in the group, we all agreed this is a very significant factor, and that you don’t need much (formal) process with very smart people. But we couldn’t think of any existing empirical evidence to support such a claim. So we speculated that we needed a multi-case case study, with some cases representing software built by very smart people (e.g. climate models, the Linux kernel, Apache, etc), and other cases representing software built by …. stupid people. But we felt we might have some difficulty recruiting subjects for such a study (unless we concealed our intent), and we would probably get into trouble once we tried to publish the results 🙂
  • The software is developed by users for their own use, and this software is mission-critical for them. I mentioned this above, but want to add something here. Most open source projects are built by people who want a tool for their own use, but that others might find useful too. The tools are built on the side (i.e. not part of the developers’ main job performance evaluations) but most such tools aren’t critical to the developers’ regular work. In contrast, climate models are absolutely central to the scientific work on which the climate scientists’ job performance depends. Hence, we described them as mission-critical, but only in a personal kind of way. If that makes sense.
  • The software is used to build a product line, rather than an individual product. All the main climate models have a number of different model configurations, representing different builds from the codebase (rather than say just different settings). In the extreme case, the UK Met Office produces several operational weather forecasting models and several research climate models from the same unified codebase, although this is unusual for a climate modeling group.
  • Testing focuses almost exclusively on integration testing. In climate modeling, there is very little unit testing, because it’s hard to specify an appropriate test for small units in isolation from the full simulation. Instead the focus is on very extensive integration tests, with daily builds, overnight regression testing, and a rigorous process of comparing the output from runs before and after each code change. In contrast, most other types of software engineering focus instead on unit testing, with elaborate test harnesses to test pieces of the software in isolation from the rest of the system. In embedded software, the testing environment usually needs to simulate the operational environment; the most extreme case I’ve seen is the software for the international space station, where the only end-to-end software integration was the final assembly in low earth orbit.
  • Software development activities are completely entangled with a wide set of other activities: doing science. This makes it almost impossible to assess software productivity in the usual way, and even impossible to estimate the total development cost of the software. We tried this as a thought experiment at the Hadley Centre, and quickly gave up: there is no sensible way of drawing a boundary to distinguish some set of activities that could be regarded as contributing to the model development, from other activities that could not. The only reasonable path to assessing productivity that we can think of must focus on time-to-results, or time-to-publication, rather than on software development and delivery.
  • Optimization doesn’t help. This is interesting, because one might expect climate modelers to put a huge amount of effort into optimization, given that century-long climate simulations still take weeks/months on some of the world’s fastest supercomputers. In practice, optimization, where it is done, tends to be an afterthought. The reason is that the model is changed so frequently that hand optimization of any particular model version is not useful. Plus the code has to remain very understandable, so very clever designed-in optimizations tend to be counter-productive.
  • There are very few resources available for software infrastructure. Most of the funding is concentrated on the frontline science (and the costs of buying and operating supercomputers). It’s very hard to divert any of this funding to software engineering support, so development of the software infrastructure is sidelined and sporadic.
  • …and last but not least, A very politically charged atmosphere. A large number of people actively seek to undermine the science, and to discredit individual scientists, for political (ideological) or commercial (revenue protection) reasons. We discussed how much this directly impacts the climate modellers, and I have to admit I don’t really know. My sense is that all of the modelers I’ve interviewed are shielded to a large extend from the political battles (I never asked them about this). Those scientists who have been directly attacked (e.g. MannJonesSanter) tend to be scientists more involved in creation and analysis of datasets, rather than GCM developers. However, I also think the situation is changing rapidly, especially in the last few months, and climate scientists of all types are starting to feel more exposed.

We also speculated about some other contextual factors that might distinguish different software engineering species, not necessarily related to our analysis of computational science software. For example:

  • Existence of competitors;
  • Whether software is developed for single-person-use versus intended for broader user base;
  • Need for certification (and different modes by which certification might be done, for example where there are liability issues, and the need to demonstrate due diligence)
  • Whether software is expected to tolerate and/or compensate for hardware errors. For example, for automotive software, much of the complexity comes from building fault-tolerance into the software because correcting hardware problems introduced in design or manufacture is prohibitively expense. We pondered how often hardware errors occur in supercomputer installations, and whether if they did it would affect the software. I’ve no idea of the answer to the first question, but the second is readily handled by the checkpoint and restart features built into all climate models. Audris pointed out that given the volumes of data being handled (terrabytes per day), there are almost certainly errors introduced in storage and retrieval (i.e. bits getting flipped), and enough that standard error correction would still miss a few. However, there’s enough noise in the data that in general, such things probably go unnoticed, although we speculated what would happen when the most significant bit gets flipped in some important variable.

More interestingly, we talked about what happens when these contextual factors change over time. For example, the emergence of a competitor where there was none previously, or the creation of a new regulatory framework where none existed. Or even, in the case of health care, when change in the regulatory framework relaxes a constraint – such as the recent US healthcare bill, under which it (presumably) becomes easier to share health records among medical professionals if knowledge of pre-existing conditions is no longer a critical privacy concern. An example from climate modeling: software that was originally developed as part of a PhD project intended for use by just one person eventually grows into a vast legacy system, because it turns out to be a really useful model for the community to use. And another: the move from single site development (which is how nearly all climate models were developed) to geographically distributed development, now that it’s getting increasingly hard to get all the necessary expertise under one roof, because of the increasing diversity of science included in the models.

We think there are lots of interesting studies to be done of what happens to the software development processes for different species of software when such contextual factors change.

Finally, we talked a bit about the challenge of finding metrics that are valid across the vastly different contexts of the various software engineering species we identified. Experience with trying to measure defect rates in climate models suggests that it is much harder to make valid comparisons than is generally presumed in the software literature. There really has not been any serious consideration of these various contextual factors and their impact on software practices in the literature, and hence we might need to re-think a lot of the ways in which claims for generality are handled in empirical software engineering studies. We spent some time talking about the specific case of defect measurements, but I’ll save that for a future post.

Kate asked the question last week “How do you stay sane” (while fighting the misinformation campaigns and worrying about our prospects for averting dangerous climate change). Kate’s post reminded me of a post I did last year on climate trauma, and specifically the essay by Gillian Caldwell, in which she compares the emotional burnout that many of us feel when dealing with climate change with other types of psychological trauma. I originally read this at a time when I was overdoing it, working late into the evenings, going to bed exhausted, and then finding myself unable to sleep because my head was buzzing with everything I’d just been working on. Gillian’s essay struck a chord.

I took on board many of the climate trauma survival tips, and in particular, I started avoiding climate related work in the evenings. My blogging rate went down and I started sleeping and exercising properly again. But good habits can be hard to maintain, and I realise in the last few months I was overdoing it again. As it was March break last week, we took a snap decision to take some time off, and took the kids skiing in Quebec. We even managed to fit in trips to Ottawa and Montreal en route, as the kids hadn’t been to either city.

The trip was great, but wasn’t 100% effective as a complete break. I was reminded of climate change throughout: I didn’t need a coat in Ottawa (in March!!) and we picnicked outdoors in Montreal (in March!!). There’s no snow left in the Laurentides (except on the ski slopes); and we found ourselves skiing in hot sunshine (which meant by mid-afternoon the slopes were covered in piles of wet slush). The ski operators told us they normally stay open through mid-April, but that looks extremely unlikely this year. And sure enough, I return to the news that Canada has experienced the warmest winter ever recorded, and we’re on course for the hottest year ever. It can’t be good news for the ski industry.

And it’s not good news for me  because I’m now back to blogging late into the evening again…

14. March 2010 · 6 comments · Categories: humour

The Subversion book (it's turtles all the way down)

Here’s the funniest comment from when I visited NCAR the other week. We were talking over dinner about how just about anything the scientists say and do now will be twisted out of context, to try and prove a conspiracy. Never mind “tricks” and “data manipulation”. What happens when the ignoranti find out that the tool used to manage the code for the climate models is called Subversion?

13. March 2010 · 1 comment · Categories: ICSE 2010

We’re gearing up our plans for the second international workshop on software research and climate change (WSRCC-2), to be held in Cape Town on May 3 (in conjunction with ICSE-2010). The workshop follows from a successful WSRCC-1 we held in the fall at Oopsla/Onward! (See also my summary of the brainstorming session).

One of the biggest challenges for the workshop in Cape Town is to accommodate participation by people who can’t be there. After all, there is irony in the size of the carbon footprints for many of us to travel all the way to South Africa, and many of the organizing committee members felt it’s too far to travel. We’ve ruled out the idea of video-conferencing (our experience is that the technology and bandwidth at conference centres just isn’t reliable enough). However, after a little brainstorming, we came up with some interesting ideas:

  • Invite people to submit youtube-style videos, to be posted on the conference website. The best of these will be shown in a session at the workshop;
  • Make full use of twitter and friendfeed to connect with remote participants, perhaps projecting the feeds up on the screen during the workshop. (tags are ready – twitter: #wsrcc-2; friendfeed: wsrcc-2-may-2010);
  • Have one session at the workshop opened up to audio conferencing. The second afternoon session would work best for this, as will permit participation from most timezones: it’ll be evening in India, afternoon in Europe; and morning in N. & S. America. And I’m led to believe that the Aussies and Japanese are always happy stay up all night anyway…
  • And I was keen to experiment with embodied social proxies, but I don’t think we’ll be able to get the kit together for this year…

Anyway, I’d be interested in more ideas, and encourage everyone to participate, either physically or remotely. The draft program is up already.

Oh, and I’m really looking forward to the closing keynote at ICSE this year: Sir David King, talking about Planning for Climate Change in the 21st Century.

Nature news runs some very readable articles on climate science, but is unfortunately behind a paywall. Which is a shame because they really should be widely read. Here’s a couple of recent beauties:

The Real Holes in Climate Science, (published 21 Jan 2010) points out that climate change denialists keep repeating long debunked myths about things they believe undermine the science. Meanwhile, in the serious scientific literature, there are some important open questions over real uncertainties in the science (h/t to AH). These are discussed openly in the IPCC reports (see for example, the 59 robust findings and 55 uncertainties listed in section 6 of the Technical Summary for WG1). None of these uncertainties pose a serious challenge to our basic understanding of climate change, but they do prevent absolute certainty about any particular projection. Not only that, many of these uncertainties suggest a strong application of the precautionary principle, because many of them suggest the potential for the IPCC to be underestimating the seriousness of climate change. The Nature News article identifies the following as particularly relevant:

  • Regional predictions. While the global models do a good job of simulating global trends in temperature, they often do poorly on fine-grained regional projections. Geographic features, such as mountain ridges, which mark the boundary of different climatic zones, occur at scales much smaller than the typical grids in GCMs, which means the GCMs get these zonal boundaries wrong, especially when coarse-grain predictions are downscaled.
  • Precipitation. As the IPCC report made clear, many of the models disagree even on the sign of the change in rainfall over much of the globe, especially for winter projections. The differences are due to uncertainties over convection processes. Worryingly, studies of recent trends (published after the IPCC report was compiled)  indicate the models are underestimating precipitation changes, such as the drying of the subtropics.
  • Aerosols. Estimates of the effect on climate from airborne particles (mainly from industrial pollution) vary by an order of magnitude. Some aerosols (e.g. suphates) induce a cooling effect by reflecting sunlight, while others (e.g. black carbon) produce a warming effect by absorbing sunlight. The extent to which these aerosols are masking the warming we’re already ‘owed’ from increased greenhouse gases is hard to determine.
  • Temperature reconstructions prior to the 20th century. The Nature News article discusses at length the issues in the tree ring data used as one of the proxies for reconstructing past temperature records, prior to the instrumental data from the last 150 years. The question of what causes the tree ring data to diverge from instrumental records in recent decades is obviously an interesting question, but to me it seems to be of marginal importance to climate science.

The Climate Machine, (published  24 Feb 2010) describes the Hadley Centre’s HadGEM-2 as an example of the current generation of earth system models, and discusses the challenges of capturing more and more earth systems into the models (h/t to JH). The article quotes many of the modelers I’ve been interviewing about their software development processes. Of particular interest is the discussion about the growing complexity of these models, once other earth systems processes are added: clouds, trees, tundra, land ice, and … pandas (the inclusion of pandas in the models is an in-joke in the modeling community) . There is likely to be a limit to the growth of this complexity, simply because the task of managing the contributions of a growing (and diversifying) group of experts gets harder and harder. The article also points out that one interesting result is likely to be an increase in some uncertainty ranges from these models in the next IPCC report, due to the additional variability introduced from these additional earth system processes.

I would post copies of the full articles, but I’m bound to get takedown emails from Macmillan publishing. But I guess they’re unlikely to object if I respond to emails requesting copies from me for research and education purposes…

I’ve just ordered the book “A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming” by Paul Edwards. It’s out next month, and I’m looking forward to reading it. I found out about the book a couple of weeks ago, while idly browsing Paul’s website while on the phone with him. What I didn’t realise, until today, is that Spencer Weart’s wonderful account of the history of general circulation models (an absolute must read!), which I’ve dipped into many times, is based originally on Paul’s work. Small world, huh?

On March 30, David Mackay, author of Sustainable Energy without the Hot Air, will be giving the J Tuzo Wilson lecture in the dept of Physics (Details of the time/location here). Here’s the abstract for his talk:

How easy is it get off our fossil fuel habit? What do the fundamental limits of physics say about sustainable energy? Could a typical “developed” country live on its own renewables? The technical potential of renewables is often said to be “huge” -but we need to know how this “huge” resource compares with another  “huge”: our huge power consumption. The public discussion of energy policy needs numbers, not adjectives. In this talk I will express power consumption and sustainable production in a single set of personal, human-friendly units. Getting off fossil fuels is not going to be easy, but it is possible.

The book itself is brilliant (and freely available online). But David’s visit is even more relevant, because it will give us a chance to show him a tool our group has been developing to facilitate and share the kinds of calculations that David does so well in the book.

We started from the question of how to take “back of the envelope” calculations and make them explicitly shareable over the web. And not just shareable, but to turn them into structured objects that can be discussed, updated, linked to evidence and so on (in much the same way that wikipedia entries are). Actually, the idea started with Jono’s calculations for the carbon footprint of “paper vs. screen”. When he first showed me his results, we got into a discussion of how other people might validate his calculations, and customize them for different contexts (e.g. for different hardware setups, different parts of the world with different energy mixes, etc). He came up with a graphical layout for the calculations, and we speculated how we would apply version control to this, make it a live calculator (so that changes in the input assumptions propagate like they would in a spreadsheet), and give each node it’s own URL, so that it can be attached to discussions, sources of evidence, etc. We brainstormed a long list of other features we’d want in such a tool, and we’re now busy creating a first prototype.

What kind of tool is it? My short description is that it is a crowd-sourced carbon calculator. Because I find existing carbon calculators to be very frustrating, because I can’t play with the assumptions in the calculations. Effectively, they are closed-source.

At the time we came up with these ideas, we were also working on modeling the analysis in David Mackay’s book (JP shows some preliminary results, here and here), to see if we could come up with a way of comparing his results with other books that also attempt to layout solutions to climate change. We created a domain model (as a UML class diagram), which was big and ugly, and a strategic actor goal model (using i*), which helped to identify key stakeholders, but didn’t capture the main content of Mackay’s analysis. So we tried modeling a chapter of the book as a calculation in Jonathan’s style, and it worked remarkably well. So we realized we needed to actually build the tool. And the rest, as they say, is history. Or at least will be, once we have a demo-able prototype…

Stephen Schneider‘s book, Science as a Contact Sport, makes fascinating reading, as he really gets his teeth into the disinformation campaign against climate science. However, the book was written before the denialist industry really cranked things up in the last few months, and now he’s angrier than ever, as is clear in this report yesterday about threats of violence against climate scientists (h/t to LG). By coincidence, I spoke to Schneider by phone yesterday – we were interviewing him as part of our analysis of the use of models such as C-ROADS in tools for online discussion, such as the collaboratorium. He’s very interested in such tools, partly because they have the potential to create a new generation of much more well-informed people (he noted that many of the people participating in the discussions in the collaboratorium are students), and partly because we need to find a much better way to get the science into the hands of the policymakers.

One of the things he said stuck out, in particular because it answers the question posed by Andrew Weaver at the end of the article above. Weaver says “good scientists are saying to themselves, ‘Why would I want to participate in the IPCC?'”. Steve Schneider told me he has a simple response to this – scientists have to keep doing the assessments and writing the reports, because you never know when they will be needed. When we get another climate shock (like Katrina, or the heatwaves in Europe in 2003), the media will suddenly look for the latest assessment report, and we have to have them ready. At that moment, all the effort is worthwhile. He pointed out this happened for the crisis over the ozone hole; when the media finally took notice, the scientific assessments were ready to hand, and it mattered. That’s why it’s important to keep at it.

I’m proposing a new graduate course for our department, to be offered next January (after I return from sabbatical). For the course calendar, I’m required to describe it in fewer than 150 words. Here’s what I have so far:

Climate Change Informatics

This introductory course will explore the contribution of computer science to the challenge of climate change, including: the role of computational models in understanding earth systems, the numerical methods at the heart of these models, and the software engineering techniques by which they are built, tested and validated; challenges in management of earth system data, such as curation, provenance, meta-data description, openness and reproducibility; tools for communication of climate science to broader audiences, such as simulations, games, educational software, collective intelligence tools, and the challenges of establishing reputation and trustworthiness for web-based information sources; decision-support tools for policymaking and carbon accounting, including the challenges of data collection, visualization, and trade-off analysis; the design of green IT, such as power-aware computing, smart controllers and the development of the smart grid.

Here’s the rationale:

This is an elective course. The aim is to bring a broad range of computer science graduate students together, to explore how their skills and knowledge in various areas of computer science can be applied to a societal grand challenge problem. The course will equip the students with a basic understanding of the challenges in tackling climate change, and will draw a strong link between the students’ disciplinary background and a series of inter-disciplinary research questions. The course crosscuts most areas of computer science.

And my suggested assessment modes:

  • Class participation: 10%
  • Term Paper 1 (essay/literature review): 40%
  • Term Paper 2 (software design or implementation): 40%
  • Oral Presentation or demo: 10%

Comments are most welcome – the proposal has to get through various committees before the final approval by the school of graduate studies. There’s plenty of room to tweak it in that time.

I like playing with data. One of my favourite tools is Gapminder, which allows you to plot graphs with any of a large number of country-by-country indicators, and even animate the graphs to see how they change over time. For example, looking at their CO2 emissions data, I could plot CO2 emissions against population (notice the yellow and red dots at the top: the US and China respectively – both with similar total annual emissions, but the US much worse on emissions per person). Press the ‘play’ button to see everyone’s emissions grow year-by-year, and play around with different indicators.

Gapminder looks good, but it’s lacking a narrative – these various graphs are only really interesting when used to tell a story. You get some sense of how to add narrative with the videos of presentations based on Gapminder, for example, this gapcast, which creates a narrative around the CO2 emissions data for the US and China.

But narrative on its own isn’t enough. We also need a way to challenge such narratives. For example, the gapcast above makes it clear that China’s gross annual emissions caught up with the US in the last couple of years, largely because of China’s reliance on coal as a cheap source of electricity. But what it doesn’t tell you is that a significant chunk (one fifth) of China’s emissions are due to carbon outsourcing: creation of goods and services exported to the west. In other words, one fifth of China’s emissions really ought to be counted as belonging to the US and Europe, because it’s our desire for cheap stuff that leads to all that coal being burnt. Without this information, the Gapminder graphs are misleading.

The only tool I’ve come across so far for challenging narratives in this way is: the blog. Many of my favourite blog posts are written as reactions (challenges) to someone else’s narrative. Which leads me to suggest that the primary value of a blog isn’t so much the contents per se, but the way each post creates new links between existing chunks of information, and adds commentary to those links. Now if only I had a tool for visualizing those links, so I could get an overview of who’s commenting on what, without having to read through thousands of blog posts…

09. March 2010 · 1 comment · Categories: advocacy

I blogged a couple of weeks ago about Skeptical Science, and in particular, the new iPhone app. Now there’s another site: Truth Fights Back, which is funded by US senator Kerry, and therefore has a strong US-centric approach, but I won’t hold that against the site, as both the design and content are excellent. Maybe the climate science community did need a swift kick in the pants to get its act together on communicating with the public.

(h/t to MT)

A few more late additions to my posts last month on climate science resources for kids:

  • NASA’s Climate Kids is a lively set of tools for younger kids, with games, videos (the ‘Climate Tales’ videos are wonderfully offbeat), and even information on future careers to help the planet.
  • Climate4Classrooms, put together by the British Council, includes a set of learning modules for kids ages 11+. Looks like a very nice set of resources.
  • And my kids came back from a school book fair last week with the DK Eyewitness book Climate Change, which is easily the best kids book I’ve seen yet (we have several other books in this series, and they’re all excellent). It’s a visual feast with photos and graphics, but it doesn’t skimp on the science, nor the policy implications, nor the available clean energy technologies – in fact it seems to cover everything! The parts that caught my eye (and are done very well) include a page on climate models, and a page entitled “What scares the scientists”, on climate tipping points.
08. March 2010 · 3 comments · Categories: blogging

Over the weekend, this blog quietly celebrated its first birthday. It was a nice moment to reflect on Serendipity’s first few words, back in March 2009:

Oh, and of course, Serendipity got a few birthday presents: a new “popular posts” page (see the menu bar at the top), a great new look on the iPhone, a new page navigation bar, and a live blogroll.