Well, this is what it comes down to. Code reviews on national TV. Who would have thought it? And, by the standards of a Newsnight code review, the code in question doesn’t look so good. Well, it’s not surprising it doesn’t. It’s the work of one, untrained programmer, working in an academic environment, trying to reconstruct someone else’s data analysis. And given the way in which the CRU files were stolen, we can be pretty sure this is not a random sample of code from the CRU; it’s handpicked to be one of the worst examples.

Watch the clip from about 2:00. They compare the code with some NASA code, although we’re not told what exactly. Well, duh. If you compare the experimental code written by one scientist on his own, which has clearly not been through any code review, with that produced by a NASA’s engineering processes, of course it looks messy. For any programmers reading this: How many of you can honestly say that you’d come out looking good if I trawled through your files, picked the worst piece of code lying around in there, and reviewed it on national TV? And the “software engineer” on the program says it’s “below the standards you would expect in any commercial software”. Well, I’ve seen a lot of commercial software. It’s a mix of good, bad, and ugly. If you’re deliberate with your sampling technique, you can find a lot worse out there.

Does any of this matter? Well, a number of things bug me about how this is being presented in the media and blogosphere:

  • The first, obviously, is the ridiculous conclusion that many people seem to be making that poor code quality in one, deliberately selected program file somehow invalidates all of climate science. As cdavid points out towards the end of this discussion, if you’re going to do that, then you pretty much have to throw out most results in every field of science over the past few decades for the same reason. Bad code is endemic in science.
  • The slightly more nuanced, but equally specious, conclusion that bugs in this code mean that research results at the CRU must be wrong. Eric Raymond picks out an example he calls blatant data-cooking, but is quite clearly fishing for results, because he ignores the fact that the correction he picks on is never used in the code, except in parts that are commented out. He’s quote mining for effect, and given Raymond’s political views, it’s not surprising. Just for fun, someone quote mined Raymond’s own code, and was horrified at what he found. Clearly we have to avoid all open source code immediately because of this…? The problem, of course, is that none of these quote miners have gone to the trouble to establish what this particular code is, why it was written, and what it was used for.
  • The widely repeated assertion that this just proves that scientific software must be made open source, so that a broader community of people can review it and improve it.

It’s this last point that bothers me most, because at first sight, it seems very reasonable. But actually, it’s a red herring. To understand why, we need to pick apart two different arguments:

  1. An argument that when a paper is published, all of the code and data on which it is based should be released so that other scientists (who have the appropriate background) can re-run it and validate the results. In fields with complex, messy datasets, this is exceedingly hard, but might be achievable with good tools. The complete toolset needed to do this does not exist today, so just calling for making the code open source is pointless. Much climate code is already open source, but that doesn’t mean anyone in another lab can repeat a run and check the results. The problems of reproducibility have very little to do with whether the code is open – the key problem is to capture the entire scientific workflow and all data provenance. This is very much an active line of research, and we have a long way to go. In the absence of this, we rely on other scientists testing the results with other methods, rather than repeating the same tests. Which is the way it’s done in most branches of science.
  2. An argument that there is a big community of open source programmers out there who could help. This is based on a fundamental misconception about why open source software development works. It matters how the community is organised, and how contributions to the code are controlled by a small group of experts. It matters that it works as a meritocracy, where programmers need to prove their ability before they are accepted into the inner developer group. And most of all, it matters that the developers are the domain experts. For example, the developers who built the Linux kernel are world-class experts on operating systems and computer architecture. Quite often they don’t realize just how high their level of expertise is, because they hang out with others who also have the same level of expertise. Likewise, it takes years of training to understand the dynamics of atmospheric physics in order to be able to contribute to the development of a climate simulation model. There is not a big pool of people with the appropriate expertise to contribute to open source climate model development, and nor is there ever likely to be, unless we expand our PhD programs in climatology dramatically (I’m sure the nay-sayers would like that!).

We do know that most of the heavy duty climate models are built at large government research centres, rather than at universities. Dave Randall explains why this is: the operational overhead of developing, testing and maintaining a Global Climate Model is far too high for university-based researchers. The Universities use (parts of) the models, and do further data analysis on both observational data and outputs from the big models. Much of this is the work of indivdual PhD students or postdocs. Which means that the argument that all code written at all stages of climate research must meet some gold standard of code quality is about as sensible as saying no programmer should ever be allowed to throw together a script to test out if some idea works. Of course bad code will get written in a hurry. What matters is that as a particular line of research matures, the coding practices associated with it should mature too. And we have plenty of evidence that this is true of climate science: the software practices used at the Hadley Centre for their climate models are better than most commercial software practices. Furthermore, they manage to produce code that appears to be less buggy than just about any other code anywhere (although we’re still trying to validate this result, and understand what it means).

None of this excuses bad code written by scientists. But the sensible response to this problem is to figure out how to train scientists to be better programmers, rather than argue that some community of programmers other than scientists can take on the job instead. The idea of open source climate software is great, but it won’t magically make the code better.

Justyna sent me a pointer to another group of people exploring an interesting challenge for computing and software technology: The Crisis Mappers Net. I think I can characterize this as another form of collective intelligence, harnessed to mobile networks and visual analytics, to provide rapid response to humanitarian emergencies. And of course, after listening to George Monbiot in the debate last night, I’m convinced that over the coming decades, the crises to be tackled will increasingly be climate related (forest fires, floods, droughts, extreme weather events, etc).

Criteria for tools that communicate climate science to a broader audience (click for bigger)

Criteria for tools that communicate climate science to a broader audience (click for bigger)

I gave my talk last night to TorCHI on Usable Climate Science. I think it went down well, especially considering that I hadn’t finished preparing the slides, and had just gotten off the plane from Seattle. I’ll post the slides soon, once I have a chance to tidy them up. But, judging by the questions and comments, one slide in particular went down well.

I put this together when trying to organize my thoughts about what’s wrong with a number of existing tools/websites in the space of climate science communication. I’ll post the critique of existing tools soon, but I guess I should first explain the criteria:

  • Trustworthy (i.e. the audience must be able to trust the information content):
    • Collective Curation captures the idea that a large community of people is responsible for curating the information content. The extreme example is, of course, wikipedia.
    • Open means that we can get inside and see how it’s all put together. Open source and open data probably need no explanation, but I also want to get across the idea of “open reasoning” – for example, users need access to the calculations and assumptions built into any tool that gives recommendations for energy choices.
    • Provenance means that we know where the information came from, and can trace it back to source. Especially important is the ability to trace back to peer-reviewed scientific literature, or to trusted experts.
    • And the tool should help to build a community by connecting people with one another, through sharing of their knowledge.
  • Appropriate (i.e. the form and content of the information must be appropriate to the intended audience(s)):
    • Accessible for audience – information must build on what people already know, and be provided in a form that allows them to assimilate it (Vygotsky’s Zone of Proximal Development captures this idea well).
    • Contextualized means that the tool provides information that is appropriate to the audience’s specific context, or can be customized for that context. For example, information about energy choices depends on location.
    • Zoomable means that different users can zoom in for more detailed information if they wish. I particularly like the idea of infinite zoom shown off well in this demo. But I don’t just mean visually zoomable – I mean zoomable in terms of information detail, so people who want to dive into the detailed science can if they wish.
  • Effective (i.e. actually works at communicating information and stimulating action):
    • Narrative force is something that seems to be missing from most digital media – the tool must tell a story rather than just provide information.
    • Get the users to form the right mental models so that they understand the science as more than just facts and figures, and understand how to think about the risks.
    • Support exploration to allow users to follow their interests. Most web-based tools are good at this, but often at the expense of narrative force.
    • Give the big picture. For climate change this is crucial – we need to encourage systems thinking if we’re ever going to get good at collective decision making.
  • Compelling (i.e. something that draws people in):
    • Cool, because coolness is how viral marketing works. If it’s cool people will tell others about it.
    • Engaging, so that people want to use it and are drawn in by it.
    • Fun and Entertaining, because we’re often in danger of being too serious. This is especially important for stuff targeted at kids. If it’s not as much fun as the latest video games, then we’re already losing their attention.

During the talk, one of the audience members suggested adding actionable to my list, i.e. it actually leads to appropriate action, changes in behaviour, etc. I’m kicking myself for forgetting this, and can’t now decide whether it belongs under effective, or is an entirely new category. I’ll welcome suggestions.

I’m visiting Microsoft this week, and am fascinated to discover the scope and expertise in climate change at Microsoft Research (MSR), particularly through their Earth, Energy and Environment theme (also known as E3).

Microsoft External Research (MER) is the part of MSR that builds collaborative research relationships with academic and other industrial partners. It is currently headed by Tony Hey, who was previously director of the UK’s e-science initiative (and obviously, as a fellow Brit, he’s a delight to chat to). Tony is particularly passionate about the need to communicate science to the broader public.

The E3 initiative within MER is headed by Dan Fay, who has a fascinating blog, where I found a pointer to a thought-provoking essay by Bill Gail (of the Virtual Earth project) in the Bulletin of the American Meteorological Society on Achieving Climate Sustainability. Bill opens up the broader discussion of what climate sustainability actually means (beyond the narrow focus on physical properties such as emissions of greenhouse gases). The core of his essay is the observation that humanity has now replaced nature as the controller of the entire climate system, despite the fact that we’re hopelessly ill-equipped either philosophically or politically to take on this role right now (this point was also made very effectively at the end of Gwynne Dyer’s book, and in Michael Tobis’ recent talk on the Cybernetics of Climate). More interestingly, Bill argues that we began to assume this role much earlier that most people think: about 7,000 years ago at the dawn of agricultural society, when we first started messing around with existing ecosystems.

The problem I have with Bill’s paper though, is that he wants to expand the scope of the climate policy framework at a time when even the limited, weak framework we have is under attack from a concerted misinformation campaign. Back to that point about public understanding of the science: we have to teach the public about the unavoidable physical facts about greenhouse gases first, to get at least to a broad consensus of the urgent need to move to a zero-carbon economy. You can’t start the broader discussion about longer term climate sustainability unless we at least establish a broad public understanding of the physics of greenhouse gases.

Random Hacks of Kindness is a codejam sponsored by the World Bank, Google, Microsoft and Yahoo!, aimed at building useful software for important social/humanitarian causes. The upcoming event in the Bay Area in November is focussed on software for disaster relief.

However, they’re also proposing to run a 4-day codejam at the COP15 meeting in Copenhagen in December, aimed at building useful software for tackling climate change. I’ve submitted a few ideas of my own, plus our categorization of software challenges. Here’s some of my suggestions:

  • Make the IPCC website more accessible. E.g. provide a visual index of the figures and charts in the reports; develop “guided tours” through the material for different kinds of users, based on their various interests; provide pointers into key sections that respond to common misunderstandings
  • Provide simple dynamic visualizations of the key physical processes. Along the lines of the tutorial developed by John Sterman, but perhaps with less text and more freedom to play with the model.
  • Provide a simpler, web-based interface to the Java Climate model, that allows policymakers to quickly see the effects of different policy options.

What else?

I’m teaching our introductory software engineering course this term, for which the students will be working on a significant software development project over the term. The main aim of the course is to get the students thinking about and using good software development practices and tools, and we organise the term project as an agile development effort, with a number of small iterations during the term. The students have to figure out for  themselves what to build at each iteration.

For a project, I’ve challenged the students to design new uses for the Canadian Climate Change Senarios Network. This service makes available the data on possible future climate change scenarios from the IPCC datasets, for a variety of end users. The current site allows users to run basic queries over the data set, and have the results returned either as raw data, or in a variety of visualizations. The main emphasis is on regional scenarios for Canada, so the service offers some basic downscaling, and ability to couple the scenarios with other regional data sources, such as data from weather monitoring stations in the region. However, to use the current service, you need to know quite a bit about the nature of the data: it asks you which models you’re interested in; which years you want data for (assumes you know something about 30-year averages); which scenarios you want (assumes you know something about the standard IPCC scenarios); which region you want (in latitude and longitude); and which variables you want (assumes you know something about what these variables measure). The current design reflects the needs of the primary user group for which the service was developed – (expert) researchers working on climate impacts and adaptation.

The challenge for the students on my course is to extend the service for new user groups. For example, farmers who want to know something about likely effects of climate change on growing seasons, rainfall and heat stress in their local area. High school students studying climate and weather. Politicians who want to understand what the latest science tells us about the impacts of climate change on the constituencies they represent. Activists who want to present a simple clear message to policymakers about the need for policy changes. etc.

I have around 60 students on the course, working in teams of 4. I’m hoping that the various teams will come up with a variety of ideas for how to make this dataset useful to new user groups, and I’ve challenged them to be imaginative. But more suggestions are always welcome…

Next Wednesday, we’re oganising demos of our students’ summer projects, prior to the Science 2.0 conference. The demos will be in BA1200 (in the Bahen Centre), Wed July 29, 10am-12pm. All welcome!

Here are the demos to be included (running order hasn’t been determined yet – we’ll probably pull names out of hat…):

  • Basie (demo’d by Bill Konrad, Eran Henig and Florian Shkurti)
    Basie is a light weight, web-based software project forge with an emphasis on inter-component communication.  It integrates revision control, issue tracking, mailing lists, wikis, status dashboards, and other tools that developers need to work effectively in teams.  Our mission is to make Basie simple enough for undergraduate students to master in ten minutes, but powerful enough to support large, distributed teams.
  • BreadCrumbs (demo’d by Brent Mombourquette).
    When researching, the context in which a relevant piece of information is found is often overlooked. However, the journey is as important as the destination. BreadCrumbs is a Firefox extension designed to capture this journey, and therefor the context, by maintaining a well structured and dynamic graph of an Internet browsing session. It keeps track of both the chronological order in which websites are visited and the link-by-link path. In addition, through providing simple tools to leave notes to yourself, an accurate record of your thought process and reasoning for browsing the documents that you did can be preserved with limited overhead. The resulting session can then be saved and revisited at a later date, with little to no time spent trying to recall the relevance or semantic relations of documents in an unordered bookmark folder, for example. It can also be used to provide information to a colleague, by not just pointing them to a series of web pages, but by providing them a trail to follow and embedded personal notes. BreadCrumbs maintains the context so that you can focus on the content.
  • Feature Diagram Tool (demo’d by Ebenezer Hailemariam)
    We present a software tool to assist software developers work with legacy code. The tool reverse engineers “dependency diagrams” from Java code through which developers can perform refactoring actions. The tool is a plug-in for the Eclipse integrated development environment.
  • MarkUs (demo’d by Severin GehwolfNelle Varoquaux and Mike Conley)
    MarkUs is a Web application that recreates the ease and flexibility of grading assignments with pen on paper. Graders fill in a marking scheme and directly annotate student’s work.  MarkUs also provides support for other aspects of assignment delivery and management.  For example, it allows students or instructors to form groups for assignment collaboration, and allows students to upload their work for grading. Instructors can also create and manage group or solo assignments, and assign graders to mark and annotate the students’ work quickly and easily.
  • MyeLink: drawing connections between OpenScience lab notes (demo’d by Maria Yancheva)
    A MediaWiki extension which facilitates connections between related wiki pages, notes, and authors. Suitable for OpenScience research communities who maintain a wiki collection of experiment pages online. Provides search functionality on the basis of both structure and content of pages, as well as a user interface allowing the customization of options and displaying an embedded preview of results.
  • TracSNAP – Trac Social Network Analysis Plugin (demo’d by Ainsley Lawson and Sarah Strong)
    TracSNAP is a suite of simple tools to help contributors make use of information about the social aspect of their Trac coding project. It tries to help you to: Find out which other developers you should be talking to, by giving contact suggestions based on commonality of file edits; Recognize files that might be related to your current work, by showing you which files are often committed at the same time as your files; Get a feel for who works on similar pieces of functionality based on discussion in bug and feature tickets, and by edits in common; Visualize your project’s effective social network with graphs of who talks to who; Visualize coupling between files based on how often your colleagues edit them together.
  • VizExpress (demo’d by Samar Sabie)
    Graphs are effective visualizations because they present data quickly and easily. vizExpress is a Mediawiki extension that inserts user-customized tables and graphs in wiki pages without having to deal with complicated wiki syntax. When editing a wiki page, the extension adds a special toolbar icon for opening the vizExpress wizard. You can provide data to the wizard by browsing to a local Excel or CSV file, or by typing (or copying/pasting) data. You can choose from eight graph types and eight graph-coloring schemes, and apply further formatting such as titles, dimensions, limits, and legend position. Once a graph is inserted in a page, you can easily edit it by restarting the wizard or modifying a simple vizExpress tag.

[Update: the session was a great success, and some of the audience have blogged about it already: e.g. Cameron Neylon]