Uncategorized | Serendipity

How many errors should we expect in the IPCC reports?

10. February 2010 · 4 comments · Categories: philosophy, Uncategorized

I guess headlines like “An error found in one paragraph of a 3000 page IPCC report; climate science unaffected” wouldn’t sell many newspapers. And so instead, the papers spin out the story that a few mistakes undermine the whole IPCC process. As if newspapers never ever make mistakes. Well, of course, scientists are supposed to be much more careful than sloppy journalists, so “shock horror, those clever scientists made a mistake. Now we can’t trust them” plays well to certain audiences.

And yet there are bound to be errors; the key question is whether any of them impact any important results in the field. The error with the Himalayan glaciers in the Working Group II report is interesting because Working Group I got it right. And the erroneous paragraph in WGII quite clearly contradicts itself. Stupid mistake, that should be pretty obvious to anyone reading that paragraph carefully. There’s obviously room for improvement in the editing and review process. But does this tell us anything useful about the overall quality of the review process?

There are errors in just about every book, newspaper, and blog post I’ve ever read. People make mistakes. Editorial processes catch many of them. Some get through. But few of these things have the kind of systematic review that the IPCC reports went through. Indeed, as large, detailed, technical artifacts, with extensive expert review, the IPCC reports are much less like normal books, and much more like large software systems. So, how many errors get through a typical review process for software? Is the IPCC doing better than this?

Even the best software testing and review practices in the world let errors through. Some examples (expressed in number of faults experienced in operation, per thousand lines of code):

Worst military systems: 55 faults/KLoC
Best military systems: 5 faults/KLoC
Agile software development (XP): 1.4 faults/KLoC
The Apache web server (open source): 0.5 faults/KLoC
NASA Space shuttle: 0.1 faults/KLoC

Because of the extensive review processes, the shuttle flight software is purported to be the most expensive in the world, in terms of dollars per line of code. Yet still about 1 error every ten thousand lines of code gets through the review and testing process. Thankfully none of those errors have ever caused a serious accident. When I worked for NASA on the Shuttle software verification in the 1990’s, they were still getting reports of software anomalies with every shuttle flight, and releasing a software update every 18 months (this, for an operational vehicle that had been flying for two decades, with only 500,000 lines of flight code!).

The IPCC reports consist of around 3000 pages, and approaching 100 lines of text per page. Let’s assume I can equate a line of text with a line of code (which seems reasonable, when you look a the information density of each line in the IPCC reports) – that would make them as complex as a 300,000 line software system. If the IPCC review process is as thorough as NASA’s, then we should still expect around 30 significant errors made it through the review process. We’ve heard of two recently – does this mean we have to endure another 28 stories, spread out over the next few months, as the drone army of denialists toils through trying to find more mistakes? Actually, it’s probably worse than that…

The IPCC writing, editing and review processes are carried out entirely by unpaid volunteers. They don’t have automated testing and static analysis tools to help – human reviewers are the only kind of review available. So they’re bound to do much worse than NASA’s flight software. I would expect there to be 100s of errors in the reports, even with the best possible review processes in the world. Somebody point me to a technical review process anywhere that can do better than this, and I’ll eat my hat. Now, what was the point of all those newspaper stories again? Oh, yes, sensationalism sells.

Exploiting Spatial Memory: Code Canvas

25. January 2010 · 2 comments · Categories: research ideas, software tools, Uncategorized

When I was visiting MPI-M earlier this month, I blogged about the difficulty of documenting climate models. The problem is particularly pertinent to questions of model validity and reproducibility, because the code itself is the result of a series of methodological choices by the climate scientists, which are entrenched in their design choices, and eventually become inscrutable. And when the code gets old, we lose access to these decisions. I suggested we need a kind of literate programming, which sprinkles the code among the relevant human representations (typically bits of physics, formulas, numerical algorithms, published papers), so that the emphasis is on explaining what the code does, rather than preparing it for a compiler to digest.

The problem with literate programming (at least in the way it was conceived) is that it requires programmers to give up using the program code as their organising principle, and maybe to give up traditional programming languages altogether. But there’s a much simpler way to achieve the same effect. It’s to provide an organising structure for existing programming languages and tools, but which mixes in non-code objects in an intuitive way. Imagine you had an infinitely large sheet of paper, and could zoom in and out, and scroll in any direction. Your chunks of code are laid out on the paper, in an spatial arrangement that means something to you, such that the layout helps you navigate. Bits of documentation, published papers, design notes, data files, parameterization schemes, etc can be placed on the sheet, near to the code that they are relevant to. When you zoom in on a chunk of code, the sheet becomes a code editor; when you zoom in on a set of math formulae, it becomes a LaTeX editor, and when you zoom in on a document it becomes a word processor.

Well, Code Canvas, a tool under development in Rob Deline‘s group at Microsoft Research does most of this already. The code is laid out as though it was one big UML diagram, but as you zoom in you move fluidly into a code editor. The whole thing appeals to me because I’m a spatial thinker. Traditional IDEs drive me crazy, because they separate the navigation views from the code, and force me to jump from one pane to another to navigate. In the process, they hide the inherent structure of a large code base, and constrain me to see only a small chunk at a time. Which means these tools create an artificial separation between higher level views (e.g. UML diagrams) and the code itself, sidelining the diagrammatic representations. I really like the idea of moving seamlessly back and forth between the big picture views and actual chunks of code.

Code Canvas is still an early prototype, and doesn’t yet have the ability to mix in other forms of documentation (e.g. LaTeX) on the sheet (or at least not in any demo Microsoft are willing to show off), but the potential is there. I’d like to explore how we take an idea like this an customize it for scientific code development, where there is less of a strict separation of code and data than in other forms of programming, and where the link to published papers and draft reports is important. The infinitely zoomable paper could provide an intuitive unifying tool to bring all these different types of object together in one place, to be managed as a set. And the use of spatial memory to help navigate will be helpful, when the set of things gets big.

I’m also interested in exploring the idea of using this metaphor for activities that don’t involve coding – for example complex decision-support for sustainability, where you need to move between spreadsheets, graphs & charts, models runs, and so on. I would lay out the basic decision task as a graph on the sheet, with sources of evidence connecting into the decision steps where they are needed. The sources of evidence could be text, graphs, spreadsheet models, live datafeeds, etc. And as you zoom in over each type of object, the sheet turns into the appropriate editor. As you zoom out, you get to see how the sources of evidence contribute to the decision-making task. Hmmm. Need a name for this idea. How about DecisionCanvas?

Update: Greg also pointed me to CodeBubbles and Intentional Software

AGU Day 1 part B: The growing need for climate services

15. December 2009 · Write a comment · Categories: AGU fall meeting 2009, Uncategorized

Well, my intention to liveblog from interesting sessions is blown – the network connection in the meeting rooms is hopeless. One day, some conference will figure out how to provide reliable internet…

Yesterday I attended an interesting session in the afternoon on climate services. Much of the discussion was building on work done at the third World Climate Conference (WCC-3) in August, which set out to develop a framework for provision of climate services. These would play a role akin to local, regional and global weather forecasting services, but focussing on risk management and adaptation planning for the impacts of climate change. Most important is the emphasis on combining observation and monitoring services with research and modeling services (both of which already exist) with a new climate services information system (I assume this would be distributed across multiple agencies across the world) and system of user interfaces to deliver the information in forms needed for different audiences. Rasmus at RealClimate discusses some of the scientific challenges.

My concern in reading the outcomes of the WCC this is that it’s all focussed on a one-way flow of information, with insufficient attention to understanding who the different users would be, and what they really need. I needn’t have worried – the AGU session demonstrated that there’s plenty of people focussing on exactly this issue. I got the impression that there’s a massive international effort quietly putting in place the risk management and planning tools needed for a us to deal with the impacts of a rapidly changing climate, but which is completely ignored by a media still obsessed with the “is it happening?” pseudo-debate. The extent of this planning for expected impacts would make a much more compelling media story, and one that matters, on a local, scale to everyone.

Some highlights from the session:

Mark Svboda from the National Drought Mitigation Centre at the University of Nebraska, talking about drought planning the in US. He pointed out that drought tends to get ignored compared to other kinds of natural disasters (tornados, floods, hurricanes), presumably because it doesn’t happen within a daily news cycle. However drought dwarfs the damage costs in the US from all other kinds of natural disasters except hurricanes. One problem is that population growth has been highest in regions most subject ot drought, especially in the southwest US. The NDMC monitoring program includes the only repository of drought impacts. Their US drought monitor has been very successful, but next generation of tools need better sources of data on droughts, so they are working on adding a drought reporter, doing science outreach, working with kids, etc. Even more important, is improving the drought planning process, hence a series of workshops on drought management tools.

Tony Busalacchi from the Earth System Science Interdisciplinary Centre at the University of Maryland. Through a series of workshops in the CIRUN project, they’ve identified the need for tools for forecasting, especially around risks such as sea level rise. Especially the need for actionable information, but no service currently provides this. Climate information system needed for policymakers, on scales of seasons to decades, providing tailorable to regions, and with ability to explore “what-if” questions. To build this, needs coupling of models not used together before, and the synthesis of new datasets.

Robert Webb from NOAA, in Boulder, on experimental climate information services to support risk management. The key to risk assessment is to understand it’s across multiple timescales. Users of such services do not distinguish between weather and climate – they need to know about extreme weather events, and they need to know how such risks change over time. Climate change matters because of the impacts. Presenting the basic science and predictions of temperature change are irrelevant to most people – its the impacts that matter (His key quote: “It’s the impacts, stupid!”). Examples: water – droughts and floods, changes in snowpack, river stream flow, fire outlooks, and planning issues (urban, agriculture, health). He’s been working with the Climate Change and Western Water Group (CCAWWG) to develop a strategy on water management. How to get people to plan and adapt? The key is to get people to think in terms of scenarios rather than deterministic forecasts.

Guy Brasseur from German Climate Services Center, in Hamburg. German adaption strategy developed by german federal government, which appears to be way ahead of the US agencies in developing climate services. Guy emphasized the need for seamless prediction – need a uniform ensemble system to build from climate monitoring of recent past and present, and forward into the future, at different regional scales and timescales. Guy called for an Apollo-sized program to develop the infrastructure for this.

Kristen Averyt from the University of Colorado, talking about her “Climate services machine” (I need to get hold of the image for this – it was very nice). She’s been running workshops for Colorado-specific services, with beakout sessions focussed on impacts and utility of climate information. She presented some evaluations of the success of these workshop, including a climate literacy test they have developed. For example at one workshop, the attendees had 63% correct answers at the beginning (where the wrong answers tended to cluster and indicate some important misperceptions. I need to get hold of this – it sounds like an interesting test. Kristen’s main point was that these workshops play an important role in reaching out to people of all ages, including kids, and getting them to understand how climate change will affect them.

Overall, the main message of this session was that while there have been lots of advances in our understanding of climate, these are still not being used for planning and decision-making.

AGU Day 1: Challenges in Engineering Earth System Models

15. December 2009 · Write a comment · Categories: AGU fall meeting 2009, Uncategorized

First proper day of the AGU conference, and I managed to get to the (free!) breakfast for Canadian members, which was so well attended that the food ran out early. Do I read this as a great showing for Canadians at AGU, or just that we’re easily tempted with free food?

Anyway, on to the first of three poster sessions we’re involved in this week. This first poster was on TracSnap, the tool that Ainsley and Sarah worked on over the summer:

Our tracSNAP poster for the AGU meeting. Click for fullsize.

The key idea in this project is that large teams of software developers find it hard to maintain an awareness of one another’s work, and cannot easily identify the appropriate experts for different sections of the software they are building. In our observations of how large climate models are built, we noticed it’s often hard to keep up to date with what changes other people are working on, and how those changes will affect things. TracSNAP builds on previous research that attempts to visualize the social network of a large software team (e.g. who talks to whom), and relate that to couplings between code modules that team members are working on. Information about the intra-team communication patterns (e.g. emails, chat sessions, bug reports, etc) can be extracted automatically from project repositories, as can information about dependencies in the code. TracSNAP extracts data automatically from the project repository to provide answers to questions such as “Who else recently worked on the module I am about to start editing?”, and “Who else should I talk to before starting a task?”. The tool extracts hidden connections in the software by examining modules that were checked into the repository together (even though they don’t necessarily refer to each other), and offers advice on how to approach key experts by identifying intermediaries in the social network. It’s still a very early prototype, but I think it has huge potential. Ainsley is continuing to work on evaluating it on some existing climate models, to check that we can pull out of the respositories the data we think we can.

The poster session we were in, “IN11D. Management and Dissemination of Earth and Space Science Models” seemed a little disappointing as there were only three posters (a fourth poster presenter hadn’t made it to the meeting). But what we lacked in quantity, we made up in quality. Next to my poster was David Bailey‘s: “The CCSM4 Sea Ice Component and the Challenges and Rewards of Community Modeling”. I was intrigued by the second part of his title, so we got chatting about this. Supporting a broader community in climate modeling has a cost, and we talked about how university labs just cannot afford this overhead. However, it also comes with a number of benefits, particularly the existence of a group of people from different backgrounds who all take on some ownership of model development, and can come together to develop a consensus on how the model should evolve. With the CCSM, most of this happens in face to face meetings, particularly the twice-yearly user meetings. We also talked a little about the challenges of integrating the CICE sea ice model from Los Alamos with CCSM, especially given that CICE is also used in the Hadley model. Making it work in both models required some careful thinking about the interface, and hence more focus on modularity. David also mentioned people are starting to use the term kernelization as a label for the process of taking physics routines and packaging them so that they can be interchanged more easily.

Dennis Shea‘s poster, “Processing Community Model Output: An Approach to Community Accessibility” was also interesting. To tackle the problem of making output from the CCSM more accessible to the broader CCSM community, the decision was taken to standardize on netCDF for the data, and to develop and support a standard data analysis toolset, based on the NCAR Command Language. NCAR runs regular workshops on the use of these data formats and tools, as part of it’s broader community support efforts (and of course, this illustrates David’s point about universities not being able to afford to provide such support efforts).

The missing poster also looked interesting: Charles Zender from UC Irvine, comparing climate modeling practices with open source software practices. Judging from his abstract, Charles makes many of the same observations we made in our CiSE paper, so I was looking forward to comparing notes with him. Next time, I guess.

Poster sessions at these meetings are both wonderful and frustrating. Wonderful because you can wonder down aisles of posters and very quickly sample a large slice of research, and chat to the poster owners in a freeform format (which is usually much better than sitting through a talk). Frustrating because poster owners don’t stay near their posters very long (I certainly didn’t – too much to see and do!), which means you get to note an interesting piece of work, and then never manage to track down the author to chat (and if you’re like me, you also forget to write down contact details for posters you noticed. However, I did manage to make notes on two to follow up on:

Joe Galewsky caught my attention with an provocative title: “Integrating atmospheric and surface process models: Why software engineering is like having weasels rip your flesh”
I briefly caught Brendan Billingsley of NSIDC as he was taking his poster down. It caught my eye because it was a reflection on software reuse in the Searchlight tool.

Open Climate Science or Denial of Service attacks?

26. November 2009 · 37 comments · Categories: Uncategorized

I wasn’t going to post anything about the CRU emails story (apart from my attempt at humour), because I think it’s a non-story. I’ve read a few of the emails, and it looks no different to the studies I’ve done of how climate science works. It’s messy. It involves lots of observational data sets, many of which contain errors that have to be corrected. Luckily, we have a process for that – it’s called science. It’s carried out by a very large number of people, any of whom might make mistakes. But it’s self-correcting, because they all review each other’s work, and one of the best ways to get noticed as a scientist is to identify and correct a problem in someone else’s work. There is, of course, a weakness in the way such corrections are usually done to a previously published paper. The corrections appear as letters and other papers in various journals, and don’t connect up well to the original paper. Which means you have to know an area well to understand which papers have stood the test of time and which have been discredited. Outsiders to a particular subfield won’t be able to tell the difference. They’ll also have a tendency to seize on what they think are errors, but actually have already been addressed in the literature.

If you want all the gory details about the CRU emails, read the RealClimate posts (here and here) and the lengthy discussion threads that follow from them. Many of the scientists involved in the CRU emails comment in these threads, and the resulting picture of the complexity of the data and analysis that they have to deal with is very interesting. But of course, none of this will change anyone’s minds. If you’re convinced global warming is a huge conspiracy, you’ll just see scientists trying to circle the wagons and spin the story their way. If you’re convinced that AGW is now accepted as scientific fact, then you’ll see scientists tearing their hair out at the ignorance of their detractors. If you’re not sure about this global warming stuff, you’ll see a lively debate about the details of the science, none of which makes much sense on its own. Don’t venture in without a good map and compass. (Update: I’ve no idea who’s running this site, but it’s a brilliant deconstruction of the allegations about the CRU emails).

But one issue has arisen in many discussions about the CRU emails that touches strongly on the research we’re doing in our group. Many people have looked at the emails and concluded that if the CRU had been fully open with its data and code right from the start, there would be no issue now. This is of course, a central question in Alicia‘s research on open science. While in principle, open science is a great idea, in practice, there are many hurdles, including the fear of being “scooped”, the need to give appropriate credit, the problems of metadata definition and of data provenance, the cost of curation, and the fact that software has a very short shelf-life, and so on. For the CRU dataset at the centre of the current kerfuffle, someone would have to go back to all the data sources, and re-negotiate agreements about how the data can be used. Of course, the anti-science crowd just think that’s an excuse.

However, for climate scientists there is another problem, which is the workload involved in being open. Gavin, at RealClimate, raises this issue in response to a comment that just putting the model code online is not sufficient. For example, if someone wanted to reproduce a graph that appears in a published paper, they’ll need much more: the script files, the data sets, the parameter settings, the spinup files, and so on. Michael Tobis argues that much of this can be solved with good tools and lots of automated scripts. Which would be fine if we were talking about how to help other scientists replicate the results.

Unfortunately, in the special case of climate science, that’s not what we’re talking about. A significant factor in the reluctance of climate scientists to release code and data is to protect themselves from denial-of-service attacks. There is a very well-funded and PR-savvy campaign to discredit climate science. Most scientists just don’t understand how to respond to this. Firing off hundreds of requests to CRU to release data under the freedom of information act, despite each such request being denied for good legal reasons, is the equivalent of frivolous lawsuits. But even worse, once datasets and codes are released, it is very easy for an anti-science campaign to tie the scientists up in knots trying to respond to their attempts to poke holes in the data. If the denialists were engaged in an honest attempt to push the science ahead, this would be fine (although many scientists would still get frustrated – they are human too).

But in reality, the denialists don’t care about the science at all; their aim is a PR campaign to sow doubt in the minds of the general public. In the process, they effect a denial-of-service attack on the scientists – the scientists can’t get on with doing their science because their time is taken up responding to frivolous queries (and criticisms) about specific features of the data. And their failure to respond to each and every such query will be trumpeted as an admission that an alleged error is indeed an error. In such an environment, is it perfectly rational not to release data and code – it’s better to pull up the drawbridge and get on with the drudgery of real science in private. That way the only attacks are complaints about lack of openness. Such complaints are bothersome, but much better than the alternative.

In this case, because the science is vitally important for all of us, it’s actually in the public interest that climate scientists be allowed to withhold their data. Which is really a tragic state of affairs. The forces of anti-science have a lot to answer for.

Update: Robert Grumbine has a superb post this morning on why openness and reproducibility is intrinsically hard in this field

Adaptation just as important as Mitigation

20. November 2009 · 4 comments · Categories: collaborative science, sustainability, Uncategorized

Brad points out that much of my discussion for a research agenda in climate change informatics focusses heavily on strategies for emissions reduction (aka Mitigation) and neglects the equally important topic of ensuring communities can survive the climate changes that are inevitable (aka Adaptation). Which is an important point. When I talk about the goal of keeping temperatures to below a 2°C rise, it’s equally important to acknowledge that we’ve almost certainly already lost any chance of keeping peak temperature rise much below 2°C.

Which means, of course, that we have some serious work to do, in understanding the impact of climate change on existing infrastructure, and to integrate an awareness of the likely climate change issues into new planning and construction projects. This is, of course, what Brad’s Adaptation and Impacts research division focusses on. There are some huge challenges to do with how we take the data we have (e.g. see the datasets in the CCCSN), downscale these to provide more localized forecasts, and then figure out how to incorporate these into decision making.

One existing tool to point out is the World Bank’s ADAPT, which is intended to help analyze projects in the planning stage, and identify risks related to climate change adaptation. This is quite a different decision-making task from the emissions reduction decision tools I’ve been looking at. But just as important.

Student Design Competitions

08. September 2009 · Write a comment · Categories: sustainability, Uncategorized

Here’s an interesting competition (with cash prizes), organized by the Usability Professionals’ Association, to develop a new concept or product, with user-centred design principles, that aims to cut energy consumption or reduce pollution.

And here’s another: A Video Game Creation Contest to create a playable video game that uses earth observations to help address environmental problems.