I’m too busy writing today to post much, but here’s a couple of videos I found hilarious, to fill in the time…
The Now Show [Hat tip for this one to ClimateExtremist]
Applying systems thinking to computing, climate and sustainability
I’m too busy writing today to post much, but here’s a couple of videos I found hilarious, to fill in the time…
The Now Show [Hat tip for this one to ClimateExtremist]
I posted a while back the introduction to a research proposal in climate change informatics. And I also posted a list of potential research areas, and a set of criteria by which we might judge climate informatics tools. But I didn’t say what kinds of things we might want climate informatics tools to do. Here’s my first attempt, based on a slide I used at the end of my talk on usable climate science:
What I was trying to lay out on this slide was a wide range of possible activities for which we could build software tools, combining good visualizations, collaborative support, and compelling user interface design. If we are to improve the quality of the public discourse on climate change, and support the kind of collective decision making that leads to effective action, we need better tools for all four of these areas:
A reader writes to me from New Zealand, arguing that climate science isn’t a science at all because there is no possibility to conduct experiments. This misconception appears to be common, even among some distinguished scientists, who presumably have never taken the time to read many published papers in climatology. The misconception arises because people assume that climate science is all about predicting future climate change, and because such predictions are for decades/centuries into the future, and we only have one planet to work with, we can’t check to see if these predictions are correct until it’s too late to be useful.
In fact, predictions of future climate are really only a by-product of climate science. The science itself concentrates on improving our understanding of the processes that shape climate, by analyzing observations of past and present climate, and testing how well we understand them. For example, detection/attribution studies focus on the detection of changes in climate that are outside the bounds of natural variability (using statistical techniques), and determining how much of the change can be attributed to each of a number of possible forcings (e.g. changes in: greenhouse gases, land use, aerosols, solar variation, etc). Like any science, the attribution is done by creating hypotheses about possible effects of each forcing, and then testing those hypotheses. Such hypotheses can be tested by looking for contradictory evidence (e.g. other episodes in the past where the forcing was present or absent, to test how well the hypothesis explains these too). They can also be tested by encoding each hypothesis in a climate model, and checking how well it simulates the observed data.
I’m not a climate modeler, but I have conducted anthropological studies of how how climate modelers work. Climate models are developed slowly and carefully over many years, as scientific instruments. One of the most striking aspects of climate model development is that it is an experimental science in the strongest sense. What do I mean?
Well, a climate model is a detailed theory of some subset of the earth’s physical processes. Like all theories, it is a simplification that focusses on those processes that are salient to a particular set of scientific questions, and approximates or ignores those processes that are less salient. Climate modelers use their models as experimental instruments. They compare the model run with the observational record for some relevant historical period. They then come up with a hypothesis to explain any divergences between the run and the observational record, and make a small improvement to the model that the hypothesis predicts will reduce the divergence. They then run an experiment in which the old version of the model acts as a control, and the new version is the experimental case. By comparing the two runs with the observational record, they determine whether the predicted improvement was achieved (and whether the change messed anything else up in the process). After a series of such experiments, the modelers will eventually either accept the change to the model as an improvement to be permanently incorporated into the model code, or they discard it because the experiments failed (i.e. they failed to give the expected improvement). By doing this day after day, year after year, the models get steadily more sophisticated, and steadily better at simulating real climactic processes.
This experimental approach has another interesting effect: the software appears to be tested much more thoroughly than most commercial software. Whether this actually delivers higher quality code is an interesting question; however, it is clear that the approach is much more thorough than most industry practices for software regression testing.
I’m delighted to announce that my student, Jonathan Lung has started a blog. Jonathan’s PhD is on how we reduce energy consumption in computing. Unlike much work on green IT, he’s decided to focus on the human behavioural aspects of this, rather than hardware optimization. His first two posts are fascinating:
As Jorge points out, this almost completes my set of grad student bloggers. We’ve been experimenting with blogging as a way of structuring research – a kind of open notebook science. Personally, I find it extremely helpful as a way of forcing me to write down ideas (rather than just thinking them), and for furthering discussion of ideas through the comments. And, just as importantly, it’s a way of letting other researchers know about what you’re working on – grad students’ future careers depend on them making a name for themselves in their chosen research community.
Of course, there’s a downside: grad students tend to worry about being “scooped”, by having someone else take their ideas, do the studies, and publish them first. My stock response is something along the lines of “research is 99% perspiration and 1% inspiration” – the ideas themselves, while important, are only a tiny part of doing research. It’s the investigation of the background literature and the implementation (design an empirical study, build a tool, develop a new theory, …etc) that matters. Give the same idea to a bunch of different grad students, and they will all do very different things with it, all of which (if the students are any good) ought to be publishable.
On balance, I think the benefits of blogging your way through grad school vastly outweigh the risks. Now if only my students updated their blogs more regularly… (hint, hint).
Interesting article by Andrew Jones entitled Are we taking supercomputing code seriously?:
Part of the problem is that in their rush to do science, scientists fail to spot the software for what it is: the analogue of the experimental instrument. Consequently, it needs to be treated with the same respect that a physical experiment would receive.
Any reputable physical experiment would ensure the instruments are appropriate to the job and have been tested. They would be checked for known error behaviour in the parameter regions of study, and chosen for their ability to give a satisfactory result within a useful timeframe and budget. Those same principles should apply to a software model.
In a blog post that was picked up by the Huffington post, Bill Gates writes about why we need innovation, not insulation. He sets up the piece as a choice of emphasis between two emissions targets: 30% reduction by 2025, and 80% reduction by 2050. He argues that the latter target is much more important, and hence we should focus on big R&D efforts to innovate our way to zero-carbon energy sources for transportation and power generation. In doing so, he pours scorn on energy conservation efforts, arguing, in effect, that they are a waste of time. Which means Bill Gates didn’t do his homework.
What matters is not some arbitrary target for any given year. What matters is the path we choose to get there. This is a prime example of the communications failure over climate change. Non-scientists don’t bother to learn the basic principles of climate science, and scientists completely fail to get the most important ideas across in a way that helps people make good judgements about strategy.
The key problem in climate change is not the actual emissions in any given year. It’s the cumulative emissions over time. The carbon we emit by burning fossil fuels doesn’t magically disappear. About half is absorbed by the oceans (making them more acidic). The rest cycles back and forth between the atmosphere and the biosphere, for centuries. And there is also tremendous lag in the system. The ocean warms up very slowly, so it take decades for the Earth to reach a new equilibrium temperature once concentrations in the atmosphere stabilize. This means even if we could immediately stop adding CO2 to the atmosphere today, the earth would keep warming for decades, and wouldn’t cool off again for centuries. It’s going to be tough adapting to the warming we’re already committed to. For every additional year that we fail to get emissions under control we compound the problem.
What does this mean for targets? It means that it matters much more how soon we get started on reducing emissions rather than eventual destination at any particular future year. Because any reduction in annual emissions achieved in the next few years means that we save that amount of emissions every year going forward. The longer we take to get the emissions under control, the harder we make the problem.
A picture might help:

Three different emissions pathways to give 67% chance of limiting global warming to 2ºC (From the Copenhagen Diagnosis, Figure 22)
The graph shows three different scenarios, each with the same cumulative emissions (i.e. the area under each curve is the same). If we get emissions to peak next year (the green line), it’s a lot easier to keep cumulative emissions under control. If we delay, and allow emissions to continue to rise until 2020, then we can forget about 80% reductions by 2050. We’ll have set ourselves the much tougher task of 100% emissions reductions by 2040!
The thing is, there are plenty of good analyses of how to achieve early emissions reductions by deploying existing technology. Anyone who argues we should put our hopes in some grand future R&D effort to invent new technologies clearly does not understand the climate science. Or perhaps can’t do calculus.

Here’s the abstract for a paper (that I haven’t written) on how to write an abstract:
How to Write an Abstract
The first sentence of an abstract should clearly introduce the topic of the paper so that readers can relate it to other work they are familiar with. However, an analysis of abstracts across a range of fields show that few follow this advice, nor do they take the opportunity to summarize previous work in their second sentence. A central issue is the lack of structure in standard advice on abstract writing, so most authors don’t realize the third sentence should point out the deficiencies of this existing research. To solve this problem, we describe a technique that structures the entire abstract around a set of six sentences, each of which has a specific role, so that by the end of the first four sentences you have introduced the idea fully. This structure then allows you to use the fifth sentence to elaborate a little on the research, explain how it works, and talk about the various ways that you have applied it, for example to teach generations of new graduate students how to write clearly. This technique is helpful because it clarifies your thinking and leads to a final sentence that summarizes why your research matters.
[I’m giving my talk on how to write a thesis to our grad students soon. Can you tell?]
Update 16 Oct 2011: This page gets lots of hits from people googling for “how to write an abstract”. So I should offer a little more constructive help for anyone still puzzling what the above really means. It comes from my standard advice for planning a PhD thesis (but probably works just as well for scientific papers, essays, etc.).
The key trick is to plan your argument in six sentences, and then use these to structure the entire thesis/paper/essay. The six sentences are:
The abstract I started with summarizes my approach to abstract writing as an abstract. But I suspect I might have been trying to be too clever. So here’s a simpler one:
(1) In widgetology, it’s long been understood that you have to glomp the widgets before you can squiffle them. (2) But there is still no known general method to determine when they’ve been sufficiently glomped. (3) The literature describes several specialist techniques that measure how wizzled or how whomped the widgets have become during glomping, but all of these involve slowing down the glomping, and thus risking a fracturing of the widgets. (4) In this thesis, we introduce a new glomping technique, which we call googa-glomping, that allows direct measurement of whifflization, a superior metric for assessing squiffle-readiness. (5) We describe a series of experiments on each of the five major types of widget, and show that in each case, googa-glomping runs faster than competing techniques, and produces glomped widgets that are perfect for squiffling. (6) We expect this new approach to dramatically reduce the cost of squiffled widgets without any loss of quality, and hence make mass production viable.
When I was visiting MPI-M earlier this month, I blogged about the difficulty of documenting climate models. The problem is particularly pertinent to questions of model validity and reproducibility, because the code itself is the result of a series of methodological choices by the climate scientists, which are entrenched in their design choices, and eventually become inscrutable. And when the code gets old, we lose access to these decisions. I suggested we need a kind of literate programming, which sprinkles the code among the relevant human representations (typically bits of physics, formulas, numerical algorithms, published papers), so that the emphasis is on explaining what the code does, rather than preparing it for a compiler to digest.
The problem with literate programming (at least in the way it was conceived) is that it requires programmers to give up using the program code as their organising principle, and maybe to give up traditional programming languages altogether. But there’s a much simpler way to achieve the same effect. It’s to provide an organising structure for existing programming languages and tools, but which mixes in non-code objects in an intuitive way. Imagine you had an infinitely large sheet of paper, and could zoom in and out, and scroll in any direction. Your chunks of code are laid out on the paper, in an spatial arrangement that means something to you, such that the layout helps you navigate. Bits of documentation, published papers, design notes, data files, parameterization schemes, etc can be placed on the sheet, near to the code that they are relevant to. When you zoom in on a chunk of code, the sheet becomes a code editor; when you zoom in on a set of math formulae, it becomes a LaTeX editor, and when you zoom in on a document it becomes a word processor.
Well, Code Canvas, a tool under development in Rob Deline‘s group at Microsoft Research does most of this already. The code is laid out as though it was one big UML diagram, but as you zoom in you move fluidly into a code editor. The whole thing appeals to me because I’m a spatial thinker. Traditional IDEs drive me crazy, because they separate the navigation views from the code, and force me to jump from one pane to another to navigate. In the process, they hide the inherent structure of a large code base, and constrain me to see only a small chunk at a time. Which means these tools create an artificial separation between higher level views (e.g. UML diagrams) and the code itself, sidelining the diagrammatic representations. I really like the idea of moving seamlessly back and forth between the big picture views and actual chunks of code.
Code Canvas is still an early prototype, and doesn’t yet have the ability to mix in other forms of documentation (e.g. LaTeX) on the sheet (or at least not in any demo Microsoft are willing to show off), but the potential is there. I’d like to explore how we take an idea like this an customize it for scientific code development, where there is less of a strict separation of code and data than in other forms of programming, and where the link to published papers and draft reports is important. The infinitely zoomable paper could provide an intuitive unifying tool to bring all these different types of object together in one place, to be managed as a set. And the use of spatial memory to help navigate will be helpful, when the set of things gets big.
I’m also interested in exploring the idea of using this metaphor for activities that don’t involve coding – for example complex decision-support for sustainability, where you need to move between spreadsheets, graphs & charts, models runs, and so on. I would lay out the basic decision task as a graph on the sheet, with sources of evidence connecting into the decision steps where they are needed. The sources of evidence could be text, graphs, spreadsheet models, live datafeeds, etc. And as you zoom in over each type of object, the sheet turns into the appropriate editor. As you zoom out, you get to see how the sources of evidence contribute to the decision-making task. Hmmm. Need a name for this idea. How about DecisionCanvas?
Update: Greg also pointed me to CodeBubbles and Intentional Software
Many moons ago, I talked about the danger of being distracted by our carbon footprints. I argued that the climate crisis cannot be solved by voluntary action by the (few) people who understand what we’re facing. The problem is systemic, and so adequate responses must be systemic too.
In the years since 9/11, it’s gotten steadily more frustrating to fly, as the lines build up at the security checkpoints, and we have to put more and more of what we’re wearing through the scanners. This doesn’t dissuade people from flying, but it does make them much more grumpy about it. And it doesn’t make them any safer, either. Bruce Schneier calls it “Security Theatre“: countermeasures that make it look like something is being done at the airport, but which make no difference to actual security. Bruce runs a regular competition to think up a movie plot that will create a new type of fear and hence enable the marketing of a new type of security theatre countermeasure.
Now Jon Udell joins the dots and points out that we have an equivalent problem in environmentalism: Carbon Theatre. Except that he doesn’t quite push the concept far enough. In Jon’s version, carbon theatre is competitions and online quizes and so on, in which we talk about how we’re going to reduce our carbon footprints more than the next guy, rather than actually getting on and doing things that make a difference.
I think carbon theatre is more insidious than that. It’s the very idea that an appropriate response to climate change is to make personal sacrifices. Like giving up flying. And driving. And running the air conditioner. And so on. The problem is, we approach these things like a dieter approaches the goal of losing weight. We make personal sacrifices that are simply not sustainable. For most people, dieting doesn’t work. It doesn’t work because, although the new diet might be healthier, it’s either less convenient or less enjoyable. Which means sooner or later, you fall off the wagon, because it’s simply not possible to maintain the effort and sacrifice indefinitely.
Carbon theatre means focussing on carbon footprint reduction without fixing the broader system that would make such changes sustainable. You can’t build a solution to climate change by asking people to give up the conveniences of modern life. Oh, sure, you can get people to set personal goals, and maybe even achieve them (temporarily). But if it requires a continual effort to sustain, you haven’t achieved anything. If it involves giving up things that you enjoy, and that others around you continue to enjoy, then it’s not a sustainable change.
I’ve struggled for many years to justify the fact that I fly a lot. A few long-haul flights in a year adds enough to my carbon footprint that just about anything else I do around the house is irrelevant. Apparently a lot of scientists worry about this too.When I blogged about the AGU meeting, the first comment worried about the collective carbon footprint of all those scientists flying to the meeting. George Marshall worries that this undermines the credibility of climate scientists (or maybe he’s even arguing that it means climate scientists still don’t really believe their own results). Somehow all these people seem to think it’s more important for climate scientists to give up flying than it is for, say, investment bankers or oil company executives. Surely that’s completely backwards??
This is, of course, the wrong way to think about the problem. If climate scientists unilaterally give up flying, it will make no discernible difference to the global emissions of the airline industry. And it will make the scientists a lot less effective, because it’s almost impossible to do good science without the networking and exchange of ideas that goes on at scientific conferences. And even if we advocate that everyone who really understands the magnitude of the climate crisis also gives up flying, it still doesn’t add up to a useful solution. We end up giving the impression that if you believe that climate change is a serious problem you have to make big personal sacrifices. Which makes it just that much harder for many people to accept that we do have a problem.
For example, I’ve tried giving up short haul flights in favour of taking the train. But often the train is more expensive and more hassle. If there is no direct train service to my destination, it’s difficult to plan a route, buy tickets, and the trains are never timed to connect in the right way. By making the switch, I’m inconveniencing myself, for no tangible outcome. I’d be far more effective getting together with others who understand the problem, and fixing the train system to make it cheaper and easier. Or helping existing political groups who are working towards this goal. If we make the train cheaper and easier than flying, it will be easy to persuade large number of people to switch as well.
So, am I arguing that working on our carbon footprints is a waste of time? Well, yes and no. It’s a waste of time if you’re doing it by giving up stuff that you’d rather not give up. However, it is worth it if you find a way to do it that could be copied by millions of other people with very little effort. In other words, if it’s not (massively) repeatable and sustainable, it’s probably a waste of time. We need changes that scale up, and we need to change the economic and policy frameworks to support such changes. That won’t happen if the people who understand what needs doing focus inwards on their own personal footprints. We have to think in terms of whole systems.
There is a caveat: sacrifices such as temporarily giving up flying are worthwhile if done as a way of understanding the role of flying in our lives, and the choices we make about travel; they might also be worthwhile if done as part of a coordinated political campaign to draw attention to a problem. But as a personal contribution to carbon reduction? That’s just carbon theatre.
Weather and climate are different. Weather varies tremendously from day to day, week to week, season to season. Climate, on the other hand is average weather over a period of years; it can be thought of as the boundary conditions on the variability of weather. We might get an extreme cold snap, or a heatwave at a particular location, but our knowledge of the local climate tells us that these things are unusual, temporary phenomena, and sooner or later things will return to normal. Forecasting the weather is therefore very different from forecasting changes in the climate. One is an initial value problem, and the other is a boundary value problem. Let me explain.
Good weather forecasts depend upon an accurate knowledge of the current state of the weather system. You gather as much data you can about current temperatures, winds, clouds, etc., feed them all into a simulation model and then run it forward to see what happens. This is hard because the weather is an incredibly complex system. The amount of information needed is huge: both the data and the models are incomplete and error-prone. Despite this, weather forecasting has come a long way over the past few decades. Through a daily process of generating forecasts, comparing them with what happened, and thinking about how to reduce errors, we have incredibly accurate 1- and 3- day temperature forecasts. Accurate forecasts of rain, snow, and so on for a specific location is a little harder because of the chance that the rainfall will be in a slightly different place (e.g a few kilometers away) or a slightly different time than the model forecasts, even if the overall amount of precipitation is right. Hence, daily forecasts give fairly precise temperatures, but put probabilistic values on things like rain (Probability of Precipitation, PoP), based on knowledge of the uncertainty factors in the forecast. The probabilities are known because we have a huge body of previous forecasts to compare with.
The limit on useful weather forecasts seems to be about one week. There are inaccuracies and missing information in the inputs, and the models are only approximations of the real physical processes. Hence, the whole process is error prone. At first these errors tend to be localized, which means the forecast for the short term (a few days) might be wrong in places, but is good enough in most of the region we’re interested in to be useful. But the longer we run the simulation for, the more these errors multiply, until they dominate the computation. At this point, running the simulation for longer is useless. 1-day forecasts are much more accurate than 3-day forecasts, which are better than 5-day forecasts, and beyond that it’s not much better than guessing. However, steady improvements mean that 3-day forecasts are now as accurate as 2-day forecasts were a decade ago. Weather forecasting centres are very serious about reviewing the accuracy of their forecasts, and set themselves annual targets for accuracy improvements.
A number of things help in this process of steadily improving forecasting accuracy. Improvements to the models help, as we get better and better at simulating physical processes in the atmosphere and oceans. Advances in high performance computing help too – faster supercomputers mean we can run the models at a higher resolution, which means we get more detail about where exactly energy (heat) and mass (winds, waves) are moving. But all of these improvements are dwarfed by the improvements we get from better data gathering. If we had more accurate data on current conditions, and could get it into the models faster, we could get big improvements in the forecast quality. In other words, weather forecasting is an “initial value” problem. The biggest uncertainty is knowledge of the initial conditions.
One result of this is that weather forecasting centres (like the UK Met Office) can get an instant boost to forecasting accuracy whenever they upgrade to a faster supercomputer. This is because the weather forecast needs to be delivered to a customer (e.g. a newspaper or TV station) by a fixed deadline. If the models can be made to run faster, the start of the run can be delayed, giving the meteorologists more time to collect newer data on current conditions, and more time to process this data to correct for errors, and so on. For this reason, the national weather forecasting services around the world operate many of the world’s fastest supercomputers.
Hence weather forecasters are strongly biased towards data collection as the most important problem to tackle. They tend to regard computer models as useful, but of secondary importance to data gathering. Of course, I’m generalizing – developing the models is also a part of meteorology, and some meteorologists devote themselves to modeling, coming up with new numerical algorithms, faster implementations, and better ways of capturing the physics. It’s quite a specialized subfield.
Climate science has the opposite problem. Using pretty much the same model as for numerical weather prediction, climate scientists will run the model for years, decades or even centuries of simulation time. After the first few days of simulation, the similarity to any actual weather conditions disappears. But over the long term, day-to-day and season-to-season variability in the weather is constrained by the overall climate. We sometimes describe climate as “average weather over a long period”, but in reality it is the other way round – the climate constrains what kinds of weather we get.
For understanding climate, we no longer need to worry about the initial values, we have to worry about the boundary values. These are the conditions that constraint the climate over the long term: the amount of energy received from the sun, the amount of energy radiated back into space from the earth, the amount of energy absorbed or emitted from oceans and land surfaces, and so on. If we get these boundary conditions right, we can simulate the earth’s climate for centuries, no matter what the initial conditions are. The weather itself is a chaotic system, but it operates within boundaries that keep the long term averages stable. Of course, a particularly weird choice of initial conditions will make the model behave strangely for a while, at the start of a simulation. But if the boundary conditions are right, eventually the simulation will settle down into a stable climate. (This effect is well known in chaos theory: the butterfly effect expresses the idea that the system is very sensitive to initial conditions, and attractors are what cause a chaotic system to exhibit a stable pattern over the long term)
To handle this potential for initial instability, climate modellers create “spin-up” runs: pick some starting state, run the model for say 30 years of simulation, until it has settled down to a stable climate, and then use the state at the end of the spin-up run as the starting point for science experiments. In other words, the starting state for a climate model doesn’t have to match real weather conditions at all; it just has to be a plausible state within the bounds of the particular climate conditions we’re simulating.
To explore the role of these boundary values on climate, we need to know whether a particular combination of boundary conditions keep the climate stable, or tend to change it. Conditions that tend to change it are known as forcings. But the impact of these forcings can be complicated to assess because of feedbacks. Feedbacks are responses to the forcings that then tend to amplify or diminish the change. For example, increasing the input of solar energy to the earth would be a forcing. If this then led to more evaporation from the oceans, causing increased cloud cover, this could be a feedback, because clouds have a number of effects: they reflect more sunlight back into space (because they are whiter than the land and ocean surfaces they cover) and they trap more of the surface heat (because water vapour is a strong greenhouse gas). The first of these is a negative feedback (it reduces the surface warming from increased solar input) and the second is a positive feedback (it increases the surface warming by trapping heat). To determine the overall effect, we need to set the boundary conditions to match what we know from observational data (e.g. from detailed measurements of solar input, measurements of greenhouse gases, etc). Then we run the model and see what happens.
Observational data is again important, but this time for making sure we get the boundary values right, rather than the initial values. Which means we need different kinds of data too – in particular, longer term trends rather than instantaneous snapshots. But this time, errors in the data are dwarfed by errors in the model. If the algorithms are off even by a tiny amount, the simulation will drift over a long climate run, and it stops resembling the earth’s actual climate. For example, a tiny error in calculating where the mass of air leaving one grid square goes could mean we lose a tiny bit of mass on each time step. For a weather forecast, the error is so small we can ignore it. But over a century long climate run, we might end up with no atmosphere left! So a basic test for climate models is that they conserve mass and energy over each timestep.
Climate models have also improved in accuracy steadily over the last few decades. We can now use the known forcings over the last century to obtain a simulation that tracks the temperature record amazingly well. These simulations demonstrate the point nicely. They don’t correspond to any actual weather, but show patterns in both small and large scale weather systems that mimic what the planet’s weather systems actually do over the year (look at August – see the the daily bursts of rainfall in the Amazon, the gulf stream sending rain to the UK all summer long, and the cyclones forming off the coast of Japan by the middle of the month). And these patterns aren’t programmed into the model – it is all driven by sets of equations derived from the basic physics. This isn’t a weather forecast, because on any given day, the actual weather won’t look anything like this. But it is an accurate simulation of typical weather over time (i.e. climate). And, as was the case with weather forecasts, some bits are better than others – for example the Indian monsoons tend to be less well-captured than the North Atlantic Oscillation.
At first sight, numerical weather prediction and climate models look very similar. They model the same phenomena (e.g. how energy moves around the planet via airflows in the atmosphere and currents in the ocean), using the same computational techniques (e.g., three dimensional models of fluid flow on a rotating sphere). And quite often they use the same program code. But the problems are completely different: one is an initial value problem, and one is a boundary value problem.
Which also partly explains why a small minority of (mostly older, mostly male) meteorologists end up being climate change denialists. They fail to understand the difference in the two problems, and think that climate scientists are misusing the models. They know that the initial value problem puts serious limits on our ability to predict the weather, and assume the same limit must prevent the models being used for studying climate. Their experience tells them that weaknesses in our ability to get detailed, accurate, and up-to-date data about current conditions is the limiting factor for weather forecasting, and they assume this limitation must be true of climate simulations too.
Ultimately, such people tend to suffer from “senior scientist” syndrome: a lifetime of immersion in their field gives them tremendous expertise in that field, which in turn causes them to over-estimate how well their expertise transfers to a related field. They can become so heavily invested in a particular scientific paradigm that they fail to understand that a different approach is needed for different problem types. This isn’t the same as the Dunning-Kruger effect, because the people I’m talking about aren’t incompetent. So perhaps we need a new name. I’m going to call it the Dyson-effect, after one of it’s worst sufferers.
I should clarify that I’m certainly not stating that meteorologists in general suffer from this problem (the vast majority quite clearly don’t), nor am I claiming this is the only reason why a meteorologist might be skeptical of climate research. Nor am I claiming that any specific meteorologists (or physicists such as Dyson) don’t understand the difference between initial value and boundary value problems. However, I do think that some scientists’ ideological beliefs tend to bias them to be dismissive of climate science because they don’t like the societal implications, and the Dyson-effect disinclines them to finding out what climate science actually does.
I am, however, arguing that if more people understood this distinction between the two types of problem, we could get past silly soundbites about “we can’t even forecast the weather…” and “climate models are garbage in garbage out”, and have a serious conversation about how climate science works.
Update: Zeke has a more detailed post on the role of parameterizations climate models.
I picked up Stephen Schneider’s “Science as a Contact Sport” to read on travel this week. I’m not that far into it yet (it’s been a busy trip), but was struck by a comment in chapter 1 about how he got involved in climate modeling. In the late 1960’s, he was working on his PhD thesis in plasma physics, and (in his words) “knew how to calculate magneto-hydro-dynamic shocks at 20,000 times the speed of sound”, with “one-and-a-half dimensional models of ionized gases” (Okay, I admit it, I have no idea what that means, but it sounds impressive)…
…Anyway, along comes Joe Smagorinsky from Princeton, to give a talk on the challenges of modeling the atmosphere as a three-dimensional fluid flow problem on a rotating sphere, and Schneider is immediately fascinated by both the mathematical challenges and the potential of this as important and useful research. He goes on to talk about the early modeling work and the mis-steps made in the early 1970’s on figuring out whether the global cooling from aerosols would be stronger than the global warming from greenhouse gases, and getting the relative magnitudes wrong by running the model without including the stratosphere. And how global warming denialists today like to repeat the line about “first you predicted global cooling, then you predicted global warming…” without understanding that this is exactly how science proceeds, by trying stuff, making mistakes, and learning from them. Or as Ms. Frizzle would say, “Take chances! Make Mistakes! Get Messy!” (No, Schneider doesn’t mention Magic School Bus in the book. He’s too old for that).
Anyway, I didn’t get much further reading the chapter, because my brain decided to have fun with the evocative phrase “modeling the atmosphere as a three-dimensional fluid flow problem on a rotating sphere”, which is perhaps the most succinct description I’ve heard yet of what a climate model is. And what would happen if Ms. Frizzle got hold of this model and encouraged her kids to “get messy” with it. What would they do?
Let’s assume the kids can run the model, and play around with its settings. Let’s assume that they have some wonderfully evocative ways of viewing the outputs too, such as these incredible animations of precipitation from a model (my favourite is “August“) from NCAR, and where greenhouse gases go after we emit them (okay, the latter was real data, rather than a model, but you get the idea).
What experiments might the kids try with the model? How about:
Now compare your answers with what the rest of the class got. And discuss what we’ve learned. [And finally, for the advanced students – look at the model software code, and point to the bits that are responsible for each outcome… Okay, I’m just kidding about that bit. We’d need literate code for that].
Okay, this seems like a worthwhile project. We’d need to wrap a desktop-runnable model in a simple user interface with the appropriate switches and dials. But is there any model out there that would come anywhere close to being useable in a classroom situation for this kind of exercise?
(feel free to suggest more experiments in the comments…)
This week I’m visiting the Max Planck Institute for Meteorology (MPI-M) in Hamburg. I gave my talk yesterday on the Hadley study, and it led to some fascinating discussions about software practices used for model building. One of the topics that came up in the discussion afterwards was how this kind of software development compares with agile software practices, and in particular the reliance on face-to-face communication, rather than documentation. Like many software projects, climate modellers struggle to keep good, up-to-date documentation, but generally feel they should be doing better. The problem of course, is that traditional forms of documentation (e.g. large, stand-alone descriptions of design and implementation details) are expensive to maintain, and of questionable value – the typical experience is that you wade through the documentation and discover that despite all the details, it never quite answers your question. Such documents are often produced in a huge burst of enthusiasm for the first release of the software, but then never touched again through subsequent releases. And as the code in the climate models evolves steadily over decades, the chances of any stand-alone documentation keeping up are remote.
An obvious response is that the code itself should be self-documenting. I’ve looked at a lot of climate model code, and readability is somewhat variable (to put it politely). This could be partially addressed with more attention to coding standards, although it’s not clear how familiar you would have to be with the model already to be able to read the code, even with very good coding standards. Initiatives like Clear Climate Code intend to address this problem, by re-implementing climate tools as open source projects in Python, with a strong focus on making the code as understandable as possible. Michael Tobis and I have speculated recently about how we’d scale up this kind of initiative to the development of coupled GCMs.
But readable code won’t fill the need for a higher level explanation of the physical equations and their numerical approximations used in the model, along with rationale for algorithm choices. These are often written up in various forms of (short) white papers when the numerical routines are first developed, and as these core routines rarely change, this form of documentation tends to remain useful. The problem is that these white papers tend to have no official status (or perhaps at best, they appear as technical reports), and are not linked in any usable way to distributions of the source code. The idea of literate programming was meant to solve this problem, but it never took off, probably because it demands that programmers must tear themselves away from using programming languages as their main form of expression, and start thinking about how to express themselves to other human beings. Given that most programmers define themselves in terms of the programming languages they are fluent in, the tyranny of the source code is unlikely to disappear anytime soon. In this respect, climate modelers have a very different culture from most other kinds of software development teams, so perhaps this is an area where the ideas of literate programming could take root.
Lack of access to these white papers could also be solved by publishing them as journal papers (thus instantly making them citeable objects). However, scientific journals tend not to publish descriptions of the designs of climate models, unless they are accompanied with new scientific results from the models. There are occasional exceptions (e.g. see the special issue of the Journal of Climate devoted to the MPI-M models). But things are changing, with the recent appearance of two new journals:
The problem is related to another dilemma in climate modeling groups: acknowledgement for the contributions of those who devote themselves more to model development rather than doing “publishable science”. Most of the code development is done by scientists whose performance is assessed by their publication record. Some modeling centres have created job positions such as “programmers” or “systems staff”, although most people hired into these roles have a very strong geosciences background. A growing recognition of the importance of their contributions represents a major culture change in the climate modeling community over the last decade.
The highlight of the whole conference for me was the Wednesday afternoon session on Methodologies of Climate Model Confirmation and Interpretation, and the poster session the following morning on the same topic, at which we presented Jon’s poster. Here’s my notes from the Wednesday session.
Before I dive in, I will offer a preamble for people unfamiliar with recent advances in climate models (or more specifically, GCMs) and how they are used in climate science. Essentially, these are massive chunks of software that simulate the flow of mass and energy in the atmosphere and oceans (using a small set of physical equations), and then couple these to simulations of biological and chemical processes, as well as human activity. The climate modellers I’ve spoken to are generally very reluctant to have their models used to generate predictions of future climate – the models are built to help improve our understanding of climate processes, rather than to make forecasts for planning purposes. I was rather struck by the attitude of the modellers at the Hadley centre at the meetings I sat in on last summer in the early planning stages for the next IPCC reports – basically, it was “how can we get the requested runs out of the way quickly so that we can get back to doing our science”. Fundamentally, there is a significant gap between the needs of planners and policymakers for detailed climate forecasts (preferably with the uncertainties quantified), and the kinds of science that the climate models support.
Climate models do play a major role in climate science, but sometimes that role is over-emphasized. Hansen lists climate models third in his sources of understanding of climate change, after (1) paleoclimate and (2) observations of changes in the present and recent past. This seems about right – the models help to refine our understanding and ask “what if…” questions, but are certainly only one of many sources of evidence for AGW.
Two trends in climate modeling over the past decade or so are particularly interesting: the push towards higher and higher resolution models (which thrash the hell out of supercomputers), and the use of ensembles:
Much of the concern is over the potential for “big surprises” – the chance that actual changes in the future will lie well outside the confidence intervals of these probabilistic forecasts (to understand why this is likely, you’ll have to read on to the detailed notes). And much of the concern is with the potential for surprises where the models dramatically under-estimate climate change and its impacts. Climate models work well at simulating 20th Century climate. But the more the climate changes in the future, the less certain we can be that the models capture the relevant processes accurately. Which is ironic, really: if the climate wasn’t changing so dramatically, climate models could give very confident predictions of 21st century climate. It’s at the upper end of projected climate changes where the most uncertainty lies, and this is the scary stuff. It worries the heck out of many climatologists.
Much of the question is to do with adequacy for answering particular questions about climate change. Climate models are very detailed hypotheses about climate processes. They don’t reproduce past climate precisely (because of many simplifications). But they do simulate past climate reasonably well, and hence are scientifically useful. It turns out that investigating areas of divergence (either from observations, or from other models) leads to interesting new insights (and potential model improvements).
Okay, with that as an introduction, on to my detailed notes from the session (be warned: it’s a long post). More »
I’m still only halfway through getting my notes from the AGU meeting turned into blog posts. But the Christmas vacation intervened. I’m hoping to get the second half of the conference blogged over the next week or so, but in the meantime, I thought I’d share these tidbits:

Wednesday morning also saw the poster session “IN31B – Emerging Issues in e-Science: Collaboration, Provenance, and the Ethics of Data”. I was presenting Alicia‘s poster on open science and reproducibility:
The poster summarizes Alicia’s master’s thesis work – a qualitative study of what scientists think about open science and reproducibility, and how they use these terms (Alicia’s thesis will be available real soon now). The most interesting outcome of the study for me was the realization that innocent sounding terms such as “replication” mean very different things to different scientists. For example, when asked how many experiments in their field are replicated, and how many should be replicated, the answers are all over the map. One reason is that the term “experiment” can have vastly different meanings to different people, from a simple laboratory procedure that might take an hour or so, to a journal-paper sized activity spanning many months. Another reason is that it’s not always clear what it means to “replicate” an experiment. To some people it means following the original experimental procedure exactly to try to generate the same results, while to others, replication includes different experiments intended to test the original result in a different way.
Once you’ve waded through the different meanings, there still seems to be a range of opinion on the desirability of frequent replication. In many fields (including my field, software engineering) there are frequent calls for more replication, along with complaints about the barriers (e.g. some journals won’t accept papers reporting replications because they’re not ‘original’ enough). However, on the specific question of how many published studies should be replicated, an answer other than “100%” is quite defensible: some published experiments are dead-ends (research questions that should not be pursued further), and some are just bad experiments (experimental designs that in hindsight were deeply flawed). And then there’s the opportunity cost – instead of replicating an experiment for a very small knowledge gain, it’s often better to design a different experiment to probe new aspects of the same theory, for a much larger knowledge gain. We reflected on some of these issues in our ICSE’2008 paper On the Difficulty of Replicating Human Subjects Studies in Software Engineering.
Anyway, I digress. Alicia’s study also revealed a number of barriers to sharing data, suggesting that some of the stronger calls for open science and reproducibility standards are, at least currently, too impractical. At a minimum, we need better tools for capturing data provenance and scientific workflows. But more importantly, we need to think more about the balance of effort – a scientist who has spent many years developing a dataset needs the appropriate credit for this effort (currently, we only tend to credit the published papers based on the data), and perhaps even some rights to exploit the dataset for their own research first, before sharing. And for large, complex datasets, there’s the balance between ‘user support’ as other people try to use the data and have many questions about it, versus getting on with your own research. I’ve already posted about an extreme case in climate science, where such questions can be used strategically in a kind of denial of service attack. The bottom line is that while in principle, openness and reproducibility are important cornerstones of scientific process, in practice there are all sorts of barriers, most of which are poorly understood.
Alicia’s poster generated a huge amount of interest, and I ended up staying around the poster area for much longer than I expected, having all sorts of interesting conversations. Many people stopped by to ask questions about the results described on the poster, especially the tables (which seemed to catch everyone’s attention). I had a fascinating chat with Paulo Pinheiro da Silva, from UT El Paso, whose Cyber-Share project is probing many of these issues, especially the question of whether knowledge provenance and semantic web techniques can be used to help establish trust in scientific artefacts (e.g. datasets). We spent some time discussing what is good and bad about current metadata projects, and the greater challenge of capturing the tacit knowledge scientists have about their datasets. Also chatted briefly with Peter Fox, of Rensselaer, who has some interesting example use cases for where scientists need to do search based on provenance rather than (or in addition to) content.
This also meant that I didn’t get anywhere near enough time to look at the other posters in the session. All looked interesting, so I’ll list them here to remind me to follow up on them: