Following my post last week about Fortran coding standards for climate models, Tim reminded me of a much older paper that was very influential in the creation (and sharing) of coding standards across climate modeling centers:

The paper is the result of a series of discussions in the mid-1980s across many different modeling centres (the paper lists 11 labs) about how to facilitate sharing of code modules. To simplify things, the paper assumes what is being shared are parameterization modules that operate in a single column of the model. Of course, this was back in the 1980s, which means the models were primarily atmospheric models, rather than the more comprehensive earth system models of today. The dynamical core of the model handles most of the horizontal processes (e.g. wind), which means that most of the remaining physical processes (the subject of these parameterizations) affect what happens vertically within a single column, e.g. by affecting radiative or convective transfer of heat between the layers. Plugging in new parameterization modules becomes much easier if this assumption holds, because the new module needs to be called once per time step per column, and if it doesn’t interact with other columns, it doesn’t mess up the vectorization. The paper describes a number of coding conventions, effectively providing an interface specification for single-column parameterizations.

An interesting point about this paper is that popularized the term “plug compatibility” amongst the modeling community, along with the (implicit) broader goal of designing all models to be plug-compatible. (although it cites Pielke & Arrit for the origin of the term). Unfortunately, the goal seems to be still very elusive. While most modelers will agree accept that plug-compatibility is desirable, a few people I’ve spoken to are very skeptical that it’s actually possible. Perhaps the strongest statement on this is from:

  • Randall DA. A University Perspective on Global Climate Modeling. Bulletin of the American Meteorological Society. 1996;77(11):2685-2690.
    p2687: “It is sometimes suggested that it is possible to make a plug-compatible global model so that an “outside” scientist can “easily make changes”. With a few exceptions (e.g. radiation codes), however, this is a fantasy, and I am surprised that such claims are not greeted with more skepticism.”

He goes on to describe instances where parameterizations have been transplanted from one model to another, but likens it to a major organ transplant, but more painful. The problem is that the various processes of the earth system interact in complex ways, and these complex interactions have to be handled properly in the code. As Randall puts it: “…the reality is that a global model must have a certain architectural unity or it will fail”. In my interviews with climate modellers, I’ve heard many tales of it taking months, and sometimes years of effort to take a code module contributed by someone outside the main modeling group, and to make it work properly in the model.

So plug compatibility and code sharing sound great in principle. In practice, no amount of interface specification and coding standards can reduce the essential complexity of earth system processes.

Note: most of the above is about plug compatibility of parameterization modules (i.e. code packages that live within the green boxes on the Bretherton diagram). More progress has been made (especially in the last decade) in standardizing the interfaces between major earth system components (i.e. the arrows on the Bretherton diagram). That’s where standardized couplers come in – see my post on the high level architecture of earth system models for an introduction. The IS-ENES workshop on coupling technologies in December will be an interesting overview of the state of the art here, although I won’t be able to attend, as it clashes with the AGU meeting.

7 Comments

  1. Alright, interchanging moduls from different climate models from different research groups cannot be done easily, because maybe the overall modelling like grid layout and time synchronization are too different, and need concepts beyond usual modularization techniques – but if we estimate that on the average 200 people work part time (1/4 of their time) on one and the same model, I still don’t understand how they do it without some modularization 🙂

    How do they handly concurrency, for example? Isn’t that an overall architectural problem that one cannot refactor into existing code?

    BTW, the book “A Climate Modelling Primer” by Kendal MacGuffie and Ann Henderson-Sellers mentions a – still active, it would seem – project developing a modular framework, the “Earth Modelling System Framework”, http://www.earthsystemmodeling.org/. I haven’t looked at it yet, though.

  2. The ESMF effort is indeed still active. But it runs in to exactly the problems mentioned about the essential complexity of the system to be modeled. The more straightforward part is, as Steve mentions, the idea (at least the idea) of having couplers between different modules. Less agreement as to what constitutes a different module.

    Even with the couplers, though, you run in to issues with the framework because it does, in principle, extend in to the module level. Where it does so, ESMF is making choices about how problems should be solved. To the extent (rather large) that the ESMF community is also a modeling community, the choices are reasonable. But reasonable today, and to that community, does not mean, necessarily, correct for now, nor correct for all future model development. Not an actual illustration, any more, but suppose ESMF had been done in the early 90s, when vector processors were the mainstay (the Kalnay document, for instance, has some mentions this way iirc). The framework would have specified things in a manner friendly to vector processing, which would have left us somewhat trapped when parallel processing came in.

  3. The problem is that the various processes of the earth system interact in complex ways, and these complex interactions have to be handled properly in the code. As Randall puts it: “…the reality is that a global model must have a certain architectural unity or it will fail”.

    I’m not at all sure I believe this. Yes it is true that different processes interact in a complex way. But this has nothing to do with plug-compatibility; it has nothing to do with coding standards at all. Changing any interesting module is always fraught, unless it has been developed with the existing model; not for coding reasons, but because of what it does (physically, so to speak).

  4. William: I think that’s what I was trying to say (but you’ve explained it better). And I *think* it’s what Randall means by architectural unity, as I don’t think he’s talking about software at all, I think he’s talking about unity in the systems of equations and how they are operationalized.

  5. @steve
    But still, if there are 200 people working on the same climate model at the same time, how do they manage to cooperate?

    In a usual software project you’d have to split the 200 people up in at least 10 subgroups with a team leader (although leading 20 developers in this highly specialized area would be too much, already), and having the teamleaders meet at least once a week to synchronize. And then they’d need some kind of architecture or component model or the like to even know what they are talking about.

    Maybe this is naive, but I really don’t have any idea how these research groups are organized….

  6. Tim: very good questions. First, think of it more like an open source community, rather than an industrial software development team. A large number of people contribute code, but most of them are doing this only occasionally, each contributing when their specialist knowledge is relevant. And most of these contributions are opportunistic – someone sees a way of improving the model, makes the necessary changes on a code branch, experiments with it for a while, and when it seems to be both valid and robust enough, offers to fold it back into the trunk. A much smaller team (typically less than a dozen) takes on the responsibility for accepting these changes, managing a continuous integration and test process, and watching for interactions between the contributed changes. They then produce baseline releases of the model at various intervals. The exact details (such as specific roles, number of people involved, and release schedule) vary tremendously from lab to lab.

    Some other factors help: very low staff turnover (in most cases the scientists working with a particular model have worked with it for years and years), very high domain expertise, and a community who are basically writing code for their own use.

  7. This reminds me of nothing so much as the core/dev arrangement in (e.g.) BSD development. http://www.freebsd.org/administration.html

Leave a Reply

Your email address will not be published. Required fields are marked *