To get myself familiar with the models at each of the climate centers I’m visiting this summer, I’ve tried to find high level architectural diagrams of the software structure. Unfortunately, there seem to be very few such diagrams around. Climate scientists tend to think of their models in terms of a set of equations, and differentiate between models on the basis of which particular equations each implements. Hence, their documentation doesn’t contain the kinds of views on the software that a software engineer might expect. It presents the equations, often followed with comments about the numerical algorithms that implement them. This also means they don’t find automated documentation tools such as Doxygen very helpful, because they don’t want to describe their models in terms of code structure (the folks at MPI-M here do use Doxygen, but it doesn’t give them the kind of documentation they most want).

But for my benefit, as I’m a visual thinker, and perhaps to better explain to others what is in these huge hunks of code, I need diagrams. There are some schematics like this around (taken from an MPI-M project site):

0c5e5ca1d3

But it’s not quite what I want. It shows the major components:

  • ECHAM – atmosphere dynamics and physics,
  • HAM – aerosols,
  • MESSy – atmospheric chemistry,
  • MPI-OM – ocean dynamics and physics,
  • HAMOCC – ocean biogeochemistry,
  • JSBACH – land surface processes,
  • HD – hydrology,
  • and the coupler, PRISM,

…but it only shows a few of the connectors, and many of the arrows are unlabeled. I need something that more clearly distinguishes the different kinds of connector, and perhaps shows where various subcomponents fit in (in part because I want to think about why particular compositional choices have been made).

The closest I can find to what I need is the Bretherton diagram, produced back in the mid 1980’s to explain what earth system science is all about:

The Bretherton Diagram of earth system processes (click to see bigger, as this is probably not readable!)

It’s not a diagram of an earth system model per se, but rather of the set of systems that such a model might simulate. There’s a lot of detail here, but it does clearly show the major systems (orange rectangles – these roughly correspond to model components) and subsystems (green rectangles), along with data sources and sinks (the brown ovals) and the connectors (pale blue rectangles, representing the data passed between components).

The diagram allows me to make a number of points. First, we can distinguish between two types of model:

  • a Global Climate Model, also known as a General Circulation Model (GCM), or Atmosphere-Ocean coupled model (AO-GCM), which only simulates the physical and dynamic processes in the atmosphere and ocean. Where a GCM does include parts of the other processes, it it typically only to supply appropriate boundary conditions.
  • an Earth System Model (ESM), which also includes the terrestrial and marine biogeochemical processes, snow and ice dynamics, atmospheric chemistry, aerosols, and so on – i.e. it includes simulations of most of the rest of the diagram.

Over the past decade, AO-GCMs have steadily evolved to become ESMs, although there are many intermediate forms around. In the last IPCC assessment, nearly all the models used for the assessment runs were AO-GCMs. For the next assessment, many of them will be ESMs.

Second, perhaps obviously, the diagram doesn’t show any infrastructure code. Some of this is substantial – for example an atmosphere-ocean coupler is a substantial component in its own right, often performing elaborate data transformations, such as re-gridding, interpolation, and synchronization. But this does reflect the way in which scientists often neglect the infrastructure code, because it is not really relevant to the science.

Third, the diagram treats all the connectors in the same way, because, at some level, they are all just data fields, representing physical quantities (mass, energy) that cross subsystem boundaries. However, there’s a wide range of different ways in which these connectors are implemented – in some cases binding the components tightly together with complex data sharing and control coupling, and in other cases keeping them very loose. The implementation choices are based on a mix of historical accident, expediency, program performance concerns, and the sheer complexity of the physical boundaries between the actual earth subsystems. For example, within an atmosphere model, the dynamical core (which computes the basic thermodynamics of air flow) is distinct from the radiation code (which computes how visible light, along with other parts of the spectrum, are scattered or absorbed by the various layers of air) and the moist processes (i.e. humidity and clouds). But the complexity of the interactions between these processes is sufficiently high that they are tightly bound together – it’s not currently possible to treat any of these parts as swappable components (at least in the current generation of models), although during development, some parts can be run in isolation for unit testing e.g. the dynanamical core is tested in isolation, but then most other subcomponents depend on it.

On the other hand, the interface between atmosphere and ocean is relatively simple — it’s the ocean surface — and as this also represents the interface between two distinct scientific disciplines (atmospheric physics and oceanography), atmosphere models and ocean model are always (?) loosely coupled. It’s common now for the two to operate on different grids (different resolution, or even different shape), and the translation of the various data to be passed between them is handled by a coupler. Some schematic diagrams do show how the coupler is connected:

Atmosphere-Ocean coupling via the OASIS coupler (source: Figure 4.2 in the MPI-Met PRISM Earth System Model Adaptation Guide)

Atmosphere-Ocean coupling via the OASIS coupler (source: Figure 4.2 in the MPI-Met PRISM Earth System Model Adaptation Guide)

Other interfaces are harder to define than the atmosphere-ocean interface. For example, the atmosphere and the terrestrial processes are harder to decouple: Which parts of the water cycle should be handled by the atmosphere model and which should be handled by the land surface model? Which module should handle evaporation from plants and soil? In some models, such as ECHAM, the land surface is embedded within the atmosphere model, and called as a subroutine at each time step. In part this is historical accident – the original atmosphere model had no vegetation processes, but used soil heat and moisture parameterization as a boundary condition. The land surface model, JSBACH, was developed by pulling out as much of this code as possible, and developing it into a separate vegetation model, and this is sometimes run as a standalone model by the land surface community. But it still shares some of the atmosphere infrastructure code for data handling, so its not as loosely coupled as the ocean is. By contrast, in CESM, the land surface model is a distinct component, interacting with the atmosphere model only via the coupler. This facilitates the switching of different land and/or atmosphere components into the coupled scheme, and also allows the land surface model to have a different grid.

The interface between the ocean model and the sea ice model is also tricky, not least because the area covered by the ice varies with the seasonal cycle. So if you use a coupler to keep the two components separate, the coupler needs information about which grid points contain ice and which do not at each timestep, and it has to alter its behaviour accordingly. For this reason, the sea ice is often treated as a subroutine of the ocean model, which then avoids having to expose all this information to the coupler. But again we have the same trade-off. Working through the coupler ensures they are self-contained components and can be swapped for other compatible models as needed; but at the cost of increasing the complexity of the coupler interfaces, reducing information hiding, and making future changes harder.

Similar challenges occur for:

  • the coupling between the atmosphere and the atmospheric chemistry (which handles chemical processes as gases and various types of pollution are mixed up by atmospheric dynamics).
  • the coupling between the ocean and marine biogeochemistry (which handles the way ocean life absorbs and emits various chemicals while floating around on ocean currents).
  • the coupling between the land surface processes and terrestrial hydrology (which includes rivers, lakes, wetlands and so on). Oh, and between both of these and the atmosphere, as water moves around so freely. Oh, and the ocean as well, because we have to account for how outflows from rivers enter the ocean at coastlines all around the world.
  • …and so on, as we account for more and more of the earth’s system into the models.

Overall, it seems that the complexity of the interactions between the various earth system processes is so high that traditional approaches to software modularity don’t work. Information hiding is hard to do, because these processes are so tightly inter-twined. A full object-oriented approach would be a radical departure from how these models are built currently, with the classes built on the data objects (the pale blue boxes in the Bretherton diagram) rather than the processes (the green boxes). But the computational demands of the processes in the green boxes is so high that the only way to make them efficient is to give them full access to the low level data structures. So any attempt to abstract away these processes from the data objects they operate on will lead to a model that is too inefficient to be useful.

Which brings me back to the question of how to draw pictures of the architecture so that I can compare the coupling and modularity of different models. I’m thinking the best approach might be to start with the Bretherton diagram, and then overlay it to show how various subsystems are grouped into components, and which connectors are handled by a separate coupler.

Postscript: While looking for good diagrams, I came across this incredible collection of visualizations of various aspects of sustainability, some of which are brilliant, while others are just kooky.

7 Comments

  1. Hi Steve,

    You mentioned the evolution of AO-GCMs to ESMs over the past decade. A few possible explanations spring to mind: lack of human resources, lack of computing power, lack of knowledge, and changing goals/purposes for the model. Have you noticed whether any of these or other reasons have been the predominant reason for this transition? Alternately, are AO-GCM and ESMs simply labels that we use to classify existing models and that what we call ESMs are and have always been the terminal software evolutionary goal of all/most of the models you have looked at? That is, is this “transition” an artifact of the fact that the previous sample a decade ago was merely a snapshot of an early stage of the models that had not yet evolved to have features that are considered necessary to called an ESM?

  2. I think you are right that the models are intrinsically related in complex ways that make both architectural decisions and depictions complicated. When the phenomena itself is so intertwined, the model struggles to do so as well.

    But, if I understand correctly, model components are also interconnected in ways that are not inherent to the physical processes. I am thinking here of the parameterizations determined when tuning a model, and basically configuring all the components to work together nicely and perform well. Parameterizations are introduced because we do not understand a certain physical process or are unable to simulate at the small scale needed – an artifact of our imperfect modeling capabilities. So even if model components could be more cleanly isolated in software architecture, they would still be tied to each other in that each component’s parameterizations are tuned to work with just a particular collection of components. This is significant because it complicates the vision of being able to swap around model components easily in plug-and-play style.

  3. If it’s any consolation, most open source projects have no architecture description of any kind — we hope http://third-bit.com/blog/archives/3847.html will plug a few holes.

  4. Have you talked to anyone at ECMWF about their current efforts at producing an OOP version of the IFS model ? they’re investigating writing an object-based version in either fortran 2003/2008 or C++, and looking at information hiding issues, etc.

    Unfortunately for the idea of recoding in Fortress, etc. the language(s) of choice look like being Fortran, C++ on the compiled side, python on the dynamical side. See the results of a discussion at the previous HPC workshop at ECMWF, 2008. Even the future of Fortran looks iffy: despite the huge amount of code in it, getting fully-featured optimised compilers is hard : it was embarassing for the Fortran 2008 committee that almost no-one had fully implemented fortran 2003. Given this, there is a tilt towards using C++. A bit of a chicken-and-egg situation in terms of getting adoption of new (compiled) languages.

  5. I really like this paragraph:
    “Overall, it seems that the complexity of the interactions between the various earth system processes is so high that traditional approaches to software modularity don’t work. Information hiding is hard to do, because these processes are so tightly inter-twined. A full object-oriented approach would be a radical departure from how these models are built currently, with the classes built on the data objects (the pale blue boxes in the Bretherton diagram) rather than the processes (the green boxes). But the computational demands of the processes in the green boxes is so high that the only way to make them efficient is to give them full access to the low level data structures. So any attempt to abstract away these processes from the data objects they operate on will lead to a model that is too inefficient to be useful.”

    Emphasis added. I think we’ve known this for a while but it hasn’t been put it into words so succinctly. Wish we had this statement 10 years ago!

  6. Pingback: When did ignorance become a badge of honour? | Serendipity

  7. “But the computational demands of the processes in the green boxes is so high that the only way to make them efficient is to give them full access to the low level data structures. ”

    As a computational physicist who started programming in the days of punch cards, I can state that we have always known this. And the advantages of object-oriented programming have never quite compensated for the performance loss.

    I have a question: How does one verify and validate an ESM code?

  8. Pingback: Causal loop diagram smorgasbord « MetaSD

  9. Pingback: Plug-compatibility and climate models | Serendipity

  10. Hi Steve,

    I am looking into how can I decouple aerosols computations (HAM) from ECHAM. Is HAM hard-wired into ECHAM? What approach should one take if he/she wishes to plug-off HAM and integrate it into another ESM? Any help will be greatly appreciated!!

    Best wishes,

  11. Pingback: Why Systems Thinking? | Serendipity

Leave a Reply

Your email address will not be published. Required fields are marked *