The Software Architecture of Global Climate Models

10. December 2011 · 1 comment · Categories: climate modeling

On Thursday, Kaitlin presented her poster at the AGU meeting, which shows the results of the study she did with us in the summer. Her poster generated a lot of interest, especially the visualizations she has of the different model architectures. Click on thumbnail to see the full poster at the AGU site:

A few things to note when looking at the diagrams:

Each diagram shows the components of a model, scale to their relative size by lines of code. However, the models are not to scale with one another, as the smallest, UVic’s, is only a tenth of the size of the biggest, CESM. Someone asked what accounts for that size. Well, the UVic model is an EMIC rather than a GCM. It has a very simplified atmosphere model that does not include atmospheric dynamics, which makes it easier to run for very long simulations (e.g. to study paleoclimate). On the other hand, CESM is a community model, with a large number of contributors across the scientific community. (See Randall and Held’s point/counterpoint article in last months IEEE Software for a discussion of how these fit into different model development strategies).
The diagrams show the couplers (in grey), again sized according to number of lines of code. A coupler handles data re-gridding (when the scientific components use different grids), temporal aggregation (when the scientific components run on different time steps) along with other data handling. These are often invisible in diagrams the scientists create of their models, because they are part of the infrastructure code; however Kaitlin’s diagrams show how substantial they are in comparison with the scientific modules. The European models all use the same coupler, following a decade-long effort to develop this as a shared code resource.
Note that there are many different choices associated with the use of a coupler, as sometimes it’s easier to connect components directly rather through the coupler, and the choice may be driven by performance impact, flexibility (e.g. ‘plug-and-play’ compatibility) and legacy code issues. Sea ice presents an interesting example, because its extent varies over the course of a model run. So somewhere there must be code that keeps track of which grid cells have ice, and then routes the fluxes from ocean and atmosphere to the sea ice component for these grid cells. This could be done in the coupler, or in any of the three scientific modules. In the GFDL model, sea ice is treated as an interface to the ocean, so all atmosphere-ocean fluxes pass through it, whether there’s ice in a particular cell or not.
The relative size of the scientific components is a reasonable proxy for functionality (or, if you like, scientific complexity/maturity). Hence, the diagrams give clues about where each lab has placed its emphasis in terms of scientific development, whether by deliberate choice, or because of availability (or unavailability) of different areas of expertise. The differences between the models from different labs show some strikingly different choices here, for example between models that are clearly atmosphere-centric, versus models that have a more balanced set of earth system components.
One comment we received in discussions around the poster was about the places where we have shown sub-components in some of the models. Some modeling groups are more explicit about naming the sub-components, and indicating them in the code. Hence, our ability to identify these might be more dependent on naming practices rather than any fundamental architectural differences.

I’m sure Kaitlin will blog more of her reflections on the poster (and AGU in general) once she’s back home.

Serendipity

The Software Architecture of Global Climate Models

1 Comment

Leave a Reply