First proper day of the AGU conference, and I managed to get to the (free!) breakfast for Canadian members, which was so well attended that the food ran out early. Do I read this as a great showing for Canadians at AGU, or just that we’re easily tempted with free food?

Anyway, on to the first of three poster sessions we’re involved in this week. This first poster was on TracSnap, the tool that Ainsley and Sarah worked on over the summer:

Our tracSNAP poster for the AGU meeting. Click for fullsize.

Our tracSNAP poster for the AGU meeting. Click for fullsize.

The key idea in this project is that large teams of software developers find it hard to maintain an awareness of one another’s work, and cannot easily identify the appropriate experts for different sections of the software they are building. In our observations of how large climate models are built, we noticed it’s often hard to keep up to date with what changes other people are working on, and how those changes will affect things. TracSNAP builds on previous research that attempts to visualize the social network of a large software team (e.g. who talks to whom), and relate that to couplings between code modules that team members are working on. Information about the intra-team communication patterns (e.g. emails, chat sessions, bug reports, etc) can be extracted automatically from project repositories, as can information about dependencies in the code. TracSNAP extracts data automatically from the project repository to provide answers to questions such as “Who else recently worked on the module I am about to start editing?”, and “Who else should I talk to before starting a task?”. The tool extracts hidden connections in the software by examining modules that were checked into the repository together (even though they don’t necessarily refer to each other), and offers advice on how to approach key experts by identifying intermediaries in the social network. It’s still a very early prototype, but I think it has huge potential. Ainsley is continuing to work on evaluating it on some existing climate models, to check that we can pull out of the respositories the data we think we can.

The poster session we were in, “IN11D. Management and Dissemination of Earth and Space Science Models” seemed a little disappointing as there were only three posters (a fourth poster presenter hadn’t made it to the meeting). But what we lacked in quantity, we made up in quality. Next to my poster was David Bailey‘s: “The CCSM4 Sea Ice Component and the Challenges and Rewards of Community Modeling”. I was intrigued by the second part of his title, so we got chatting about this. Supporting a broader community in climate modeling has a cost, and we talked about how university labs just cannot afford this overhead. However, it also comes with a number of benefits, particularly the existence of a group of people from different backgrounds who all take on some ownership of model development, and can come together to develop a consensus on how the model should evolve. With the CCSM, most of this happens in face to face meetings, particularly the twice-yearly user meetings. We also talked a little about the challenges of integrating the CICE sea ice model from Los Alamos with CCSM, especially given that CICE is also used in the Hadley model. Making it work in both models required some careful thinking about the interface, and hence more focus on modularity. David also mentioned people are starting to use the term kernelization as a label for the process of taking physics routines and packaging them so that they can be interchanged more easily.

Dennis Shea‘s poster, “Processing Community Model Output: An Approach to Community Accessibility” was also interesting. To tackle the problem of making output from the CCSM more accessible to the broader CCSM community, the decision was taken to standardize on netCDF for the data, and to develop and support a standard data analysis toolset, based on the NCAR Command Language. NCAR runs regular workshops on the use of these data formats and tools, as part of it’s broader community support efforts (and of course, this illustrates David’s point about universities not being able to afford to provide such support efforts).

The missing poster also looked interesting: Charles Zender from UC Irvine, comparing climate modeling practices with open source software practices. Judging from his abstract, Charles makes many of the same observations we made in our CiSE paper, so I was looking forward to comparing notes with him. Next time, I guess.

Poster sessions at these meetings are both wonderful and frustrating. Wonderful because you can wonder down aisles of posters and very quickly sample a large slice of research, and chat to the poster owners in a freeform format (which is usually much better than sitting through a talk). Frustrating because poster owners don’t stay near their posters very long (I certainly didn’t – too much to see and do!), which means you get to note an interesting piece of work, and then never manage to track down the author to chat (and if you’re like me, you also forget to write down contact details for posters you noticed. However, I did manage to make notes on two to follow up on:

  • Joe Galewsky caught my attention with an provocative title: “Integrating atmospheric and surface process models: Why software engineering is like having weasels rip your flesh”
  • I briefly caught Brendan Billingsley of NSIDC as he was taking his poster down. It caught my eye because it was a reflection on software reuse in the  Searchlight tool.

Leave a Reply

Your email address will not be published. Required fields are marked *