Okay, I’ve had a few days to reflect on the session on Software Engineering for the Planet that we ran at ICSE last week. First, I owe a very big thank you to everyone who helped – to Spencer for co-presenting and lots of follow up work; to my grad students, Jon, Alicia, Carolyn, and Jorge for rehearsing the material with me and suggesting many improvements, and for helping advertise and run the brainstorming session; and of course to everyone who attended and participated in the brainstorming for lots of energy, enthusiasm and positive ideas.

First action as a result of the session was to set up a google group, SE-for-the-planet, as a starting point for coordinating further conversations. I’ve posted the talk slides and brainstorming notes there. Feel free to join the group, and help us build the momentum.

Now, I’m contemplating a whole bunch of immediate action items. I welcome comments on these and any other ideas for immediate next steps:

  • Plan a follow up workshop at a major SE conference in the fall, and another at ICSE next year (waiting a full year was considered by everyone to be too slow).
  • I should give my part of the talk at U of T in the next few weeks, and we should film it and get it up on the web. 
  • Write a short white paper based on the talk, and fire it off to NSF and other funding agencies, to get funding for community building workshops
  • Write a short challenge statement, to which researchers can respond with project ideas to bring to the next workshop.
  • Write up a vision paper based on the talk for CACM and/or IEEE Software
  • Take the talk on the road (a la Al Gore), and offer to give it at any university that has a large software engineering research group (assuming I can come to terms with the increased personal carbon footprint 😉
  • Broaden the talk to a more general computer science audience and repeat most of the above steps.
  • Write a short book (pamphlet) on this, to be used to introduce the topic in undergraduate CS courses, such as computers and society, project courses, etc.

Phew, that will keep me busy for the rest of the week…

Oh, and I managed to post my ICSE photos at last.

In the last session yesterday, Inez Fung gave the Charney Lecture: Progress in Earth System Modeling since the ENIAC Calculation. But I missed it as I had to go pick up the kids. She has a recent paper that seems to cover some of the same ground, and allegedly the lecture was recorded, so I’m looking forward to watching it once the AGU posts it. And this morning, Joanie Keyplas gave the Rachel Carson Lecture: Ocean Acidification and Coral Reef Ecosystems: A Simple Concept with Complex Findings. She also has a recent paper covering what I assume was in her talk (again, I missed it!). Both lectures were recorded, so I’m looking forward to watching them once the AGU posts them.

I made it to the latter half of the session on Standards-Based Interoperability. I missed Stefano Nativi‘s talk on the requirements analysis for GIS systems, but there’s lots of interesting stuff on his web page to explore. However, I did catch Olga Wilhelmi presenting the results of a community workshop at NCAR on GIS for Weather, Climate and Impacts. She asked some interesting questions about the gathering of user requirements, and we chatted after the session about how users find the data they need (here’s an interesting set of use cases). I also chatted with Ben Domenico from Unidata/UCAR about open science. We were complaining about how hard it is at a conference like this to get people to put their presentation slides on the web. It turns out that some journals in the geosciences have explicit policies to reject papers if any part of the results have already been presented on the web (including in blogs, powerpoints, etc). Ben’s feeling is that these print media are effectively dead, and had some interesting thoughts about moving to electronic publishing, althoug we both worried that some of these restrictive policies might live on in online peer-review venues. (Ben is part of the THREDDS project, which is attempting to improve the way that scientists find and access datasets).

Down at the ESSI poster session, I bumped into Peter Fox, whom I’d met at the EGU meeting last month. We both chatted to Benjamin Branch, about his poster on spatial thinking and earth sciences, and especially how educators approach this. Ben’s PhD thesis looks at all the institutional barriers that prevent changes in high school curricula, all of which mitigate against the nurturing of cross-disciplinary skills (like spatial reasoning) necessary for understanding global climate change. We brainstormed some ideas for overcoming these barriers, including putting cool tools in the students hands (e.g. Google Maps mashups of interesting data sets; or idea that Jon had for a Lego-style constructor kit for building simplified climate models). I also speculated that if the education policy in the US prevents this kind of initiative, we should do it in another country, build it to a major success, and then import it back into the US as a best practice model. Oh, well, I can dream.

Next I chatted to Dicky Allison from Woods Hole, and Tom Yoksas from Unidata/UCAR. Dicky’s poster is on the MapServer project, and Tom shared with us the slides from his talk yesterday on the RAMADDA project, which is intended as a publishing platform for geosciences data. We spent some time playing with the RAMADDA data server, and Tom encouraged us to play with it more, and send comments back on our experiences. Again, most of the discussion was about how to facilitate access to these data sets, how to keep the user interface as simple as possible, and the need for instant access – e.g. grabbing datasets from a server while travelling to a conference, without having to have all the tools and data loaded on a large disk first. Oh, and Tom explained the relationship between NCAR and UCAR, but it’s too complicated to repeat here.

Here’s an aside. Browsing the UCAR pages, I just found the Climate Modeller’s Commandments. Nice.

This afternoon, I attended the session “A Meeting of the Models“, on the use of Multi-model Ensembles for weather and climate prediction. First speaker was Peter Houtekamer, talking about the Canadian Ensemble Prediction Systems (EPS). The key idea of an ensemble is that it samples across the uncertainty in the initial conditions. However, challenges arise from the incomplete understanding of the model-error. So the interesting questions are how to sample adequately across the space, to get a better ensemble spread. The NCEP Short-Range Ensemble Forecast System (SREF), claimed to be the first real-time operational regional ensemble prediction system in the world. Even grander is TIGGE, in which the output of lots of operational EPS’s are combined into an archive. The volume of the database is large (100s of ensemble members), and you really only need something like 20-40 members to get converging scores (he cites Talagrand for this) (aside: Talagrand diagrams are an interesting way of visualizing model spread). NAEFS combines 20-member American (NCEP) and 20-member Canadian (MSC) operational ensembles forecasts, to get a 40-member ensemble. Nice demonstration of how NAEFS outperforms both of the individual ensembles from which it is constructed. Multi-centre ensembles improve the sampling of model error, but impose a big operational cost: data exchange protocols, telecommunications costs, etc. As more centres are added, there are likely to be diminishing returns.