Summer projects: I posted yesterday on social network tools for computational scientists. Greg has posted a whole list of additional suggestions.

Here, I will elaborate another of these ideas: the electronic lab notebook. For computational scientists, wiki pages are an obvious substitute for traditional lab notebooks, because each description of an experiment can then be linked directly with the corresponding datasets, configuration files, visualizations of results, scientific papers, related experiments, etc. (In the most radical version, Open Notebook Science, the lab notebook is completely open for anyone to see. But the toolset would be the same whether it was open to anyone, or just shared with select colleagues)

In my study of the software practices at the UK Met Office last summer, I noticed that some of the scientists carefully document each experiment via a new wiki page, but the process is laborious in a standard wiki, involving a lot of cut-and-paste to create a suitable page structure. For this reason, many scientists don’t keep good records of their experiments. An obvious improvement would be to generate a basic wiki page automatically each time a model run is configured, and populate it with information about the run, and links to the relevant data files. The scientists could then add further commentary via a standard wiki editor.

Of course, an even better solution is to capture all information about a particular run of the model (including subsequent commentary on the results) as meta-data in the configuration file, so that no wiki pages are needed: lab notebook pages are just user-friendly views of the configuration file. I think that’s probably a longer term project, and links in with the observation that existing climate model configuration tools are hard to use anyway and need to be re-invented. Let’s leave that one aside for the moment…

A related problem is better support for navigating and linking existing lab book pages. For example, in the process of writing up a scientific paper, a scientist might need to search for the descriptions of number of individual experiments, select some of the data, create new visualizations for use in the paper, and so on. Recording this trail would improve reproducibility, by capturing the necessary links to source data in case the visualizations used in the paper need to be altered or recreated. Some of requires a detailed analysis of the specific workflows used in a particular lab (which reminds me I need to write up what I know of the Met Office’s workflows), but I think some of this can be achieved by simple generic tools (e.g. browser plugins) that help capture the trail as it happens, and perhaps edit and annotate it afterwards.

I’m sure some of these tools must exist already, but I don’t know of them. Feel free to send me pointers…

Had an interesting conversation this afternoon with Brad Bass. Brad is a prof in the Centre for Environment at U of T, and was one of the pioneers of the use of models to explore adaptations to climate change. His agent based simulations explore how systems react to environmental change, e.g. exploring population balance among animals, insects, the growth of vector-borne diseases, and even entire cities. One of his models is Cobweb, an open-source platform for agent-based simulations. 

He’s also involved in the Canadian Climate Change Scenarios Network, which takes outputs from the major climate simulation models around the world, and extracts information on the regional effects on Canada, particularly relevant for scientists who want to know about variability and extremes on a regional scale.

We also talked a lot about educating kids, and kicked around some ideas for how you could give kids simplified simulation models to play with (along the line that Jon was exploring as a possible project), to get them doing hands on experimentation with the effects of climate change. We might get one of our summer students to explore this idea, and Brad has promised to come talk to them in May once they start with us.

Oh, and Brad is also an expert on green roofs, and will be demonstrating them to grade 5 kids at the Kids World of Energy Festival.

I just spent the last two hours chewing the fat with Mark Klein at MIT and Mark Tovey at Carleton, talking about all sorts of ideas, but loosely focussed on how distributed collaborative modeling efforts can help address global change issues (e.g. climate, peak oil, sustainability).

MK has a project, Climate Interactive,[update: Mark tells me I got the wrong project – it should be The Climate Collaboratorium. Climate Interactive is from a different group at MIT] which is exploring how climate simulation tools can be hooked up to discussions around decision making, which is one of the ideas we kicked around in our brainstorming sessions here.

MT has been exploring how you take ideas from distributed cognition and scale them up to much larger teams of people. He has put together a wonderful one-pager that summarized many interesting ideas on how mass collaboration can be applied in this space.

This conversation is going to keep me going for days on stuff to explore and blog about:

And lots of interesting ideas for new projects…

A group of us at the lab, led by Jon Pipitone, has been meeting every Tuesday lunchtime (well almost every Tuesday) for a few months, to brainstorm ideas for how software engineers can contribute to addressing the climate crisis. Jon has been blogging some of our sessions (here, here and here).

This week we attempted to create a matrix, where the rows are “challenge problems” related to the climate crisis, and the columns are the various research areas of software engineering (e.g. requirements analysis, formal methods, testing, etc…). One reason to do this is to figure out how to run a structured brainstorming session with a bigger set of SE researchers (e.g. at ICSE). Having sketched out the matrix, we then attempted to populate one row with ideas for research projects. I thought the exercise went remarkably well. One thing I took away from it was that it was pretty easy to think up research projects to populate many of the cells in the matrix (I had initially thought the matrix might be rather sparse by the time we were done).

We also decided that it would be helpful to characterize each of the rows a little more, so that SE researchers who are unfamiliar with some of the challenges would understand each challenge enough to stimulate some interesting discussions. So, here is an initial list of challenges (I added some links where I could). Note that I’ve grouped them according to who immediate audience is for any tools, techniques, practices…).

  1. Help the climate scientists to develop a better understanding of climate processes.
  2. Help the educators to to teach kids about climate science – how the science is done, and how we know what we know about climate change.
    • Support hands-on computational science (e.g. an online climate lab with building blocks to support construction of simple simulation models)
    • Global warming games
  3. Help the journalists & science writers to raise awareness of the issues around climate change for a broader audience.
    • Better public understanding of climate processes
    • Better public understanding of how climate science works
    • Visualizations of complex earth systems
    • connect data generators (eg scientists) with potential users (e.g. bloggers)
  4. Help the policymakers to design, implement and adjust a comprehensive set of policies for reducing greenhouse gas emissions.
  5. Help the political activists who put pressure on governments to change their policies, or to get better leaders elected when the current ones don’t act.
    • Social networking tools for activitists
    • Tools for persuasion (e.g. visualizations) and community building (e.g. Essence)
  6. Help individuals and communities to lower their carbon footprints.
  7. Help the engineers who are developing new technologies for renewable energy and energy efficiency systems.
    • green IT
    • Smart energy grids
    • waste reduction
    • renewable energy
    • town planning
    • green buildings/architecture
    • transportation systems (better public transit, electric cars, etc)
    • etc

We had a discussion today with the grad students taking my class on empirical research methods, on the role of blogging by researchers. Some students thought that it was a bad idea to post their research ideas on their blogs, because other people might steal them. This is, of course, a perennial fear amongst grad students – that someone else will do the same research and publish it first. I argued strongly that it doesn’t happen, for two reasons:

  1. the idea is only a tiny part of the research – it’s what you do with the idea that really matters. Bill Buxton has a whole talk on this, the summary of which is:  The worst thing in the world is a precious idea; The worst person to have on your team is someone who thinks his idea is precious; Good ideas are cheap, they are not precious; The key is not to come up with ideas but to cultivate the adoption of ideas.
  2. even if someone else works on the same idea, they will approach it in different way, and both projects will be a contribution to knowledge (and therefore be worthy of publication).

After the class, Simon sent me a pointer to Michael Nielsen’s blog post on the importance of scientists sharing their ideas via blogs. It’s great reading.

Note: I’m particularly chuffed about the relevance of Neilsen’s post to climate science, as the Navier-Stokes equations he mentions in his example lie at the heart of climate simulation models.