I posted some initial ideas for projects for our summer students awhile back. I’m pleased to say that the students have been making great progress in the last few weeks (despite, or perhaps because of, the fact that I haven’t been around much). Here’s what they’ve been up to:

Sarah Strong and Ainsley Lawson have been exploring how to take the ideas on visualizing the social network of a software development team (as embodied in tools such as Tesseract), and applying them as simple extensions to code browsers / version control tools. The aim is to see if we can add some value in the form of better awareness of who is working on related code, but without asking the scientists to adopt entirely new tools. Our initial target users are the climate scientists at the UK Met Office Hadley Centre, who currently use SVN/Trac as their code management environment.

Brent Mombourquette has been working on a Firefox extension that will capture the browsing history as a graph (pages and traversed links), which can then be visualized, saved, annotated, and shared with others. The main idea is to support the way in which scientists search/browse for resources (e.g. published papers on a particular topic), and to allow them to recall their exploration path to remember the context in which they obtained these resources. I should mention the key idea goes all the way back to the Vannevar Bush’s memex.

Maria Yancheva has been exploring the whole idea of electronic lab notebooks. She has been exploring the workflows used by the climate scientists when they configure and run their simulation models, and considering how a more structured form of wiki might help them. She has selected OpenWetWare as a good starting point, and is exploring how to add extensions to MediaWiki to make OWW more suitable for computational science, especially to keep track of model runs.

Samar Sabie has also been looking at MediaWiki extensions, specifically to find a way to add visualizations into wiki pages and blogs as simply as possible. The problem is that currently, adding something as simple as a table of data to a page requires extensive work with the markup language. The long term aim is to make the insertion of dynamic visualizations (such as those at ManyEyes), but the starting point is to try to make it as ridiculously simple as possible to insert a data table, link it to a graph, and select appropriate parameters to make the graph look good, with the idea that users can subsequently change the appearance in useful ways (which means cut and paste from Excel Spreadsheets won’t be good enough).

Oh, and they’ve all been regularly blogging their progress, so we’re practicing the whole open notebook science thingy.

Okay, I’ve had a few days to reflect on the session on Software Engineering for the Planet that we ran at ICSE last week. First, I owe a very big thank you to everyone who helped – to Spencer for co-presenting and lots of follow up work; to my grad students, Jon, Alicia, Carolyn, and Jorge for rehearsing the material with me and suggesting many improvements, and for helping advertise and run the brainstorming session; and of course to everyone who attended and participated in the brainstorming for lots of energy, enthusiasm and positive ideas.

First action as a result of the session was to set up a google group, SE-for-the-planet, as a starting point for coordinating further conversations. I’ve posted the talk slides and brainstorming notes there. Feel free to join the group, and help us build the momentum.

Now, I’m contemplating a whole bunch of immediate action items. I welcome comments on these and any other ideas for immediate next steps:

  • Plan a follow up workshop at a major SE conference in the fall, and another at ICSE next year (waiting a full year was considered by everyone to be too slow).
  • I should give my part of the talk at U of T in the next few weeks, and we should film it and get it up on the web. 
  • Write a short white paper based on the talk, and fire it off to NSF and other funding agencies, to get funding for community building workshops
  • Write a short challenge statement, to which researchers can respond with project ideas to bring to the next workshop.
  • Write up a vision paper based on the talk for CACM and/or IEEE Software
  • Take the talk on the road (a la Al Gore), and offer to give it at any university that has a large software engineering research group (assuming I can come to terms with the increased personal carbon footprint 😉
  • Broaden the talk to a more general computer science audience and repeat most of the above steps.
  • Write a short book (pamphlet) on this, to be used to introduce the topic in undergraduate CS courses, such as computers and society, project courses, etc.

Phew, that will keep me busy for the rest of the week…

Oh, and I managed to post my ICSE photos at last.

As a fan of Edward Tufte’s books on the power of beautiful visualizations of qualitative and quantitative data, I’m keen on the idea of exploring new ways of visualizing the climate change challenge. In part because many key policymakers are not likely to ever read the detailed reports on the science, but a few simple, compelling graphics might capture their attention.

I like the visualizations of collected by the UNEP, especially their summary of climate processes and effects, their strategic options curve, the map of political choices, summary of emissions by sector, a guide to emissions assessment, trends in sea level rise, CO2 emissions per capita. I should also point out that the IPCC reports are full of great graphics too, but there’s no easy visual index – you have to read the reports.

Now these are all very nice, and (presumably) the work of professional graphic artists. But they’re all static. The scientist in me wants to play with them. I want to play around with different scales on the axes. I want to select from among different data series. And I want to do this in a web-brower that’s directly linked to the data sources, so that I don’t have to mess around with the data directly, nor worry about how the data is formatted.

What I have in mind is something like Gap Minder. This allows you to play with the data, create new views, and share them with others. Many Eyes is similar, but goes one step further in allowing a community to create entirely new kinds of visualization, and enhance each other’s, in a social networking style. Now, if i can connect up some of these to the climate data sets collected by the IPCC, all sorts of interesting things might happen. Except that the IPCC data sets don’t have enough descriptive metadata for non-experts to make sense of it. But fixing that’s another project.

Oh, and the periodic table of visualization methods is pretty neat as a guide to what’s possible.

Update: (via Shelly): Worldmapper is an interesting way of visualizing international comparisons.

(via Grist) A new report from the World Bank on effects of storm surges and extreme weather as a result of global warming. (See an overview in the NY Times, and the draft report). 

(via Gillian) A report in the Lancet on the impacts on health, which begins with the sentence “Climate Change is the biggest global health threat of the 21st Century”. (See an overview in New Scientist, and the Editorial and full report in the Lancet). But to me, this is the most interesting bit: a roadmap for applied research in health and climate change.

And while we’re on the topic of research roadmaps, here’s one on Psychology and Climate Change, from the Australian Psychological Association.

Update: And another one from WWF And ETNOA – a roadmap on how the ICT sector can contribute to emissions reduction.

I like these roadmaps – send more!

This summer, we have a group of undergrad students working with us, who will try building some of the tools we have identified as potentially useful for climate scientists. We’re just getting started this week, so it’s not clear what we’ll actually build yet, but I think I can guarantee we’ll end up with one of two outcomes: either we build something that is genuinely useful, or we learn a lot about what doesn’t work and why not.

Here’s the first project idea. It responds to the observation that large climate models (and indeed any large-scale scientific simulation) undergoes continuous evolution, as a variety of scientists contribute code over a long period of time (decades, in some cases). There is no well-defined specification for the system, and nor do the scientists even know ahead of time exactly what the software should do. Coordinating contributions to this code then becomes a problem. If you want to make a change to some particular routine, it can be hard to know who else is working on related code, what potential impacts your change might have, and sometimes it is hard even to know who to go and ask about these things – who’s the expert?

A similar problem occurs in many other types of software project, and there is a fascinating line of research that exploits the social network to visualize how the efforts of different people interact. It draws on work in sociology on social network analysis – basically the idea that you can treat a large group of people and their social interactions as a graph, which can then be visualized in interesting ways, and analyzed for its structural properties, to identify things like distance (as in six degrees of separation), and structural cohesion. For software engineering purposes, we can automatically construct two distinct graphs:

  1. A graph of social interactions (e.g. who talks to whom). This can be constructed by extracting records of electronic communication from the project database – email records, bug reports, bulletin boards, etc. Of course, this misses verbal interactions, which makes it more suitable for geographically distributed projects, but there are ways of adding some of this missing information if needed (e.g. if we can mine people’s calendars, meeting agendas, etc).
  2. A graph of code dependencies (which bits of code are related). This can include simply which routines call which other routines. More interestingly, it can include information such as which bits of code were checked into the repository at the same time by the same person, which bits of code are linked to the same bug report, etc.

Comparing these two graphs offers insight into socio-technical congruence – how well the social network (who talks to whom) matches the technical dependencies in the code. Which then leads to all sorts of interesting ideas for tools:

For added difficulty, we have to assume that our target users (climate scientists) are programming in Fortran, and are not using integrated programming environments. Although we can assume they have good version control tools (e.g. Subversion) and good bug tracking tools (e.g Trac).

Well, this is a little off topic, but we (Janice, Dana, Peggy and I) have been invited to run this year’s International Advanced School of Empirical Software Engineering, in Florida in October. We’ve planned the day around the content of our book chapter on Selecting Empirical Research Methods for Software Engineering Research, which appeared in the book Guide to Advanced Empirical Software Engineering. It’s going to be a lot of fun!

At many discussions about the climate crisis that I’ve had with professional colleagues, the conversation inevitably turns to how we (as individuals) can make a difference by reducing our personal carbon emissions. So sure, our personal choices matter. And we shouldn’t stop thinking about them. And there is plenty of advice out there on how to green your home, and how to make good shopping decisions, and so on. Actually, there is way too much advice out there on how to live a greener life. It’s overwhelming. And plenty of it is contradictory. Which leads to two unfortunate messages: (1) we’re supposed to fix global warming through our individual personal choices and (2) this is incredibly hard because there is so much information to process to do it right.

The climate crisis is huge, and systemic. It cannot be solved through voluntary personal lifestyle choices; it needs systemic changes throughout society as a whole. As Bill McKibben says:

“the number one thing is to organize politically; number two, do some political organizing; number three, get together with your neighbors and organize; and then if you have energy left over from all of that, change the light bulb.”

Now, part of getting politically organized is getting educated. Another part is connecting with people. We computer scientists are generally not very good at political action, but we are remarkably good at inventing tools that allow people to get connected. And we’re good at inventing tools for managing, searching and visualizing information, which helps with the ‘getting educated’ part and the ‘persuading others’ part.

So, I don’t want to have more conversations about reducing our personal carbon footprints. I want to have conversations about how we can apply our expertise as computer scientists and software engineers in new and creative ways. Instead of thinking about your footprint, think about your delta (okay, I might need a better name for it): what expertise and skills do you have that most others don’t, and how can they be applied to good effect to help?

In honour of Ada Lovelace day, I decided to write a post today about Prof Julia Slingo, the new chief scientist at the UK Met Office. News of Julia’s appointment came out in the summer last year during my visit to the Met Office, coincidentally on the same day that I met her, at a workshop on the HiGEM project (where, incidentally, I saw some very cool simulations of ocean temperatures). Julia’s role at the meeting was to represent the sponsor (NERC – the UK equivalent of Canada’s NSERC), but what impressed me about her talk was both her detailed knowledge of the project, and the way she nurtured it – she’ll make a great chief scientist.

Julia’s research has focussed on tropical variability, particularly improving our understanding of the monsoons, but she’s also played a key role in earth system modeling, and especially in the exploration of high resolution models. But best of all, she’s just published a very readable account of the challenges in developing the next generation of climate models. Highly recommended for a good introduction to the state of the art in climate modeling.

First a couple of local ones, in May:

Then, this one looks interesting: The World Climate Conference, in Geneva at the end of August. It looks like most of the program will be invited, but they will be accepting abstracts for a poster session. Given that the theme is to do with how climate information is generated and used, it sounds very appropriate.

Followed almost immediately by EnviroInfo2009, in Berlin, in September. I guess the field I want to name “Climate Informatics” would be a subfield of environmental informatics. Paper deadline is April 6.

Finally, there’s the biggy in Copenhagen in December, where, hopefully, the successor to the Kyoto agreement will be negotiated.

A group of us at the lab, led by Jon Pipitone, has been meeting every Tuesday lunchtime (well almost every Tuesday) for a few months, to brainstorm ideas for how software engineers can contribute to addressing the climate crisis. Jon has been blogging some of our sessions (here, here and here).

This week we attempted to create a matrix, where the rows are “challenge problems” related to the climate crisis, and the columns are the various research areas of software engineering (e.g. requirements analysis, formal methods, testing, etc…). One reason to do this is to figure out how to run a structured brainstorming session with a bigger set of SE researchers (e.g. at ICSE). Having sketched out the matrix, we then attempted to populate one row with ideas for research projects. I thought the exercise went remarkably well. One thing I took away from it was that it was pretty easy to think up research projects to populate many of the cells in the matrix (I had initially thought the matrix might be rather sparse by the time we were done).

We also decided that it would be helpful to characterize each of the rows a little more, so that SE researchers who are unfamiliar with some of the challenges would understand each challenge enough to stimulate some interesting discussions. So, here is an initial list of challenges (I added some links where I could). Note that I’ve grouped them according to who immediate audience is for any tools, techniques, practices…).

  1. Help the climate scientists to develop a better understanding of climate processes.
  2. Help the educators to to teach kids about climate science – how the science is done, and how we know what we know about climate change.
    • Support hands-on computational science (e.g. an online climate lab with building blocks to support construction of simple simulation models)
    • Global warming games
  3. Help the journalists & science writers to raise awareness of the issues around climate change for a broader audience.
    • Better public understanding of climate processes
    • Better public understanding of how climate science works
    • Visualizations of complex earth systems
    • connect data generators (eg scientists) with potential users (e.g. bloggers)
  4. Help the policymakers to design, implement and adjust a comprehensive set of policies for reducing greenhouse gas emissions.
  5. Help the political activists who put pressure on governments to change their policies, or to get better leaders elected when the current ones don’t act.
    • Social networking tools for activitists
    • Tools for persuasion (e.g. visualizations) and community building (e.g. Essence)
  6. Help individuals and communities to lower their carbon footprints.
  7. Help the engineers who are developing new technologies for renewable energy and energy efficiency systems.
    • green IT
    • Smart energy grids
    • waste reduction
    • renewable energy
    • town planning
    • green buildings/architecture
    • transportation systems (better public transit, electric cars, etc)
    • etc

Next month, I’ll be attending the European Geosciences Union’s General Assembly, in Austria. It will be my first trip to a major geosciences conference, and I’m looking forward to rubbing shoulders with thousands of geoscientists.

My colleague, Tim, will be presenting a poster in the Climate Prediction: Models, Diagnostics, and Uncertainty Analysis session on the Thursday, and I’ll be presenting a talk on the last day in the session on Earth System Modeling: Strategies and Software. My talk is entitled Are Earth System model software engineering practices fit for purpose? A case study.

While I’m there, I’ll also be taking in the Ensembles workshop that Tim is organising, and attending some parts of the Seamless Assessment session, to catch up with more colleagues from the Hadley Centre. Sometime soon I’ll write a blog post on what ensembles and seamless assessment are all about (for now, it will just have to sound mysterious…)

The rest of the time, I plan to talk to as many climate modellers as a I can from other centres, as part of my quest for comparison studies for the one we did at the Hadley Centre.

I’ve been pondering starting a blog for way too long. Time for action. To explain what I think I’ll be blogging about, I put together the following blurb, for a conference session at the International Conference on Software Engineering. I’ll probably end up revising it for the conference, but it will do for a kickoff to the blog:

This year, the ICSE organisers have worked hard to make the conference “greener” – to reduce our impact on the environment. Partly this is in response to the growing worldwide awareness that we need to take more care of the natural environment. But partly it is driven by a deeper and more urgent concern. During this century, we will have to face up to a crisis that will make the current economic turmoil look like a walk in the park. Climate change is accelerating, outpacing the most pessimistic predictions of climate scientists. Its effects will touch everything, including the flooding of low-lying lands and coastal cities, the disruption of fresh water supplies for most of the world, the loss of agricultural lands, more frequent and severe extreme weather events, mass extinctions, and the destruction of entire ecosystems. And there are no easy solutions. We need concerted systematic change in how we live, to stabilize the concentration of greenhouse gases that drive climate change. Not to give up the conveniences of modern life, but to re-engineer them so that we no longer depend on fossil fuels to power our lives. The challenge is massive and urgent – a planetary emergency. The type of emergency that requires all hands on deck. Scientists, engineers, policymakers, professionals, no matter what their discipline, need to ask how their skills and experience can contribute.

We, as software engineering researchers and software practitioners have many important roles to play. Software is part of the problem, as every new killer application drives up our demand for more energy. But it is also a major part of the solution. Our information systems help provide the data we need to support intelligent decision making, from individuals trying to reduce their energy consumption, to policymakers trying to design effective governmental policies. Our control systems allow us to make smarter use of the available power, and provide the  adaptability and reliability to power our technological infrastructure in the face of a more diverse set of renewable energy sources. Less obviously, the software engineering community has many other contributions to make. We have developed practices and tools to analyze, build and evolve some of the most complex socio-technical systems ever created, and to coordinate the efforts of large teams of engineers. We have developed abstractions that help us to understand complex systems, to describe their structure and behaviour, and to understand the effects of change on those systems. These tools and practices are likely to be useful in our struggle to address the climate crisis, often in strange and surprising ways. For example, can we apply the principles of information hiding and modularity to our attempts to develop coordinated solutions to climate change? What is the appropriate architectural pattern for an integrated set of climate policies? How can we model the problem requirements so that the stakeholders can understand them? How do we debug strategies for emissions reduction when they don’t work out as intended?

This conference session is intended to kick start a discussion about the contributions that software engineering can make to tackling the climate crisis. Our aim is to build a community of concerned professionals, and find new ways to apply our skills and experience to the problem. We will attempt to map out a set of ideas for action, and identify potential roadblocks. We will start to build a broad research agenda, to capture the potential contributions of software engineering research. The session will begin with a short summary of the latest lessons from climate science, and a concrete set of examples of existing software engineering research efforts applied to climate change. We will include an open discussion, and structured brainstorming sessions to map out an agenda for action. We invite everyone to come to the session, and take up this challenge.

Okay, so how does that sound as a call to arms?