Okay, I’ve had a few days to reflect on the session on Software Engineering for the Planet that we ran at ICSE last week. First, I owe a very big thank you to everyone who helped – to Spencer for co-presenting and lots of follow up work; to my grad students, Jon, Alicia, Carolyn, and Jorge for rehearsing the material with me and suggesting many improvements, and for helping advertise and run the brainstorming session; and of course to everyone who attended and participated in the brainstorming for lots of energy, enthusiasm and positive ideas.

First action as a result of the session was to set up a google group, SE-for-the-planet, as a starting point for coordinating further conversations. I’ve posted the talk slides and brainstorming notes there. Feel free to join the group, and help us build the momentum.

Now, I’m contemplating a whole bunch of immediate action items. I welcome comments on these and any other ideas for immediate next steps:

  • Plan a follow up workshop at a major SE conference in the fall, and another at ICSE next year (waiting a full year was considered by everyone to be too slow).
  • I should give my part of the talk at U of T in the next few weeks, and we should film it and get it up on the web. 
  • Write a short white paper based on the talk, and fire it off to NSF and other funding agencies, to get funding for community building workshops
  • Write a short challenge statement, to which researchers can respond with project ideas to bring to the next workshop.
  • Write up a vision paper based on the talk for CACM and/or IEEE Software
  • Take the talk on the road (a la Al Gore), and offer to give it at any university that has a large software engineering research group (assuming I can come to terms with the increased personal carbon footprint 😉
  • Broaden the talk to a more general computer science audience and repeat most of the above steps.
  • Write a short book (pamphlet) on this, to be used to introduce the topic in undergraduate CS courses, such as computers and society, project courses, etc.

Phew, that will keep me busy for the rest of the week…

Oh, and I managed to post my ICSE photos at last.

The American Geophysical Union’s Joint Assembly is in Toronto this week. It’s a little slim on climate science content compared to the EGU meeting, but I’m taking in a few sessions as it’s local and convenient. Yesterday I managed to visit some of the climate science posters. I also caught the last talk of the session on connecting space and planetary science, and learned that the solar cycles have a significant temperature impact on the upper atmosphere, but no obvious effect on the lower atmosphere, but more research is needed to understand the impact on climate simulations. (Heather Andres‘ poster has some more detail on this).

This morning, I attended the session on Regional Scale Climate  Change. I’m learning that understanding the relationship between temperature change and increased tropical storm activity is complicated, because tropical storms seem to react to complex patterns of temperature change, rather than just the temperature itself. I’m also learning that you can use statistical downscaling from the climate models to get finer grained regional simulations of the changes in rainfall, e.g. over the US, leading to predictions for increased precipitation over much of the US in the winters and decreased in the summers. However, you have to be careful, because the models don’t capture seasonal variability well in some parts of the continent. A particular challenge for regional climate predictions is that some placed (e.g. Carribean Islands) are just too small to show up in the grids used in General Circulation Models (GCMs), which means we need more work on Regional Models to get the necessary resolution.

Final talk is Noah Diffenbaugh‘s talk on an ensemble approach to regional climate forecasts. He’s using the IPCC’s A1B scenario (but notes that in the last few years, emissions have exceeded those for this scenario). The model is nested – a hight resolution regional model (25km) is nested within a GCM (CCSM3, at T85 resolution), but the information flows only in one direction, from the GCM to the RCM. As far as I can tell, the reason it’s one way, is because the GCM run is pre-computed; specifically, it is taken by averaging 5 existing runs of the CCSM3 model from the IPCC AR4 dataset, and generate 6-hourly 3D atmosphere fields to drive the regional model. The runs show that by 2030-2039, we should expect 6-8 heat stress events per deacade across the whole of the south-west US (where a heat stress event is the kind of thing that should only hit once per  decade). Interestingly, the warming is greater in the south-eastern US, but because the south-western states are already closer to the threshold temperature for heat stress events, they get more heatwaves. Noah also showed some interesting validation images, to demonstrate that the regional model reproduced 20th Century temperatures over the US much better than the GCM does. 

Noah also talked a little about the role of the 2°C threshold used in climate negotiations, particularly at the Copenhagen meeting. The politicians don’t like that the climate scientists are expressing uncertainty about the 2°C threshold. But there has to be, because the models show that even below 2 degrees, there are some serious regional impacts, in this case on the US. His take home message is that we need to seriously question greenhouse gas mitigation targets. One of the questioners pointed out that there is also some confusion between whether the 2°C is supposed to be above pre-industrial temperatures.

After lunch, I attended the session on Breakthrough Ideas and Technologies for a Planet at Risk II. First talk is by Lewis Gilbert on monitoring and managing a planet at risk. First, he noted that really, the planet itself isn’t at risk – destroying it is still outside our capacity. Life will survive. Humans will survive (at least for a while). But it’s the quality of that survival that is at question. Some definitions of sustainability (he has quibbles with them all). First Bruntland’s – future generations should be able to meet their own needs; Natural Capital – future generations should have a standard of living better or equal to our own. Gilbert’s own: existance of a set of possible futures that are acceptable in some satisficing sense. But all of these definitions are based on human values and human life. So the concept of sustainability has human concerns deeply embedded in it. The rest of his talk was a little vague – he described a state space, E, with multiple dimensions (e.g. physical, such as CO2 concentrations; sociological, such as infant mortality in Somalia; biological, such as amphibian counts in Sierra Nevada), in which we can talk about quality of human life a some function of the vectors. The question then becomes what are the acceptable and unacceptable regions of E. But I’m not sure how this helps any.

Alan Robock talked about Geoengineering. He’s conducted studies of the effect of seeding sulphur particles into the atmosphere, using NASA’s climate model. In particular, injecting them over the arctic, where there is the most temperature change, and least impact on humans. His studies show that the seeding does have a significant impact on temperature, but as soon as you stop the seeding, the global warming quickly rises to where it would have been. So basically, once you start, you can’t stop. Also, you get other effects: e.g. a reduction of the tropical monsoons, a reduction of precipitation. Here’s an alternative: could it be done by just seeding in the arctic summer (when the temperature rise matters), and not in the winter. e.g. seed in April, May and June, or just in April, rather than year round. He’s exploring options like these with the model. Interesting aside: Rolling Stone Magazine, Nov 3, 2006 “Dr Evil’s plan to stop Global Warming”. There was a meeting convened by NASA, at which Alan started to create a long list of risks associated with geoengineering (and has a newer paper updating the list currently in submission).

George Shaw talked about biogeologic carbon sequestration. First, he demolished the idea that peak oil / peak coal etc will save us, by calculating the amount of carbon that can be easily extracted by known fossil fuel reserves. Carbon capture ideas include iron fertilization of the oceans, which stimulates plankton growth, which extracts carbon from. Cyanobacteria also extract carbon. E.g. attach an algae farm to every power station smoke stack. However, to make any difference, the algae farm for one power plant might have to be 40-50 square km. He then described a specific case study, of taking the Salton Basin Area in southern California, and filling it up with an algae farm. This would remove a chunk of agricultural land, but would probably make money under the current carbon trading schemes.

Roel Snieder gave a talk “Facing the Facts and Living Our Values”. Interesting graph on energy efficiency, which shows that 60% of the energy we use is lost. Also presents a version of the graph showing cost of intervention against emissions reduction, point out that sequestration is the most expensive choice of all. Another nice point: understanding of the facts – how much CO2 gas is produced by burning all the coal in one railroad car. Answer is about 3 times the weight of the coal, but most people would say only a few ounces, because gases are very light. Also he has a neat public lecture, and encouraged the audience to get out and give similar lectures to the public.

Eric Barron: Beyond Climate Science. It’s a mistake for the climate science community to say that “the science is settled”, and we need to move on to mitigation strategies. Still five things we need:

  1. A true climate services – an authoritative, credible, user-centric source of information on climate (models and data). E.g. Advice on resettlement of threatened towns, advice on forestry management, etc.
  2. Deliberately expand the family of forecasting elements. Some natural expansion of forecasting is occurring, but the geoscience community needs to push this forward deliberately.
  3. Invest in stage 2 science – social sciences and the human dimension of climate change (physical science budget dwarves the social sciences budget).
  4. Deliberately tackle the issue of scale and the demand for an integrated approach.
  5. Evolve from independent research groups to environmental “intelligence” centres. Cohesive regional observation and modeling framework. And must connect vigorously with users and decision-makers.

Key point: we’re not ready. Characterizes the research community as a cottage industry of climate modellers. Interesting analogy: health sciences, which is almost entirely a “point-of-service” community that reacts to people coming in the door, with no coherent forecasting service. Finally, some examples of forecasting spread of west nile disease, lyme disease, etc.

ICSE proper finished on Friday, but a few brave souls stayed around for more workshops on Saturday. There were two workshops in adjacent rooms that had a big topic overlap: SE Foundations for End-user programming (SEE-UP) and Software Engineering for Computational Science and Engineering (SECSE, pronounced “sexy”). I attended the latter, but chatted to some people attending the former during the breaks – seems we could have merged the two workshops for interesting effect. At SECSE, the first talk was by Greg Wilson, talking about the results of his survey of computational scientists. Some interesting comments about the qualitative data he showed, including the strong confidence exhibited in most of the responses (people who believe they are more effective at using computers than their colleagues). This probably indicates a self-selection bias, but it would be interesting to probe the extent of this. Also, many of them take a “toolbox” perspective – they treat the computer as a set of tools, and associate effectiveness with how well people understand the different tools, and how much they take the time to understand them. Oh and many of them mention that using a Mac makes them more effective. Tee Hee.

Next up: Judith Segal, talking about organisational and process issues – particularly the iterative, incremental approach they take to building software. Only cursory requirements analysis and only cursory testing. The model works because the programmers are the users – they build software for themselves, and because the software is developed (initially) only to solve a specific problem, so they can ignore maintainability and usability. Of course, the software often does escape from the lab, and get used by others, which leads to a large risk of using incorrect, poorly designed software leading to incorrect results. For the scientific communities Judith has been working with, there’s a cultural issue too – the scientists don’t value software skills, because they’re focussed on scientific skills and understanding. Also, openness is a problem because they are busy competing for publications and funding. But this is clearly not true of all scientific disciplines, as the climate scientists I’m familiar with are very different: for them computational skills are right at the core of their discipline, and they are much more collaborative than competitive.

Roscoe Bartlett, from Sandia Labs, presenting “Barely Sufficient Software Engineering: 10 Practices to Improve Your CSE Software”. It’s a good list: Agile (incremental) development, Code management, mail lists, checklists, make the source code the primary source of documentation. Most important was the idea of “barely sufficient”. Mindless application of formal software engineering processes to computational science doesn’t make any sense.

Carlton Crabtree described a study design to investigate the role of agile and plan-driven development processes among scientific software development projects. They are particularly interested in exploring the applicability of the Boehm and Turner model as an analytical tool. They’re also planning to use grounded theory to explore the scientists own perspectives, although I don’t quite get how they will reconcile the contructivist stance of grounded theory (it’s intended as a way of exploring the participants’ own perspectives), with the use of a pre-existing theoretical framework, such as the Boehm and Turner model.

Jeff Overbey, on refactoring Fortran. First, he started with a few thoughts on the history of Fortran (the language that everyone keeps thinking will die out, but never does. Some reference to zombies in here…). Jeff pointed out that languages only ever accumulate features (because removing features breaks backwards compatibility), so they just get more complex and harder to use with each update to the language standard. So, he’s looking at whether you can remove old language features using refactoring tools. This is especially useful for the older language features that encourage bad software engineering practices. Jeff then demo’d his tool. It’s neat, but is currently only available as an Eclipse plugin. If there was an emacs version, I could get lots of climate scientists to use this. [note: In the discussion, Greg recommended the book Working effectively with legacy code].

Next up: Roscoe again, this time on integration strategies. The software integration issues he describes are very familiar to me. and he outlined an “almost” continuous integration process, which makes a lot of sense. However, some of the things he describes a challenges don’t seem to be problems in the environment I’m familiar with (the climate scientists at the Hadley Centre). I need to follow up on this.

Last talk before the break: Wen Yu, talking about the use of program families for scientific computation, including a specific application for finite element method computations.

After an infusion of coffee, Ritu Arora, talking about the application of generative programming for scientific applications. She used a checkpointing example as a proof-of-concept, and created a domain specific language for describing checkpointing needs. Checkpointing is interesting, because it tends to be a cross cutting concern; generating code for this and automatically weaving it into the code is likely to be a significant benefit. Initial results are good: the automatically generated code had similar performance profiles to hand generated checkpointing code.

Next: Daniel Hook on testing for code trustworthiness. He started with some nice definitions and diagrams that distinguish some of the key terminology e.g. faults (mistakes in the code) versus errors (outcomes that affect the results). Here’s a great story: he walked into a glass storefront window the other day, thinking it was a door. The fault was mistaking a window for a door, and the error was about three feet. Two key problems: the oracle problem (we often have only approximate or limited oracles for what answers we should get) and the tolerance problem (there’s no objective way to say that the results are close enough to the expected results so that we can say they are correct). Standard SE techniques often don’t apply. For example, the use of mutation testing to check the quality of a test set doesn’t work on scientific code because of the tolerance problem – the mutant might be closer to the expected result than the unmutated code. So, he’s exploring a variant and it’s looking promising. The project is called matmute.

David Woollard, from JPL, talking about inserting architectural constraints into legacy (scientific) code. David has been doing some interesting work with assessing the applicability of workflow tools to computational science.

Parmit Chilana from U Washington. She’s working mainly with bioinformatics researchers, comparing the work practices of practitioners with researchers. The biologists understand the scientific relevance , but not the technical implementation; the computer scientists understand the tools and algorithms, but not the biological relevance. She’s clearly demonstrated the need for domain expertise during the design process, and explored several different ways to bring both domain expertise and usability expertise together (especially when the two types of expert are hard to get because they are in great demand).

After lunch, the last talk before we break out for discussion. Val Maxville, preparing scientists for scaleable software development. Val gave a great overview of the challenges for software development at iVEC. AuScope looks interesting – an integration of geosciences data across Australia. For each of the different projects. Val assessed how much they have taken practices from the SWEBOK – how much have they applied them, and how much do they value them. And she finished with some thoughts on the challenges for software engineering education for this community, including balancing between generic and niche content, and balance between ‘on demand’ versus a more planned skills development process.

And because this is a real workshop, we spent the rest of the afternoon in breakout groups having fascinating discussions. This was the best part of the workshop, but of course required me to put away the blogging tools and get involved (so I don’t have any notes…!). I’ll have to keep everyone in suspense.

Friday, the last day of the main conference, kicked off with Pamela Zave’s keynote “Software Engineering for the Next Internet”. Unfortunately I missed the first few minutes of the talk. But I regret that, because this was an excellent keynote. Why do I say that? Because Pamela demonstrated a beautiful example of what I want to call “software systems thinking”. By analyzing them from a software engineering perspective, she demonstrated how some of the basic protocols of the internet (eg the Simple Initiation Protocol, SIP), and the standardization process by which they are developed are broken in interesting ways. The reason they are broken is because they ignore software engineering principles. I thought the analysis was compelling: both thorough in terms of the level of detail, and elegant in the simplicity of the analysis.

Here’s some interesting tidbits;

  • A corner case is a possible behaviour that emerges from the interaction of unanticipated constraints. It is undesirable, and designers typically declare it to be rare and unimportant, without any evidence. Understanding and dealing with corner cases is important for assessing the robustness of a design.
  • The IETF standards process is an extreme (pathological?) case of bottom up thinking. It sets an artificial conflict between generality and simplicity, because any new needs are dealt with by adding more features and more documents to the standard. Generality is always achieved by making the design more complex. Better abstractions, and some more top down analysis can provide simple and general designs (and Pamela demonstrated a few)
  • How did the protocols get to be this broken? Most network functions are provided by cramming them into the IP layer. This is believed to be more efficient, and in the IETF design process, efficiency always takes precedence over separation of concerns.
  • We need a successor to the end-to-end principle. Each application should run on a stack of overlays that exactly meets its requirements. Overlays have to be composable. The bottom underlay runs on a virtual network which gets a predictable slice of the real network resources. Of course, there are still some tough technical challenges in designing the overlay hierarchy.

So, my reflections. Why did I like this talk so much? First it had an appealing balance of serious detail (with clear explanations) and new ideas that are based on an understanding of the big picture. Probably it helps that she’s talking about an analysis approach using techniques that I’m very familiar with (some basic software engineering design principles: modularity, separation of concerns, etc), and applies them to a problem that I’m really not familiar with at all (detailed protocol design). So that combination allows me to follow most of the talk (because I understand the way she approaches the problem), but tells me a lot of things that are new and interesting (because the domain is new to me).

She ended with a strong plug for domain-specific research. It’s more fun and more interesting! I agree wholeheartedly with that. Much of software engineering research is ultimately disappointing because in trying to be too generic it ends up being vague and wishy washy. And it misses good pithy examples.

So, having been very disappointed with Steve McConnell’s opening keynote yesterday, I’m pleased to report that the keynotes got steadily better over the week. Thursday’s keynote was by Carlo Ghezzi, entitled Reflections on 40+ years of software engineering research and beyond: An Insider’s View. He started with a little bit of history of the SE research community and the ICSE conference, but the main part of the talk was a trawl though the data from the conference over the years, motivated by questions such as “how international are we as a community?”, and “how diverse?” (e.g. academia, industry…), and “how did the research areas included in ICSE evolve?”. For example, there has been a clear trend in the composition of the program committee, from being N. American dominated (80% at first ICSE), to now approx equal N. American and European, with some from asia & elsewhere. However, there is a startling trend on industry vs. acadmia mix. The attendees at the first conference were 80% industry and only 20% academics. This has steadily changed: the conference is now 90% academics. The number of accepted papers each year has remained fairly steady (average is 44), but with a strong growth in submissions over past 15 years from 150 to 400. Which now gives us a paper acceptance rate now well below 15%. This is clearly good for the academics – the low acceptance rate keeps the quality of the accepted papers high, and makes the conference the top choice as a publication venue. But a strong academic research program clearly does not attract practitioners to attend.

In Carlo’s analysis of research areas, I was struck by the graph of number of papers on programming languages, which looks like a pair of vampire teeth – a huge spike in this area in the early days of ICSE, then nothing for years, and again a huge spike in the last couple of years. A truly interesting and surprising result.

Towards the end of the talk, Carlo got onto the question of how we could identify our best products. He talked about the strengths and weaknesses of quantitative measures such as citation count (difficult as it’s a moving target, and you have to account for journal/conference versions), number of downloads from ACM digital library over 12 months, etc. He drew a lot on a report by the Joint Committee on Quantitative Assessment of Research. He also mentioned Meyer’s viewpoint article in CACM April 2009, and of course, Parnas’s somewhat less nuanced “Stop the numbers game“. Why is the problem of quantitative assessment of research becoming so hot today? It’s being increasingly used to rank journals and conferences and individuals. Many stakeholders now need to evaluate research, and peer-review is considered to be expensive and subjective, while numeric metrics are considered to be simple and objective. The Joint committee report says that, to the contrary, numeric metrics are simple and misleading. From the report: Much of modern bibliometrics is flawed. The meaning of a citation can be even more subjective than peer review. Citation counts are only valid if reinforced by other judgements.

Carlos’ final message was that we have to care about impact of our research: understanding, measuring, and improving it. Because if we don’t others will (governments, funding agencies, universities, etc). Okay, that’s a good argument. I’ve been skeptical of SIGSOFT’s Impact Project in the past, largely because I think the process by which research ideas filter into industrial practice is much more complex, and takes much longer than everyone seems to expect. But I guess taking control of the assessment of impact is the obvious way to address this issue.

After the break, Jorge presented his paper on the Secret Life of Bugs. His did a great job on presenting the work, to an absolutely packed room, and I had lots of people comment on how much they enjoyed the paper afterwards. I beamed with pride.

But for most of the day, I was busy trying to finish off my talk “Software Engineering for the Planet”, in time for the session at 2pm. Many thanks to Spencer, Jon, Carolyn and Alicia for helping my polish it prior to delivery. I’ll get the slides up on the web soon. I think the session went very well – the questions and discussions afterwards were very encouraging – most people seemed to immediately get the key message (that we should stop focussing our energies on personal green choices, and instead figure out how our professional skills and experience can be used to address to the climate crisis). Aran posted a quick summary of the session, and some afterthoughts. Now we’ve got to do the community building, and keep the momentum going. [Aran said he doesn’t think I’ll get much research done in the next few months. He’s might be right, but I can just declare that this is now my research…]

Okay, the main conference started today, and we kick off with the keynote speaker – Steve McConnell talking about “The 10 most powerful ideas in software engineering”. Here’s my thoughts: when magazines are short of ideas for an upcoming issue, they resort to the cheap journalist’s trick of inventing top ten lists. It makes for easy reading filler, that never really engages the brain. Unfortunately, this approach also leads to dull talks. The best keynotes have a narrative thread. They tell a story. They build up ideas in interesting new ways. The top ten format kills this kind of narrative stone dead (except perhaps when used in parody). Okay, so I didn’t like the format, but what about the content? Steve walked through ten basic concepts that we’ve been teaching in our introductory software engineering courses for years, so I learned nothing new. Maybe this would be okay as a talk to junior programmers who missed out on software engineering courses in school. For ICSE keynotes, I expect a lot more – I’d have liked at least some sharper insights, or better marshalling of the evidence. I’m afraid I have to add this to my long list of poor ICSE keynotes. Which is okay, because ICSE keynotes always suck – even when the chosen speakers are normally brilliant thinkers and presenters. Maybe I’ll be proved wrong later this week… For what it’s worth, here’s his top ten list (which he said were in no particular order):

  1. Software Development work is performed by human beings. Human factors make a huge difference in the performance of a project.
  2. Incrementalism is essential. The benefits are feedback, feedback, and feedback! (on the software, on the development process, on the developer capability). And making small mistakes that prevent bigger mistakes later.
  3. I’ve no idea what number 3 was. Please excuse my inattention.
  4. Cost to fix a defect increases over time, because of the need to fix all the collateral and downstream consequences of the error.
  5. There’s an important kernel of truth in the waterfall model. Essentially, there are three intellectual phases: discovery, invention, construction. They are sequential, but also overlapping.
  6. Software Estimates can be improved over time, by reducing its uncertainty as the project progresses.
  7. The most powerful form of reuse is full reuse – i.e. not just code and design, but all aspects of process.
  8. Risk management is important.
  9. Different kinds of software call for different kinds of software development (the toolbox approach). This was witty: he showed a series of pictures of different kinds of saw, then newsflash: software development is as difficult as sawing.
  10. The software engineering body of knowledge (SWEBOK)

Next up, the first New Ideas and Emerging Results session. This is a new track at this year’s ICSE, and the intent is to have a series of short talks, with a poster session at the end of the day. Although I’m surprised how hard it was to get a paper accepted: of 118 submissions, they selected only 21 for presentation (an 18% acceptance rate). The organisers also encouraged the presenters to use the Pecha Kucha format: 20 slides on an automated timer, with 20 seconds per slide. Just to make it more fun and more dynamic.

I’m disappointed to report that none of the speakers this morning took up this challenge, although Andrew Begel’s talk on social networking for programmers was very interesting (and similar to some of our ideas for summer projects this year). The fourth talk, by Abram Hindle, also didn’t use the Pecha Kucha format, but made up for it with a brilliant and beautiful set of slides that explain how to form interesting time series analysis visualizations of software projects by mining the change logs.

Buried in the middle of the session was an object lesson in misuse of empirical methods. I won’t name the guilty parties, but let me describe the flaw in their study design. Two teams were assigned a problem to analyze, with one team being given a systems architecture, and the other team wasn’t. To measure the effect of being given this architecture on the requirements analysis, the authors asked experts to rate each of several hundred requirements generated by each of the teams, and then used a statistical test to see whether the requirements from one team were different on this ranking compared to the other. Unsurprisingly, they discovered a statistically significant difference. Unfortunately, the analysis is completely invalid, because they made a classic unit of analysis error. The unit of analysis for the experimental design is the team, because it was teams that were assigned the different treatments. But the statistical test was applied to individual requirements. But there was no randomization of these requirements – all the requirements from a given team have to be taken as a single unit. The analysis that was performed in this study merely shows that the requirements came from two different teams, which we knew already. It shows nothing at all about the experimental hypothesis. I guess the peer review process has to let a few klunkers through.

Well, we reach the end of the session and nobody did the Pecha Kucha thing. Never mind – my talk is first up in the next NIER session this afternoon, and I will take the challenge. Should be hilarious. On the plus side, I was impressed with the quality of all the talks – they all managed to pack in key ideas, make them interesting, and stick to the 6 minute time slot.

So, I made it to ICSE at last. I’m way behind on blogging this one: the students from our group have been here for several days, busy blogging their experiences. So far, the internet connection is way too weak for liveblogging, so I’l have to make do with post-hoc summaries.

I spent the morning at the Socio-Technical Congruence (STC) workshop. The workshops is set up with discussants giving prepared responses to each full paper presentation, and I love the format. The discussants basically riff on ideas that the original paper made them think of. Which ends up being more interesting than the original paper. For example, Peri Tarr clarified how to tell whether something counts as a design pattern. A design pattern is a (1) proven solution to a (2) commonly occurring problem in a (3) particular context. To assess whether an observed “pattern” is actually a design pattern, you need to probe whether all these three things are in place. For example, the patterns that Marcelo had identified do express implemented solutions, but e has not yet identified the problems/concerns they solve, and the contexts in which the patterns are applicable.

Andy Begel’s discussion include a tour through learning theory (I’ve no idea why, but I enjoyed the ride!). On a single slide, he tooks us though the traditional “empty container” model of learning, though Piaget‘s constructivism; Vygotsky‘s social learning, Papert‘s constructionism), Van Maanen & Schein‘s newcomer socialization; Hutchins‘ distributed cognition and Lave & Wenger‘s legitimate peripheral participation. Whew. Luckily, I’m familiar with all of these except the Van Maanen & Schein stuff – I’m looking forward to read that. Oh and an interesting book recommendation “Anything that’s worth knowing is really complex” from Wolfram’s A New Kind of Science. Then, Andy posed some interesting question’s: how long can software live? How big can it get? How many people can work on it? And he proposed we should design for long-term social structures, rather than modular architecture.

We then spent some time discussing whether designing the software architecture is the same thing as designing the social structure. Audris suggested that while software architecture people tend not to talk about the social dimension, but in fact they are secretly designing it. If the two get out of synch, people are very adaptable – they find a way of working around the mismatch. Peri pointed out that technology also adapts to people. They are different things, with feedback loops that affect each other. It’s an emergent, adaptive thing.

And someone mentioned Rob DeLine’s keynote on the weekend at CHASE, in which pointed out that only about 20% of ICSE papers mention the human dimension, and we should seek to flip the ratio. To make it 80% we should insist that papers that ignore the people aspects have to prove that people are irrelevant to the problem being addressed. Nice!

After lots of catching up with ICSE regulars over lunch, I headed over to the last session of the Michael Jackson festschrift, to hear Michael’s talk. He kicked off with some quotes that he admitted he can’t take credit for: “description should precede invention”, and Tony Hoare’s: “there are 2 ways to make a system (1) make it so complicated that it has no obvious deficiencies or (2) make it so simple that it obviously has no deficiencies”. And another which may or may not be original: “Understanding is a process, not a state”. And another interesting book recommendation: Personal Knowledge by Michael Polanyi.

So, here’s the core of MJ’s talk: every “contrivance” has an operational principle, which specifies how the characteristic parts fulfill their function. Further, knowledge of physics, chemistry, etc, is not sufficient to understand and recognise the operating principle. E.g. describing a clock – the description of the mechanism is not a scientific description. While the physical science has made great strides, our description of contrivances has not. The operational principle answers questions like “What is it?” “What is it for?”,  and “how do the parts interact to achieve the purpose?”. To supplement this, the mathematical and scientific knowledge describes the underlying laws, context necessary for success (e.g. pendulum clock only works in the appropriate gravitational field, and must be completely upright – won’t work on the moon, on a ship, etc), part properties necessary for success, possible improvements, specific failures and causes, feasibility of a proposed contrivance.

MJ then goes on to show how problem decomposition works:

(1) Problem decomposition –  by breaking out the problem frames: e.g. for an elevator: provide prioritized lift service, brake on danger, provide information display for users.

(2) Instrumental decomposition – building manager specifies priority rules, system uses priority rules to determine operation.

The sources of complexity are the intrinsic complexity of each subproblem, plus the interaction of subproblems. But he calls for the use of free decomposition (meaning free as in unconstrained). For initial description purposes, there are no constraints on how the subproblems will interact; the only driver is that we’re looking for simple operating principles.

Finally, then he identified some composition concerns: interleaving (edit priority rules vs lift service); requirements elaboration (e.g. book loans vs member status), requirements conflict (linter-library vs member loan), switching (lift service vs emergency action), domain sharing (e.g. phone display: camera vs gps vs email).

The discussion was fascinating, but I was too busy participating to take notes. Hope someone else did!

In our brainstorm session yesterday, someone (Faraz?) suggested I could kick off the ICSE session with a short video. The closest thing I can think of is this:

Wake Up, Freak Out – then Get a Grip

It’s not too long, it covers the recent science very well, and it is exactly the message I want to give – climate change is serious, urgent, demands massive systemic change, but is not something we should despair over. It also comes with a full transcript with detailed references into the primary scientific literature, which is well worth a browse.

Except that it scares the heck out of me every time I watch it. Could I really show this to an ICSE audience?

First a couple of local ones, in May:

Then, this one looks interesting: The World Climate Conference, in Geneva at the end of August. It looks like most of the program will be invited, but they will be accepting abstracts for a poster session. Given that the theme is to do with how climate information is generated and used, it sounds very appropriate.

Followed almost immediately by EnviroInfo2009, in Berlin, in September. I guess the field I want to name “Climate Informatics” would be a subfield of environmental informatics. Paper deadline is April 6.

Finally, there’s the biggy in Copenhagen in December, where, hopefully, the successor to the Kyoto agreement will be negotiated.

Here’s an updated description of the ICSE session I kicked off this blog with. Looks like we’re scheduled for the second morning afternoon of the conference (Thurs May 21, 11am 2pm), straight after the keynote.

Update: Slides and notes from the session now available.

Software Engineering for the Planet

This session is a call to action. What can we, as software engineers, do to help tackle the challenge of climate change (besides reducing our personal carbon footprints)? The session will review recent results from climate science, showing how big the challenge is. We will then identify ways in which software engineering tools and techniques can help. The goal is to build a research agenda and a community of software engineering researchers willing to pursue it.

The ICSE organisers have worked hard this year to make the conference “greener” – to reduce our impact on the environment. This is partly in response to the growing worldwide awareness that we need to take more care of the natural environment. But it is also driven by a deeper and more urgent concern.

During this century, we will have to face up to a crisis that will make the current economic turmoil look like a walk in the park. Climate change is accelerating, confirming the more pessimistic of scenarios identified by climate scientists [1-4]. Its effects will touch everything, including the flooding of low-lying lands and coastal cities, the disruption of fresh water supplies for much of the world, the loss of agricultural lands, more frequent and severe extreme weather events, mass extinctions, and the destruction of entire ecosystems [5].

And there are no easy solutions. We need concerted systematic change in how we live, to reduce emissions so as to stabilize the concentration of greenhouse gases that drive climate change. Not to give up the conveniences of modern life, but to re-engineer them so that we no longer depend on fossil fuels to power our lives. The challenge is massive and urgent – a planetary emergency. The type of emergency that requires all hands on deck. Scientists, engineers, policymakers, professionals, no matter what their discipline, need to ask how their skills and experience can contribute.

We, as software engineering researchers and software practitioners have many important roles to play. Our information systems help provide the data we need to support intelligent decision making, from individuals trying to reduce their energy consumption, to policymakers trying to design effective governmental policies. Our control systems allow us to make smarter use of the available power, and provide the  adaptability and reliability to power our technological infrastructure in the face of a more diverse set of renewable energy sources.

The ICSE community in particular has many other contributions to make. We have developed practices and tools to analyze, build and evolve some of the most complex socio-technical systems ever created, and to coordinate the efforts of large teams of engineers. We have developed abstractions that help us to understand complex systems, to describe their structure and behaviour, and to understand the effects of change on those systems. These tools and practices are likely to be useful in our struggle to address the climate crisis, often in strange and surprising ways. For example, can we apply the principles of information hiding and modularity to our attempts to develop coordinated solutions to climate change? What is the appropriate architectural pattern for an integrated set of climate policies? How can we model the problem requirements so that the stakeholders can understand them? How do we debug the models on which policy decision are based?

This conference session is intended to kick start a discussion about the contributions that software engineering research can make to tackling the climate crisis. Our aim is to build a community of concerned professionals, and find new ways to apply our skills and experience to the problem. We will attempt to map out a set of ideas for action, and identify potential roadblocks. We will start to build a broad research agenda, to capture the potential contributions of software engineering research, and discuss strategies for researchers to refocus their research towards this agenda. The session will begin with a short summary of the latest lessons from climate science, and a concrete set of examples of existing software engineering research efforts applied to climate change. We will include an open discussion session, to map out an agenda for action. We invite everyone to come to the session, and take up this challenge.

References:

[1] http://www.csmonitor.com/2006/0324/p01s03-sten.html

[2] http://www.newscientist.com/article/dn11083

[3] http://news.bbc.co.uk/2/hi/uk_news/7053903.stm

[4] http://www.pnas.org/content/104/24/10288.abstract

[5] http://www.ipcc.ch/ipccreports/ar4-wg2.htm

Next month, I’ll be attending the European Geosciences Union’s General Assembly, in Austria. It will be my first trip to a major geosciences conference, and I’m looking forward to rubbing shoulders with thousands of geoscientists.

My colleague, Tim, will be presenting a poster in the Climate Prediction: Models, Diagnostics, and Uncertainty Analysis session on the Thursday, and I’ll be presenting a talk on the last day in the session on Earth System Modeling: Strategies and Software. My talk is entitled Are Earth System model software engineering practices fit for purpose? A case study.

While I’m there, I’ll also be taking in the Ensembles workshop that Tim is organising, and attending some parts of the Seamless Assessment session, to catch up with more colleagues from the Hadley Centre. Sometime soon I’ll write a blog post on what ensembles and seamless assessment are all about (for now, it will just have to sound mysterious…)

The rest of the time, I plan to talk to as many climate modellers as a I can from other centres, as part of my quest for comparison studies for the one we did at the Hadley Centre.

I’ve been pondering starting a blog for way too long. Time for action. To explain what I think I’ll be blogging about, I put together the following blurb, for a conference session at the International Conference on Software Engineering. I’ll probably end up revising it for the conference, but it will do for a kickoff to the blog:

This year, the ICSE organisers have worked hard to make the conference “greener” – to reduce our impact on the environment. Partly this is in response to the growing worldwide awareness that we need to take more care of the natural environment. But partly it is driven by a deeper and more urgent concern. During this century, we will have to face up to a crisis that will make the current economic turmoil look like a walk in the park. Climate change is accelerating, outpacing the most pessimistic predictions of climate scientists. Its effects will touch everything, including the flooding of low-lying lands and coastal cities, the disruption of fresh water supplies for most of the world, the loss of agricultural lands, more frequent and severe extreme weather events, mass extinctions, and the destruction of entire ecosystems. And there are no easy solutions. We need concerted systematic change in how we live, to stabilize the concentration of greenhouse gases that drive climate change. Not to give up the conveniences of modern life, but to re-engineer them so that we no longer depend on fossil fuels to power our lives. The challenge is massive and urgent – a planetary emergency. The type of emergency that requires all hands on deck. Scientists, engineers, policymakers, professionals, no matter what their discipline, need to ask how their skills and experience can contribute.

We, as software engineering researchers and software practitioners have many important roles to play. Software is part of the problem, as every new killer application drives up our demand for more energy. But it is also a major part of the solution. Our information systems help provide the data we need to support intelligent decision making, from individuals trying to reduce their energy consumption, to policymakers trying to design effective governmental policies. Our control systems allow us to make smarter use of the available power, and provide the  adaptability and reliability to power our technological infrastructure in the face of a more diverse set of renewable energy sources. Less obviously, the software engineering community has many other contributions to make. We have developed practices and tools to analyze, build and evolve some of the most complex socio-technical systems ever created, and to coordinate the efforts of large teams of engineers. We have developed abstractions that help us to understand complex systems, to describe their structure and behaviour, and to understand the effects of change on those systems. These tools and practices are likely to be useful in our struggle to address the climate crisis, often in strange and surprising ways. For example, can we apply the principles of information hiding and modularity to our attempts to develop coordinated solutions to climate change? What is the appropriate architectural pattern for an integrated set of climate policies? How can we model the problem requirements so that the stakeholders can understand them? How do we debug strategies for emissions reduction when they don’t work out as intended?

This conference session is intended to kick start a discussion about the contributions that software engineering can make to tackling the climate crisis. Our aim is to build a community of concerned professionals, and find new ways to apply our skills and experience to the problem. We will attempt to map out a set of ideas for action, and identify potential roadblocks. We will start to build a broad research agenda, to capture the potential contributions of software engineering research. The session will begin with a short summary of the latest lessons from climate science, and a concrete set of examples of existing software engineering research efforts applied to climate change. We will include an open discussion, and structured brainstorming sessions to map out an agenda for action. We invite everyone to come to the session, and take up this challenge.

Okay, so how does that sound as a call to arms?