Congratulations to Jorge, who passed the first part of his PhD thesis defense yesterday with flying colours. Jorge’ thesis is based on a whole series of qualitative case studies of different software development teams (links go to ones he’s already published):
- 7 successful small companies (under 50 employees) in the Toronto region;
- 9 scientific software development groups, in an academic environment;
- 2 studies of large companies (IBM and Microsoft);
- 1 detailed comparative study of a company using Extreme Programming (XP) versus a similar sized company that uses more traditional development process (both building similar types of software for similar customers);
We don’t have anywhere near enough detailed case studies in software engineering – most claims for the effectiveness of various approaches to software development are based on little more than marketing claims and anecdotal evidence. There has been a push in the last decade or so for laboratory experiments, which are usually conducted along the lines of experiments in psychology: recruit a set of subjects, assign them a programming task, and measure the difference in variables like productivity or software quality when half of them are given some new tool or technique. While these experiments are sometimes useful for insights into how individual programmers work on small tasks, they really don’t tell us much about software development in the wild, where, as Parnas puts it, the interesting challenges are in multi-person development of multi-version software over long time scales. Jorge cites a particular example in his thesis of a controlled study of pair programming, which purports to show that pair programming lowers productivity. Except that it shows no such thing – any claimed benefits of pair programming are unlikely to emerge with subjects who are put together for a single day, but who otherwise have no connection with one another, and no shared context (like, for example, a project they are both committed to).
Each of Jorge’s case studies is interesting, but to me, the theory he uses them to develop is even more interesting. He starts by identifying three different traditions to the study of software development:
- The process view, in which software construction is treated like a production line, and the details of the individuals and teams who do the construction are abstracted away, allowing researchers to talk about processes and process models, which, it is assumed, can be applied in any organizational context to achieve a predictable result. This view is predominant in the SE literature. The problem, of course, is that the experience and skills of individuals and teams do matter, and the focus on processes is a poor way to understand how software development works.
- The information flow view, in which much of software development is seen as a problem in sharing information across software teams. This view has become popular recently, as it enables the study of electronic repositories of team communications as evidence of interaction patterns across the team, and leads to a set of theories abut how well patterns of communication acts match the technical dependencies in the software. The view is appealing because it connects well with what we know about interdependencies within the software, where clean interfaces and information hiding are important. Jorge argues that the problem with this view is that it fails to distinguish between successful and unsuccessful acts of communication. It assumes that communication is all about transmitting and receiving information, and it ignores problems in reconstructing the meaning of a message, which is particularly hard when the recipient is in a remote location, or is reading it months or years later.
- The third view is that software development is largely about the development of a shared understanding within teams. This view is attractive because it takes seriously the intensive cognitive effort of software construction, and emphasizes the role of coordination, and the way that different forms of communication can impact coordination. It should be no surprise that Jorge and I both prefer this view.
Then comes the most interesting part. Jorge points out that software teams need to develop a shared understanding of goals, plans, status and context, and that four factors will strongly impact their success in this: proximity (how close the team members are to each other – being in the same room is much more useful than being in different cities), synchrony (talking to each other in (near) realtime is much more useful than writing documents to be read at some later time); symmetry (which means the coordination and information sharing is done best by the people whom it most concerns, rather than imposed by, say, managers) and maturity (it really helps if a team has an established set of working relationships and a shared culture).
This theory leads to a reconceptualization of many aspects of software development, such as the role of tools, the layout of physical space, the value of documentation, and the impact of growth on software teams. But you’ll have to read the thesis to get the scoop on all these…
I’d love a copy of the thesis – is it available online, or will it be, at some point?
[soon – I’ll update with a link when it’s available – Steve]
Congrats to both Jorge and you!
Pingback: Can we improve the engineering of climate software? | Serendipity
Pingback: Do Climate Models need Independent Verification and Validation? | Serendipity