Summer projects: I posted yesterday on social network tools for computational scientists. Greg has posted a whole list of additional suggestions.
Here, I will elaborate another of these ideas: the electronic lab notebook. For computational scientists, wiki pages are an obvious substitute for traditional lab notebooks, because each description of an experiment can then be linked directly with the corresponding datasets, configuration files, visualizations of results, scientific papers, related experiments, etc. (In the most radical version, Open Notebook Science, the lab notebook is completely open for anyone to see. But the toolset would be the same whether it was open to anyone, or just shared with select colleagues)
In my study of the software practices at the UK Met Office last summer, I noticed that some of the scientists carefully document each experiment via a new wiki page, but the process is laborious in a standard wiki, involving a lot of cut-and-paste to create a suitable page structure. For this reason, many scientists don’t keep good records of their experiments. An obvious improvement would be to generate a basic wiki page automatically each time a model run is configured, and populate it with information about the run, and links to the relevant data files. The scientists could then add further commentary via a standard wiki editor.
Of course, an even better solution is to capture all information about a particular run of the model (including subsequent commentary on the results) as meta-data in the configuration file, so that no wiki pages are needed: lab notebook pages are just user-friendly views of the configuration file. I think that’s probably a longer term project, and links in with the observation that existing climate model configuration tools are hard to use anyway and need to be re-invented. Let’s leave that one aside for the moment…
A related problem is better support for navigating and linking existing lab book pages. For example, in the process of writing up a scientific paper, a scientist might need to search for the descriptions of number of individual experiments, select some of the data, create new visualizations for use in the paper, and so on. Recording this trail would improve reproducibility, by capturing the necessary links to source data in case the visualizations used in the paper need to be altered or recreated. Some of requires a detailed analysis of the specific workflows used in a particular lab (which reminds me I need to write up what I know of the Met Office’s workflows), but I think some of this can be achieved by simple generic tools (e.g. browser plugins) that help capture the trail as it happens, and perhaps edit and annotate it afterwards.
I’m sure some of these tools must exist already, but I don’t know of them. Feel free to send me pointers…
Related to this, it’s interesting how many scientists use Excel for almost everything: one of the medical imaging groups at UHN, for example, stores every project in a (very large) spreadsheet containing notes, snippets of code, tables of experimental results, graphs from those results, etc. How much of this could/should be moved online using something like Google Docs (to make sharing easier)?
For an electronic notebook to be practical it has to be easy to add information and find what you are looking for quickly. We have found that templates don’t work very well because our experiments vary so much in terms of protocols.
For an Open Notebook it also becomes important for others to find information. In that case the table of contents is not as useful. It is especially important for the notebook to be indexed quickly in major search engines like Google. In addition, we link to field specific portals. In the case of our solubility measurements, the chemical info boxes on Wikipedia are an important entry point. Other databases such as ChemSpider are logical.
In addition, to make it easy to browse, we have set up web query interfaces (this one by Rajarshi Guha) that link to the relevant lab notebook pages.
I think that there will be tremendous variability between fields and projects that use electronic notebooks, and Open Notebooks in particular. It is hard to predict what will work best until systems are put into place and used.
I’ve always thought that the endgame for scientific blogging is a form of notebook … when I started engineering my own blog software that’s where I thought I was going with it … oh how naive 🙁
Pingback: Before your organization invests in an Electronic Lab Notebook (ELN)… « The Power of Proof