I’m attending a workshop this week in which some of the initial results from the Fifth Coupled Model Intercomparison Project (CMIP5) will be presented. CMIP5 will form a key part of the next IPCC assessment report – it’s a coordinated set of experiments on the global climate models built by labs around the world. The experiments include hindcasts to compare model skill on preindustrial and 20th Century climate, projections into the future for 100 and 300 years, shorter term decadal projections, paleoclimate studies, plus lots of other experiments that probe specific processes in the models. (For more explanation, see the post I wrote on the design of the experiments for CMIP5 back in September).
I’ve been looking at some of the data for the past CMIP exercises. CMIP1 originally consisted of one experiment – a control run with fixed forcings. The idea was to compare how each of the models simulates a stable climate. CMIP2 included two experiments, a control run like CMIP1, and a climate change scenario in which CO2 levels were increased by 1% per year. CMIP3 then built on these projects with a much broader set of experiments, and formed a key input to the IPCC Fourth Assessment Report.
There was no CMIP4, as the numbers were resynchronised to match the IPCC report numbers (also there was a thing called the Coupled Carbon Cycle Climate Model Intercomparison Project, which was nicknamed C4MIP, so it’s probably just as well!), so CMIP5 will feed into the fifth assessment report.
So here’s what I have found so far on the vital statistics of each project. Feel free to correct my numbers and help me to fill in the gaps!
CMIP (1996 onwards) 
CMIP2 (1997 onwards) 
CMIP3 (20052006) 
CMIP5 (20102011) 


Number of Experiments  1  2  12  110 
Centres Participating  16  18  15  24 
# of Distinct Models  19  24  21  45 
# of Runs (Models X Expts)  19  48  211  841 
Total Dataset Size  1 Gigabyte  500 Gigabyte  36 TeraByte  3.3 PetaByte 
Total Downloads from archive  ??  ??  1.2 PetaByte  
Number of Papers Published  47  595  
Users  ??  ??  6700 
[Update:] I’ve added a row for number of runs, i.e. the sum of the number of experiments run on each model (in CMIP3 and CMIP5, centres were able to pick a subset of the experiments to run, so you can’t just multiply models and experiments to get the number of runs). Also, I ought to calculate the total number of simulated years that represents (If a centre did all the CMIP5 experiments, I figure it would result in at least 12,000 simulated years).
Oh, one more datapoint from this week. We came up with an estimate that by 2020, each individual experiment will generate an Exabyte of data. I’ll explain how we got this number once we’ve given the calculations a bit more of a thorough checking over.