Trees for the Forest

November 19, 2009

I’m Back

Filed under: Uncategorized — Chad @ 2:58 am

The title says it all. It’s good to be back with the old computer (since Tuesday afternoon). Time to get back to work.

November 5, 2009

Problems and more problems

Filed under: Uncategorized — Chad @ 11:01 pm

I was sitting at my computer, minding my own business watching The Office, when I suddenly heard a relatively loud pop. I thought my external drive tipped over. It didn’t. I was testing a Fortran program and I closed a notepad document. The next thing I know, the computer shutdown immediately. I tried turning it on multiple times, but it wouldn’t make it more than 15 seconds or so into booting up. I came back to it about 10 minutes later. It was booting up and asked if it should restart in safe mode. I started Windows XP normally. Before it got to the login screen, the computer turned off again. I called someone at BestBuy and they speculated it could be a failed heatsink or power source. I think the fact the boot up process got farther along after leaving it alone for 10 minutes may support the idea that the heatsink failed some how (because the CPU had time to cool off.)

If anyone has any ideas what may be causing this from the little information I have, your help would be greatly appreciated. Until I get the old system up and running, there will be no more climate blogging for me.

P.S. I thought my computer was slow and outdated. I’m now using a much older computer with half the memory (256 MB) and three-fourths the computing power (750 MHz.)

UPDATE – That thing that made the pop sound was one of the brackets on the heat sink breaking off of the base. Since the heat sink lost contact with the CPU, the computer shut down to prevent the CPU from melting. Now all I need to do is get a new base for the heat sink.

UPDATE – I ordered a new bracket to hold the heat sink in place. Though it’s unlikely to come any time soon.

November 1, 2009

Comparing Spatial Averages

Filed under: Climate models, Data comparisons, Surface temperature record — Chad @ 10:39 pm

As I’ve shown, the different surface data sets are not necessarily comparable because they are anomalies relative to different climatologies and have different spatial coverage. Here, I’m going to take the AR4 model output and process it in a way that will make it an apples-to-apple comparison to see what differences spatial coverage makes. I took data from January 1979 – December 2008, calculated the climatology for this period and subtracted it to form anomalies. I then interpolated the data into identical grid maps used for GISS, HadCRUT and NCDC. GISS anomalies are available  in 250 km and 1200 km interpolated versions. The interpolated model data was then masked to match the coverage of the four surface temperature data sets. Global averages were calculated for all four plus an average for 100% coverage. Here’s BCCR BCM 2.0 for starters.

BCCR_BCM_2.0_run_1_coverage_comparison

The red line is the area-weighted average for the particular data set coverage. The black line is for 100% coverage. Not much of a difference. The relevant statistics for the model data relative to each observational data set are given below. The trends are given in °C/decade.

MODEL 100% GISS.250 GISS.1200 HADCRUT NCDC
BCCR BCM 2.0 0.141 0.141 0.142 0.140 0.140
CCCMA CGCM 3.1 T47 0.246 0.234 0.243 0.233 0.237
CCCMA CGCM 3.1 T63 0.299 0.275 0.297 0.268 0.271
CSIRO MK 3.0 0.106 0.108 0.106 0.109 0.107
CSIRO MK 3.5 0.204 0.192 0.203 0.191 0.200
GFDL CM 2.0 0.279 0.260 0.276 0.265 0.272
GFDL CM 2.1 0.298 0.279 0.295 0.281 0.287
GISS AOM 0.129 0.124 0.129 0.124 0.127
GISS EH 0.164 0.160 0.163 0.161 0.165
GISS ER 0.183 0.179 0.182 0.181 0.185
IAP FGOALS 1.0g 0.150 0.122 0.135 0.117 0.120
INGV ECHAM 4 0.175 0.164 0.173 0.166 0.168
INM CM 3.0 0.218 0.193 0.219 0.190 0.196
IPSL CM 4 0.296 0.283 0.294 0.280 0.284
MIROC 3.2 HIRES 0.330 0.306 0.329 0.304 0.310
MIROC 3.2 MEDRES 0.181 0.182 0.182 0.184 0.186
MRI CGCM 2.3.2a 0.137 0.132 0.137 0.131 0.132
NCAR CCSM 3 0.270 0.253 0.269 0.249 0.255
NCAR PCM 1 0.181 0.170 0.179 0.164 0.170
UKMO HADGEM 1 0.271 0.255 0.270 0.247 0.252

MEAN 0.213 0.201 0.211 0.199 0.203
STD DEV 0.068 0.062 0.068 0.062 0.063
PERCENT (p < 0.05) 100.000 97.917 97.917 95.833 97.917

First thing some may notice is that for the 1979 – 2008 period, essentially every model run showed a statistically significant trend. This is because a) looking at  a longer period affords more degrees of freedom which translate into small standard errors and b) the noise is greatly reduced by looking over such a long time period. Notice that on average, the model runs using the GISS 1200 km coverage are the closest to the 100%. This isn’t just due to the fact that its spatial coverage is virtually 100%. GISS 1200 km includes the Arctic region where the warming trend is amplified. HadCRUT, the dataset with the worse spatial coverage, is on average lower than the 100% and GISS 1200 km trends, probably primarily due to its missing Arctic data. Trends corresponding to the GISS 250 km and NCDC coverage are roughly comparable. Here are the results for the Jan 2001 – Dec 2008 period.

 

MODEL 100% GISS.250 GISS.1200 HADCRUT NCDC
BCCR BCM 2.0 0.282 0.317 0.291 0.314 0.336
CCCMA CGCM 3.1 T47 0.316 0.263 0.308 0.247 0.245
CCCMA CGCM 3.1 T63 0.355 0.309 0.347 0.278 0.265
CSIRO MK 3.0 0.312 0.277 0.305 0.246 0.282
CSIRO MK 3.5 0.345 0.319 0.338 0.321 0.347
GFDL CM 2.0 0.138 0.108 0.124 0.071 0.123
GFDL CM 2.1 0.590 0.593 0.581 0.631 0.651
GISS AOM 0.137 0.137 0.143 0.150 0.170
GISS EH 0.094 0.094 0.090 0.078 0.102
GISS ER 0.170 0.156 0.169 0.156 0.171
IAP FGOALS 1.0g 0.224 0.179 0.201 0.173 0.184
INGV ECHAM 4 0.066 0.062 0.068 0.061 0.053
INM CM 3.0 0.358 0.292 0.366 0.257 0.293
IPSL CM 4 0.365 0.403 0.365 0.435 0.450
MIROC 3.2 HIRES 0.582 0.595 0.574 0.579 0.622
MIROC 3.2 MEDRES 0.253 0.237 0.248 0.224 0.233
MRI CGCM 2.3.2a 0.164 0.162 0.164 0.159 0.162
NCAR CCSM 3 0.298 0.279 0.301 0.281 0.294
NCAR PCM 1 0.237 0.194 0.238 0.186 0.186
UKMO HADGEM 1 0.514 0.489 0.511 0.470 0.484
MEAN 0.290 0.273 0.287 0.266 0.283
STD DEV 0.148 0.152 0.147 0.159 0.162
PERCENT (p < 0.05) 56.250 60.417 60.417 62.500 58.333

The shorter Jan 2001-Dec 2008 period show similar trends, but significantly more than half of them yield statistically significant warming trends. Looking at such a short period results in fewer degrees of freedom which are reduced even further by autocorrelation effects.

You may notice the model list to be a little short. This is because when I reprocessed the AR4 model data, I placed a few constraints on what gets processed. I wanted to join the skin surface, 2-m, and LT, MT, LS brightness temperature data into on file so I can very easily access multiple variables. Not all the data was processed. Some of you may remember the CNRM CM 3 data that SteveM posted on because it showed a weird step change at about the first and second third of the 20th century. Well, the model uses one set of pressure levels during 20c3m and uses another set of pressure levels during a1b. This means two different weighting functions would have to be used. This would undoubtedly cause a step change. I was only going to use models that represented the atmospheric temperature at the IPCC mandated pressure levels. Any models not playing by the rules were not processed. HadCM 3 is one of them. MUIB Echo G didn’t even have any atmospheric temperature data!

October 24, 2009

Comparing Surface Temperature Records

Filed under: Surface temperature record — Chad @ 12:19 am
1. Obtain GISS, HadCRUT and NCDC gridded data. The GISS and NCDC data use the same grid map. HadCRUT has the
lowest resolution (5° x 5°). The 1,200 km version was used for GISS.
http://www.cru.uea.ac.uk/cru/data/temperature/
http://www.esrl.noaa.gov/psd/data/gridded/data.noaamergedtemp.html
http://www.esrl.noaa.gov/psd/data/gridded/data.gisstemp.html
2. Find the period of time where all three datasets overlap in time. The NCDC data wasn’t completely up-to-date
as it stopped in April 2009. This was the end point for the calculations. The starting point was January 1880.
3. Adjust GISS and NCDC so they are expressed as anomalies relative to the mean climatology of 1961-1990. When
computing the climatology of each grid point, I allowed missing values to force the mean calculation to become a
missing value. This necessarily results in permanent loss of coverage, but it is very minor for GISS and NCDC
because as you’ll see below, over this new base period, the data set has almost total coverage. My motivation
for allowing these grid points to become missing is that I did not want to deal with having grid points with a
potentially biased climatology. Each grid point should ideally have 30 values for January, February, etc. But if
months are missing, the monthly mean value excluding the missing values will certainly make some months too warm
or too cold. One less complication to worry about.
4. Subtract the calculated climatologies from GISS and NCDC.
5. Interpolate the adjusted GISS data onto the grid map used by HadCRUT.
6. Subtract all three data sets from each other (delta = giss – hadcrut – ncdc). This flags all the missing
values in all three data sets.
7. Mask all the missing values in all three data sets. Now they all have the same spatial resolution, the same
spatial coverage and the same base period.
8. Calculate global averages for all three data sets. All three averages are calculated the same way. HadCRUT’s
official global averages are an average of Northern and Southern hemispheric averages. This method is not used
for consistency.
We now have three data sets that we can make apples-to-apples comparisons with. Now let’s look at the spatial
coverage of all three data sets plus the unadjusted version of NCDC.
INSERT FIGURE 1. surface.data.coverage
GISS starts out strong with about 80% coverage and since about 1960, the 1,200 km version has pretty much maxed
out. The effects of two world wars causing a loss of reporting stations are clearly visible in the HadCRUT
coverage, but not as visible in NCDC and not even visible in either version of GISS.
Looking at the two versions of GISS and their difference, there are two noticable features.
INSERT FIGURE 2. giss.250.1200.versions
I calculated a running 5-year average for both 250 km and 1200 km versions and subtracted them the original data
and computed the standard deviation. The 1200 km version shows 14% more variability in the residuals. This is
probably due to the noise in the data that is used in the interpolation propagating from higher to lower
latitudes where the noise is given greater weight. The differenced series shows the second noticable feature.
INSERT FIGURE 3. giss.1200.minus.250
There is an upward trend over the satellite period of 0.0232 ± 0.0068 °C/decade with the lag-1 coefficient of
0.228. This trend is probably due to the interpolation into the Arctic region which is apparently warming much
faster than the rest of the globe.
Next, let’s look at all three series together with and without equal area coverage and see what differences the
area factor makes. I’ve calculated a running 12-month average to make the three series easier to discriminate
from one another. Since we now have a time series that is free from area coverage effects, the effects of the
different processing schemes may become more visible.
INSERT FIGURE 4. giss.had.ncdc.filtered
It is apparent that the differences in spatial coverage accounts for a significant portion of the differences
between the three series. Overall, the equal coverage versions are in better agreement than their unequal
coverage counterparts.
INSETER FIGURE 5. giss.had.ncdc.filtered.1979
The difference in the three series is most apparent at the beginning and end of the series. Here’s how it looks
over the satellite period.
INSETER FIGURE 5. giss.had.ncdc.filtered.1979
For about twenty years beginning with 1979, GISS and NCDC are tightly bound. All three series show more
consistency with each other for about the last 15 years or so. For completeness, here’s the post 20th century
period.
INSERT FIGURE 6. giss.had.ncdc.filtered.2001
Plots of the differences show significant (in a statistical sense) differences in long term trends.
INSERT FIGURE 7. giss.minus.had
First, take note that the equal area series shows less variance than the unequal area series. The reason is that
with that series, we are subtracting the noise from the same set of grid boxes. The magnitude of the noise is
similar, but not the same because difference processing schemes are used for both GISS and NCDC and the
interpolation undoubtedly mixes noise from surrounding grid points. The unequal area series shows more variance
because we aren’t exactly subtracting like with like as I described above. As far as trend analysis is
concerned, this increases the lag-1 autocorrelation but simaltaneously descreases the standard error and we end
up with tighter confidence intervals.
GISS anomalies are consistantly above HadCRUT the difference shows a long term declining trend for about the
last 60 years. What is causing this? A few possibilities: a) The mean climatology in GISS is colder than the
climatology in HadCRUT, b) spurious urban warming or c) a combination of the two as well as other unknown factors. As far as I know, HadCRUT makes no attempt to deal with urban warming. The long term trend from 1950 to the present using the equal area data is -0.01723986 ± 0.002362541 °C/decade, lag-1 value = 0.4227776. Using the unequal area data, I get a trend of -0.008044086 ± 0.004118713 °C/decade, lag-1 value = 0.2871348. Looking at data over the satellite period, the trend is -0.02270083 ± 0.006253924 °C/decade with a lag-1 value of 0.4114007. Using the unequal area data, the trend is -0.0009793928 ± 0.01237656 °C/decade with a lag-1 value of 0.3332930 resulting in an insignificant trend. Here’s GISS and NCDC.

The main thrust of my research now is centered on studying urban biases in the surface temperature records. We’re all aware that many temperature stations suffer from poor siting issues. The question in my mind is what effect do urban biases have on regional to global scale anomaly trends. How can we best uncover them? Over what timescales are they detectable? I have quite a few ideas which I’ll elaborate on in the next post. This post however is just a dump of nice looking graphs for some calculations I’m working on related to this issue.

Not all surface temperature records are created equal and they are not necessarily representing the same thing. Comparing them may involve pitfalls that the casual or not-so-casual user may fall into if not careful. All three well-known data sets are anomalies calculated relative to different climatologies using different methodologies. Simple re-normalizing all three to the same baseperiod doesn’t put them on the same footing. The only way one can make apples-to-apples comparisons is to get their hands dirty with gridded data. Here are the steps I took in this analysis:

1. Obtain GISS, HadCRUT and NCDC gridded data. The GISS and NCDC data use the same grid map. HadCRUT has the lowest resolution (5° x 5°). The 1,200 km version was used for GISS.

2. Find the period of time where all three datasets overlap in time. The NCDC data wasn’t completely up-to-date as it stopped in April 2009. This was the end point for the calculations. The starting point was January 1880.

3. Adjust GISS and NCDC so they are expressed as anomalies relative to the mean climatology of 1961-1990. When computing the climatology of each grid point, I allowed missing values to force the mean calculation to become a missing value. This necessarily results in permanent loss of coverage, but it is very minor for GISS and NCDC because as you’ll see below, over this new base period, the data sets have almost total coverage. My motivation for allowing these grid points to become missing is that I did not want to deal with having grid points with a potentially biased climatology. Each grid point should ideally have 30 values for January, February, etc. But if months are missing, the monthly mean value excluding the missing values will certainly make some months too warm or too cold. One less complication to worry about.

4. Subtract the calculated climatologies from GISS and NCDC.

5. Interpolate the adjusted GISS data onto the grid map used by HadCRUT.

6. Subtract all three data sets from each other (delta = giss – hadcrut – ncdc). This flags all the missing values in all three data sets.

7. Mask all the missing values in all three data sets. Now they all have the same spatial resolution, the same spatial coverage and the same base period.

8. Calculate global averages for all three data sets. All three averages are calculated the same way. HadCRUT’s official global averages are an average of Northern and Southern hemispheric averages. This method is not used in my calculations for consistency with GISS and NCDC.

We now have three data sets that we can make apples-to-apples comparisons with. Now let’s look at the spatial coverage of all three data sets plus the unadjusted version of NCDC.

surface.data.coverage

GISS starts out strong with about 80% coverage and since about the late 1950s, the 1,200 km version has pretty much maxed out. The effects of two world wars causing a loss of reporting stations are clearly visible in the HadCRUT coverage, but not as visible in NCDC and not even visible in either version of GISS. Looking at the two versions of GISS and their difference, there are two noticable features.

giss.250.1200.versions

I calculated a running 5-year average for both 250 km and 1200 km versions and subtracted them the original data and computed the standard deviation. The 1200 km version shows 14% more variability in the residuals. This is probably due to the noise in the data that is used in the interpolation propagating from higher to lower latitudes where the noise is given greater weight. The differenced series shows the second noticable feature.

giss.1200.minus.250

There is an upward trend over the satellite period of 0.0232 ± 0.0068 °C/decade with the lag-1 coefficient of 0.228. This trend is probably due to the interpolation into the Arctic region which is apparently warming much faster than the rest of the globe. Next, let’s look at all three series together with and without equal area coverage and see what differences the area factor makes. I’ve calculated a running 12-month average to suppress noise and make the three series easier to discriminate from one another. Since we now have a time series that is free from area coverage effects, the effects of the different processing schemes may become more visible.

giss.had.ncdc.complete

It is apparent that the differences in spatial coverage accounts for a significant portion of the differences between the three series. Overall, the equal coverage versions are in better agreement than their unequal coverage counterparts.

The difference in the three series is most apparent at the beginning and end of the series. Here’s how it looks over the satellite period.

giss.had.ncdc.1979.present

For about twenty years beginning with 1979, GISS and NCDC are more tightly bound in the equal coverage than the unequal coverage version. All three series show more consistency with each other for about the last 15 years or so. The apparent divergence of GISS/NCDC relative to HadCRUT in the last decade or so is illusory as the next figure makes clear.

giss.had.ncdc.2001.present

This is fairly important to those who compare anomalies each month they are published. It’s really an apples-to-oranges comparison.

Looking at the difference between two series at a time reveals significant (in a statistical sense) differences in long term trends.

giss.had.complete

First, take note that the equal area series shows less variance than the unequal area series. The reason is that with the equal area series, we are subtracting the noise from the same set of grid boxes. The magnitude of the noise is similar, but not the same because difference processing schemes are used for both GISS and HadCRUT and the interpolation undoubtedly mixes noise from surrounding grid points in GISS. The unequal area series shows more variance because we aren’t exactly subtracting like with like as I described above. As far as trend analysis is concerned, this increases the lag-1 autocorrelation but simaltaneously decreases the standard error and we end up with tighter confidence intervals.

GISS anomalies are almost consistantly above HadCRUT. The difference shows a long term declining trend. What is causing this? A few possibilities: a) The mean climatology in GISS is colder than the climatology in HadCRUT, b) spurious urban warming or c) a combination of the two as well or other unknown factors. As far as I know, HadCRUT makes no attempt to deal with urban warming. We get different results when comparing GISS to NCDC.

giss.ncdc.complete

The adjusted data shows no significant trend over the period of maximum overlap. The both data sets have similar anomalies because the difference, for the most part, is centered on zero. The next combination is HadCRUT and NCDC.

had.ncdc.complete

Compared to GISS – HadCRUT, the trend is of equal magnitude, but opposite sign. This shouldn’t come as a surprise given that GISS and NCDC are very similar. Now that we’ve got the big picture out of the way, we can look at smaller time periods. The first will be 1880-1949.

giss.had.1880.1949

There’s a barely significant trend in the first 70 years of data.

giss.had.1950.1978

The next period is from 1950-1978. Again, we find a significant downward trend in both the statistical and absolute sense of the word. Here are the next two periods 1979-present and 2001-present, without comment.

giss.had.1979.present

giss.had.2001.present

I really wish I could of gotten the strucchange package to work because it would have been a real help. I might give it another try. BTW, what do you think of the new theme? My graphs look better I must say! Back to work.

October 16, 2009

RSS vs. UAH

Filed under: Data comparisons, Satellite temperature record — Chad @ 9:59 pm

I had begun calculations for a post on urban warming, but as usual, I got sidetracked on something else. This is that sidetrack. There’s much interest in the differences in the RSS and UAH temperature data sets. We’ve seen plenty of graphs showing the difference in the spatial averages of both data sets. But I think we can learn more by looking at the differences between the gridded data. I took the difference between UAH and RSS from January 1979 – Sept 2009 and calculated the trend at each grid point.

uah.minus.rssAs you can see, there is strong differential warming of RSS relative to UAH in the tropics. This probably explains why Santer’s analysis of tropical tropospheric amplification went well with RSS and not so much with UAH. Outside of the tropics, the differential warming is in the opposite direction but dependent on hemisphere with UAH in the lead in the North. Approaching the North Pole, the differential warming becomes unmistakably larger and larger. However, approaching the South Pole, the warming rises, then at about 50° S, it starts to drop. I remember reading that RSS’s correction of diurnal drift is latitude dependent, but I couldn’t find the paper or website where I found it. If anyone knows if this is true, I’d appreciate a reference. Here’s the zonal trend.

uah.minus.rss.lat.trend

Now let’s look at how each satellite record compares to the three surface temperature records. To make the comparison, I had to calculate the monthly mean anomaly and subtract it from the entire gridded data set. This is where I ran into some problems. GISS has the best consistent coverage of the three data sets. As you know, GISS produces two anomaly series. One that interpolates anomalies out as far as 250 km and another up to 1200 km. I used the 1200 km series for this analysis. HadCRUT and NCDC however don’t interpolate to fill in missing data. This presents something of a problem when trying to rebaseline the data. If I have a series of January anomalies corresponding to my new baseperiod, the easiest situation is where all the anomalies area real-valued. What does one do when not all of the anomalies area real valued? By computing a mean and ignoring missing values, it could bias the results in unpredictable ways. That wasn’t an issue I was interested in dealing with. So I resorted to only computing the mean when all values are defined. This resulted in a serious loss of coverage for HadCRU and NCDC.

The period of analysis is Jan 1979 – Dec 2008. I chose this period because a) it makes the programming easier and b) the NCDC gridded data only goes out to April 2009. Here’s RSS and UAH minus GISS.

rss.minus.gissuah.minus.giss

Both data sets show relatively similar patterns of warming but RSS shows stronger more widespread warming in the tropics. Here’s the trend by latitude.

Oct 26- CORRECTION: John N-G discovered an error with the zonal trend calculation. See his comment below. I’ve updated the zonal trend graphs. You can see the three original versions here, here and here.)

rss.and.uah.minus.giss.lat.trend.corrected

Next up is HadCRUT.

rss.minus.haduah.minus.had

It’s not apparent from these two plots, but there is a significant difference in warming in the tropics as the latitude trend shows.

rss.and.uah.minus.had.lat.trend.correctedOct 26, UPDATE: I’ve truncated the x-limits of the graph because of extreme values at high latitudes in the South. These values shouldn’t be taken seriously because there are few grid boxes with real-valued trends in this region, so it’s not really a valid spatial average.

Moving on to NCDC,

rss.minus.ncdcuah.minus.ncdc

The pattern of warming for both data sets is almost identical. So identical I thought I may have mistakenly processed the same data set twice. I checked and that wasn’t the case. The latitude trends confirm this.

rss.and.uah.minus.ncdc.lat.trend.corrected

This plot raises some questions. There is a strong difference between RSS and UAH anomalies. So then why does this difference almost completely disappear relative to NCDC? It’s almost like RSS and UAH suddenly became similar. Strange.

UPDATE! Oct 21, 2009

JeffID suggested that I repeat the trend calculation for the period before and after RSS/UAH were using the same satellites. I recalculated the trend map for the periods Jan 1979 – Dec 1991 and Jan 1992 – Sept 2009. Here are the results.

rss.minus.giss.end.1991rss.minus.giss.start.1992The differences are like night and day. Before 1992, the trends were similar as what I found previously. But after 1992, there is strong differential warming over South America, Africa (with some cold spots) and Australia. I’ll leave it to you discuss the specifics.

October 5, 2009

In Praise of Fortran!

Filed under: Uncategorized — Chad @ 2:48 pm

I was processing the model data and calculating global and tropical averages. I chose only two regions because it took so long to do the calculations. I stopped my program yesterday and decided I should try recoding in Fortran. I wrote the most computational intensive portions of my code in Fortran, compiled it into a dll and called it from R. I could not believe how fast my programs were now running. It is beyond amazing! Now I’m going to start fresh as far as processing data is concerned. I’m going to reprocess all the 20c3m/sresa1b runs for msu channels 2,4, as well as the SST, 2-m air temperature and atmospheric temperature data.

October 2, 2009

AR4 Model Data Problems

Filed under: Climate models, Synthetic MSU — Chad @ 10:17 pm

Since the last post I’ve been looking for more efficient ways of processing model data. My original code for computing brightness temperature was slow. Mostly because I’m running it on an old computer. In my previous post, I only looked at 20C3M model runs that had a corresponding A1B run. This left significant runs out of the analysis. I want to process those unused runs. I also want to redo Santer’s analysis of the lapse rate. This requires me to compute spatial averages of each atmospheric layer. I had wrote some code to do this and realized that I could just multiply the averages at each layer by the MSU weighting function to get the brightness temperature. Doing it this way was is MUCH faster than multiplying two 3D matrices and summing their product in the third dimension. For comparison, computing by layer took a little more than 5 minutes. Doing it the other way could take about 30-45 minutes. Big difference. However, I ran into some problems. The lowest atmospheric level is 1000 hPa. A lot of models have missing data at that and higher atmospheric levels. Let’s look at NCAR PCM 1. Here’s what the first time step looks like (run 1, 20C3m).

pcm_1000_hpa

Now here’s what the fractional area coverage at the first nine pressure levels for the first 5 years of the 20C3m experiment. (Click to get a clearer view).

pcm_area_fraction

It made sense to me at first that there would be missing data in the lowest level because there are mountainous regions of high altitude (Andes, Greenland, Tibetan Plateau, the Rockies, etc), but the missing data as you can see stretches pretty high into the atmosphere, though its magnitude drops off to almost nothing after the first layer. What didn’t make sense is why the missing data coverage is changing with time. I sent an inquiry to the nice people at NCAR about this issue and received a prompt informative response. The reason is because the IPCC defined 17 pressure levels and the original model output wasn’t on the same number of pressure levels. The data had to be interpolated to the IPCC defined levels and the missing data is simply an artifact. It was recommended that I obtain the original data from the Earth System Grid. I went there and found a lot of data of sift through. I found this page to help guide me through the confusing indexing of the files contained here. The run apparently corresponding to 20C3M for NCAR PCM 1 is B05.03 but it starts Feb 1871 (the webpage says it starts Oct 1870) and the file on PCMDI starts on Oct 1870. I found the index in the B05.03 run corresponding to the start date in the PCMDI data and took the difference between the lowest pressure levels.

delta_pcm_model

Unfortunately, we don’t see all zeros. This is definitely due to one reason and maybe two anothers: The data are defined on different pressure levels. One uses regular pressure coordinates, the other uses normalized pressure coordinates. This usually wouldn’t be a problem, however, the PCMDI data is on 17 levels and the original is on 18. So the lowest level may not precisely correspond in both models. The other possible reason is that I’m not looking at the same model run. Also, the udunits R package may not be correctly translating the time variable into year/month pairs.

None the less, I’m not going to download anymore data. I’m going to reprocess everything I have using the layer-by-layer method that I described earlier. I compared one run to some older data I processed and it does introduce a difference of about 1 °C or so, but it’s a trendless error.

On a brighter note, I finally was able to port C code into R! My objective was to take the most computationally intensive portion of my synthetic MSU program and recode it in C and call it from R to speed things up. It didn’t work. Rgui.exe kept crashing. Oh well, back to the drawing board.

These results leave open the question: Should these models be used? Maybe. The land areas area completely undefined. This may not be so bad because the weight at 1000 hPa in the MSU channels 2 and 6 are relatively small compared to the weights height up where the loss of coverage isn’t nearly as bad.

September 26, 2009

AR4 Model Hypothesis Tests

This is the culmination of many weeks of hard work. To recap the previous posts on this blog, I found the weighting function used in Santer et al. 2008 to produce synthetic MSU brightness temperature. I applied the function to AR4 climate model data and found it to match the data released by Santer. I’ve processed all the data into the same choice of spatial averages used by Santer (global, nh, sh,  tropics, etc). The 23 models used in this analysis are

BCCR BCM 2.0, CCCMA CGCM 3.2 T47/T63, CNRM CM 3, CSIRO MK 3.0/3.5, GFDL CM 2.0/2.1, GISS AOM/EH/ER, IAP FGOALS 1.0g, INGV ECHAM 4, INM CM 3, IPSL CM 4, MIRCO 3.2 hires/medres, MPI ECHAM 5, MRI CGCM 2.3.2a, NCAR CCSM 3/PCM 1, UKMO HADCM 3/HADGEM 1

There are a total of 53 runs. Santer didn’t use BCCR BCM 2.0, INGV ECHAM 4 or GISS AOM or all the runs for all the the other models. The statistical basis of this analysis is based on Santer et al. 2009. Two hypotheses will be tested: H1) The  trend in a particular model run is consistent with the trend in the observational data and H2) The multi-model mean trend is consistent with the trend in the observational data. For more details on the statistical test, you can read the paper here. I also used the same method to account for autocorrelation in the residuals.

The statistical test will be applied to MSU channels 2, 4 and 6 which correspond to the lower/mid troposphere and the lower stratosphere, respectively. First, let’s look at the 1999-1979 period covered in the Santer period. There are some differences to note.

TLT_tropics_Jan-1979_Dec-1999_RSS

If you compare this graph to the one in the Santer paper on page 1711, you’ll see they largely match up. You’ll also notice that the number of runs for each model doesn’t match because when I was processing the data, I only chose runs from 20C3M that have a corresponding run in A1B. I calculated the multi-model d* statistic to be 1.27 and 0.58 with degrees of freedom of 22 for both and p-values of 0.23 and 0.57, for UAH and RSS, respectively. Hereafter, every time I quote two identical statistics, like d* or p-values, they refer to UAH first and RSS second. These figures are fairly close to what is reported by Santer (d*  = 1.11 and 0.37). Note d*1 in Santer et al. is simply d* here.

What happens if we redo the calculations out to the present (data up to August 2009)? Below we see the model means are significantly higher than the observational trend than they were in the 1979-1999 period.

TLT_tropics_Jan-1979_Aug-2009_RSS

The d* statistics are 3.49 and 2.01 with, again, 22 degrees of freedom and p-values of 0.002 and 0.056, respectively. At 5% significance the result is that H2 is rejected relative to UAH, but barely slips by with RSS. For H1, the two CCCMA CGCM models, GISS ER runs 4/5 and MIROC 3.2 hires run 1 rejected relative to RSS (or 17% of the model runs). The situation is worse with UAH. The CCCMA models rejected, as well as, GISS AOM, GISS EH 1/2, GISS ER, the two MIROC models, MRI, NCAR CCSM 3 except run 3 and UKMO HADGEM 1 (or 55% of the model runs).

If we consider the globe as a whole, the results are fairly similar.

TLT_global_Jan-1979_Aug-2009_RSSH2 is still rejected in both data sets  with p-values of 0.000 and 0.002. The proportion of H1 rejections are 54.7% and 45.3%. Let’s go back to the tropics and look at the middle troposphere.

TMT_tropics_Jan-1979_Aug-2009_RSSH2 again is still rejected with  p-values of 0.000 and 0.020. The proportion of rejections is 60.4% and 26.4%. The models thus far have taken quite a beating. But what about the stratosphere?

TLS_tropics_Jan-1979_Aug-2009_RSSWe find very different results. H2 fails to reject at p = 0.063 and 0.218. However, H1 still has a high rejection rate of 47.2% and 34.0%.

How different are the test results when we look from the end of the 20C3M simulation up to the present? A world of difference. Starting with the lower troposphere,

TLT_tropics_Jan-2000_Aug-2009_RSSH2 fails to reject at p = 0.159 and 0.208. The proportion rejecting under H1 is 5.7% for both UAH and RSS. For the middle troposphere, we get p = 0.111 and 0.085 with a rejected proportion of 9.4% and 13.2%.

TMT_tropics_Jan-2000_Aug-2009_RSSFor the stratosphere, p = 0.969 and 0.934 and none of the individual models were rejected.

TLS_tropics_Jan-2000_Aug-2009_RSSSo what explains these results? Are the models truly consistent with the observed “trend” over the last 9 years? I initially suspected that it was because of the autocorrelation in the regression residuals. I took a look at the AR(1) coefficients for the periods 1979-2009 and 2000–2009. Here’s the frequency distribution for both periods.

ar1_frequency_distributionWhile the inflation of the standard errors in the second period is less concentrated on the higher side, the standard errors in the observational data on Jan 2000 – Aug 2009 are roughly seven times their values in Jan 1979 – Aug 2009 while their AR(1) coefficients are essentially the same. This failure to reject may simply be weather noise decreasing the power of the test.

UPDATE (9/26) : Chip Knappenberger asked in the comments about the end date of 20C3M. He said he believed they ended in 2000. I thought they had ended in 1999. He was right. Most of the models end in Dec 1999. I re-ran my script for Jan 2001 – Aug 2009 to see how the results are affected. Here’s the rub: For the lower troposphere, H2 rejects with p-values of 0.011 and 0.012 for  UAH and RSS, respectively. The proportion of H1 rejections is 24.5% for both UAH and RSS. Again, the standard error in the observational estimates is still many times larger than it was in the 1979-1999 period. As I said previously, I think the weather noise in the data over this (relatively) small period of time is seriously degrading the test’s ability to perform adequately. Here’s a graph of the 2001-2009 period.

UAH RSS
d*  -0.03984649 -0.0827173
dof 22.39936015 22.6161493
p    0.96856854  0.9348041

September 23, 2009

Validation Complete

Filed under: Santer, Synthetic MSU — Chad @ 2:50 pm

Like the title says, the validation step is done with only a few minor discrepancies. Issues were encountered with GISS ER and ECHAM 5. The three plots below show both my and Santer’s T2 MSU tropical temperature anomalies. Everything goes fine until the end of the 20C3M period. After that, Santer appears to have concatenated each run using an emission scenario different from A1B. I think it may be the COMMIT scenario. Just a guess.

echam_5_run_1echam_5_run_2echam_5_run_3The only replication failure without any explanation was with GISS ER RUN 1.

giss_er_run_1I’m at a loss to explain the difference. Now that the data has been validated, it’s time to start doing some interesting calculations!

P.S. Sorry for the washed out image quality. Should have saved as png!

It’s done, finally.

Filed under: Santer, Synthetic MSU — Chad @ 3:26 am

All the MSU temps have been calculated. The netCDF files have been processed into ascii files containing spacial averages. I’ve just begun the validation process. I’ve compared three model runs from my and Santer’s collation and they match. I’ll go through the rest of them tomorrow. This computer deserves a good rest and that’s exactly what it’s going to get!

Older Posts »

Blog at WordPress.com.