However One Slices It ... A Warming World

Reconstructing Surface Temperatures Leads to ‘Remarkably Similar Results’

In the aftermath of the recent hacked e-mails affair, much opprobrium has been cast in the direction of the Climate Research Unit (CRU) at the University of East Anglia.

In part the result of confusion over the meaning of the now notorious “hide the decline” remark, some assume that the global temperature record produced jointly by CRU and the Hadley Centre is of dubious quality.

Fox News, for instance, recently ran an article with the alarming headline, “NASA Data Worse Than Climate-Gate Data, Space Agency Admits“, referring to an e-mail obtained by the Competitive Enterprise Institute under the Freedom of Information Act. In that e-mail, one of the lead scientists behind NASA’s GISSTemp temperature series suggested that the Hadley and CRU approach might be superior.

Overseas, Der Spiegel has made similar associations between the e-mails and the perceived validity of the surface temperature record.

All of which makes it worth stepping back and examining exactly how these temperature records are created. At least for land temperatures, it turns out that it is not a very complicated a process, and recently a slew of climate science bloggers on both sides of the issue have set about replicating the work that CRU and NASA have done, with remarkably similar results.

Global temperature records are assembled using raw temperature data from a number of different sources. Land temperatures come primarily from roughly 7000 different temperature stations in the Global Historical Climatologically Network (GHCN). These stations are located all over the world, each with records of varying length and completeness. This data is supplemented with data from additional stations in Antarctica and the United States. A map of the location of all GHCN stations is shown below.

View larger image

Figure from Peterson and Vose 1997.

The land temperature data is combined with ocean surface temperature data from ship-borne thermometers (for the historical record) and satellites (for the record after 1979) to produce a global temperature record.

When NASA and Hadley/CRU create a global temperature record, they aren’t estimating the average temperature of the globe per se; rather, they are calculating the global temperature anomaly, the change in temperature from a base period. This is an important distinction, as using anomalies helps avoid a number of location- and site-specific biases that will affect the absolute temperature but not the change in temperature over time (for a more technical discussion of the benefits of anomalies vis-à-vis absolute temperatures, see these three posts).

But calculating anomalies can be complicated when dealing with station records that are often incomplete or of relatively short lengths. To compare anomalies from one station with those from another station, a common baseline period is needed. Perhaps the simplest way of calculating anomalies is to identify a specific baseline period (say, 1961-1990), and simply toss out all station records that don’t cover that period. This approach, called the Common Anomaly Method (CAM), is the one taken by NOAA’s National Climate Data Center (NCDC) in its temperature reconstructions.

CAM is less than ideal in some cases, however, as it often requires discarding a lot of useful temperature data. Other approaches seek to use stations with more complete records and located close to ones with sporadic records; this approach allows analysis of a common anomaly for those stations. Variants of this approach include the Reference Station Method (RSM) and the First Differences Method (FDM).

In addition to calculating anomalies, stations need to be spatially weighted. This step involves dividing the world into grid cells and assigning each station to a particular grid, creating an average anomaly for each grid and, depending on the gridding method used, weighting each grid based on its respective geographic area. Spatial weighting is essential to deal with the fact that stations are not equally distributed over Earth’s surface.

For example, if more than 20 percent of the world’s weather stations are in the United States, the U.S. temperatures should not be reflected as 20 percent of the global land temperature, since the country covers only 5 percent of the global land mass.

Armed with spatial weighting and anomaly calculation, scientists can generate a land temperature reconstruction. There are various adjustments made to individual stations or selection criteria to reduce Urban Heat Island effects, station moves, time of observation changes, and other biases, but the net effect of these adjustments vis-à-vis the raw data on a global level is relatively small.

Over the past few months, at least six different groups (including one involving this author) have undertaken independent efforts to reconstruct land temperatures. Each of these series relies on the same basic set of raw station data, but their methods often differ substantially. The reconstructions are:

Additionally, three groups have always released both land and global land/ocean temperature reconstructions using different methods:

One can compare the outputs of each of these side-by-side (save for Tamino’s, for which the numeric results are not published, but which looks similar to the others in graphs he has posted). Also, this comparison does not reflect the Clear Climate Code reconstruction here, as it is an essentially perfect reconstruction of GISSTemp and is visually indistinguishable. Here are land temperatures from 1900 to 2009 for all series:

View larger image

And the same data from 1960-2009:

View larger image

The Jeff Id/Roman M, Nick Stokes, Hausfather, Tamino, and Residual Analysis approaches all use raw GHCN data only. NCDC applies adjustments to the raw data, and HadCRUT and GISSTemp both apply their own adjustments and include additional stations from Antarctica (and, in the case of HadCRUT, from some private station data not included in GHCN).

Note that the two series that have come under the most criticism (GISSTemp and HadCRUT) actually show the lowest land temperature trend of any of the reconstructions, perhaps the result of their Antarctic coverage.

The comparison does strongly suggest that the validity of these temperature reconstructions has in no way been diminished by anything released in the hacked e-mails.

However one slices it, the world has still warmed significantly over the past century.

Zeke Hausfather

Zeke Hausfather, a data scientist with extensive experience with clean technology interests in Silicon Valley, is currently a Senior Researcher with Berkeley Earth. He is a regular contributor to The Yale Forum (E-mail: zeke@yaleclimatemediaforum.org, Twitter: @hausfath).
Bookmark the permalink.

8 Responses to Reconstructing Surface Temperatures Leads to ‘Remarkably Similar Results’

  1. Jeff Id says:

    The comparison does strongly suggest that the validity of these temperature reconstructions has in no way been diminished by anything released in the hacked e-mails.

    Actually, what it shows to me is that the differences in slope are not created by the secret code that Phil Jones used. Expanding that to the validity of temperature requires the belief that the GHCN data is a good source. I’ve personally found it to be dirty, undocumented and highly variable from station to station.

    The mess may average out but all the curves above show is that the data as presented is what is producing the curves.

    With respect to the emails, they show intent to exaggerate AGW, to eliminate any skeptic point thorugh any means necessary. People with some similar characteristics to the climategators have control over the global data. Until real confirmation and complete documented data are provided, I’ve got no comfort level that the GHCN data is a good source yet. Too many stations dropped out, too many discontinuities between series, and too much odd looking variation from station to station for me.

    I guess I’m saying, it’s important to separate the quality of the gridded average process from the quality of the data.

  2. Joseph says:

    Expanding that to the validity of temperature requires the belief that the GHCN data is a good source.

    If the data were generally bogus, I’d find it hard to explain why you get essentially the same results when you look only at stations in forests, marshes and deserts vs. towns with population less than 0.5 million vs. towns with population more than 0.5 million.

    http://residualanalysis.blogspot.com/2010/04/urban-heat-island-effect-probably.html

    It would be hard to explain why you find essentially the same results only if you look at towns between 10K and 20K people and towns between 0.5 and 1 million people. But then, if you look at towns with population over 1 million, you do find more warming, which is not surprising.

    http://residualanalysis.blogspot.com/2010/04/urban-heat-island-effect-model.html

    The consistency of the data is actually quite good.

  3. Jeff Id,

    I agree that the quality of GHCN data is questionable at times. For the U.S., where we have much better station metadata, we’ve found numerous factors that bias the data and require corrections (e.g. TOBs, MMTS sensor transitions, etc.). We don’t even know if the time of observation has changed for GHCN stations…

    However, the quality of GHCN data is a separate issue from the allegation that HadCRUT/Phil Jones conspired to exaggerate the surface record. If anything, their adjustments decrease the land trend a bit vis-a-vis the raw data.

  4. Jeff id says:

    I agree with your point to the extent that Jones did not conspire to exaggerate the surface station record through code. It also appears likely that he didn’t look for every avenue of increase due to data either.

    In addition, he wasn’t even looking to knit the offsets like Roman and in a simpler form , I, did before that. Jones didn’t work to his limit looking for any possible way to increase trends. Even though Roman’s method should provide an increased trend and is definitely better.

    However, Phil Jones did try and claim UHI was a non-factor. Something I disagree with. He also is oddly silent when most of the GHCN stations which comprise CRUtem vanished. I don’t subsribe to the belief that these were eliminated to change trends on purpose either, and I’m sorry that Anthony got into that one.

    What I believe is that we need, and I mean absolutely need, a full accounting of all temperature stations in GHCN and around the world. Starting with presentation of raw data and a complete detail of corrections. IMO it would be less than a 100 million dollar project, which is nothing in the AGW industry.

    If they want to get rid of skeptics valid concerns, that’s a really good point to start. Pretending like over 70 percent of the data disappearing from the record is ok, is nothing but crap.

  5. Joseph says:

    However, Phil Jones did try and claim UHI was a non-factor.

    Well, the unadjusted GHCN v2 data shows that it is a non-factor (I’m not sure if others have confirmed this) except for cities with over 1 million people. Since the number of cities declines exponentially with size, it is practically a non-factor overall.

    Peterson et al. (2003) found UHI undetectable, but he only looked at 289 stations.

  6. But why are conservatives, in particular, compelled to rationalize-away the data and/or eye-witness reports? The following is but one of several closely related and interlocking theories.

    G.Williams
    ~~~~~~~~~~~~~~~~~~~~
    System Justification Theory

    We have shown above that most traditional personality theories about the functions of conservative ideology, especially theories of authoritarianism, dogmatism, and anxiety reduction, stress ego-defensive or ego-justifying aspects of conservatism, that is, the satisfaction of individual needs for security, obedience, and projection (e.g., Adorno et al., 1950; Altemeyer, 1981, 1988; Rokeach, 1960; Wilson, 1973c). Although ego-justifying motives constitute an important part of the appeal of conservatism, there are also group-justifying and system-justifying motives that are satisfied in a particularly efficient manner by right-wing ideologies (Jost & Banaji, 1994; Jost & Thompson, 2000). Social dominance theory, for example, stresses the emergence of conservative legitimizing myths as group-justifying attempts to rationalize the interests of dominant or high-status group members (Sidanius & Pratto, 1999). System justification theory focuses on the motivated tendency for people to do cognitive and ideological work on behalf of the social system, thereby perpetuating the status quo and preserving inequality (e.g., Jost, 1995; Jost & Banaji, 1994). cont….

  7. caerbannog says:


    I guess I’m saying, it’s important to separate the quality of the gridded average process from the quality of the data.

    It’s a good thing that we have two independent satellite-based global temperature records (RSS and UAH) to use as sanity checks against the surface temperature data.

  8. pough says:

    He also is oddly silent when most of the GHCN stations which comprise CRUtem vanished.

    Some stations vanished? How? Did anyone else remain silent when it happened? That’s really weird.

    You aren’t referring to the retroactive stations, are you? That would be a sudden appearance rather than a disappearance. Saying they vanished is like saying you can’t recall 2011.