GHCN v1 vs v3 Special Alignments

Over on WUWT Barry had wanted a couple of “special” alignments of the v1 vs v3 comparison data.

His original statement was a 1900 start time, but in further comments he said the reason was to align the comparison with GIStemp and with Hadley products. As GIStemp “starts time” in 1880 and Hadley in 1850 (as I remember it) I’ve made two special comparison graphs of v1 vs v3 in aggregate aligned on those two start times.

The trend in any curve segment is fairly strongly influenced by the start and end points for a fit trend line, so these trends will be different both from each other and from the “all time” graph. As this First Differences based method is different from that used in both GIStemp and HadCRUT / CRUTEMP, the trend the difference found here will induce in those products is unknown.

So here are the two graphs:

GHCN v1 vs v3 GIStemp 1880 1990 Alignment

GHCN v1 vs v3 GIStemp 1880 1990 Alignment

GHCN v1 vs v3 Hadley 1850 1990 Alignment

GHCN v1 vs v3 Hadley 1850 1990 Alignment

For comparison, the “all data” graph again, but in a slightly different presentation with a trend line:

GHCN v1 vs v3 1990 All Data Alignment

GHCN v1 vs v3 1990 All Data Alignment

Don’t know if it will be of interest to anyone but Barry, but FWIW, I’ve made the custom cut graphs.

Can’t say I’m not cooperative ;-)

Subscribe to feed

About these ads

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in NCDC - GHCN Issues and tagged , , , , , . Bookmark the permalink.

17 Responses to GHCN v1 vs v3 Special Alignments

  1. A C Osborn says:

    Chefio, Zeke Hausfather and Steve Mosher have a rebuttal post of your original analysis over at WUWT.
    They say they have used Raw data from both versions of GHCN.
    What do you think of their post?
    I think that they may be hiding the period of the data changes in their Figure 3, the reductions can be in the past and the increases in the current period which will provide a bias.
    I am not sure how the Raw data compares to the “Published” data.

  2. Paul in Sweden says:

    Chiefio, you have done well.

  3. Petrossa says:

    Read the post they made and they completely missed the point. They are unable to recognize that you can’t extrapolate (dubious) regional/local temperature to cover vast stretches of unmeasured world. And do that over a prolonged period. Just local climate completely skews the result already.

    Climates come in all sizes. Personally i live in a micro climate of about a 100 km2. Here temperature variations between day and night vary by a few degrees and it never freezes. Citrus grows plentiful. 25 km to either side it freezes up 10 C, and day/night temperatures vary 10 degrees easily.
    Suppose my thermometer was used to smear out over 1200 km……and then replicated.

  4. E.M.Smith says:

    @A C Osborn:

    Thanks for the heads up. Sucked down most of yesterday mostly resaying things I’ve said here in postings.

    I thought of being a bit cheeky and asking where THEY downloaded THEIR v1 data? (In response to the Red Herring about my download of v3) as it has been removed from the NOAA site…. but decided to just let that one lay…

    There isn’t really any “raw”. There is “unadjusted” (that still has a lot of estimates, infilling, ‘corrections’, etc.) and “adjusted” that gets some added specific adjustments (TOBS for example) and more “quality control” changes. Both are published. (For v3, the ending is either ‘qca’ or ‘qcu’ for Q.C. Adjusted vs Unadjusted.)

    What do I think of their post? “Nice try, swing and a miss.”

    They didn’t understand, or ignored, the point about the assemblage of data being different from each data item (i.e. the collection of thermometers is more important than matching per instrument) and attacked ‘by instrument’ comparison – except I’m not doing a ‘by instrument’ compare…..
    “Swishh…… thunk. Steeeerike!”

    They do the usual of showing that all the things that are the same are the same. “Yawn…”

    Then show that on average the adjustments or differences are nicely distributed about the mean. As that is highly unlikely unless ALL the errors were also random, and many classes of error are non-random (systematic); to me it is more suspicious than comforting.
    “Swishh…. thunk. Steeeerike 2″

    Then they show that if you ignore WHEN an adjustment is made, they don’t make a difference in warming trend over time. “Er, what again?”
    “Swishhh… THUNK! Steeeerike 3 !”

    And so it goes…

    Oh, and they didn’t like the way I pitched the ball (FD with a fix to gap handling). Their complaint is still off with the Ref. who’s looking to see what the “spitball” rules cover ;-)

    @Paul In Sweden:

    Thank you! Sometimes I wonder… ;-)

    @Petrossa:

    Yeah. I’ve sometimes wondered how much of that “missing the point” is a mental defect, how much is a “dogged dogma” effect, and how much is “artifice”. Not possible to know, but I wonder anyway… ;-)

    Overall, I got the impression that Mosher didn’t read through what I said nor what I did; just did a quick glance, figured out it would take some time, and leapt to “the usual arguments”. Probably saw that I had “STATN” as a variable in some of the set-up / selection scripts and figured I was matching on stations, didn’t realize it takes a standard Unix Regular Expression and does a selection based on that. (i.e. it lets me select specific ranges of stations as a group for ensemble trend plotting.)

    Clearly was not taking much time to read what I wrote. Even got thrown by the description of “skipping the gap” on my variation on FD? How hard is it get: Don’t do anything, don’t reset, just go on to the next valid data item? Personally, I think it is patently obvious that a series of: ( 10.4, missing, 10.1, 9.9 ) would be better represented as anomalies via: ( 0, 0, -0.3, -0.2 ) for an overall range of -0.5 when going from 10.4 to 9.9; especially when compared to what standard FD gives you: ( 0,0,0,-0.2 ) for a change of -0.2 when going from 10.4 to 9.9 but they wanted to try and paint that as some kind of Great Risk. And despite my being pretty careful about saying First Differences is peer reviewed and I use a variation on it; wanted to toss a smear bomb asserting I was saying the variation was peer reviewed? Just silly. (Not that peer review means much anymore in the era of Climategate Pal Review… I’d rather have 1000 folks at WUWT inspect it – any errors or problems would be much more quickly and throughly discovered.)

    Per Local Climate:

    That was what got me saying “They do WHAT?” about the Reference Station Method. San Francisco is sometime in the same direction of change as the Central Valley and sometimes exactly opposite. You could make a map showing valid “in fill” creation for winter, only to discover that the “3 to 4 day huff and puff” summer cycle of valley hot air pulling fog over SF was “exactly backwards”. As the amount of each effect varies seasonally, and over the PDO cycle, any extension of coastal thermometers to inland areas will be wrong about 1/2 the time. (or more). That GHCN only has coastal thermometers left in California was kind of an issue for me on that count…

    I’ve also got a Sunset Garden Book that I live by. Reading the descriptions of microclimates, you find many places where the weather causes cold air shedding off the hills into pools in the valley. Sometimes this makes the hills warmer and sunnier. Sometimes the downslope winds can cause compression heating (depends on place, slope, velocity, etc.) So putting your thermometer at an airport in the nice flat valley “has issues” in using it to re-create the hillside temperatures. “Drainage” making the hills warmer in winter, while “downslope winds” making the valley warmer when the wind blows… But “It’s what they do…”

    Then that whole “do it repetitively at least 3 times” smear of 1200 km! Yikes! That lets a thermometer in L.A. have influence over one in Buffalo New York in January. (Yes, it is highly unlikely that any given data item gets smeared that far… then I look at the Arctic Red Blob created from ONE thermometer in “The garden spot of the arctic” … ) So maybe there are places where that “L.A.” temp fills in a missing data item in Utah, that is used to UHI adjust over in Nebraska, that has that Grid/Box value used to interpolate what a missing Grid/Box value set ought to be in Buffalo…. It all depends on what records are “in” vs “out” and what dropouts they have in any given time period. (Realistically, not likely in LA to NY; but almost certain in places like The Pacific where Indonesia has huge drop outs on various wars and where many small islands have truncated data about 1990; so must be filled in from islands far far away.)

    But if only I would do my quality checking inside the nice small fixed grid boxes and ignore that, everything is just fine…. it’s only when I color outside the lines that I find an issue…. (Never mind that the other climate codes also “color outside the lines”, just in a more subtle way via RSM, Homogenizing, and UHI “correction”…)

    Oh Well…

  5. JT says:

    Thanks for your responses to my posts at WUWT. This may be a bit off topic but … a while back Steig tried to use satellite data together with sparse Antarctic surface temperatures to, as I understand it, extrapolate a temperature history for the whole Antarctic back in time beyond the satellite period. As we know, his methods were flawed and were markedly improved by O’Donnell and McIntyre and others. So it occurred to me, why not apply the improved methodology to the data from a much reduced station set of rural, stable, no, or, low UHI stations, either by regions across the rest of the world, or globally? This would address the data quality issue by fusing the high quality satellite data with high quality surface data. By limiting the data set to high quality stations it would reduce the sheer size of the database. If the result is consonant with the existing reconstructions it should lay data quality concerns to rest, and if the result is marked different that would highlight the data quality issue.

  6. Petrossa says:

    Oh Well…Indeed

    The worst thing is this kind ‘science’ is invading other disciplines. For ages now I am trying to convince neuroscientists that fMRI can’t be used for anything else then the most basic studies. It also uses a similar kind of statistical smear to get signal from noise. Still they make the most farfetched claims based on them. Look at this: http://www.ncbi.nlm.nih.gov/pubmed/22711879
    Climate Science applied to Neuroscience. Extrapolating from a hardly convincing base to an extreme conclusion.

    Oh Well….It keeps them off the street

  7. j ferguson says:

    J.T.
    your question raises an issue i’ve been pondering: has smearing as a technique been thogoughly tested by application over regions with “known” anomolies and if so how good has been the agreement? It is hard for me to believe this hasn’t been done, (likely has) , but my knowledge of this art is pretty limited.

  8. E.M.Smith says:

    @J Fersuson:

    The “testing” near as I can tell has been done on the method over a small domain of time and space and only for one technique in isolation; then is applied in bundles with other techniques over much larger domains in time and space.

    For example, the Reference Station Method has a published paper that looked at a small number of stations limited in time and space and found that one set could be used to predict the trend in the other set, and up to 1200 km of separation would still allow an ‘acceptable’ agreement.

    That gets peer reviewed and becomes official dogma…

    Now RSM is applied to all geographies (and near as I can tell no tests are done for differential behaviour in divergent geography types) and over any desired time domain ( so things with long period changes of behaviour like PDO / AMO etc. influence data relationships for sites) even though the original paper, IIRC, covered too short a time window to prove usability with a 60 year window of changing relationships. (Though it’s a fuzzy memory of the paper and their exact time window ought to be verified… but I’m just awake after all of 4 hours sleep and not had morning coffee or tea yet so not high on my ‘priority list’ at the moment ;-) RSM is also applied at least 3 times in a row, a behaviour for which I see no justification in the paper published and no testing at all. It is applied in the context of other “approved” methods like TOBS adjustment and the “QA” of the data that tosses out outlier data and replaces it with an “average of nearby ASOS stations” (which data, near as I can tell, can itself have been ‘repaired’ data… and the comparison history that decides when to toss ‘outliers’ is full of ‘repaired’ comparison data…) and again, I’ve seen nothing that shows that the assemblage of tools taken in concert has ever been tested.

    They just assume that “valid in one case” is valid in all cases, applied as often as you like, and applied in any constellation you like with other adjustments. ( I could be wrong. There could be some paper that tests the assemblage and I’ve just not found it…)

    So my major concern is just the potential for unintended feedback loops in this whole process to induce accumulating error. Things like the “difference from the recent mean” deciding to toss outlier data and replace it with an average of ASOS. Over time, the replaced set will have a narrower mean, so the test gets tighter, resulting in more tossing and substitution, that ought to end up in a feedback loop slowly squeezing down the range of volatility. That, when I look at the recent data volatility compared to the past we see exactly that compression of range and loss of volatility causes me some concern…. (But I’ve not taken the time to figure out which of several errors might be causing it. The QA one, or the electric thermometer swap one with different failure modes and short wires to buildings, or the loss of high altitude volatile stations one, or all of them ensemble…)

    http://chiefio.wordpress.com/2010/04/11/qa-or-tossing-data-you-decide/

    @Petrossa:

    Interesting link. Don’t know much about how MRI processing is done. Maybe I’ll look at it some day…. One other thing to think about is that each of us is built different. Some folks (and a not insignificant percentage) even have the internal organs mirror flipped (so liver on the other side). A large number (at least 10%) have the brain hemisphere wired the opposite so are left handed. (It is actually more complicated than that with 4 types of handedness… for some folks Rt brain controls Rt hand, for most it is Rt to Lt, for some Rt hemisphere is dominant so they are Lt handed, for others Lt hemisphere is dominant, but it is a Rt hemisphere type, so they are Rt handed, but have a Lt hemisphere type brain running things, etc.) In the end, about 1/3 (or more) of folks have their brains wired differently than “the norm” for handedness and brain lateralization…. And that is only the beginning….

    (Don’t even get me started on how different metabolism react to various drugs. ANY time you take a drug you hare just hoping you are like the “norm” used to accept it. For example, blacks react differently to common heart medications, so do better with different drugs. They have a different nitrous oxide metabolism IIRC).

    @JT:

    You are most welcome.

    Sounds like an interesting way to test things. I see two concerns, though.

    1) ANY time there is a splice in data, it is a potential catastrophe and a potential for error and fiddling. (Look at what Mann got into with his hokey hockey stick…) So even if your spice is expertly done and strongly overlapped, blending Sat data with land data will have its own set of concerns.

    2) How / where will you find and identify this set of long lived rural stable stations? For example, ALL of Japan has about a 1/2 degree ( I think it was F but could have been C – fuzzy no coffee moment ;-) shift post W.W.II when the US Occupation changed their instruments and methods (and maybe even their scale…) The whole of the Pacific has a load of disruptions and dropouts during the war years and Indonesia had a coup or two later with multi year dropouts and poorly tended stations. Europe can be almost as bad. China had The Cultural Revolution and the Russian Revolution / W.W.I / W.W.II / USSR / USSR Collapse etc. also introduces discontinuities. Latin America has had more revolutions and wars than I can remember (and tending the thermometer is low on the list of priorities when being shot at) while Africa is a perpetual basket case. And that’s just the political influence issue… The world has dramatically shifted instruments from LIG (mercury OR alcohol or…) to electronics (of a couple of generations one with a known ‘suck its own exhaust’ problem. Most of the temperatures we do have come from Airports. So pick a nice rural airport? Grass strips in 1920. Thin tarmac in 1930 or 40. Large concrete pads with parking aprons in 1950-60… I’ve got a picture somewhere from a “pristine” tropical island. 1950 the “terminal” was a grass shack. Now it is a nice modern airconditioned building with paved parking areas… Still a “rural” area by any measure you could apply.

    The basic problem is that in most truly rural areas, we have no data other than from airports. In those places where we have data, even from rural areas, land use changes are ongoing. ALL Airports grow over time from grass to pavement to… Even places like isolated Mount Hamilton Observatory have regularly added new buildings, housing, paving, people… so it is only a matter of how much growth of heat island, not ‘no change’.

    TonyB (hope I got it right this time ;-) did a study of long lived stable stations IIRC. He found they don’t show any “global warming”. Do we really need risking splice artifacts trying to extend that over other areas? Or is it enough just to say “When the instruments don’t change, there is no warming. Warming is an artifact of instrument changes and location change.”

    Examples:

    http://chiefio.wordpress.com/2009/08/26/agw-gistemp-measure-jet-age-airport-growth/

    http://chiefio.wordpress.com/2009/09/08/gistemp-islands-in-the-sun/

    That island in the second link is about as “rural” as you can get and with nearly no people. Yet just look at it now…

  9. j ferguson says:

    E.M.
    Thanks much for your thoughts. i continue to be nagged by suspicion that the massage has produced the message. I’m also bothered by Hausfather, Mosher, et al. ‘s conitnuing insistence that the data integrity is somehow validated by their analyses showing great similarity while using the various flavored compilations of what must surely be reports from the same thermometers. This would only suggest that the compilations and their “improvements” are similar, not that the underlying data (reported temperatures) have much to do with what actually happened.

    But then I probably haven’t really grasped what they are showing.

  10. j ferguson says:

    what made that comment enter moderation?

  11. E.M.Smith says:

    @J Ferguson:

    Using the name of a person who is “in moderation” so triggered the filter on them… The way that a commenter can be put into moderation is a bit limited in how smart it can be. You can put their name, or their IP, on the “cut list” and it always takes an “OR” not an “AND”…

  12. E.M.Smith says:

    Yes, the patent “group think” in saying “using the same data and methods you get the same results, so it must be right!” is just amazing. That they can’t see that an inevitably go to attack mode if you do anything different from The One True Way is a symptom, IMHO…

  13. JT says:

    “How / where will you find and identify this set of long lived rural stable stations?” Hmm… I would guess that a set might be found first in the USA and perhaps in Canada sufficient to apply the method to North America. If that were done and the behaviour of the statistical correlations developed were studied it might result in the discovery of some consistent principles which could be relied upon in regions where the numbers of LLRSS (longlivedruralstablestations) are fewer in number. There weren’t many in the Stieg or the O’Donnell reconstructions over Antarctica as I recall. I should think that there should be enough stations in Europe as well. As for the rest of the world … I am sensitive to your observations concerning the degree of turmoil they have endured and the consequent reduction in likelihood of finding an adequately long record of LLRSS. But if the method were successfully applied to the European and North American regions it might make it worthwhile for someone to look for LLRSS in the rest of the world.

  14. Chuckles says:

    J. Ferg,
    Keep thinking those thoughts. The dog and pony show or a slight variation gets trotted out whenever anyone suggests what you’ve just noted, that there is no validation or increased integrity, just smoke and mirrors.
    It always appears whenever anyone suggests that the data being used might not be entirely fit for purpose, or when someone notes as you have done that there is no independence between all their ‘independent’ efforts, since they all use different cuts of the same iffy data.
    It matters not what has been said by their target,since they always trot out the same response, much use of anomalies and much hair-splitting use of language and nuance.
    There is also always a great deal of arm waving and snark, as the intent seems to be to prevent and divert discussion rather than provide information or enlightenment.
    The rather fundamental question of whether the input data is fit for purpose must not be discussed. When you consider that a lot of the temperature ‘numbers’, the data have been truncated or rounded to the nearest whole degree, or worse, a min and max temp, both rounded/truncated, have been averaged, then I say ‘go no further.’
    Nothing can be deduced from that data, or extracted by averaging, smearing or manipulating. Any subtle signal ‘hidden’ in the data is gone.

  15. E.M.Smith says:

    @J Furguson:

    THE biggest argument for “fudge” instead of “stew” is just that all the adjustments and “errors” and station change and whatever always go in the direction of warmer trend. Station end up nestled next to large bodies of water (low volatility) over time. They end up near the tarmac at airports ( about 60% IRIC) at Airport Heat Islands. You don’t see a bunch of urban core stations dropped and a bunch of new instruments set up on Smoke Towers at mountain tops… Very suspicious pattern. ( Could still be accidental, but it’s an ‘odds thing’.)

    https://chiefio.wordpress.com/2009/10/09/how-long-is-a-long-temperature-history/

  16. Brian H says:

    EMS;
    yes, there are many one-way valves in the dropout and averaging and updating process. Virtually all mitigate against a more isolated/cooler siting. Add to that the preference for climbing measurements, and it’s surprising there are any negative data steps at all!

    Oh, wait … there aren’t!

  17. Brian H says:

    Elsewhere you make note of the farcality of GAT. Here’s another kindred soul:

    In the Comments on a SciAm article subtly dissing Judith Curry, “Iconoclast” posts the following:

    14. Iconoclast 05:06 PM 10/23/10

    The proposition that the average temperature of the earth’s surface is warming because of increased emissions of human-produced greenhouse gases cannot be tested by any known scientific procedure

    It is impossible to position temperature sensors randomly over the earth’s surface (including the 71% of ocean, and all the deserts, forests, and icecaps) and maintain it in constant condition long enough to tell if any average is increasing. Even if this were done the difference between the temperature during day and night is so great that no rational average can be derived.

    Measurements at weather stations are quite unsuitable since they are not positioned representatively and they only measure maximum and minimum once a day, from which no average can be derived. They also constantly change in number, location and surroundings. Recent studies show that most of the current stations are unable to measure temperature to better than a degree or two

    The assumptions of climate models are absurd. They assume the earth is flat, that the sun shines with equal intensity day and night, and the earth is in equilibrium, with the energy received equal to that emitted.

    Half of the time there is no sun, where the temperature regime is quite different from the day.

    No part of the earth ever is in energy equilibrium, neither is there any evidence of an overall “balance”.

    It is unsurprising that such models are incapable of predicting any future climate behaviour, even if this could be measured satisfactorily.

    There are no representative measurements of the concentration of atmospheric carbon dioxide over any land surface, where “greenhouse warming” is supposed to happen.

    After twenty years of study, and as expert reviewer to the IPCC from the very beginning , I can only conclude that the whole affair is a gigantic fraud.

Comments are closed.