Met Office UEA CRU Data Release Polite Deception

Watch The Pea, Lose the Purse

Watch The Pea, Lose the Purse

It’s just a little bit of “magic” from an Illusionist… You know, the “old shell game”.

Original Image

So, They Released the Data, Right?

Well, sort of…

When you read the press reports, it sounds like there was a release of the 1500 or so sites that were well sited, yet not subject to legal restrictions on data sharing. It also sounds like they are releasing the basic data for all of us to look at it.

The web page that gives you the data:

http://www.metoffice.gov.uk/climatechange/science/monitoring/subsets.html

says:


The data subset consists of a network of individual land stations that has been designated by the World Meteorological Organization for use in climate monitoring. The data show monthly average temperature values for over 1,500 land stations.

“The data” “individual land stations” “monthly average temperature values”. It all sounds like they are releasing the temperature data…

But…

There is a link near the top of that page that mentions this is a subset of the HadCRUT3 data set… “But I thought HadCRUT3 was a product, not the “raw” data?”

And right you are… When you click through that link, you find that the “data” they have released is a partial subset of the output of the CRU Code, not the input. From:

http://www.metoffice.gov.uk/climatechange/science/monitoring/hadcrut3.html

I’ve bolded the key words:


HadCRUT3: Global surface temperatures

HadCRUT3 is a globally gridded product of near-surface temperatures, consisting of annual differences from 1961-90 normals. It covers the period 1850 to present and is updated monthly.

The data set is based on regular measurements of air temperature at a global network of long-term land stations and on sea-surface temperatures measured from ships and buoys. Global near-surface temperatures may also be reported as the differences from the average values at the beginning of the 20th century.

So this is the product and not the data. It has the HadCRUt 1850 cutoff in it. It is based on measurements and it not itself a measurement of anything. This is not the temperature data, this is the homogenized pasteurized processed data food product.

Nice deflection. Nice packaging that looks like it is releasing the data while not quite lying. Bad form. Very Bad Form.

Try again, please…

One is left to wonder if they subscribe to the broken behaviours of “Post Normal Science”. It looks like it… A good read on that subject (that does seem to explain the broken moral compass at CRU and Met Office) is here:

http://i-squared.blogspot.com/2009/12/green-snake-in-grass.html

About these ads

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in CRUt and tagged , . Bookmark the permalink.

9 Responses to Met Office UEA CRU Data Release Polite Deception

  1. Bishop Hill says:

    Yes, that makes sense. It’s the Met Office doing the releasing. They produce HADCRUT from the output of CRUTEM and HADSST. So it’s input data for the Met Office.

    As you say, watch the pea under the thimble.

  2. VJones says:

    So more Climate Fast Food then. Yuk!

    Thanks for the Hadley links. I took peek last night. It is so frustrating.

    That link to i-squared blog has made me feel physically ill, although it is creepily familiar. More people need to understand that.

    REPLY: [ Yes. I'm coming to see the pattern of an organized structured formal behaviour... Tea helps, though. -E.M.Smith ]

  3. Douglas Hoyt says:

    I don’t think the Met Office is qualified to re-analyze the raw data. They have already stated that they will conclude everything is alright with HadCRUt.

    An independent group of metrologists and statisticians is needed.

    REPLY: [ All they need do is publish the raw data. Lots of us will analyze it for free. ;-) -E.M.Smith ]

  4. Pouncer says:

    It’s also missing data. Being from Kansas I sought station data inthat data set from my area. The data included such a station — Concorida. But the data for a decade in the mid-70′s is missing.

    I assure all that Corcorida is not Brigadoon, or Oz. It doesn’t come and go with either the times or the winds.

    And this is the PRODUCT? They could interpolate or substitute something if they had holes in the data?

  5. Rod Smith says:

    In my day, admittedly some years ago, information from observations was designated ‘data’ and things such as this called ‘product.’ Pouncer has the terminology exactly right in my book.

    One wonders at this stage if the ‘press’ is a willing accomplice or falls into the ‘duped’ class.

    REPLY: [ I will always use data for the stuff you record from the instrument and 'product' for the result of a computer run or conversion process. I've had to add the metaphore: "raw" data. I use it for what is the input (for example, the USHCN or GHCN "raw" data that are fed into GIStemp; but when you look into it, they say it has had some ill defined 'adjustments' and 'QA' processing... so it isn't raw data, but some bastard half way point... I've not yet been able to find really raw data as a collated data set. Images of paper records is about it so far. Per the 'press', I think it is some of each. As this thing falls apart, expect some of the 'accomplices' to feign 'duped' as they run for cover. -E.M.Smith ]

  6. Rod Smith says:

    Hmmm. I don’t really have the slightest problem with your use of the term “raw data,” but especially in light of the phrase “adjustments’ and ‘QA’ processing…” I will comment on it,

    The data from surface observation in the USAF forwarded indirectly to NCDC — during the early 60′s at any rate — contained supporting information beyond the WBAN form on which observations were made. These things included thermograph data, and continuous recordings of such things as wind speed and direction. We may have (?) also sent hard copies of obs transmissions. I’m sure there were other things too, because it all became a pretty big wad of paper, but the old memory just has too many rust plates these days – sorry.

    In truth much of this stuff IS raw data, but I still have no quibble on your usage — it is just that my viewpoint is still on the nuts and bolts. Plus unless you are involved in really LOW level QA (was that gust really 9 knots or over and last less than 20 seconds?) it is just so much baggage.

    I suspect you are 100% correct in the “run for cover” comment.

    And finally, I AM NOT trying to be picky, but making the point we generated a ton of paper on top our basic records.

    REPLY: [ No gripe with anything you had to say. I suspect that NCDC is sitting on, or shredding madly, :-( a whole wad of Really Raw data. They then take that wad and turn it into something via some method and publish THAT as "unadjusted" data (except it has had "QA" screens, some homogenization, etc.) Next it has the daily MIN and MAX averaged, and those daily averages for the month are averaged to give the monthly MEAN. At this point I have a really hard time calling it "raw data". Yet THAT is the input to GIStemp. And when you go to the GISS site and pull up the chart labeled "Raw" you get this GHCN data product (but only AFTER averaging the USA data from USHCN into it)... Hardly raw, but the label on their web page says "Raw" so to point folks at it, I have to say "Raw", but add the caveat that it is anything but... I just wish they would call it "NCDC GHCN processed monthly averages"... and I wish they had "real raw daily" and "QA enhanced daily" on the NCDC web site so we could QA their QA... This kind of subtile deception runs through the whole "Global Warming" mess. -E.M.Smith]

  7. Mesa says:

    Hi:

    On RC they are talking about an analysis of the adjustments that shows a zero net decadal adjustment over time. However, it doesn’t show how that net adjustment might look over time. Have you done some study along these lines?

    http://www.gilestro.tk/2009/lots-of-smoke-hardly-any-gun-do-climatologists-falsify-data/

    REPLY: [ I have not, but there is one floating around. There are links to it posted in comments on at least two threads here. I also think it is just a dodge since it ignores the larger issue of record truncation. I think it was this one:
    http://statpad.wordpress.com/2009/12/12/ghcn-and-adjustment-trends/

    -E.M.Smith ]

    Cheers,

    Mesa

  8. hpx83 says:

    Hey Chief,

    Thanks for your comment on my recent Antarctica findings. Wasn’t sure if your email address was a modern version of the old “lamer@aol.com” address people used to enter when they didn’t feel like entering their own, so I thought it better to reply via your blog.

    I just read above post, and if I get you correctly the recently released MetOffice data is nothing but junk? (or at least – original data thrown in a mixer together with homogenisation junk)? This is very disappointing, and it also accentuates a point I’ve been wondering about since I saw some of their Fortran code – for some reason, when they “homogenise” data for grids, they do not generate the grid-output directly (ergo average temperature / grid area) but instead they create a “homogenised” version of the station-data, which they then use for the grid-data.

    I am suspecting that what we are seeing is one of these homogenised pieces of work. However – it may prove useful for some purposes still – namely to compare with the “raw” data found in the GHCN dataset (although I doubt even that is “raw” anymore). Unfortunately I don’t know much about data analysis, but I would love for someone more skilled than me to run assorted entropy analysis and distribution studies of the datasets these guys use. I wonder if they’ve been smart enough to add random “noise” after they manipulated stuff to avoid people finding repetitive patterns etc. Might be worth looking into, since much of the “homogenisation” has been done by running different algorithms over and over until the result was pleasing ….

    Also, I will pick up on your South America’s thread soon, and try to do a complete study of exclusions / adjustments. I can tell you already that the Nicaragua-data (which is only a few series) has some serious issues with trend diff. between adjusted/unadjusted. Will try to have something ready this week.

    Cheers

    //hpx83

    REPLY: [ The email address is real (though I only read it once a week or so... need to keep "deconstructing GIStemp and GHCN" ;-) most of the time. I was just lucky enough to get a name that says what it does. It's a "Spam catcher share with anyone public ID". I chose AOL just because they have a pretty good spam filter and I don't have to fool around with it.

    As I read the Met Office release they are saying this is the "data" that makes the anomaly map, filtered to take out sites that have asked to be kept private; but you click on the definition of "data" and you find out it is the product of CRU not the input to CRUT. That is, it is an "homogenized data food product". So they are claiming you can show CRUT is valid by only looking at a subset of the CRUT temperature product... but not the input.

    As near as I can tell, the randomization in the "cooking" comes via self reference. So GIStemp has a station look at a varying set of "nearby" stations for "in-fill" adjustments and for "UHI" adjustments. Then you bias the input by making ever more of the "nearby" stations (up to 1200 km away...) be abnormally warm places. I.e. when you prune out the cold mountain thermometers in California, you must look to the one near the beach in Los Angeles to 'fill in' and 'adjust' it's neighbors... And yes, while I don't know how "raw" GHCN is (the monthlys are calculated, so they are not "raw". In any case, I've settled on calling it "base" data, since that is about as accurate as I can get... ) but this "base" data will show up any post processing issues in CRU / GIStemp / (Whatever the Japanese series is called) / NCDC "adjusted" and the issue of how "raw" the GHCN base might be can be pushed out just a little bit (and maybe someone else can figure it out... ;-)

    And yes, the 'order' of processing seems particularly designed to make audit unclear and difficult and to yield an unneeded cooked "pasteurized processed homogenized data food product" temperature series that is then often called "raw data" or "data" but isn't. -E.M.Smith ]

  9. 3x2 says:

    Interesting in this press release

    They seem to be suggesting that 5000 of the 6000 are covered by some kind of “confidentiality” agreement. Not sure how that works if HadCRUT was based on GHCN. Perhaps they are not.

    Having looked at the flush of recent station related posts over at WUWT I decided to have a look various stations from v2.mean and v2.mean_adj graphing the difference.

    Most are interesting and some belong in a zoo. Some are clean square waves (CHRISTCHURCH 93780) and some just plain odd ( HELSINKI 2974). Well worth a look.

    I tried to figure out how _adj had been adj’d but couldn’t find anything like set of adjustment templates on their ftp site. I went looking elsewhere with no success but I did find this from The DMI. (pdf) Thought you might appreciate this from page 14 ..

    Studies of methodological or single station histories are quite difficult and time-consuming to conduct. In addition, the necessary information sources for a given country are available only in that country. The language in which the information is described may also hinder this kind of study
    by outsiders. In practice, data records are never fully homogenised. New information on metadata or new test methods may reveal some need for further adjustments. However, if any adjustments are made, the original data should never be destroyed and replaced with the homogenised data, because future studies may still require the original data.

    REPLY: [ Love it! Somebody needs to tell the CRU crew... BTW, if you find anything interesting, please let us know! - E.M.Smith ]

Comments are closed.