GHCN Version ONE vs TWO (prepping for THREE)

So, how hard is it to compare them?

I have been waiting for the GHCN Version THREE set (that was supposed to be released in about February) and, well, I’ve gotten tired of waiting. You can only ‘sharpen your knife’ so long before you want to start chopping on something…

So I’ve started taking a few practice swings at GHCN Version ONE. (The present set is GHCN V2).

What I’ve learned so far is, while not surprising, very disappointing. How can folks be so sloppy?

What kind of sloppy?

Well, for starters, the “key fields” are not stable. Anyone with any decent data processing experience would have a distinct key field that is kept unique to the exact instrument/location set and would not change the keys from version to version. These keys would point to the unique meta data (such as LAT / LONG and NAME) and when the data are updated (say for name) a decision would be made to keep the name constant (if a gratuitous change – i.e. an ‘error’ like turning N.W.T. into NWT) or issue a new record and key if a significant change (like if the LAT / LONG / ALT show a real station relocation event).

Look, I can understand having the need to create new country codes as countries come into being, and even the need to split an old country into new ones. But they changed them all. I can see no sane reason to do that. There is no reason what so ever for the USA, France, UK, South Africa, Brazil, etc. to have had the country code number change. But it does.

The major station ID number (5 digits) does stay stable, but the ‘minor number’ field changes from 2 digits in V1 to 3 digits in V2 (and no, it’s not just a mater of taking 01 and 02 and making them 001 and 002; the actual order of assignment is different as well). So most of the time the number is a good indication, but sometimes it isn’t. xxxxx001 and xxxxx0002 may have been xxxxx02 and xxxxx03 in the past ( I’ve seen examples of that).

Given that many stations have name changes, there are a fair number of ambiguous cases where it’s just not clear which station is what.

Even the LAT and LONG often drift by a few 1/100 or even 1/10 of degree (some more than 1/2 degree!). Is that ‘error band’ or is that ‘station move’? Who knows….

Sample Data for Inventory File

So here is a sample of my combined inventory file. I’m merging the V1 and V2 inventory files to make a map of “old index number” to “new index number”.


+40371928000 ROCKY MTN HOU                   52.43 -114.92  988  975R   -9HIxxno-9A-9COOL CONIFER    B
+40371928001 PRAIRIE CREEK RS,AL             52.25 -115.30 1173 1253R   -9HIxxno-9x-9COOL CONIFER    A
?40371928002 NORDEGG,AL                      52.47 -116.08 1402 1480R   -9MVxxno-9x-9COOL CONIFER    A
?40371928003 NORDEGG RS,AL                   52.50 -116.05 1320 1518R   -9MVxxno-9x-9COOL CONIFER    A
+40371928004 SPRINGDALE,AL                   52.80 -114.30  914  954R   -9HIxxno-9x-9COOL CONIFER    A
?4027193000  WHITECOURT,ALTA.                54.15 -115.78  783 1981 1990  1.7 0
?40371930000 WHITECOURT,AL                   54.15 -115.78  782  753R   -9HIxxno-9A-9COOL CONIFER    B
?40371930003 WHITECOURT,AL                   54.13 -115.67  741  732R   -9HIxxno-9x-9COOL CONIFER    C
+40371930001 HELDAR,AL                       54.02 -115.00  701  690R   -9HIxxno-9x-9COOL CONIFER    A
+40371930002 KAYBOB 3,AL                     54.12 -116.63 1003  897R   -9HIxxno-9A-9COOL CONIFER    C
+40371930004 CAMPSIE,AL                      54.13 -114.68  671  668R   -9FLxxno-9x-9COOL CONIFER    A
+40371931001 ANDREW,AL                       54.02 -112.23  610  606R   -9HIxxno-9x-9BOGS, BOG WOODS A
+40371931002 ST LINA,AL                      54.30 -111.45  632  627R   -9HIxxno-9x-9BOGS, BOG WOODS A
+40371931003 MEANOOK,AL                      54.62 -113.35  684  632R   -9HIxxno-9x-9COOL CONIFER    A
+40371931004 ATHABASCA LANDING,AL            54.72 -113.28  503  569R   -9HIxxno-9x-9COOL CONIFER    C
+40371931005 CALLING LAKE RS,AL              55.25 -113.18  598  622R   -9HIxxLA-9x-9COOL CONIFER    B
=4027193200  FORT MCMURRAY,ALTA.             56.65 -111.22  369 1931 1990  0.6 0
=40371932000 FORT MCMURRAY                   56.65 -111.22  369  360R   -9HIxxno-9A-9SOUTH. TAIGA    B
+40371932001 TAR ISLAND,AL                   56.98 -111.45  240  287R   -9HIxxno-9x-9SOUTH. TAIGA    B
=4027193300  FORT CHIPEWYAN,ALTA.            58.77 -111.12  232 1883 1988 34.0 0
=40371933000 FORT CHIPEWYA                   58.77 -111.12  232  245R   -9FLxxLA-9A-9SOUTH. TAIGA    A
=4027193400  FORT SMITH,N.W.T.               60.02 -111.95  203 1914 1990  2.1 0
=40371934000 FORT SMITH                      60.00 -112.00  203  192R   -9FLxxno-9A-9SOUTH. TAIGA    A
=40371934002 FORT SMITH,N.                   60.02 -111.95  205  196R   -9FLxxno-9A-9SOUTH. TAIGA    B
?4027193500  HAY RIVER,N.W.T.                60.83 -115.78  166 1893 1990  4.5 0
?40371935000 HAY RIVER,N.W                   60.83 -115.78  166  173R   -9FLMAno-9x-9SOUTH. TAIGA    C

This is for a chunk of Canada. I’ve added a leading symbol (edited in by hand) where “-” means the old V1 station is dropped in V2. “+” means the V2 station is a new one. A “?” means that there is some doubt and I need to do a look at the actual data in more depth to sort things out, while an “=” means I think this is the same station.

Notice that Canada has changed from 402 country code in V1 to 403 country code in V2. For that last line, you can see that HAY RIVER ends with either N.W.T. or N.W (so an automated name match will have issues – some are even worse) while FORT SMITH, N.W.T. is in one LAT LONG while the new ones are somewhat different (automation via LAT LONG will have issues…)

Then for WHITECOURT, we have 2 potential replacements, but with different minor numbers and in slightly different places. Station move or?… So automated number matches will have issues.

So I’m slogging through the (approximately 13,000) records by hand doing an intelligent match. But it’s slow. Thus the lack of postings the last couple of days.

My Stats So Far

This is a rough count of where I’ve gotten so far:

		13320	Total Records		
Done	7786	5534	Not yet Done		
	58%	4592	Equal Sites		
	19%	1532	Added Sites		
	15%	1183	Dropped Sites		
	 6%	 474	Questionable Matches		
	34%		Changed Percent	
	40%		Changed Plus Questionable	
	65%		Equal Sites Plus Questionable	

The most immediate thought that comes to mind is how anyone can think they can do global calorimetry to 2 decimal places with 1/3 of the instruments changed every decade.

The second thought is based on observing the data as I’m doing the edits. Canada and Russia have massive change. And that’s where GISS “finds red”. Places with little change of thermometers have little red (Africa, some of south Asia). There are many countries where the Percent Change is near zero, then places like Canada where it’s well over 50% (exact stats when I’m done with flagging North America… So far I’ve done Europe, Asia, Africa, and about 1/2 of North America).

You can learn a lot about National Character doing this kind of thing. The Arab countries and the Chinese have very little station change. LAT / LONG are remarkably consistent. There is the odd change of an airport name from a geography name to a politicians name, but not much else. Then there is Yugoslavia where all sorts of things change. Canada that is about as stable as a schizophrenic on meth with changes happening darned near everywhere in huge percentages. Even for the “stable” places, the LAT and LONG change by small amounts. Did they change instruments? Go from a Stevenson Screen out in the snow to an MMTS on an electric leash to a nice warm building? Most likely. Many stations show a small nudge to the LAT or LONG in the middle of the record (as seen in the above examples). But I’m certain I’d not put much trust in the Canadian record without a whole lot of detailed Station Audit done. Then there is Russia. Along with the dropouts, there is an odd set of LAT/ LONG changes. Mostly in the 1/10 or 2/100 range. Cold war artifact of not wanting to give targeting coordinates for Airports? Well known places don’t change (why fib when the fib would be obvious?) while more obscure ones do. Or maybe it’s just sloppy in the remote places. It was bad enough I decided to accept as equal stations with up to 1/10 degree of LAT/LONG variance as long as the name and number matched. And so it goes. You can learn a lot about politics and character from the inventory file. You can even see wars come and go (France drops out for most of the two world wars). But while it’s great for seeing human induced instrument and political changes, it’s pretty poor for seeing long duration climate and calorimetry processes.

One interesting change from v1 to v2 is that in v1, years with missing data have records with missing data flags for the whole year in the monthly mean temperatures file. For v2 the year is simply (and silently) dropped from the record in the v2.mean file. Personally, I’d rather have the record show the missing data as missing, that way you know it wasn’t accidentally dropped or just ignored.

Just doing this exercise makes it pretty clear to me that what GHCN / GISS are measuring is thermometer change artifacts, not “global warming”.

Also, you can look at the above sample and see that the “meta data” are much different between the two data sets. The v1 version includes a ‘years of coverage’ start and end years along with a ‘percent missing data’ flag. Both very useful things that are now dropped. Clearly the early version cared about coverage over time, while the present one has strongly de-emphasized it.

A random inspection of a few sites has shown some changes to the temperatures with a bit of recent warming. This is only anecdotal at this point. Given the crappy state of the Inventory files, it will take me a while to get a decent Old / New comparison set built and do some decent A/B comparison reports. One thing is certain, though: When Version Three comes out, if the data have changed, I’ll be spotting it and reporting on it. And if the inventory file changes as much as has happened between V1 and V2, I’ll be pointing that out as well.

It’s really pretty simple: Station Change Matters.

Take a look at how one does calorimetry. Go hit up a college chem teacher (they would love the attention of someone actually being interested in how to do calorimetry right ;-) and ask them. You MUST know the mass, specific heat, phase change, heats of fusion and vaporization, etc. Then, and only then, can you make statements about the change of HEAT via using temperature. Then ask them about the impact of changing out the thermometers periodically, moving them around in the experimental apparatus, and changing what technology is used mid-stream. (swapping mercury in glass for electronic, for example). Be prepared for a long lecture… THEN ask what would happen if, over a 100 time period experiment you changed 1/3 of the thermometers every 10 periods.

Then ask if that would be better, or worse, calorimetry technique than was done by Pons and Fleischmann when they thought they had found Cold Fusion… ( IMHO, the cold fusion folks look like stellar work in comparison to what the climate guys are doing.) Frankly, I think that the “Global Warming” folks are going to go down in history as far more bogus than Cold Fusion ever was.

And in the end, I think this explains why looking at single well tended instruments that have had no instrument change shows no ‘climate change’. It is only in the aggregate and with highly questionable ‘homogenization’ and ‘adjustments’ that the false signal of ‘global warming’ can be created. As an artifact of instrument and process change. Exactly those things that are considered horrid technique in the chem lab calorimetry experiments. And for good reason.

I’ve got a few other tools I’m developing to do a forensic comparison of the data sets, but those will need to wait a while to be revealed. Until after version 3 is out and they’ve ‘done the deed’. Mostly those tools will be similar to the tools used to find the ‘fingerprints’ left behind in forensic audits of financial records. There are patterns to natural data that do not show up the same way in ‘adjusted’ data. Minor disturbances in the expected and probable patterns. (Folks like to pick ‘7’ as a ‘random’ number, for example. So if you have lots of ‘7’ and few ‘0’ or ‘9’, there is a clue to look for more fudging.) In the end, any attempt to hide ‘data diddling’ (yes, that is the jargon in the field, believe it or not) usually just creates a different set of clues.

But before I can do that part, I need to get the inventory map worked out, and that, as you can see from the above, is A Piece Of Work… But at least we’re getting some interesting statistics about just how much GHCN is a ‘random box of thermometers’ from decade to decade and just how unstable some places can be. And if the V3 inventory is screwed up by similar changes, you can be assured I’ll be pointing it out.

(NOTE to CANADA: Get your hands off your instruments and stop playing with them. It screws up the readings! 8-)

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in AGW Science and Background, Favorites, NCDC - GHCN Issues and tagged , . Bookmark the permalink.

24 Responses to GHCN Version ONE vs TWO (prepping for THREE)

  1. Scott Finegan says:

    So you really expected a clean, quick way to compare…

    In big business, when some division isn’t producing as expected, they have a major reorganization of people and responsibilities. It becomes impossible to have an apples to apples comparison between the old org. and the new org. No accountability… because a lot of the lost money, or productivity, is charged to the reorganization.

    Based on anecdotal evidence…
    In climate science, there isn’t much to reorganize, and it shouldn’t affect the results, but the people doing the work really don’t want anyone looking over their shoulder, and don’t want them finding past mistakes. Making versions easy to compare exposes them to criticism sooner. Making them hard to compare means fewer will bother to try.

  2. Gary says:

    Having worked a long time with datasets that are accumulated piecemeal from different sources and at different times, it seem perfectly natural that the GHCN data is a mess. I suspect various people have been in involved and without a complete knowledge of the details, have hesitated to do anything other than add on as best they know how. At some points there may have been someone more energetic than usual who made major changes. Doubtful his/her successors kept up with them, though. Also the source data probably changed format a few times meaning it had to be stuffed into the dataset somehow. Then there are the inevitable transcription errors (early data probably came on hardcopy). It all comes from not having and enforcing standards from the beginning. However, when the project started I doubt they gave much thought to standards and instead were eager to get analyzing.

  3. Lynn Erickson says:

    I may be overlooking something obvious here, but can’t you use the “stable” major station ID as a key?

    Then you could compare data fields for differences, and spit out reports by selected differences, rather than examining records by hand.

    At least that’s how I would do it in C. I’m pretty ignorant about R, or any R equivalent.

    Thanks for all your work.

    Lynn

  4. Ken McMurtrie says:

    E.M.
    Great to see you back on site, publishing for the sake of TRUTH instead of lies, facts instead of spin, science instead of prejudice, substance instead of rhetoric, selflessness instead of self interest.
    My gratitude and best wishes.
    Ken

  5. vjones says:

    Glad to see you posting on this. Something like this is long overdue. At least the 5 digit WMO code number doesn’t change, but I know from hearing Kevin’s complaints over what is covered by the same WMO number and different series numbers that that can be challenge enough.

    What are the timings of the two versions? I mean when did V2 supercede V1?

  6. Schrodinger's Cat says:

    I’ve been following your work for a long time and I believe it is hugely important. If the main dataset upon which climate science is based is showing mainly process effects and little or no evidence of global warming, then the whole gigantic issue is a mess.

    Have you got plans to draw conclusions backed by evidence and would you seek to publish them in a forum recognised by the “scientific community”? Evidence that stands the scrutiny of independent expert auditors is what is needed. Maybe your contribution is pivotal in this respect.

    I thought the excellent work you did with Anthony and Joe was going to have huge impact, but as far as I could tell, the MSM and the establishment ignored it, much to my frustration, hence my question above.

  7. Tom Bakewell says:

    Sir, I love your posts.

    The small lat -long changes may be due to a change in geodetic datum from North American Datum 1927 to NAD83 in the mid 80’s. These are not huge shifts, by only a few meters in some directions and maybe up to 200 meters in another direction. Just another inconvenient detail when run thru the CRU data shredders…

    Both God and the Devil dwell in the details. Thank you so much for diving in to get this all straightened out.

    Tom Bakewell

  8. RuhRoh says:

    A while back, I remember Mike McMillan (sp?) posting blinkers of a bunch of stations in Illinois.
    I can’t seem to find them to check if they are V1 vs V2 .
    Also, whether that is all of the stations in that state or somehow selected.

    Maybe he will notice this note in this thread…

    Wow, another Augean stable at risk of getting cleaned out…

    RR

  9. Ken McMurtrie says:

    Ref Gary and Scott.
    Your points made about accuracy and details in statistical analyses, fallibility and authenticity perhaps of results and conclusions, are probably well founded and relevant to general cases.
    However, here we have a situation where government policies and billions of dollars of taxpayer money are reliant on the accuracy and truthfulness of the IPCC statistics that are being critically reviewed by E.M Smith.
    This is not just a matter of principle, there are lives and livelihoods at stake. Whole country economies are being threatened by the likelihood of the IPCC submissions and recommendations being, at very least, misleading, at worst fraudulent.
    There is no room for the smallest of errors in this situation.
    The IPCC have been shown up to be arguably suspect. So far no criticism has been made of E.M. Smith’s own data selection and processing, (as far as I know). His conclusions have yet to be accepted officially, but are seen to be soundly based by many of the website viewers. I wonder how many?

  10. Keith Hill says:

    Sorry this is just a “cut and paste” E.M, but after reading your very welcome latest post, I couldn’t resist this reprise of the Harry ,Read Me.Text file from Climategate. I’m sure that by the time you finish your current surgical analysis of these matters, you’ll have a deal of sympathy for Harry!
    Good luck!

    Quote:-

    ‘Botch after botch after botch’
    Leaked ‘climategate’ documents show huge flaws in the backbone of climate change science
    By LORRIE GOLDSTEIN
    Toronto Sun
    29th November 2009

    …The file — 274 pages long — describes the efforts of a climatologist/programmer at the Climatic Research Unit (CRU) of the University of East Anglia to update a huge statistical database (11,000 files) of important climate data between 2006 and 2009.

    The computer coding, along with the programmer’s apparently unsuccessful efforts to complete the project, involve data that are the foundation of the study of climate change — recordings from hundreds of weather stations around the world of temperature and precipitation measurements from 1901 to 2006, sun/cloud computer simulations, and the like.

    The CRU at East Anglia University is considered by many as the world’s leading climate research agency. Here’s how CBSNews.com…’s Declan McCullagh describes its enormous impact on policymakers:

    “In global warming circles, the CRU wields outsize influence: It claims the world’s largest temperature data set, and its work and mathematical models were incorporated into the United Nations Intergovernmental Panel on Climate Change’s 2007 report. The report … is what the Environmental Protection Agency acknowledged it ‘relies on most heavily’ when concluding carbon dioxide emissions endanger public health and should be regulated.”

    As you read the programmer’s comments below, remember, this is only a fraction of what he says.

    – “But what are all those monthly files? DON’T KNOW, UNDOCUMENTED. Wherever I look, there are data files, no info about what they are other than their names. And that’s useless …” (Page 17)

    – “It’s botch after botch after botch.” (18)

    – “The biggest immediate problem was the loss of an hour’s edits to the program, when the network died … no explanation from anyone, I hope it’s not a return to last year’s troubles … This surely is the worst project I’ve ever attempted. Eeeek.” (31)

    – “Oh, GOD, if I could start this project again and actually argue the case for junking the inherited program suite.” (37)

    – “… this should all have been rewritten from scratch a year ago!” (45)

    – “Am I the first person to attempt to get the CRU databases in working order?!!” (47)

    – “As far as I can see, this renders the (weather) station counts totally meaningless.” (57)

    – “COBAR AIRPORT AWS (data from an Australian weather station) cannot start in 1962, it didn’t open until 1993!” (71)

    – “What the hell is supposed to happen here? Oh yeah — there is no ’supposed,’ I can make it up. So I have : – )” (98)

    – “You can’t imagine what this has cost me — to actually allow the operator to assign false WMO (World Meteorological Organization) codes!! But what else is there in such situations? Especially when dealing with a ‘Master’ database of dubious provenance …” (98)

    – “So with a somewhat cynical shrug, I added the nuclear option — to match every WMO possible, and turn the rest into new stations … In other words what CRU usually do. It will allow bad databases to pass unnoticed, and good databases to become bad …” (98-9)

    – “OH F— THIS. It’s Sunday evening, I’ve worked all weekend, and just when I thought it was done, I’m hitting yet another problem that’s based on the hopeless state of our databases.” (241).

    – “This whole project is SUCH A MESS …” (266)

    Unquote.

  11. MichaelM says:

    Ruh Roh,

    I believe you’re looking for these:

    http://www.rockyhigh66.org/stuff/USHCN_revisions.htm

    I don’t remember if this was in reference to v1/v2 or the infamous “Hansen’s Y2K problem” as documented by McIntyre.

    _Michael

  12. MichaelM says:

    Upon reflection, I don’t think the Y2K problem documented here http://climateaudit.org/2010/01/23/nasa-hide-this-after-jim-checks-it/ and other places, is related to the blink charts. But the referenced article does show the propensity of NASA GISS to change the PAST data without telling anyone.

    Unfortunately, I haven’t heard much from folks such as Lucia’s ‘modelling brigade’ about this step change downward in USHCN data. Wouldn’t it have a significant effect on the overall US trend?

    _Michael

  13. WillR says:

    Maybe you should talk to Richard…
    http://cdnsurfacetemps.blogspot.com/

    I know that he is having an adventuresome time with the Canadian Data…

  14. RuhRoh says:

    Hey Cheif,
    I’d be willing to take a shot at comparing the small lat/lon changes to see if they are indeed the result of the change in projection from NAD27 to NAD83.

    send me a short txt file with lat lon in the first columns, and a few examples of the dubious shifted coordinates.
    RR

  15. E.M.Smith says:

    Lynn Erickson

    I may be overlooking something obvious here, but can’t you use the “stable” major station ID as a key?

    Then you could compare data fields for differences, and spit out reports by selected differences, rather than examining records by hand.

    That’s basically what I’ve done, in that I make a merged set sorted by 5 digit key. Then I just scroll through looking at the records. Given the number of “interesting things” it’s about as fast as kicking out the identical ones then working through the odd cases and reintegrating.

  16. E.M.Smith says:

    vjones

    Glad to see you posting on this. Something like this is long overdue. At least the 5 digit WMO code number doesn’t change, but I know from hearing Kevin’s complaints over what is covered by the same WMO number and different series numbers that that can be challenge enough.

    What are the timings of the two versions? I mean when did V2 supercede V1? a

    Well, it MOSTLY didn’t change…. I’ve actually run into a dozen or so where the WMO code DOES change (and everything else stayed the same)!

    Looks like they are on the ‘every decade’ plan. V1 was 1990, v2 was 2000, V3 ought to be 2010…

    Ken McMurtrie

    There is no room for the smallest of errors in this situation.
    The IPCC have been shown up to be arguably suspect. So far no criticism has been made of E.M. Smith’s own data selection and processing, (as far as I know). His conclusions have yet to be accepted officially, but are seen to be soundly based by many of the website viewers. I wonder how many?

    Well, I have my fair share of critics. They’ve not found anything wrong with the dT/dt method, but generally don’t like it (as it doesn’t do what they did…). The major criticism is that I don’t do the areal adjustment and grid/box thing so the Global Average Temperature can’t be right… which completely misses the whole point of the approach. It’s not to find a GAT, it’s to inspect the data quality via looking at a very narrow type of anomaly (only a single station to itself with no adjustments, infill, whatever) and see if that shows patterns that would argue for ‘issues’ in the data (and it does, big time…)

    I also don’t really do ‘data selection’. I use the whole GHCN set. I do look at it with different perspectives (cuts by altitude, by latitude, by country, etc.) but that’s to illuminate patterns in the whole set, not select parts of it for an agenda driven conclusion. That also throws some folks. It’s a “forensics thing”, and He Who Shall Not Be Named goes bat shit over it. Saying “Ravings of a lunatic” I think it was. Just doesn’t ‘get it’ that a forensics approach looks at ALL KINDS OF ANGLES, not just the one or two a researcher would say are the ‘right ones’. Like looking at a potentially forged document under white light, UV, IR, and various other color bands. AND with different florescent dies applied. Yeah, it’s “not right” for reading a nice check as it’s supposed to look; but it is Very Very Right for finding ‘odd things’ that ought not to be there…

    In forensics, if your ‘opponent’ says “here, sit in the front row for a good view of the show”, you immediately move to the wings, and have an assistant go to the basement and look for trap doors while the ‘show’ is being performed. You deliberately ‘slightly break things’ and see if the result is as expected. You NEVER do it the same way the target of the investigation does things… And that causes some complaints that I’m not ‘doing it the right way’. ( The peer reviewed way? ;-)

    There was also a brief spat of folks wanting to decorate the place with links to articles of the form “you are an idiot because”. They got pissed and went away when I told them that 1) They had to be polite. and 2) I didn’t take kindly to posting links to articles of the form “you are an idiot because”. … so they’ve gone away and not returned.

    (Almost all the articles basically said “up to 1990 matched for kept and tossed stations so you are an idiot because they must have been the same afterwards too.” This fails for the simple reason that you can not compare what is not there. So, for example, stations that match in a cold PDO phase might not match in a warm one. Also, if the drop of stations in 1990 was biased to, for example, keep stations that change to ASOS but drop those that don’t (i.e. airports, that we know happened…) and the ASOS had issues (which we know it had) then you also don’t pick up the change by comparing the two sets pre-ASOS. There are more examples, but that gets a bit long for a comment.

    At any rate, those folks have largely gone elsewhere to talk among themselves and say “He’s an idiot because up to 1990 things match, so after 1990 must too.” (and I’m glad they’ve gone. It gets tedious pointing out that when things change, they are different ;-) Oh, one other minor comment. We lose about 90% of stations in The Great Dieing so it’ pretty easy to have a bias in a 10% “kept” that was not in the 90% tossed and that bias would be lost in the comparison due to excessive averaging. (Averages are great at hiding things, not so good at revealing them…) and there is a time synchronous change of QA process and some equipment. I’ve often stated that it could simply be those things in the ‘kept’ stations and not just a physical site artifact (i.e. only altitude). Finally, there is an aspect that I’ve alluded to before that I’m not going to share just yet (as it’s fodder for a publication potentially) that DOES vary directly with station latitude, altitude, etc. and will impact the kept vs tossed, but will be lost in the type of analysis that the Idiot Mongers like to toss around (heavily averaged sets of data, in two groups). But I can’t show them where they are wrong without putting in jeopardy the potential to publish, so that just has to sit and fester in their craw for now. Oh Well.

    But the bottom line is that yes, I do have critics, but since I set a couple of simple rules, they all nattered to each other that I was not a nice person and must censor or something and all ran off together to go somewhere else and complain. (No, really… There were about a half dozen, then a day or two later, none. Very odd. Strongly suggestive of coordinative activity of some sort – though perhaps no more sinister than hanging out on the same warmer blog sites together.)

    And Lucia decided I was “not interesting” after I said I didn’t want to pay “Lucia holds the coats while you two boys fight” over at her place. I’m really not interested in games of ‘who debates best’ so much as I’m interested in ‘who finds the truth best’, and that causes a lot of folks to pack up and go looking for some “action”… Which is also fine with me.

    I started this blog mostly as a place to put my own notes as I found things and as I was working on things. Not as a place for school yard brawls, bully boys taunts, or name calling. Got over that about 3rd grade. Some folks didn’t. So if folks want to watch, ask questions, discuss things among themselves, make suggestions, heck, even take bits of the puzzle and work on it too, all that is just fine with me. If folks what to mess up the place with graffiti and get into distracting name calling, well, that doesn’t fit the theme as described under the “rules” and “about” tabs (a polite lawn party with friends) and those folks don’t get much satisfaction here. (I’m also not driven by a desire for “hit count” or any of the other ‘size’ issues; so I’m not driven to keep posting controversial things just to stir the pot.)

    So with all of that, when you post fact oriented things like, oh “The keys were changed for no good reason”, there just isn’t a lot of ‘come back’ that can be tossed at you. And if it has to be done politely, there is even less emotional gratification for the mud tossers.

    At least, that’s my take on things.

  17. E.M.Smith says:

    @Keith Hill:

    Yeah, that about sums it up! If I ever meet Harry R.M. I’m going to buy him a few beers…

    @RuhRoh

    I’d be willing to take a shot at comparing the small lat/lon changes to see if they are indeed the result of the change in projection from NAD27 to NAD83.

    send me a short txt file with lat lon in the first columns, and a few examples of the dubious shifted coordinates.

    I took a couple of days off so I’m still not done. Probably a couple of more days to finish. (Well, really, I’ve been dealing with my street being repaved, walking a few blocks to the car for a week, getting new tires – for unrelated reasons, an invite to a very nice Solstice Party down toward Monterey, and solving a minor crisis for my mechanic… but that’s what I do on ‘days off’ from the blog ;-) Once I’m at a convenient place in the data, I’ll send a sample or make a posting.

    What I intend to do is sort out the “-” from the “=” from the … and then do some simple analysis on them. It also ought to be possible to run the “?” set through a simple filter for “delta LAT LONG” and just have that set, if so I’ll likely post it “just for grins”. We’ll see.

    My take on it is that it does not look like a datum change. There are stations near each other where some will be matches while others will be ‘a bit off’. It mostly looks like either a minor station move (Canada), or a minor deception / error fix (Russia), or maybe improved precision as they got a GPS in 2000 instead of an old sextant in 1820 ;-) (Russia, Canada, China…).

    But more eyes can likely make more sense of it.

    (FWIW, the ‘minor crisis’ involved coming up with a novel cleaner that just might be a money maker. I have an idea how to improve it that may be patentable, so the rest of that story, too, will need to be left fallow… but it removes grease without removing paper gasket material or paint ;-)

    So, at any rate, after a few more bits of ‘excitement’ I’m back at the keyboard…

    And hoping they will be done with my street tomorrow… backhoes, road graders, and rollers at 7 AM are not my idea of morning wakeup music…

  18. E.M.Smith says:

    Schrodinger’s Cat

    I’ve been following your work for a long time and I believe it is hugely important. If the main dataset upon which climate science is based is showing mainly process effects and little or no evidence of global warming, then the whole gigantic issue is a mess.

    Well, that about sums it all up, IMHO! (And thanks for the vote of support.) It’s got so many artifacts in it that teasing a signal of a couple of tenths C out of it and thinking it means anything is, IMHO, a fools errand. Heck, I documented 0.6 C of warming bias just in the first couple of steps of the data processing of GIStemp. An overall 1/100 C warming of the data set from a single line of code in STEP0 (!). Then there are the GHCN shenanigans and the splice artifacts from having a lot of temperature segments merged (via homogenizing and The Reference Station Method) and a dozen other things. At the end of it all, I think folks would be lucky to get a 1.0 C precision out of it. (And an unknown accuracy…)


    Have you got plans to draw conclusions backed by evidence and would you seek to publish them in a forum recognised by the “scientific community”? Evidence that stands the scrutiny of independent expert auditors is what is needed. Maybe your contribution is pivotal in this respect.

    Yes, I do. But they are more longer range. First up is to make sure any build up to Mexico this summer is about as effective as the build up to Copenhagen was ;-) Then there was that lawsuit someone filed against the EPA to try to block them. Things are moving fast enough that I’m probably going to continue tossing things out for public review rather than holding it tight for potential publishing in a year or three… (Except, maybe, one or two things…)

    In particular, I think that the dT/dt method has a lot of potential as a data audit tool. It does strongly point up some very odd change done about 1987-1993 to the stations or the data processing. And the method is pretty darned bullet proof. So I think it has the potential to be a paper. With some work.


    I thought the excellent work you did with Anthony and Joe was going to have huge impact, but as far as I could tell, the MSM and the establishment ignored it, much to my frustration, hence my question above.

    Well, I think it had more impact than you might have noticed ;-)

    Yeah, not on the cover of Newsweek, but… There was a fair amount of ‘buzz’ in places that mattered…

    While the Climategate CRU crew were center stage, their attempts to toss a lifeline to NCDC / NASA and the attempt to play Three Card Monty with each one saying “We’re just like them” pointing to their left… suddenly fell apart. So CRU were left adrift as they could not say NCDC was pristine and “We’re just like them!”… And when CRU went down, and NOAA / NCDC were on defense and GIStemp was being looked upon as just a bit wacky (and I must say, Hansen helped there with his public performances painting him as a radical wacko); well, that’s when the tide turned.

    Yeah, it’s a ‘negative space’ thing. What you OUGHT to have seen but did not… But that’s OK with me. We would have expected to see the three in a circle each supporting the other. What we got was CRU sinking, NOAA / NCDC being defensive, and NASA / GISS looking embarrassed about things… At least in part because they were unable to show that their data were sound. And at least part of THAT was from the work a lot of us did in digging at it…

    Then when Obama put “Global Warming” in his SOTU speech and got chuckled at, well, that was just golden ;-)

    @WillR: I’ll take a look, thanks!

  19. DR says:

    Chiefio,
    There is a post at WUWT concerning Australian temperature data. When countries send their data to GHCN, is the data “raw” or as in this case “high quality”?

    How can one find out?

  20. E.M.Smith says:

    Well, while the warmers like to dismiss it as non-relevant, the GHCN simply can NOT be raw. It is a constructed item. The daily data are averaged to make a monthly mean.

    Since there are several ways you could do this, each BOM may be giving you a different method. The comeback on that is that the difference between “Peak Min vs Peak Max” and 24 samples vs 2 samples and… is not material. But the point is it is still Not Raw. So don’t call it that (and they don’t, they use “unadjusted”).

    Beyond that, the GHCN data are “QA filtered”. I’ve not yet found exactly what is done, but if it is at all like what is done to the USHCN, that is a very long long way from ‘raw’. (There is a posting on it, something like “QA or tossing data, you decide” under the AGW/ Gistemp or maybe the GHCN category on the right hand side list).

    So my take on it is that since it is clearly wrong to call it ‘raw’, it must be called “Quality Improved”… (I can’t bring myself to call it ‘High Quality”, even in jest…)

    You find out by endlessly digging through the papers and references at the NOAA/ NCDC site and the NASA / GISS site and puzzling out what it all means, then finding the misdirections and figuring them out. Like the fact that some station data is “estimated” (there are a couple of flags for it in the metadata). If it’s estimated, how can it be raw?

  21. Ken McMurtrie says:

    E.M., you continue to provide us with a wealth of information and logical assessment. Once again, thanks.
    Without wanting to distract you from your current themes – do you have anything to say about the atmospheric aspects of global temperature variations? E.g., what part does CO2 really play in influencing/modifying the GAT, whatever the GAT may actually be. It seems reasonably beyond doubt that GHG’s in general, create a greenhouse scenario, quite important to the surface temperatures we experience, by trapping and re-reflecting energy emitted/reflected by our land, ice and water surfaces.
    Yet the proportion of this energy due to the atmospheric content of CO2 (0.038% in the lower atmosphere, and a ?minor GHG) is being strongly debated in the internet world. I am prepared to tackle this by researching web information but thought you might have some worthwhile insights in the area. You have shown very convincingly, that seemingly accurate assessments of global temperatures, (yours) certainly have little or questionable statistical relationship with CO2 levels, but I am wondering if there is some scientific basis for “them” picking on CO2 or is it just a convenient substance selected for its suitability for government and big money control purposes? I am not going to be influenced by anything in the IPCC reports and need other scientific sources.
    Rergards, Ken.

  22. E.M.Smith says:

    @Ken McMurtrie

    My understanding of it is that CO2 gets the blame as a residual. Several groups of folks, in order, do things to the data, then at the end of the day they find a warming trend.

    Looking around, nothing explains it to them, so it must be CO2, since there is this very old theory about CO2 trapping heat. The theory has never been proven… but it must be responsible because nothing else has been found.

    It was the weakness and fundamentally flawed logic of that reasoning that gave me one of my earliest “You’ve got to be kidding, they think THAT?” moments with respect to AGW.

    IMHO, one of the weakest points is the poor handling of vertical air flow. We have very poor data on it (some microburst detectors at airports) and it is largely ignored. Further, during this solar downturn, it has been discovered that the atmospheric thickness became much less than expected. (NASA posting)

    So we have this much thinner air blanket (more compressed) between us and space. And I’ve noticed (and commented many times) on how the air is now more “blustery”, like it was sometimes when I was a kid, but it seems to be that way far more often. My wind chimes are chiming more than ever before, even when the nominal wind speeds are not out of line. It’s the blustery gusty nature of it these days that is different.

    Now my synthesis of this is that we have much more vertical mixing going on, even in ‘clear sky’ than we had prior. When I was taking ground school in the ’70s there was a lot of discussion of Clear Air Turbulence and of micro-bursts along with news reports of planes crashing on takeoff or landing at airports in such micro-bursts (up to 2000 feet per minute of vertical velocity! Normal climb rate is 500 fpm or so, with 2000 being very high performance fighters or stunt planes. You just can’t out climb a micro-burst in commercial aircraft…)

    Then it got quiet. In the ’80, and even more so in the ’90s, we had ever decreasing air fatalities. No more pictures of major crashes on takeoff and landing due to turbulence. Lots of fairly smooth flights. (Partly due to the added microburst detectors and better understanding of when not to fly). We even had a few years there with NO commercial airplane fatalities.

    Then the sun went quiet.

    In one year we had 3 planes go down for what may have been weather related issues. (An Airbus near an African island was one, IIRC. Mauritius? Something like that. There were 2 Airbus and 1 something else crashed in a single year. Caused me to wonder if the Airbus was less able to deal with turbulence.) There also have been more news reports of passenger injuries due to in flight turbulence. Basically, it’s a lot more bumpy up there.

    So my synthesis of that is simple: In times of high solar activity, we get more total solar input, but also much more UV (the UV drops more than the visible in a major minimum event). The added UV makes more ozone, and closes the part of the IR window not well covered by water and CO2. The air warms and swells (documented) and we get less vertical mixing (speculative). Then when the sun goes quiet, the UV drops so the ozone does too. That 10 micron or so ( IIRC) IR window opens, and heat leaves, largely from the poles where there is very little UV, especially in winter…

    As the heat leaves, the air contracts and the atmosphere becomes thinner (documented). This cold dense air heads toward the equator (documented) and the residual heat in the oceans warms the air that heads to the poles (documented) so we get this Lava Lamp World with hot blobs running up the east coast and cold blobs running down the west coast (documented) along with the “loopy jet stream” more like it was in the ’70s. (It had become more ‘flat’ and straighter, more parallel to the latitude lines, during the major warming phase). The thinner more vertically mixed air can dump more heat to space from the upper altitudes than thicker non-mixing air. The CO2 ‘blanket’ doesn’t matter when you have physical transport vertically.

    All of this enhanced air flow (from the greater heat differential between equatorial ocean and poles, due to the change of solar output) causes much more turbulence and more vertical mixing. As the cold blob heads south, it’s going over hotter land (after a couple of decades of heating) and ocean so gets more variation between cold on top and hot on the bottom, thus more turbulence. Also more shear action as it passes the warm blobs headed pole-ward. (speculative) All in a thinner air mass, so there is less distance to diffuse that vertical movement. Also, the now colder cold blobs will hold less water (be dryer) at the poles, so even less IR retention.

    This ought to continue until the accumulated heat in the oceans re-balances with the now colder poles or until the sun wakes up again, which ever comes first. And it takes many years to re-balance…

    So that’s my model of what’s going on.

    The “other guys” basically model constant atmospheric thickness and don’t seem to allow much (or anything) for vertical mixing variations and they pay no attention to the ozone variation. I do think ozone is much underrated as a IR blocker. If you look at the charts of IR transmission, most of the gasses have a lot of overlap everywhere. Except this one window that is covered only by ozone… So you can raise or lower CO2, and it’s already swamped by water and ozone just about everywhere, but for ozone, nothing else is covering that one slot. And that is ignored.

    Then they sit back, look at the model results, and say “We’ve accounted for everything important, it must be CO2.” And IMHO they have very much NOT accounted for everything else. (Clouds, in particular, are very poorly handled. The increase in clouds cover under the cold blobs on the west coast has been significant…)

    So you can find a lot of literature on CO2, mostly praising the magical power it has to modulate IR even in the face of water vapor et. al. and coming up with fanciful feedback mechanisms. But you find very little actual measured processes. Lots of models, little reality. All hat and no cattle.

    And it all SOUNDS good. It’s just that when you dig into it “there’s no there, there”. Like my speculative scenario above, it explains everything, but there is no measurement and no recorded events to prove it. (There is a record of CO2 levels. And there is a record of temperatures. And they do have a coincidental partial correlation recently, but are way out of whack in geologic time scales… and there is nothing showing a causality.)

    Thus, at the end of the day, anyone can make up a good sounding story and be on as sound a footing as the CO2 thesis.

    Personally, I like my Lave Lamp World and Ozone, with a bit of Svensmark cosmic ray driven clouds tossed in, far far better. It requires nothing other than what we know happens. Then again, nobody gets a new Ph.D. for sayings “It’s all fine and just doing what it’s always done”. So I doubt there will be much revolutionary zeal put behind it. (I do wonder, sometimes, how many degrees are being minted on the back of the ‘new science’ of CO2 and AGW… It can be hard to come up with a ‘new contribution’ in an established field. If you can just computer model your way to a new thesis, hey, you get the brass paper ring at the end…)

    FWIW, my simple thought experiment about CO2 is this:

    In the desert, you can make ice by exposing a well insulated hole in the ground to the night sky. The IR radiates away and eventually water freezes. It can get darned cold at night in the desert… And CO2 is powerless to prevent that.

    So if CO2 is keeping all the heat in, how come the heat all leaves?

    Yet under a cloudy sky, it doesn’t work. And in high humidity it doesn’t work.

    So I think there is a simple existence proof that clouds and water mater much more than CO2 to the point where CO2 can be ignored.

    Here is a way to try it yourself:

    > A slashdot contributor named Adam (the original contributor ?)
    > sez:
    > “In September 1999, we placed two funnels out in the evening, with
    > double-bagged jars inside. One jar was on a block of wood and the
    > other was suspended in the funnel using fishing line. The
    > temperature that evening (in Provo, Utah) was 78 F. Using a Radio
    > Shack indoor/outdoor thermometer, a BYU student (Colter Paulson)
    > measured the temperature inside the funnel and outside in the open
    > air. He found that the temperature of the air inside the funnel
    > dropped quickly by about 15 degrees, as its heat was radiated
    > upwards in the clear sky. That night, the minimum outdoor air
    > temperature measured was 47.5 degrees – but the water in both jars
    > had ICE. I invite others to try this, and please let me know if
    > you get ice at 55 or even 60 degrees outside air temperature
    > (minimum at night). A black PVC container may work even better
    > than a black-painted jar, since PVC is a good infrared radiator –
    > these matters are still being studied.

    That’s my take on it anyway…

  23. Ken McMurtrie says:

    Many thanks for your thoughts on this. I think they help me keep my feet on the ground and my brain open to logical input and discerning about brainwashing.
    Good wishes!

  24. E.M.Smith says:

    You are most welcome.

    It really is just a matter of how my brain works. It won’t let me put new things in if they are not harmonized with the old things. (Well, I CAN put them in, but all sorts of flags go up saying “this doesn’t quite fit here” and I have to go figure out what I’m missing to polish the edges…)

    So I’ll know about the desert ice thing, and then someone says CO2 keeps the heat in… And a flag goes up saying “If it holds the heat in, how come the heat can leave in a mater of hours and freeze water to ice?” And why does it work under a clear sky but not under clouds? Doesn’t that make the clouds more effective?

    Then I go looking for IR opacity maps of the sky and find that water and CO2 overlap all over the place, and there is much more water than CO2 in the air. Then notice that little part only covered by Ozone. Then… And so it goes.

Comments are closed.