A Beautiful Tomato with Flowers - Smarter than GIStemp

GIStemp- Dumber than a Tomato!

Original Full Sized Image.

I’m adopting this “tag line” about tomatoes due to the simple fact that my tomato garden is a more accurate reporter of the temperature than is GIStemp. Normal tomatoes will not set fruit below 50F at night. Cold varieties, like Siberia, can set fruit down to 40F (and some claim lower to 35F). My tomato plants reliably report the temperature. GIStemp, not so much… GIStemp was reporting a 115 year record heat in California (because NCDC in GHCN moved all the thermometers to Southern California near the beach, except one at the S.F. Airport…) while my tomatoes were accurately reporting “too cold to set fruit”…

(NCDC is the National Climatic Data Center. GHCN is the Global Historical Climate Network – the thermometer data. GIStemp is the NASA product that turns the GHCN history into that global ‘anomaly map’ and claims to show where it’s hot and how hot we are.)

GIStemp, A “Start Here” page

UPDATE 6: As of November, 2009:

I’ve added a GIStemp high level overview for regular folks (i.e. you don’t need to be a computer geek or weather guy to ‘get it’):

This is a nice high level view, but has links down into all the detail and tech talk that support it, if desired. Most folks ought to read it first.

NOAA / NCDC have Fudged and Corrupted the Input Data Series

A sidebar on data corruption from thermometer deletions:

The GHCN input data to GIStemp “has issues” (they -NOAA/NCDC- deleted 90% or so of the thermometers between about 1990 and 2009…) with those deletions focused on cold places. This is the second set of reports most folks ought to read. We explore that here:

And find what is most likey the key coordinating factor behind the ‘agreement’ between HadCRUT (UEA / CRU i.e. the “Climategate” folks), NCDC (the GHCN adjusted series and GHCN data fabricators), and GIStemp. They all use GHCN and the GHCN set has been “buggered” with the deletion of cold thermometers from the recent data (but they are left in the baseline periods. Even the Japanese temperature series depends on GHCN). This, IMHO, is the biggest problem and is the most important ‘issue’ in the apparent fraud of AGW. When you are ‘cooking the books’, literally, I have trouble finding any more polite word than fraud…

Some Prior Updates

UPDATE 5: As of September, 2009:

I ran into this excellent page:

A very “information dense” but “must read” page.

UPDATE4: As of Augst 16, 2009:

I’ve put up a simple intro to the “issues” with the AGW thesis:

A bit dated (it needs to be updated with Climategate, the UEA subornation of the peer review process, and the NOAA / NCDC GHCN data set molestation, for example) but still a good starter list.

Not exactly GIStemp, but in “characterizing the data” for the GHCN input, I’ve found that all the “warming signal” is carried in the winter months. The summer months do not warm. That can not be caused by CO2. How can you have a ‘runaway greenhouse effect’ with a ‘tipping point at a higher temperature’ when it reacts LESS with higher temperatures with an apparent dampening to a very nice maximum?

Further, the “warming signal” arrives coincident with the arrival of large numbers of new thermometers. When you look at the longest lived cohort, those over about 100 years lifetime, there is no warming signal present in the data to speak of. When you look at the much shorter lived cohorts, you find a very strong warming signal, especially in the winter months. On further inspection of the data it looks like a lot of thermometers “arrived” at places with low latitudes AND at airports (newly built as the “jet age” arrived).

Those above three links are still a nice read, but they were early in my discovery process. The end result was that “GHCN Global Analysis”. Still, they are worth reading both for what they say and as an interesting insight into how a discovery process proceeds from a broad ‘something is not quite right here’ discovery and down into the exact focused detail (who, what, how, when – specifically).

My “working thesis” at this point is that GIStemp is a “filter” that tries to remove the impact of this bolus of thermometers arriving in a spike of time and space (using zones, boxes, grids, et. al.) , but is just not adequate to the task. GIStemp is just not a perfect filter. Looking at the impact of the temperature steps (up to zones), they act as a mild amplifier, so some of the “filter” effect in the later steps will only be removing what was added in the early steps.

But only a little impact comes from STEP0 (not surprising, since it is adding in the Antarctic data and some fiddley bits

Though the way they do the F to C conversion is sloppy, not very efficient, and has sensitivity to exactly which compiler you use:

UPDATE3: As of July 27, 2009:

[ NOTE: The “UPDATE” here is a bit dated at this point. I’ve now worked out in some detail the thermometer deletions that were only hinted at in this stage (and are now nicely documented in the reports above). It has all been duplicated by independent parties, and I’ve caught up on sleep. I’m mostly leaving this part up as a reminder of what it was like back then. A bit of nostalgia, of a sort. Further down a “Geek Corner” continues with computer code and how to download and install GIStemp.]

I’ve identified the “issue” with STEP4_5 codes that will not run without errors. They were produced on a ‘Bigendian” box like a Sun, and I’m on a “Littleendian” box like a x86 PC. The g95 docs say they support the “convert={swap|bigendian|littleendian} ” flag to the file “open” statement. I ought to be able to add that directive the “open” file foo statments and be done. But the compiler barfs on it. I’m left to suspect that the g95 support for convert=swap is not actually working in the release I have running.

So the bottom line is that the code I have probably works, but it can’t read a couple of the downloaded files (like SBBX.HadR2 ) due to the bytes being in the wrong order for this hardware. A relatively obscure problem usually avoided by not using byte order sensitive data structures for data interchange…

OK, so I’ll ether port the code to a Macintosh (who have bigendian processors), dig my ancient Sparc 2 out of the garage, or make a purpose built byte swapping file conversion utility. (Since I downloaded the latest production release of the g95 compiler, it’s not likely to be the “fix” to go fishing for a version with a working convert=swap flag; but I do need to verify this assumption…). I’m on stable.91 and it look like 92 has the working convert flag.

I’ve also built a couple of tools that do some crude data analysis on the temperature data as it moves from step to step. The most interesting bit so far is that the “warming” all happens in the winter.

Clever stuff, this CO2. It can cause selective warming in winter, with nearly no warming in summer…

I’m going to paste in here a comment I made over on Watts Up With That in a thread about GIStemp STEP1 about the first rough characterization of what GIStemp does to the temperature data:

Well, at long last I have a contribution based on the work porting GIStemp. I can now run it up to the “add sea surface anomaly maps” stage, and this means I can inspect the intermediate data for interesting trends. (The STEP4_5 part will take a bit longer. I’ve figured out that SBBX.HadR2 is in “bigendian” format and PCs are littleendian, so I have a data conversion to work out…).

Ok, what have I found in steps 0, 1, 2, …? Plenty. First off, though, I needed a “benchmark” to measure against. I decided to just use the canonical GHCN data set. This is what all the other bits get glued onto, so I wondered, what happens, step by step, as bits get blended into the sausage? I also wondered about the odd “seasonal” anomaly design, and wanted a simple year by year measure. [and also month by month -ems]

So my benchmark is just the GHCN monthly averages, summed for each month of the year, cross footed to an annual “Global Average Temperature”, and then a final GAT for ALL TIME is calculated by averaging those yearly GATs.

Now, there are a couple of caveats, not the least of which is that this is Beta code. I’ve cobbled together these tools on 5 hours sleep a night for the last few days (It’s called a “coding frenzy” in the biz… programmers know what I’m talking about… you don’t dare stop till it’s done…) So I’ve done nearly NO Quality Control and have not had a Code Review yet (though I’ve lined up a friend with 30+ years of high end work, currently doing robotics, to review my stuff. He started tonight.) I’m fairly certain that some of these numbers will change a bit as I find little edge cases where some record was left out of the addition…

Second is that I don’t try to answer the question “Is this change to the data valid?” I’m just asking “What is the degree of change?” These may be valid changes.

And third, I have not fully vetted the input data sets. Some of them came with the source code, some from the GIS web site, etc. There is a small possibility that I might not have the newest or best input data. I think this is valid data, but final results may be a smidgeon different if a newer data set shows up.

Ok enough tush cover: What did I find already?!

First up, the “GLOBAL” temperature shows a pronounced seasonal trend. This is a record from after STEP1, just before the zonalizing:

GAT in year : 1971 3.60 6.20 8.20 12.90 16.50 19.30 20.90 20.70 17.90 13.90 9.50 5.60 14.10

The first number is the year, then 12 monthly averages, then the final number is the global average. The fact that the 100ths place is always is a 0 is a direct result of their using C in tenths at this stage. It is “False Precision” in my print format.

It seems a bit “odd” to me that the “Globe” would be 17C colder in January than it is in July. Does it not have hemispheres that balance each other out? In fairness, the sea temps are added in in STEP4_5 and the SH is mostly sea. But it’s pretty clear that the “Global” record is not very global at the half way point in GIStemp.

Next is from GHCH, to GHCN with added (Antarctic, Hohenp…., etc.) and with the pre 1880’s tossed out and the first round of the Reference Station Method. The third record is as the data leaves STEP1 with it’s magic sauce. These are the total of all years in the data set. (The individual year trends are still being analyzed – i.e. I need to get some sleep ;-)

2.6 3.9 7.3 11.8 15.8 18.9 20.7 20.3 17.4 13.1 7.9 3.9 11.97
2.6 3.8 7.3 11.7 15.6 18.7 20.5 20.0 17.2 13.0 7.9 3.9 11.85
3.2 4.5 7.9 12.1 15.9 19.0 20.9 20.5 17.7 13.5 8.5 4.5 12.35

It is pretty clear from inspection of these three that the temperature is raised by GIStemp. It’s also pretty clear that STEP0 does not do much of it (in fact, some data points go down – Adding the Antarctic can do that!). The “cooking” only really starts with STEP1.

The big surprise for me was not the 0.38 C rise in the Total GAT (far right) but the way that winters get warmed up! July and August hardly change (0.2 and 0.3 respectively) yet January has a full 0.6 C rise as do November, December, Febrary, and March.

So GIStemp thinks it’s getting warmer, but only in the winter! I can live with that! At this point I think it’s mostly in the data, but further dredging around is needed to confirm that. The code as written seems to have a small bias spread over all months, at least as I read it, so I’m at a loss for the asymmetry of winter. Perhaps it’s buried in the Python of Step1 that I’m still learning to read…

Finally, a brief word on trends over the years. The GIStemp numbers are, er, odd. I have to do more work on them, but there are some trends that I just do not find credible. For example, the 1776 record (that is very representative of that block of time) in GHCN is:

GAT/year: 1776 -1.40 2.30 4.20 7.20 12.10 18.20 19.70 19.30 15.60 9.50 3.00 -0.40 9.89

The 2008 record is:

GAT/year: 2008 8.30 8.30 11.10 14.60 17.60 19.90 20.90 20.90 18.80 15.50 11.00 8.80 15.90

Notice that last, whole year global number? We’re already 6 C warmer!

Now look at the post step1 record for 1881 compared to 1971:

GAT in year : 1881 3.50 4.10 6.40 10.90 15.30 18.20 20.20 19.80 17.20 11.80 6.40 3.40 11.43

GAT in year : 1971 3.60 6.20 8.20 12.90 16.50 19.30 20.90 20.70 17.90 13.90 9.50 5.60 14.10

According to this, we’ve warmed up 4.5C since 1881 and the 1971 record above was a full 2.7C warmer than 1881. But I thought we were freezing in 1971 and a new ice age was forecast?!

Now take a look at January. No change from 1881 to 1971 (well, 0.1c) but February was up 2.1C, March 1.8C, December 2.2C. And the delta to 2008 is a wopping 4.8C in January and 5.4C in December, but July is almost identical. By definition, picking one year to compare to another is a bit of a cherry pick, even though these were modestly randomly picked. (There are “better” and “worse”: 1894 was MINUS 2.4c in January). But even with that, the “globe” seems to have gotten much much warmer during the Northern Hemisphere winters. Yet not in summer.

Somehow I suspect were seeing a mix of: Exit from LIA in the record that is mostly focused on N. America and Europe; any AGW being substantially in winter in the N.H. and not really doing much for summer heat (if anything), and potentially some kind of bias in the code or temperature recording system that has been warming winter thermometers (heated buildings nearby, car exhausts, huge UHI from massive winter fuel today vs a few wood fires 100+ years ago). (update: see the cold winter location thermometer deletions under the “GHCN global analysis” link at the top.)

I’ve seen nothing in the AGW thesis that would explain these patterns in the data. Certainly not any “runaway greenhouse” effect. The summers are just fine…

So I’m going to dredge through the buckets of “stuff” my new toy is spitting out, and spend a while thinking about what would be a good article to make from this… and do a bit of a code review to make sure I’ve got it right. In the mean time, enjoy your balmy winters ;-)

(And if Anthony would like a copy of the ported GIStemp to play with, well, “Will Program for Beer!” ;-)

E.M.Smith (03:59:22) :
Hmmm…. A bit further pondering….
Does anyone have a graph of S.H. thermometer growth over time? It would be a bit of a “hoot” if the “Global Warming” all came down to more thermometers being put in The Empire in Africa, Australia, et. al. then to Soviet Union dropping SIberia out in large part…
Could GW all just be where in the world is Carmen Sandiego’s Thermometer?


(I’ve put a update or two in the quote, so it’s not exactly a quote anymore, but 99+% the same)

Geek Corner starts here. Technical stuff and programmer topics

So at this point I have a working GIStemp port, a clear idea how to get it to “play well with littleendian hardware” and some early clues as to where the warming is coming from (and it isn’t CO2, given the summers are not getting hotter…)

I’ll probably take a break from GIStemp for the next week (or at least ramp down the effort a little, I could use some more sleep 8-) and get back to making some weekly stock market postings. Time to make some more money (my kids need tuition this month ;-)

UPDATE2: As of July 22, 2009:

I’ve gotten all the code up to STEP4_5 to run to completion. The input data files for STEP4 do not have a location called out in what passes for documentation, so I’m on a bit of a treasure hunt to get them. I’ve found SBBX.HadR2, but not the monthly update files yet. So I still can’t run that step to see what happens.

A note on hardware:

I decided to haul an old box out of the garage for this project. I didn’t care how long it took to run. Mostly I just wanted a dedicated LINUX box that I could “blow away” at any time if I needed a different release, port, or whatever. (I have several LINUX releases on my bookshelf along with a couple of BSD copies, Wasabi, and more…). My goal was mostly just to get through the compile step. Given the small size of the code, I knew that would be “doable” on any size box. I can now also say that the code runs just fine on my box as well. What box?

It started life as a “white box” 486 machine some decades ago. About a decade (plus?) ago it got a motherboard upgrade. As of now, it runs an AMD 400 Mhz processor. The memory on that board is a mix. It was from a transition period when both SDRAM and old SIMM memory sticks were supported on one board. Right now I have some of each: 64 MB of 100 Mhz SDRAM and 48 MB of slow SIMMS. The Operating System is RedHat 7.2 Linux (though most any Linux or Unix ought to be fine.) This was just what was on it when pulled from the garage… The disk is a 10 GB IDE disk, of which GIStemp sucks up about 1 GB.

Most of that is for data. GIStemp makes a copy or two of the data at each step, with minor modifications, and leaves some of them laying about. The code itself is only about 7000 lines after allowing for duplicate copies and “hand tools” that are not part of the normal flow of execution.

A fast disk is much more important than a fast processor. Much of the time the machine is I/O limited. Especially in the Python steps. The only time I saw a significant “hardware limitation” was during the make of the “gcc” C compiler chain that I had to do to get the 4.x release level of libraries needed for the g95 FORTRAN compiler port. ( If I were running a newer Linux, they would already be there… I just figured a gcc build was less work than a new Linux install.) Watching “top” and it’s report of swap file usage showed that 256 MB of memory would be the biggest performance improver. The actual execution of GIStemp runs just fine on the box “as is” taking a few minutes per step, mostly shoveling the 100 MB or so of data into Yet Another Copy…

So if you decide to run this puppy, any old PC will do. A new one with fast SATA disk drives and at least 256 MB of memory matters more than the CPU speed.

I’ll be putting together a posting on what was done for the port, so anyone who would like to know exactly what was done to make it go can either wait a bit, or ask questions below. Also, if anyone want’s a copy of the code, I’m happy to make it available. It will get a bit “cleaner” over the next week or two, but if you want it Right Now, just tell me where to leave a “tarball”… ;-)

UPDATE: As of July 21, 2009 I have managed to get GIStemp to compile on a Linux box! I’ve added “Makefiles” for all the steps and moved much of the code into a src/bin structure (it had been compiling code in-line in scripts and then running it, then deleting the executables).

I’ve also started to map the “data flow” as it moves from file to file to file to file… A silly way to do a “data structure”, rather reminiscent of how it was done in the 1960’s with tape drives holding batch data. At any rate, the first section is mapped (with more to follow) here:

The Data Go Round

The bottom line is that the “data structure” is still primitive, but the program and data file location and overall operational flow is now a much cleaner structure with source code (things like sorts.f ) in separate directories from where it writes it’s scratch files / work files; and with a clean way to “make” the executables (things like sorts.exe) one time and not have to do it again every time you run the system. Also, where there was the potential for ambiguity between a script name and an executable from a FORTRAN program, I’ve changed is so that the script ends in .sh as a qualifier. For example, zonav.f is the FORTRAN source that produces zonav.exe which has a “wrapper script” that runs it named The *.f files live in a directory named “src” while the *.exe files live in a directory named “bin” and the *.sh files come in two flavors. The “driver scripts” for a single “STEP” that are left in the STEPx directory, and shared scripts that I’ve put in the “src” directory.

You can now look at the code and have a decent clue what’s going on! For now, each FORTRAN and python program is basically exactly as written by GISS. I’ll be making some minor changes as I get steps working and test them (things like having the “work” files created in the work_files directory, rather than ‘wherever’ and then moved to the work_files directory when done. I will NOT be making any changes to speak of to the logic or processing of the code. Just cleaning up where it puts things, not changing what it does to them.

Once it’s running, I intend to follow a USHCN site as it’s data flows through the code and see what happens to it, Step by Step…

This all sounds more complicated than it is. For one thing, moving to a source directory means that the duplicate copies of programs shared by STEP3, STEP4 and STEP5 now only have a single copy to deal with. It also means that all the “compile foo.f, run foo.exe, delete foo.exe” text that was scattered around all the scripts is replaced with a couple of “Makefile”s. One for the initial step of STEP0 and one for STEP2, STEP3, STEP4_5 combined. (Sometime after I’ve gotten STEP0 tested, I’ll combine the two Makefiles…). The STEP1 python step has a complicated make system built into it already. I may do something with it, but it seems to work OK for now.

The only real “bugs” I’ve run into so far are that to get things to compile I had to remove a “feature” they were using in about 1/2 of them. They would declare an array and assign data values to it in one step. That is a non-standard extension on some FORTRAN compilers, but not g95. So I had to replace those lines with ones a bit more like:

DATA FOO /1, 2/

(Where they had: INTEGER FOO(2)/1,2/ )

There were also a couple of programs that complained about data type mismatches in passed parameters. That can be a “bug” or it can be a “neat trick” depending on what the programmer intended (though it is always poor style…) As I run those programs, I’ll evaluate if those were bugs or features ;-) IIRC, there were three of them. Two were involving print steps, so I’m not too worried. One was a REAL passed to an INTEGER data type in a subroutine. That’s more worrisome and could be a significant bug. We’ll see.

Oh, and in most cases the scripts had the ksh directive at the top removed (most folks don’t have ksh). A couple of script don’t run on bash or sh, so I’m going to be fixing them as I do my “debug and test”.

So, at any rate, it’s time to “pop a cork”! Since I now have a significantly cleaned up AND COMPILED version of GIStemp ready to run and test!

As parts are shown to work correctly, I’ll just use them as is. If a part is broken (due to the compiler differences, or ksh) I’ll re-write it as needed, perhaps even into C. Just about everything on the planet has a C compiler on it… (It was modestly annoying to get and build all the things needed to put a FORTRAN 90 or newer compiler up on Linux…)

At any rate, if you are thinking of trying to make GIStemp go, holler at me first. I’m happy to give folks what I’ve got to help them along. When I have something that works more or less end to end and can be shown to be essentially what GISS intended, I’ll publish it somewhere (need to find a place to put a tarball…). Until then, it will have to be “upon request”.

So now you know where I’ve been the last couple of weeks ;-)


Every journey begins with but a single step. This page is it for GIStemp deconstruction.

At present this is a work in progress. As I get through a chunk, I will update this persistent top level page with the appropriate links. It is likely to take a few months.

GIStemp has 6 formal steps (named 0 to 5) each run by a top level script. In the sections on each STEP, I go through that top level script in an overview page. that overview page will have links to the source code and commentary for each program inside that step. As these overviews get done, and as the code pages fill out, there will be links added to fill in this structure.

Right now the biggest hole is STEP2, but I hope to have something there soon. STEP1 source code is up, but analysis is lacking for now.

At present, I’m nearly done with STEP0, and I’ve started on STEP1 (which sounds much more impressive when you realize that step 4 and step 5 are in one directory, STEP4_5, and they (it?) are really just a couple of small new programs and some runs of the same code as STEP3, so there’s roughly STEP2 as a “big chunk” to go. With Minus One, Zero, and One done that makes it about half way already…

STEP3 and STEP4_5 have had the scripts documented and the source code is up, but I have not yet done the FORTRAN deconstruction.

So please expect that this shell will fill in over time. There is only one of me doing it part time for free, so the speed will not be what you might like.

You are welcome to help make it go faster by contributing time, coding skill, commentary, or even just beer. After all, my present motto is “Will Program for Beer!” (Hey, everybody needs a motivator…good management just chooses what they know will work; and since I’m management as well as grunt on this project…)

General Overview Steps

So how big is this puppy and where can I download a copy? What is the general impression of it?

For a bit more detail on the source download and a peek at the “README” file that comes with it, gistemp.txt, we have a starter peek.

And if you would like, you can look in a bit more depth at the ghcn data formats.

General Issues

A list of things I’ve found that make me wonder.

First up, the issue of GIStemp cutting off data at 1880 and using an odd “baseline” period of 1950 to 1980. Is this a Cherry Pick? Were these dates picked deliberately to make warming trends look bigger than they really were (or to fabricate the trend entirely?)

Then there is the issue of false precision. How can you calculate 1/10 th of a degree C from data in whole degrees F? Mr. McGuire would NOT approve!

(Please note: To all the folks who wish to run off to the central limit theorem and how an average of a large number of values can have a greater precision: All such discussion belongs under the “Mr. McGuire” thread. It is there already. Further, GIStemp does not do that. NCDC in GHCN averages exactly 2 things: daily MIN and MAX. Then it takes those (at most 31) for the month and averages them together. The average of up to 31 daily averages of the daily MIN and MAX is not the monthly mean (though everyone treats it as such). The “problem” is in assigning 1/100 F precision to that as the monthly mean when it: A. Isn’t the monthly mean and we have no error estimates. and B. Is calculated in 2 steps with no more than 31 values at any one point. No law of large numbers need apply here. In GIStemp STEP0 this is converted to C one value at a time, so again, no ‘law of large numbers’, just a “monthly mean temperature” measured in 1/10 C that isn’t an accurate monthly mean. This is carried through all the other homogenizing, splicing, UHI adjusting etc. steps. The only time a large number theorem approach might apply is the final pre-anomaly step, but even there it looks like each box is calculated, then the global mean is calculated from them. So please, leave the central limit theorem stuff out of GIStemp… look at how the data are actually processed, not at theoreticals.)

And then there is the issue of trying to dig a global trend out of a data series that only goes back a few years for most of the planet. Maybe you can get a 50 year trend for America, but not for most of the world. We just don’t have the data for the time period needed

The Code – Step by Step, Inch by Inch, Slowly He Turns…

Step Minus One

OK, you need to get the data and there is a pre-process to sort the Antarctic data if you get a fresh copy.

Step Zero

So how about those input files scattered about?

Once you have all the data and files, STEP0 processing does what again?

Step One

What does STEP1 look like in an overview sense?

This step uses Python, but the Python programs call a library of C functions. These are in two pieces. Monthlydata and Stationstrings.

Step Two

What does STEP2 look like in an overview sense?

This step has several subscripts and FORTRAN programs

Step Three

What does STEP3 look like in an overview of mostly just source code listings right now.

Step Four_Five

Some folks have asserted that since GIStemp uses an anomaly map for sea surface temperatures to adjust its internal anomaly map already computed, this means that GIStemp “uses satellite data”. After chasing this down for a while, I came to the conclusion that this stretches the truth quite a bit. Yes, it uses a partially satellite derived SST anomaly map as input to STEP4, but that isn’t quite the same as using direct satellite data.

What does STEP4 look like in an overview of mostly just source code listings right now.

What does STEP5 look like in an overview of mostly just source code listings right now.

Related Websites

To Be Done: Add entries.

Subscribe to feed


129 Responses to GIStemp

  1. pyromancer76 says:

    Thanks again for your willingness to make your economic/financial wisdom public. It takes quite an effort to create these posts. I keep mentally digesting them at the end of each week. I used to keep up on some aspects of markets; perhaps one of these days I will have enough under my belt to engage in a discussion.

    [Relocated here to be more on topic- emsmith]
    I have been reading Frank Lansner’s post “Making Holocene Spaghetti Sauce by Proxy” on WUWT. I believe two articles should be published together, or, one in one month’s issue of a journal and the second in the following month. The first, should be Lansner’s (with clean-up for English and greater readability). At the beginning of it, “Part Two” should be prominently mentioned.

    What is the title of the second article and who is the author? It is titled something like “The Outlier’s Lies: GISStemp and the Attempt to Destroy Our Minds?” (Oh, well, I guess that title is a bit melodramatic.) It is written by E.M. Smith — in a detective style answering the Qs: How does GISStemp become the outlier over and over and over again? Second, what are “their” most dastardly deeds in relationship to a programmer’s commitment to truth in working with data?

    I would love to see this article. You posted something like I am imagining in a lengthy comment on WUWT.

    Believe it or not, I have further advice! In addition to the article you “should” write, I think you should continue with these GISStruth moments on Anthony’s blog. Make them into a detective serial. I believe abreviated versions of your work belong on WUWT for a large international audience. ClimateAudit is absolutely essential, but it is for the experts.

    I any event, I am grateful for your efforts and your communications.

  2. E.M.Smith says:

    I’m doing GIStemp in two threads. One is source code with comments in a technical style (and I’m hoping to get some other folks to join in on ‘deconstructing GIStemp’). The stuff under “The Code – Step by Step, Inch by Inch, Slowly He Turns…”

    The other is a series of more pointed articles each detailing some flaw or factor in a form suited to a broader audience (as things are discovered and as they are written up). These I link to from WUWT. (Illudium, Mr. McGuire Would Not Approve, etc. The stuff under ‘general issues’)

    Eventually, there will be a ‘top level critique’ clustering those articles into a single whole; but that’s months away. I could probably put it up as a skeleton and add the parts over time.

    And yes, I’ll keep putting up summary replies with pointers to here on:

    In fact, that was why I started this operation. So I could stop retyping the same comments on WUWT and just post links to here! So my intent is that as I build up a library of these critiques, they can be linked to by anyone from anywhere including WUWT. (But I try to moderate my linking from WUWT since otherwise it gets a bit ‘stale’ to the regulars – so a couple of times a month or as the topic comes up, I link to one of the threads here.

  3. fred says:

    BTW, GISTEMP and RSS (satellite record) show about the same amount of warming over the past 30 years. It’s not the case taht the satellites show far less warming as is sometimes, I guess, assumed.

  4. E.M.Smith says:


    Two problems. The first one is that NCDC Adjusted data and GIStemp both “rewrite the past” so there is a hinge point prior to the satellite data. That the satellite and GIStemp agree recently does NOT validate the whole slope. And the satellite data is from too short a period of time to demonstrate climate, only short term weather trends.

    The second problem is in the “about the same”. Since we are supposed to get all excited about a change measured in tenths of a degree, and since the tenths of a degree position is entirely fictional in GIStemp, that some other data set is within tenths of a degree of it is not very comforting and just continues to say the same thing:

    ALL of the AGW nonsense is just dancing in the error bands of the calculations.

    See “Mr. McGuire Would Not Approve” under general issues above for clarification and amplification.

  5. fred says:

    [ Comment Moved to:

    I’ve gotten tired of the endless nattering about precision (a highly technical point) on this “general start here” page. So I’m moving all comments related to it over the the proper thread, the Mr. McGuire thread. (I’d asked folks to take it there… now it’s moving there.)

    If you want to dissect that nit any more, do it there, not here. -E.M.Smith ]

  6. We did a lot of work on GISTEMP in 2007 and 2008 -see threads at Climate Audit. .

    I’ve transliterated most of the first 2.5 steps into understandable scripts in R.

  7. E.M.Smith says:

    @Stephen McIntyre

    I’ll take a look when I get a chance. Right now I’m up to my eyeballs in a bull market run and I’m making money… need to do that every so often to feed the hobbies 8-)

    Frankly, the traffic on the technical parts of this GIStemp thread have not been very high. In a way not surprising… But it does leave me wondering if it’s the best place to put my efforts.

    So for now, my focus (at least the GIStemp part of it) is moving back to the “explain it to Joe and Jane Sixpack” and less to the “provide an easy place for programmers to look and comment”…

  8. Fluffy Clouds (Tim L) says:
  9. steven mosher says:

    see this error in step 0

    ob001909: GISTEMP STEP0 discards the final digit of every USHCN datum

    The GISTEMP STEP0 code which reads the USHCN temperature records and
    converts them to the GISTEMP v2.mean format fails to read the full width
    of each datum. Each datum occupies 6 columns in the data file (-99.99
    to 999.99), but only the first five columns are read and converted to a
    float. This was detected by the Clear Climate Code project in
    diagnosing a difference between the output of the FORTRAN STEP0 and

    from the guys at

  10. E.M.Smith says:

    @Steven Mosher

    I’ve looked in USHCN2V2.f and the section in question looks like this one to me:

    do m=1,12
    indd=15+(m-1)*10 ! start of data
    indf=indd+9 ! position of fillin flag
    if(mfil==0.and.line(indf:indf)==’M’) cycle
    read(line(indd:indd+5),'(f6.2)’) temp
    if( itemp(m)=nint( 50.*(temp-32.)/9 ) ! F->.1C
    end do

    I don’t see any particular bug here (but then again, things are bugs because they are hard to spot…) The GIStemp site says they fixed the bug, so maybe it’s not in this code release.

    At any rate, it’s late (after midnight) and I need some sleep. I’ll look up the bug more tomorrow and see if it’s still in the code.


  11. steven mosher says:

    see here.

    read(line(indd:indd+5),’(f6.2)’) temp

    is the line in question

  12. E.M.Smith says:

    Thanks, Steven, I’ll take a look at it.

    The line of text matches the one in the “snippet” I quoted, so it looks like it is the same bit of code AND like the bug still exists…

  13. E.M.Smith says:

    [ Comment Duplicated in: but left here for now. There are some non-precision discussions in this comment that ought to stay here and I’ll sort out how to do that later. Just don’t pick up the precision issues here, OK? Do them on “Mr. McGuire”. -E.M.Smith ]

    I disagree.

    As well you ought. There is a small “edge case” that I’ve avoided discussing because it sucks people down that road long before they “get it” that the main point is valid, dominates, and that the “short road” they are on is a dead end of little value. (Some, but very little).

    That “edge case” is what you are exploring.

    It does not apply to world average temperature precisely for the reasons you cover with your premises of your thought experiment.

    1) We don’t have enough real thermometers for it to work.

    2) We don’t measure them only once in time, then make the average for the globe. GIStemp STARTS with monthly averages of min-max. MIN and MAX come at different times for every site. We’ve already averaged 2 data points per day, then averaged those for each month. And each of those temperature readings was taken at a disjoint time.

    3) We violate Nyquist requirments in both time and space to such an extent that even the whole digits are suspect. Given that; a Theoretical Land improvement in at most one decimal point of precision is a pointless exercise in distraction. An exercise that causes most folks eyes to glaze. And does nothing for the non-Nyquist history of temperatures that we have to work with.

    OK, but you wish “to go there”.

    Where you will end up is that whole degree F raw data (IFF you had access to it) can support about 1/2 to one decimal point of added precision. Maybe. Sometimes. If you did everything perfectly. Which we don’t. (And IFF we had a lot more thermometers in the past, which we don’t and IFF the collection criteria were more stringent, which they were not). Due to not meeting those criteria each reading in our actual history is a disjoint data point for ONE place at ONE time in a non-Nyquist set and can not be used for a statistical approach via over sampling methods. Oh, and the resultant number does not mean much.

    But then we throw that theoretical possibility away by immediately making daily MIN / MAX averages for each site that are then turned into monthly averages (further diluting the potential for that theoretical to surface) and THEN we apply a bunch of “corrections” that even further pollute the precision. Only then does GIStemp get a shot at it…

    So yes, you have a “theoretical” that is an interesting mathematical game to play; but no, it is not of use in the world today. The data are not suited to your theoretical from the very moment they are collected (fails Nyquist).

    OK, the next “issue” is the question of “WHAT average” to use? See:

    For a reasonable introduction.

    Are we using the arithmetic mean? Median value? Harmonic Mean? Geometric Mean? Mode? Geometric Median? Winsorized mean? Truncated Mean? Weighted Mean?

    Which choice is right?

    And what choice was it after GIStemp has added some data points, interpolated some, fabricated a few, and deleted some others? What does that do to the precise requirements for a statistical approach to adding a partial decimal point of precision? At that point it does not neatly fit either the truncated mean nor the non-truncated category. It is a new beast of it’s own construction, undefined in standard statistics and with unknown properties. And unknown, but limited, precision limits.

    Now, there is still one other major issue before getting to your theoretical: It is very important to keep it clear in your mind that temperatures are an intensive variable. If you glaze at that in the smallest degree and skip over it without an in depth grasp of it, you will continue to waste time and space on a pointless pursuit of the impossible. Most folks, it seems, do exactly that (give how much bandwidth is wasted on the issue to no avail…)

    These folks have a nice short description a couple of paragraphs down:

    What that means, “intensive”, is that one instance of the property for one entity means nothing to another instance of another entity. It is not dependent on the mass of the object.

    The taste of MY meal means nothing to the taste of YOUR meal, and averaging them together can be done BUT MEANS NOTHING. Were you averaging in one drop of Tabasco sauce from my meal, or one ounce? It matters to the average, but is not known…

    Another example might be taking two different pots of water and averaging their temperature. You get two numbers, but know nothing about the THERMAL ENERGY in the two pots. The temperatures become an average, but the average means nothing. It certainly is not representative of the average thermal energy.

    Take the two pots of water and mix them, the resultant temperature is NOT the same as the average of the two temperatures. You must know the mass of water in each pot to get that result. And we did not measure the mass.

    The same thing happens on a colossal scale globally. We measure the temperature over a snow field, and ignore the massive heat needed to melt the snow with no change of temperature and ignore the mass of the snow. We measure the temperature of the surface of the ocean and ignore the shallow and great depths. We measure the surface temperature of a forest, and ignore the TONS of water per acre being evaporated by transpiration.

    Then we average those temperature readings together and expect them to tell us something about the heat balance of the planet.

    That is lunacy in terms of physics and mathematics.

    So take all the above, and firmly fix in your mind the truth:

    An average of a bunch of thermometer readings MEANS NOTHING.

    Got it?

    OK, IFF you can get that preamble into your mind and hold onto it, then I’ll “go there” into your theoretical…

    (ANY attempt to use the result of the answer to the “theoretical” to revisit GIStemp, AGW, or GHCN issues will be referred back to this preamble…)

    I am instead claiming that when we average lots of different temperature readings to produce an average which I call the “global temperature”, the error in the global temperature is lower than +-0.50.

    You can get somewhat more precision if you have ever more stringent limitations on how you sample. (NOTE: This is NOT what is done in the real world and has nothing to do with global temperatures as actually measured).

    I am not measuring the same thing repeatedly. I am taking temperature readings at different points on the globe ONCE and taking their average.

    That is too bad. Repeated sampling is one of the ways to get greater precision. Take 10 thermometers at one place and time, now you have a 10 x larger sample. You CAN average those readings to extend your accuracy and precision (this is done with “oversampling” D to A converters in music applications). You can get to about X.y from thermometers that read in X whole units IFF you oversample “enough” (where enough is more than 10…)


    My earlier response was presuming you were trying to tie this to the world temperature record (and GIStemp starts with a set of monthly averages of averages of MIN/Max; so your theoretical diverges dramatically from what is done “in the real world”.)

    Now realize, that even if you measure ONLY ONCE, you still have the Nyquist space problem (enough thermometers evenly enough distributed in space) AND you still have the “intensive property” problem. You can say nothing about heat (yet that is what folks inevitably try to do).

    My contention is that the accuracy of the global average is much higher than the accuracy of any one measurement as the number of data points in the average becomes large.

    The key point here is “becomes large”. Large must be very very large, and we are not even near the Nyquist limit in the real world. You need more than Nyquist to start getting more precision (and remember that the precision tells you nothing of meaning.) So in Theoretical Land, with some millions of thermometers, you can get to X.y from a measured X. (If all are taken at the exact same instant in the time domain.)

    I don’t see where this point is germane to what is done with AGW arguments, GIStemp, GHCN, etc. The “theoretical” is so far from reality it is like arguing about angels fitting on pin heads. An interesting mathematical game, but not of use. We don’t even get near Nyquist in either time or space. We measure random places (especially changing over time). We measure them at near random and disjoint times. We then use a randomly selected averaging technique to average some of them together and THEN we start to use GIStemp to fudge the data even more.

    The difference between Theoretical Land and Reality makes believing in Angels the easier choice 8-)

    Here is a thought-experiment which explains why the distributional properties of the error change.

    Suppose I have 1000 measurements of temperature from 1000 locations and in these locations I have super accurate thermometers which can give me the results to 10dp. I calculate the Exact Average – lets call it EA. I then round each temperature to the nearest integer and calculate the Rounded Average, let’s call it RA.

    Yes, this will give you a very precise number. One Small Problem:

    You are working from the rounded data from a very precise starting data set. The real world data are not precise and have accuracy limitations in the instruments as well. The properties of your RA will be far different from the properties of the averages from the real data. Your data have a theoretical basis of high precision, the real data have a basis of low precision. The statistical distribution of their error bands will not be the same. This is the major flaw in your argument, IMHO.

    Another Small Problem: 1000 thermometers will give you very precise readings that DO NOT represent the reality around them. You will still have ACCURACY issues. And your scale is still low by several thousands (or millions?) of thermometers…

    Take a meadow with a stream through it. Just around the corner in the shade, snow is melting into the stream. A dark rock on the edge of the stream in the sun is warmed by the sun. The snow is 32F, the creek is 33F, the stone is 115F on the surface (and 60F in the deep interior), the air over the meadow is 85F and the grass blades are about 75F (they are transpiring and evaporating water). Where do your place your thermometer to get an ACCURATE reading of “the temperature” of that place?

    The answer is that you can not.

    Each piece has it’s own intensive property of temperature. We hope that putting a stevenson screen with a thermometer somewhere in the meadow at about 5 feet up will give us some kind of “accidental average”, via the air, of the various surfaces in the meadow. Good enough for us to know if we need to wear a coat (since we are in that body of air) but quite useless for saying anything much about heat balances. I learned this quite dramatically in that meadow when, sweating in the sun, I dove into the stream and shot right back out a nice light blue color!

    So yes, you can repeatedly average your average of averages of accidental averages and get ever greater precision, but no, you can’t get more accuracy, since at it’s core, the notion of an “accurate temperature” for any given size cell is fundamentally broken. You can only get a truly accurate temperature for a single surface of a single thing. It is an intensive property of that thing.

    Further, you can not take a temperature and use it to say anything about heat or energy balance (the important issues for “climate change”), though people try. Temperature without mass and specific heat is useless for heat questions.

    So as soon as you step away from that thing, you get to answer such questions as ‘how was the heat flowing?’, ‘what are the relative specific heats?’, ‘what phase changes happened’? And since we ignore those, we have no idea what “the temperature” means.

    I believe that you would argue that the error in the rounded average is +-0.50 so that the error is a uniformly distributed random number in the range -0.50 to 0.50, i.e. the error could be -0.10 or it could be 0.50, or it could be -0.50, or it could be 0.267272 all with the same probability.

    IF this is your view then I disagree. If you study the statistical properties of the error, you will find that the size of the error will actually be much smaller than that.

    For your theoretical case of a gigantic number of thermometers read at exactly the same time.

    Not for the real world. For the real world, we measure each high and each low ONCE at each place. That’s all we have.

    For the real world, there is no possibility of a Gaussian distribution of error, since we have only one temperature of one thing at one point in time. From that point on, we can make no statement about improved accuracy from improved precision other than to say that your accuracy is limited to whole degrees F so any precision beyond that is False Precision.

    The disjoint measurements in time, the sample size of one for any place, the non-nyquist distribution of places, they all say: You can only truncate at whole degrees F any calculations you make.

    There was no “one thing” repeatedly sampled (in time or in space) to support an “oversampling” argument. And that is what your argument is. That an over sample can extend accuracy. It can, by a VERY limited amount. But the data we have do not conform to the requirements for a statistical over sample approach. So your theoretical is, and must remain, a theoretical that has no bearing on the real world and the issue of “Global Warming” (and how GIStemp works).

    Here is another way to look at it. The only way for the error to be equal to +0.4999.. is if EVERY temperature reading is in the form XX.4999. The only way for the error to be minus 0.4999 is if every temperature reading is XX.5001. What are the chances of that ? Pretty much zero, especially as the number of readings becomes large.

    Except that in the real data, being that the error is not a statistical artifact of rounding from true and accurate data, there is no basis for any particular expectation about the nature of the data that do not exist. The fractional part that you are discussing does not exist.

    So “Pretty much zero”, is quite possible. And that is the whole point behind knowing where your accuracy ends. You DON’T KNOW what the “real” average is in that range. You get to GUESS based on probabilities IFF you have the basis for it.

    IFF you have a tightly constrained set of initial conditions, you can use a statistical approach to tease out a bit more “significance” via that probability analysis. If you do not have those tightly constrained initial conditions, you can not make such a statistical GUESS.

    We don’t have the initial conditions to support that approach in the thermometer record of the planet. Even your theoretical falls short (1000 is no where near enough).

    You would find that the probability distribution of the error is no longer uniform, but becomes Gaussian with a variance which becomes more tightly distributed around 0 as the number of measurements in the average becomes large.

    And again, we are back at the point that “large” must be very very large. My guess is on the order of sampling every square meter of surface (but no one knows for sure). The surface properties of the planet are rather fractal and that introduces some “issues”. Is a black beetle on a white marble paver in a green garden with brown dirt patches accurately being “averaged” in with a 1 M scale? No. Are the billions of beetles on the planet significant? At what point do you put a number on “large”?

    And just a reminder: This still ignores the problem of averaging a bunch of intensive property measurements being a meaningless result.

    For this reason, I claim that there is statistical “self-averaging” (the averaging over many numbers cancels out the rounding error to a large extent) that makes the level of accuracy in the result significant to maybe 1 or 2 dp.

    You can’t get to 2 with your theoretical (and it would be challenged to get to part of 1 dp. You need a lot more thermometers.) The math is beyond the scope of this article (maybe I need to add a specific article on this point. It seems to be a quicksand trap that everyone loves to go to for a picnick…) In the real world data, they are challenged to support whole degrees of F and the way they are handled prior to GIStemp makes the 1/10 F completely fantasy.

  14. H.R. says:

    E.M., you wrote, “We measure random places (especially changing over time).”

    I’m not convinced the places where temperatures are measured were selected at ‘random’, as the term is used for statistical purposes.

    The locations for placing thermometers are selected for at least one, and more often, a combination of the following factors (warning! not exhaustive, but a fun and true list):
    1. local interest in the location
    2. accessability of the location (think South Pole vs South Philly, e.g.)
    3. willingness of someone to faithfully read and record temperatures at the location
    4. convenience of the location
    5. willingness of someone to allow a thermometer to be placed at a location
    6. easy first choice; the location seems as good as any other nearby place to locate a thermometer
    7. and???? (fun to think about, eh?)

    To be truly randomly selected, each possible temperature measuring location must have an equal probability of being selected. I was taught that I would be stepping on thin ice if I “assume a random sample” and did not take steps to assure that I was taking a truly random sample. (Professor Neuhardt would ding you a few points every time for that.)

    It seems there is a tendency to forget (self included) that we’ve been using whatever means we’ve had lying about to examine the possibility of global temperature change and scientists only relatively recently are starting to devolop (gawd I’m gonna hate myself for using this word!) `~robust~ data gathering systems.

    Myself, I’d choose the terms ‘haphazard’ and ‘convenient’ before I’d use the term ‘random’ when referring to thermometer placement.

  15. E.M.Smith says:


    You are right. I was sloppy and used a “term of art” in a generic way. Mea culpa.

    I ought to have said “irregular” or “poorly selected”, though your “haphazard” has a charm to it ;-)

    You left out a Big One:

    8) At airports. They use thermometers to determine the Density Altitude to figure if an airplane can take off (or not). Surely it doesn’t matter that they want it located 4 feet over the tarmac where the airplane wing will be for optimal value for their use… Hey, they have a thermometer already, let’s use it!

    9) Military Bases. Lots of grunts to read the thing and they need the temperature as part of their weather system to know what their capabilities will be and plan. Hey, they have a thermometer already, let’s use it!

    Those two seem to account for the bulk of the thermometer record at this time… Never mind that both sites are highly biased toward warming.

  16. j ferguson says:

    If you do dig the Sparc 2 out of the garage and it works, let me know. I’ve got an IPX in storage, last used 6 years ago and I fear that the battery bearing chip with the hostid will have died.

    I have, somewhere, the things you need to do to get at the leads and add a battery to this chip and then recode the hostid, but I last saw them 6 years ago.

    maybe the code conversion is a better idea. besides, you’ll never have to do it again.

  17. E.M.Smith says:

    In theory, all I need to do is update my g95 compiler to the next newest release and use the ‘bigendian’ flag. I may dig the SPARC out just to either use it or pitch it…

  18. Tim Clark says:

    Have you ever tried to post your findings on RC. They could sure use your help. ;~P

    Response: I think we just did. But there is a bigger issue here. Take the GISTEMP product for instance. This takes public domain data provided by the Met Services, homogenises it and makes a correction for urban warming based on nearby rural stations. The method was amply described in a number of publications and lots of intermediate data was provided through the web interface. Good right? But the descriptions of the algorithms were not enough, and a number of people complained that the full code wasn’t available and how that meant GISTEMP was somehow hiding some secret manipulations. Now the code isn’t particularly pretty but it worked and so in response to that pressure, they put the whole thing online. Finally the secrets were going to be exposed! Except that….. people looked over it briefly, there was one formatting error found, there were some half-hearted attempts to look at it…. and nothing. McIntyre et al got bored and went off to find another windmill to tilt at. And people still complain that the data and the code aren’t available. This happens because people (in general) are much keener on the political point scoring than they are in doing anything with the data. The reality is not the point. Given that I share your desire for open science and transparency, why do these antics bother me? Because it sends completely the wrong signal. These politically driven demands for more code, more data, more residuals, more notes, more background are basically insatiable and when the people that provide the most, end up being those who are attacked most viciously, it doesn’t help the cause people claim to espouse. So when you hear this demands for more openness look at what those people have done with what is already there and judge for yourself whether it is genuine or merely grandstanding. – gavin]

  19. E.M.Smith says:

    @Tim Clark

    Well, from time to time I’ve tried to post things on RC and they all go to the “bit bucket”. I first started saving copies of postings there, so that I could repost them when then ‘mysteriously’ evaporated…

    I now have a wonderful log of everything I’ve posted since then, thanks entirely to them…

    The quote you posted is a marvelous example of folks “sucking their own exhaust”. EITHER they are incredibly inept at understanding that code examination takes time (I’m into GIStemp about 1 year now and I’m still not done.) or they just don’t ‘get it’ that crap takes a while to ferment ;-) or they have a political agenda. None of those is a good choice…

    In any case, I think I’ve shown that GIStemp has some very real issues that anyone can test, refute, or verify. So far no one single person has shown where any GIStemp bogosity was not correctly identified by me.

    Not One.


  20. Tim Clark says:

    So far no one single person has shown where any GIStemp bogosity was not correctly identified by me.

    Not One.

    I knew that E.M., and I figured you wouldn’t make it past the sniff test at R.C. I was just reiterating the level R.C. is at. Keep up the good work. Lots of us skeptics are behind you! As for the catfood, I immediately thought the problem may be mercury, the symptoms are similar:–new-tuna-warning.aspx

    Tuna is a common ingredient in pet food, especially in cat food; and you can bet that the tuna used is the cheapest grade available—which is potentially the highest in toxins. (Salmon, another popular ingredient, is nearly all factory-farmed, and therefore high in a variety of toxins.)

  21. E.M.Smith says:

    @Tim Clark.

    Yeah, you are very right about RC.

    Per the cat, that really belongs under the cat thread:


    It isn’t mercury. Her “sneezy drippy” modulates with the food within 24 hours. Heavy metal poisons don’t do that. They come in and stay for a chronic problem. You need chelation therapy to pull the metals out. It just can’t modulate that fast.

    It was partly that “near instant” response that got me headed down the allergy path, then it was that the same tuna and salmon did NOT cause a problem if “people food” that pointed at not an allergy but Urea Formaldehyde fish pellet coating in the ‘fish byproducts’ ….

  22. steven mosher says:

    I take issue with gavins comment

    “These politically driven demands for more code, more data, more residuals, more notes, more background are basically insatiable and when the people that provide the most, end up being those who are attacked most viciously, it doesn’t help the cause people claim to espouse. So when you hear this demands for more openness look at what those people have done with what is already there and judge for yourself whether it is genuine or merely grandstanding. – gavin]

    My demands for code are not politically driven. gavin knows this. he knows that I demand this in my business as well.

  23. E.M.Smith says:

    @Steven: I agree with you completely. It is pretty clear that those with the most fear of opening the kimono have these tiny little things they wish to hide in embarrassment …

  24. Not Sure says:

    Just FYI modern (Intel) Macs are little-endian as well.

  25. Steven mosher says:

    crap.. my mac is new.

  26. Peter says:

    Thanks EM. My first and best chemistry teacher drilled significant digits into my brain to such a degree, I could not forget it if wanted to. Global average temperature reported to two decimal places, or three for Hadley, for goodness sake, was what four years ago started me on this journey to skeptic land. This blog is a fascinating stop on the journey.

  27. E.M.Smith says:

    @Peter: You are most welcome. If you haven’t read it, read the ” ” posting. It is all about that same “chem teacher drilling significance” experience…

    UPDATE: [ I’ve left this comment and the ones just before and after it here, since it is more about personal experience than the nuanced math of statistics and precision. Any discussion of precision mathematics goes under the “Mr. McGuire” thread, though. -E.M.Smith ]

  28. Peter says:


    An interesting post. I cannot tell you the number of times I have questioned significant digits in various climate papers only to recieve the obligatory armwave about the law of large numbers. I have never been able to have it explained to me how new information can be created in this way, and the short answer which you provide above is it can’t. May I copy and paste relevant sections (with attribution) if needed?

    REPLY: “You have my permission to copy and use the material here in any attempt to disabuse folks of the ‘global warming’ notion and / or to correct their understanding of precision. -ems”

  29. BillM says:

    I think that the issue here is not one of stats but one of assumptions.
    You see the monthly average as an average of different things. Individual measurements, one for each day, of the different temperature for each day.
    The climate folks do not see it this way. Climate folks are not interested in either the day by day variation at a site or the site’s absolute temperature on each of those days. They assume that for one thermometer, all temperature variation in a month can be regarded as irrelevant weather noise.
    So they regard the 30 readings as 30 readings of the same thing. They see only one actual value for the month, not 30 different values. Hence, they claim 30 readings of the same thing, hence the law of large numbers in the average.
    And the climate folks only work in anomalies compared to a baseline. They assume that the variations in 30 years of base period are noise, irrelevant to any later trend signal that we seek to identify. Hence there is only one, 30-year anomaly baseline actual value for a month, which is claimed to benefit from 30 x 30 readings of any one thermometer. 900 readings to which the law of large numbers is now applied. So they say on realclimate.
    I am NOT defending this approach. I am simply suggesting that the climate folks would see the significant digit issue being of less relevance when viewed with these assumptions. (Don’t shoot the messenger, I’m with EMS, and you, on this one).

    REPLY: “I think you summed it up rather nicely. And that, at it’s core, is the nature of their precision problem. They forget that a month may have a significant trend, day over day, that averaging as a single thing ignores. They forget that a century may have a significant 60 year cyclicality in it and that a 30 year baseline is just hiding that ‘ripple’ which they later ‘discover’ as their anomaly. They create their own self confirming fantasies because they do not understand the tools they are using. Not computers. Not thermometers. And not even math. -ems ”

    [ Comment Duplicated in: so as to preserve continuity in that thread, yet it has much to say that is not “precision math” per say, so I’ve left the original here. -E.M.Smith ]

  30. E.M.Smith says:

    Take a look at the last few “by country” or “by continent” postings.

    Look at the bottom total averages of the temperature series. They are different from each other.

    I put them there deliberately to illustrate the degree of change of the “average” depending on the order chosen to to do the average. (Average all daily data / divide by days; vs: Average daily data in to monthly averages, then average the monthly averages). In some cases, like Antarctica, you can get 7 C of variation in the answer. (Note: there is no decimal point in that number, it is 7.0 whole degrees of C based on order of averaging…)

    That is part of the problem with “serial averaging”. Not only is your precision limited to your original data precision; but you can cause your results to wildly swing based on which of 2 reasonable choices you make …

    So just WHICH Global Mean Surface Temp does one choose…

  31. Peter says:


    That Antartic fact alone should put to rest the validity of any mean temperature digit to the right of the decimal. It just doesn’t mean anything.

  32. dougie says:

    Hi EM
    not sure if you have seen this, though it might intrest you!!

    what do you think?
    and keep up the good work

  33. E.M.Smith says:


    You have brought tears of joy to my eyes. Literally.

    “See how they distance themselves, Kohai?” Rising Sun…

    That is the first sign. Those who surrounded and protected “one of their own” edge away. Rather like Hansens boss did a while ago:

    Then even the “peers” and eventually the underlings find somewhere else to be…

  34. dougie says:


    “See how they distance themselves, Kohai?” Rising Sun…

    sounds like a quote from the 7 samauri, if so apt (fall on sword/s), as is sun pun, i’m crap at this, but try (need steve/bender/mosher (CA) on this blog for the wit).

    maybe some people are finally looking at your stuff & are taking it from the data side/panic stations!!

    o/t & to show the tear is not misplaced, you have worked long & hard on this, on your own time for zero money.
    i would thank you & everyone helping.

    ps. i now understand the jack effect, but thats for the bunny thread.

    cheers dougie

    REPLY: “Thanks!. The movie is ‘Rising Sun’ and the speaker is Sean Connery to Wesley Snipes (as his kohai, or understudy). Written by Michael Crichton. As the ‘bad guy’ is first being found out and as all the other men in the Japanese board room melt away from the one who is loosing face (and the head man is starting to glower at him…)

    Per the rest: One can only hope…

  35. Peter O'Neill says:

    A STEP6 has now appeared, with other updates to other steps, at

    These are dated November 14th, and seems to still be work in progress – , linked from the Sources page, seems to have become a broken link for now, and while has been updated to indicate that USHCN version 2 will be used from November 13th, there is as yet no mention of STEP6 there, and it is not mentioned in the revised gistemp.txt file either.

    A quick look at the STEP6 files indicates that it produces “line plots” from ANNZON.Ts.GHCN.CL.PA.1200, ANNZON.Tsho2.GHCN.CL.PA.1200, ZON.Ts.GHCN.CL.PA.1200 and ZON.Tsho2.GHCN.CL.PA.1200.

  36. E.M.Smith says:

    Golly! Wonder if someone got needled into fixing an obviously broken bit of software ;-)

    Yes, I expect “GIStemp” to be a “work in progress” for quite a while…

  37. waymad says:

    Guess you are being Instalanched, but let me offer sincere congratulations for doing, as we say, the ‘hard yards’, wading through the code, and showing us the trends and biasses inherent in the temperature record. Your efforts have borne fruit. Bit like that opening image, perhaps…Again, thanks.

    REPLY: [ Thanks! While traffic has picked up, it’s still a couple of orders of magnitude behind WUWT, so no risk of ego inflation ;-) Yes, a year ago it was a long hard bleak slog ahead. Now, at last, I’m getting some of the ‘fun stuff’… Pushing NASA into putting the US thermometers back was pleasant! -ems ]

  38. Pingback: Gavin Schmidt – Code and data for GIStemp is available « CO2 Realist

  39. twawki says:

    Wow, great post. Thankyou. Youre starting to get linked at WUWT

  40. Michael Lenaghan says:

    This sounds interesting:

    “The problem with averaging all these stations is that the any tendency in stations to change with latitude introduces bias. That is, if stations are introduced in warmer climates late in the century, the average will be biased to warmer temperatures.

    To get around this problem, I have simply normalized each of the station records (i.e. subtracted a station’s mean value) before averaging each year. This puts all of the stations on a even playing field, so to speak, no-matter whether it is normally warm or cold.”

    Would it be worth trying that with GISS raw data vs. adjusted data?

    REPLY: [ Probably, though GIStemp tries something like this already. All of the “adjustmement” methods come with the possibilitie of either failing to work (but leaving you to believe they did) or introduction of collateral biases. It is worth trying, but requires rather a lot of testing to validate.l BTW, application of a mean like this will fail if there is a cycle longer than the mean… and we know that there are cycles longer than the entire length of the data series… -ems ]

  41. Michael Lenaghan says:

    Thanks for your reply. I’m curious if we’re talking about the same thing:

    “BTW, application of a mean like this will fail if there is a cycle longer than the mean.”

    What cycle do you mean? It seems to me that the intent of the whole process is to generate per-station anomalies, and then to combine the anomalies to see whether the entire record is moving up or down–and by how much.

    I thought this might yield a quick take on your “moving from the mountains to the beaches” finding–assuming that the mountains and beaches are treated as separate stations!

  42. Michael Lenaghan says:

    Btw, he revised his procedure slightly:

    “Paz pointed out that the normalization used previously might not remove geographic biases introduced by fugitive weather stations, so here is another approach. I have differenced each of the records, averaged the differences and then cumulative summed the result.

    This way we are dealing only in annual increments, up until the final summation. The result differs from the previous, as the pop-up in 1914 is lowered, and the temperatures post 2000 are raised. However, it still seems to depart somewhat from the official version.”

    We’ll see if the approach continues to evolve, but as I said: it caught my attention because of the mountains-to-beaches issue.

  43. Having com-posited data to make a long term USA forecast, using a large raw data set, of 22,000 stations, about 880 of which are “Primary NWS stations”, of varying length, elevation, seasonal coverage on some upper elevation fire stations, and changes in density of station coverage from state to state, increases in number for awhile, then a reduction of the total, due to consolidation to conserve spending budgets. I am aware of some of the problems they at CRU allude to in the raw data sets they must have had to wade through.

    Where as I am doing contour plotting of data that were recorded the same date for each map, Times 3 cycles gridded separately, then combined, I am not too picky as to how many data points I use per grid resolution of 0.1 degrees, I chose to use all the valid ones (non-9999 values, any that were less than 150 Degrees F, and more than -60 degrees F, I had to eliminate all negative precipitation values (there were a lot in the raw data), and chose to not use Trace amounts less than 0.1″ as agriculturally insignificant.(to not have to deal with blending “T”‘s into the numerical amounts)

    The result of using the varying station density, and intermittent station data, when ever it was valid, did not cause me any problems. It only changed the resolution of the contours, better in more dense areas of record, and better in the summer when it was inhabited. Since I was not setting up a uniform grid coverage to form a reference average, it does not matter to my application, if the coverage is better or worse from area to area.
    results are viewable at;

    From reading the Harry_read_me file comments it seems they were having a problem selecting long term records of consistent length, and weeding out errors in the raw data set, while at the same time selecting the stations that fit their “EXPECTED” temp response and still keep an evenly spaced number of stations per grid. They talk of “INFILLING” where there are large distances between data stations, this they did at their “objective discretion”, where I just left the coverage slim as it was, by not infilling, and left the area be weak in it’s ability to forecast subtle changes from area to area, and day to day.

    CRU and GISS had to have some plan to regulate the density of coverage and infilling to represent a true average/unit of surface area, whether they used this as an opportunity to “adjust” the final result, will not affect the original complete set of raw data. The problem we have is that the first set of adjusted data, trimmed of all “Problematic data and irregularly spaced stations, and infilled data points” when converted to a CSV file or other format and saved, no longer has any stations identification numbers attached to any of the data, just the Long and Lat and temperature values.

    I by passed that programming problem by tabling all usable valid station data, one file for each date, for each parameter, and retaining a copy for use in forming the CSV files for each forecast date. So that it would be totally repeatable by anyone trying to do so, or in case I want to look at the individual cycles, or look for phase shifts between them as in the case of outer planet synod interferences, that shift the timing of the influx of moisture out of the tropics into the mid-latitudes.

    It is a “shame / crime”? that their interim data set adjusted from the raw data set was lost. So that others cannot repeat and confirm / void? their results, this is the cost, to the truth “real science” seeks with every breath, from the shoddy work they were doing. As no matter how the data is compiled by others it will not repeat exactly again. Can you say plausible deny ability? They could have with out the release of the leaked files.

    The derailing of the Peer review process, shifts the balance from deny ability to fraud of the worst kind. The application of the unlawfully “peer reviewed” screened data to the decision making process of the IPCC agenda is a crime against all mankind, and the participants should be tried as such.

    To MIke et all:
    to Quote; “”E.M. Smith does not actualy analyse the GISS temp series. He analyse the raw data that is used to calculate the GISS temperature data, and when he does he uses different calculations to what GISS uses. As GISS temperature shows basically the same trend in all four seasons, it is obvious that the difference between Smith’s calculations and the actual GISS calculations is the cause of the supposed lack of warming in summer.””

    My comment starts here again: The difference E.M. Smith sees in the raw data from the GISS is probably due to using the whole data set affected by commercial, agricultural, and residential irrigation of crops and auto watering of lawns, usually done at night. Increased use of landscaping over the years, heating and air conditioner use close to some of the raw data stations. My local measurable UHI effects are more pronounced in winter, even here in a small community of 3,000 I see a rise of 3 to 5 degrees F (more when calm) on my auto’s digital thermometer as I go the 6 miles, into the town from my farm, and it drops again as I leave out the other side.

    As an aside, the most effective way to abate the rapid shift from day to night temps would be to add moisture to the top soil increasing it’s specific heat capacity, making it less desert like. Also increasing sodded area, mowing lawns at an increased height, 4″ to 6″ instead of the 2 1/2″ manicured carpet look, most strive for.

    REPLY: [ The lack of summer warming, and the presence in winter, is directly tracked to thermometer counts moving to warmer places. If you pick subsets of the thermometers, as I have done in other postings, there is still no summer warming (and often no winter warming either). Winter warming only shows up in the newer thermometers added in warmer places with flatter seasonal trends (i.e. closer to the equator and beach) as they preferentially add to the winter averages. It isn’t a UHI artifact, it is a thermometer location and count-where-hot vs count-where-cold artifact. Basically, they cooked the thermometer locations. But they did not allow for the 4th power radiation effect putting a very hard lid on maximum temperatures and that shows through in the max temps in summer. Basically, they could raise the annual and even the winter seasonal averages with thermometer change; but that 4th power increase in IR with temperature prevents raising the summers much at all and seems to hit hard at about 20 C (in the averages). The potential that it was UHI was, BTW, part of why I did the “subset” thermometer studies. I wanted to see if a long lived set subject to 100 years of urbanization was the “source”, and it was not. Surprised me, but that is what real science is about. Finding the surprise… -ems ]

  44. David Gillies says:

    For endianness conversion under GCC, take a look at the routines in /usr/include/byteswap.h. These provide bswap_16(), bswap_32 and bswap_64() (which do exactly what they say on the tin). As long as you have __GNUC__ >= 2 you will be OK. There’s also ntohs()/ntohl() and htons()/htonl() to convert network to host byte order (in network/in.h), which compile away to a no-op on big-endian systems. The bswap_xx are implemented as assembler primitives on 486 and up so are fairly efficient.

    In addition we have /usr/include/endian.h which can be used for runtime endianness-testing.

  45. If we start with the studies of what works in climate forecasting, the Milankavitch cycles, and expand on what has turned out to be true about solar cycles according to Theodor Landscheidt, ( the only one to correctly forecast the long solar minimum we are passing through). The evidence points to the natural variability factors as being the effects of the rotation or the galaxy and the swirl imparted to the local area of the spiral arm we seem to reside in (Milankavich), and by the inertial dampening of the planets effects on the barycenter of the solar system, moves the sun’s center of mass around as it tries to stay magnetically and gravitationally centered in the swirling magnetic fields, plasma, and dust clouds, and other stars joining us in this dance to the celestial music as it were.

    (Landscheidt) Found the driving forces of the Inertial dampening of the system and defined it to the point of predictability, it only seems that that the next steps would be to analyze the effects of the interactions of the Inner planets, which have a rhythmic pattern to their orbital relationships, and their relations to the weather patterns they share. Most good discoveries come from the individuals who seek the truth with out consideration for the limited vision of the thundering herd mentality.

    With climategate we have seen the latest stampede, of hurried angst ridden, fear mongering, driving of the ignorant sheep of the world away from the truth and into the pens. By the politically minded “think they know what the rest of us need crowd,” that are controlling the funds, research orientation, and imposing their goals upon the process, to achieve profits as they see fit, to stay in power.

    I have quietly undertaken the study of the relationships between the interactions of the Sun’s magnetic fields borne on the solar wind, and it’s interactions with the Earth’s weather patterns to the point I have found the cyclic patterns of the shorter decadeal durations, that show up as the natural background variances in the climate RAW data sets. Starting with the history of research into planetary motions and the Lunar declination,(the Earth / Moon system’s response to the rotation of the magnetic poles of the sun. In order to find a natural analog to the patterns in the weather there were several things I had to consider.

    The results of the analog cyclic pattern I discovered repeat with in a complex pattern of Inner planet harmonics, and outer planet longer term interferences that come round to the 172 year pattern Landscheidt discovered, so this is just the shorter period set of variables, that further define the limits, of the natural variables needed to be considered, along side the CO2 hypothesis, as the longer term/period parents (Milankivich and Landscheidt cycles) of these driving forces are valid. It would be in error if they were not considered and calculated into the filtering of the swings in the climate data, for forecasting longer terms into the future.

    A sample of the cyclic pattern found in the meteorological database is presented as a composite of the past three cycles composited together and plotted onto maps for a 5 year period starting in 2008, and running to January of 2014, on a rough draft website I use to further define the shifts in the pattern from the past three to the current cycle, to continue learning about the details of the interactions.

    The building of Stonehenge at the end of the last ice age, was done as the weather in the area was changing from tundra, to grasses and shrubs, in waves from the El nino effects at the time.

    They began a study of the relationship between the Solar and Lunar declinational movement timing, found the lunar 18.6 year Mn minimum/maximum declinational cycle, the 19 year Metonic cycle where the moon is at the same phase and maximum declination on the same date every 19 years, and the 6585 day Saris cycle of eclipses.

    From combining the annual seasonal effects of apparent solar declination, and the short term effects of the Lunar declinational movement. The Incas and Mayans understood repeating weather patterns well enough to build a thriving culture, that supported a much larger population, than the area currently barely supports in poverty.

    Then along came the Conquistadors, that were assumed to be the gods foretold in prophecy, who took over and killed off the high priests and the learned class (because the Catholic priests with them were convinced, they were idolaters and heretics.) so all were lost that understood how the “Pagan religion†was able to grow that much food with little problems, by the timing of celebrations and festivals that the people partook of, in a joyous and productive mood.

    The Mayan stone masons who were busy carving out the next stone block to carve another 300 years of calendar upon, were put to work mining gold to export back to Spain. So with the next stone block unfinished, and in the rough, still in the quarry the Mayan calendar comes to an end in 2012.

    Most of the population of the area was either killed in battles, or worked to death, while on cocaine to minimize food consumption, and mined gold for export by the false gods.

    At home in Europe the Spanish inquisition sought to wipe out the fund of knowledge, (that went underground) about the interactions of the Solar and Lunar declinational movements and other sidereal stellar influences on people, and things in the natural world. As the result of mass killings, and book burnings much knowledge, and data history was lost.

    Nicolas Copernicus, (19, February 1473 – 24 May 1543) and Nostradamus, (21 December, 1503 – 2 July 1566) Were around at about the same time and may have collaborated in person, or through a net work of underground friends. To give Nostradamus the idea to convert the data sets of past history sorted by geocentric astrology locations and positions, to a Heliocentric data base from which he drew his famous quatrains. There are many references to late night calculations, aside observations that may have given him his accuracy. Then along came Galileo Galilie , (15, February, 1564 – 8 January, 1642) with proof, that round moons circled round planets.

    With the advent of good fast cheap computers, I was able to look at data sets ( although with considerably less coverage due to centuries of suppression,) and sort for Planetary and Lunar influences, and found that the Lunar declinational component, of the orbital movements, of the Moon, was responsible for the driving, of the Rossby Wave patterns, in sync with the lunar declinational tidal forces at work in the atmosphere.

    How does this all work you ask? Well there is a magnetic field that surrounds the sun, and magnetic fields, that are invested in the body of the Galaxy. These large scale standing fields, interact to produce fluctuations in the strength of the fields felt upon the Earth as it moves in it’s orbit.

    The poles of the Earth are tilted to the axis of the solar system ~23 ½ degrees, giving us the changing seasons. The sun on the other hand is different it’s axis of rotation is vertical, but the magnet poles are tilted ~12 degrees, so as it rotates on an average of 27.325 day period, the polarity of the magnetic fields felt via the solar wind, shifts from the result of the orientation determined by the position of the rotating magnetic poles of the sun.

    The inner core of the moon has frozen, the outer core of the Earth is still molten, and a concentration of the magnetically permeable materials that make up the earth. These pulses of alternating North then South magnetic field shifts has been going on since before the Earth condensed into a planet and then was later struck by a Mars sized object (so the current theory goes), that splashed off most of the crust.

    Most returned to the Earth, some was lost into interplanetary space, and some condensed into the moon. Somewhere in the process the center of mass of the moon gravitated toward the surface that faces the Earth, before it froze, causing that denser side to always face the Earth.

    It is not the center of mass of the Earth that scribes the orbital path of the Earth about the sun but the center of mass of the composite Earth / moon barycenter that lies about 1,200 kilometers off of the center of mass of the Earth, always positioned between the center of the earth and the center of the Moon. So as the Moon rotates around the earth to create the lunar light phases, the center of mass of the earth goes from inside to out side, around the common barycenter. As the Moon moves North / South in it’s declination, the center of mass of the earth goes the opposite direction to counter balance, around their common barycenter that scribes the smooth ellipse of the orbit around the sun. So really the Earth makes 13 loops like a strung out spring every year.

    The magnetic impulses in the solar wind has driven the Moon / Earth into the declinational dance that creates the tides in phase in the atmosphere, because of the pendulum type movement the Moon hangs at the extremes of declination almost three days with in a couple of degrees then makes a fast sweep across the equator at up to 7 to 9 degrees per day. At these culminations of declination movement the polarity of the solar wind peaks and reverses, causing a surge in the reversal of the ion flux generated as a result. Because of the combination of both peak of Meridian flow surge in the atmosphere, and reversal of ion charge gradient globally occurs at the same time like clock work most severe weather occurs at these times.

    Because of the semi boundary conditions caused by mountain ranges, the Rockies, Andes, Urals, Alps, Himalayas, that resulted in topographical forcing into a four fold pattern of types of Jet stream patterns, I had to use not a 27.325 day period but a 109.3 day period to synchronize the lunar declinational patterns into the data to get clearer repeatability than the same data set filtered by Lunar phase alone.

    There is a pattern of 6554 days where in the inner planets, Mars, Earth, Venus, and Mercury, make an even number of orbital revolutions, and return to almost the same relative position to the star field.

    By adding 4 days to this period I get 6558 days the time it takes the Moon to have 240 declinational cycles of 27.325 days, so that by using 6558 days as a synchronization period I get the lunar Declination angle, lunar phase, perigee / apogee cycle, and the relative positions of the inner planets to align from the past three (6558 day) cycles well enough that the average of the temperatures, and the totals of the precipitations give a picture of the repeating pattern, from the last three to forecast the next almost 18 year long string of weather related events, with a better accuracy than the forecast available for three to five days from NOW from conventional NWS / NOAA sources.

    So by looking at the periods of declinational movement and the four fold pattern of Rossby wave propagation, while maintaining the inner planet synchronization. I get all of these influences in sync to look almost the same, as the current conditions, even to periods of hail, and tornado production.

    When the outer planets are added into the mix, they are out of phase in regard to the inner planet / Lunar patterns, and their influences are not in Sync with these background patterns. There are lines of magnetic force that connect each planet to the sun, and these revolve around with the planets naturally.

    As the Earth’s orbit takes it between these outer planets and the sun (at Synodic conjunctions), the increase in magnetic fields carried via the solar wind, (to effect this outer planet coupling) is felt upon the Earth’s magnetosphere, and results in a temporary increase in the pole to equator charge gradient then a discharge back to ambient levels (about a two week long up then down cycle time), how this interferes or combines with the “usual lunar / inner planet patterns†is determined by whether it is in, or out of phase with the background patterns.

    During normal charge cycles more moisture is driven into the atmosphere carrying positive Ions, along the ITCZ, and in discharge cycle phases waves of free electrons, and negative ions are sent down from the poles into the mid-latitudes. Charge cycles inhibit precipitation amounts and discharge cycles produce increased precipitation amounts along existing frontal boundaries, due to changes in residual ion charge differences between the air masses.

    There is a seasonal increase in magnetic fields coupled from the center of out galaxy to the sun that peaks in mid June (summer solstice), and then decreases till winter solstice. As the magnetic charging cycle associated with this build up in Northern hemisphere Spring, it brings on a bias for surges of positive ionized air masses, that produces surges of tornadoes in phase with the lunar declinational culminations, and other severe weather, will also be enhanced by Synod conjunctions with outer planets, by the same increases of positively charged ions. The closer the timing of the conjunction to a peak lunar culmination the sharper the spike of production, like cracking a whip.

    During discharge phases from summer solstice through fall in general, tropical storms manifest as large scale discharge patterns to ring the moisture, heat, and excess ions out of the tropical air masses. Outer planets conjunctions at these times help to build moisture reserves in the atmosphere, during their ion charge contribution, and enhance storms to category 4 and 5 levels when in phase with their discharge phase influences. So to say that the planets have no real influence on the world in general, is the same as to totally disregard how much the weather, effects how people live and survive.

    I think that the influences felt at the surface, are just changes in the back ground stimuli, and not strongly controlling enough to lose free will, when we chose to interact with the total spectrum of stimuli that surrounds us at any particular moment. The 18+ year long repeating pattern is long enough that the other conditions surrounding a person have changed, via plant growth, soil changes, age, or location on the surface of the earth, since the last cycle.

    On the ground, all plants that have roots in the soil share soil ions and nutrients, via microbial sharing, fungal predation, and companion plants that support each other. The organic matter from past growth that gives up valuable nutrients as it decays, and adds texture to the soil for better aeration and moisture penetration, form a mat of interacting processes, that breath life into the environment.

    REPLY: [ An interesting, if a bit long, comment. FWIW, the discussion of barycentric stuff was going on in this thread:

    and comments about the earth / planet / sun / wobble etc. are better placed there. This tread is about GIStemp and how it measures temperatures (not particularly what might cause them). If I get enough ambition and time I may figure out how to move this comment to that thread without losing bits… though most likely I’ll just go make a cup of tea and read on ;-) Also, I’ve got one small article up on Stonhenge (see the topic box to the right). On my “someday when AGW is dead” list is to finish writing up how to make your own stonehenge from “scratch”. It is a nice little astronomical observatory along with being both a time and length standard (from which you can get an area standard and volume standards…). See the article on “making an English Foot” for some speculation:

    Basically, you start with star motion and from that make the “rod” and the “megalithic yard” (and other time and length standards). Then you can lay out the henge and make more precise calendars, clocks, lunar and stellar obervatory, etc. And like all circular slide rules, the bigger it is, the more precision you get. Thus the large cleared and aligned areas miles from the henge proper. To get even more precise timing of star rise, and set; and moon rise, and set. But details on that must await the actual writing of the “making a henge” article… being Off Topic here. But if you are ever lost and alone and need a precise clock and distance standard, all it takes is a bit of rope, 2 sticks, a stone, and the night sky… Clever people, these ancients… -ems ]

  46. dougie says:

    Hi -EMS
    nice to to see your getting the hits you deserve on all this.
    about time.

    on your comment re – ‘Clever people, these ancients…’
    how true, most modern !! people seem to think we know it all/take somebody’s (expert) word & believe, we’ve got experts on everything, no need to think for ourselves then!!

    these ‘ancients’ put many of our present scientists to shame. sorry if i diverge.
    may be a new thread needed

  47. E.M.Smith says:


    Not really a divergence… frankly I enjoy the topic far more than the whole AGW mess. You may note that the “English Foot” and “Greek Foot” time stamps are early…

    The fact that the Greek Foot and the English Foot are inside the error band of each other even though a few thousand years separated in history fascinates me to no end. The fact that we are in “Four 9s” land on the precision of measuring the circumference of the earth in feet is astounding. And the notion that I can make a “rod” that is very precise AND a time standard with 2 sticks, some rope and a rock at any place on the planet and any time the night sky is clear speaks to me more than any of this other stuff.

    I find myself admiring the work of some Druid 4000+ year ago and wishing that the modern crop of “scientists” had 1/10th of their skill.

    To me, the issues are inextricably linked. But for all the other folks who “glaze” over such things and just want the daily dose of “AGW is broken”, I do think it is better to focus such ponderings in a thread devoted to them.

    But if you give me even 1/4 of a reason to go into it, I’ll be telling you how to make a ‘rod’ unit of measure with good precision and not much beyond stone age tools and admiring the scientists of 4000 BC… and loving it…

  48. Pingback: GHCN temps re-hashed – step 0.001 « Warmingslide's Blog

  49. Douglas Hoyt says:

    E. M. Smith
    I did a reconstruction of the global surface temperatures back in 1999 using the CRU data through 1997. Basically I found that the temperature declined much more rapidly between 1940 and the mid-70s than CRU reported.

    If interested, email me and I will send you my Fortran code and the data.

  50. Comment Moved to:

    The endless nattering about precision really belongs on the precision thread, so I’m moving all that discussion there. Any further comments on the folly of 1/100 precision from whole degree measurements will be moved there or deleted depending on my time and whim at the moment. If you wish to flog that particular topic more, go to the correct thread for it. -E.M.Smith ]

  51. Pingback: Don’t Hold Your Breath. If the AGW Evidence is So Conclusive, AGW Proponents Explain This: « Thoughtful Analysis

  52. Neil Fisher says:

    Just been over at Lucia’s blog, where she’s talking about false precision. She appears to be saying you are wrong about this. Care to comment?

    REPLY: [ Not really. The issue has been thoroughly flogged to death. See the Mr. McGuire thread. Please. Pretty please. Pretty please Or Else. -E.M.Smith ]

  53. Arthur Dent says:

    Yup, I’m afraid that I started this argument (?) by pointing out that Lucia’s view and your view appeared to differ

  54. Adam Gallon says:

    I’m sure you did a post looking at the thermometers in New Zealand, can’t find it, but this may be of interest to you!
    “Now, a study published in the NZ Journal of Science back in 1980 reveals weather stations at the heart of NIWA’s claims of massive warming were shown to be unreliable and untrustworthy by a senior Met Office climate scientist 30 years ago, long before global warming became a politically charged issue. “

  55. E.M.Smith says:


    Follow the “global analysis” link at the top of this page and there are 4 postings about the Pacific basin. I think you are probably thinking about this one:

    Oh, and yes, interesting article…

  56. docmartyn says:

    Would it be possible to find a rosette of, say, five temperature stations, with ‘communication’ distance to one another and remove each one at a time and run the temperature profile? You would get 6 data-sets, giving six different ‘average’ for the region. It would give us some idea of how robust the modeling is.

  57. leighp says:

    I can barely follow 50% of this stuff but I do hope you or someone of your calibre is writing The Book because it is gripping reading even for someone of my very limited understanding. I think I now know who the big baddy in this thriller. I only wish I could contribute to the investigation.

  58. E.M.Smith says:


    Don’t worry about some of it being a bit deep. It just takes time. So pick a bit, and start studying it. About 3 years ago I had no idea what the heck any of the ocean currents were (PDO, AMO, ENSO, etc.) nor did I have any idea what a GIStemp was or even that it existed.

    I was completely over my head over at wattsupwiththat (a weather and climate site with decent info).

    After about a year reading stuff there (and a few books), I was pretty well oriented to how a lot of the climate and weather stuff works.

    Then someone posted that GIStemp code was published (after I’d complained of it being a black box). The rest, as they say, is history. I downloaded a copy, discovered it was old FORTRAN (that I’d written once, long ago, so could remember pretty quickly). A few months later, I had it running. Now I’m (more or less) an expert on it.

    So if you catch 50% of it, you are doing better than I did on my first few dips into the whole climate thing. Just don’t give up, you can ‘get it all’. It just takes some time (and learning that the AGW folks web sites are full of time sinks that try to convince you that you can’t get it). So don’t let them wear you down.

    The part that matters is somewhat complicated, but not beyond anyone to ‘get’. Start by just “looking out the window”. See all that snow in North America and Europe? That means when The AGW Team shout “Warmest Winter in DECADES!!!!” they are wrong. Be grounded in yourself, and the rest follows pretty directly.

    It was the “Warmest 115 Year RECORD HEAT!!!! in California !!!” about a year ago (at least, it seems like a year, probably only 1/2 year ;-) announcement that made me call “BS” on it. My tomatoes were not setting fruit. In warm years they do, in cold years they don’t. It was a cold year. Not a record warm one.

    So I went digging. What thermometers said it was warm in California? That lead directly to all the postings about GHCN (Global Historical Climate Network) having “cooked the books” by deleting thermometers in cold places since 1990 or so. Not some great wisdom. Not some deep understanding. Just a tomato plant that did not tell lies.

    So if you know what your garden tells you,
    or if you know what your heating bill tells you,
    or if you work out doors and just know you are cold,
    you know all you need to know to get started.

    From there, you just start looking at why reality does not match the thermometer record? Who’s cooking what books?

    It’s really that simple, and anyone can do it.

    (And after a couple of years, you too can be fondly remembering the days when GHCN and USHCN were meaningless jumbles of letters… just like they were for me a couple of years back. 8-)

  59. Posted this on

    Latest thoughts, with some rehashing.

    Richard Holle (22:01:29) :

    There are a few large factors “they” don’t consider in the models, (that were supposedly proven useless in the 50’s, before computers and peer review came around.)

    The Moon drives the tides in the oceans (that they know) the Moon in it’s declinational (North to South) movement moves the atmosphere around and is the strongest driver of global weather patterns. (this is the problem with the models with handling time scales past 3 days to 20 years)

    The answers can be found to both enhance short term, (3 days to monthly) forecasts and climate models out to about 15 years or more, by incorporating the periods of the Lunar declinational atmospheric tides and their resultant effects on the Rossby waves and Jet stream patterns, into the models.

    I am offering here for your use, the process to fix these problems. If you look at the “forecast maps” generated by the aft mentioned process, you will see they did much better than the NOAA forecasts in the article above.

    Notice also the large spring outbreak of tornadoes coming around the 22nd through 25th of march of 2010, is mentioned, hidden in the middle of this rather lengthy read, (maps of the days mentioned can be found posted on the site, updates under the national maps will be updated 2 months in advance to reflect areas of the states expected to be affected, by this outbreak of severe weather.

    Richard Holle

    The following text was Originally Posted on another blog: December 13, 2009 11:35 pm;

    One of the problems with the current models is the reference time frame is very narrow for initial conditions, and changes with in the past three days, a lot of times, will introduce presistance of inertia, to the medial flows, for several days, consistent with the actual flows, as the Lunar declinational atmospheric tides, make their runs across the equator from one poleward culmination to another.

    Then as the tide turns and we have the severe weather bursts at declinational culmination, they get confused, or surprised, as the initial inertial effects reverse for about four days before the sweep to the other pole, that brings back the smooth flows, the models understand.

    So that when the Lunar declination went to Maximum North on December 3rd, turbulence and shear introduced into the atmosphere, from the turning tide, (the models do not know about), surprised them with the usual couple of tornadoes. Now (12-13-09) that we are ~20 degrees South Lunar declination, the models have a full buffer, of five days of linear inertial movement, from the Moon’s trip South across the equator (12-09-09) and is slowing it’s movement.

    Coming up on the Southern extent culmination, producing a secondary tidal bulge in the Northern Hemisphere, bringing us to the mid point of a 27.32 day declinational cycle (one of the four routine patterns that cycle on an 109.3 day period). This particular one (#1) that started back on Dec 3rd, has incursions of polar air masses that come down from Western Canada, through Montana and the Dakotas, to make up the Northern part of the atmospheric tidal bulge.

    So I would expect to see a large invasion of cold dry air sweep almost all the way to the Gulf coast again, then the produced frontal boundary with the interesting weather, that includes change state intense precipitation. Freezing rain, where the warm over runs cold, and snow where the cold undercuts the more sluggish warm air, still moving North East by inertia alone, severe weather to form in that trailing edge of the warm moist mass, that gets over taken from behind by the polar air mass that tries to follow the tidal bulge back to the equator, which for the next 4 of 5 days powers up the cyclonic patterns generated by carolis forces, and finishes out as the Moon approaches the equator again.

    Expect the same type of interaction again for a primary bulge production by the passage back North, culminating on 12-30-09, pumping in a solid polar air mass very consistent with the pattern we had on 12-03-09, (the North “lunar declination culmination”)[LDC], then (#2) the next Rossby wave / jet stream regime pattern, comes back into play with much more zonal flow, and air masses invading from the Pacific, (of the two sub types of) phase with lesser amounts of Gulf moisture entrainment in this one, more in the other #4.

    The (#3) third 27.32 day pattern with polar air masses invading in from the Minnesota / Great Lakes area and sweeping out through the Eastern sea board, and mostly zonal flow out west, from 01-27-10 till 02-23-10, comes next.

    The fourth 27.32 day cycle, that looks very similar to #2 but with much more moisture from the Gulf of Mexico, usually has more hail and tornadoes associated with it than Pattern #4, and typically flows up Eastern side of tornado alley. Will be in effect from 02-23-10 through 03-22-10, and should produce the first big surge of severe tornado production, from about March 20th 2010, until about March 26 or later as the Next polar air mass cycle is coming out of western Canada, and should make for steep temperature gradients, and ion content differences.

    Richard Holle

    From a viewpoint of how the assemblage of parts seamlessly fits together,the only thing you have to do, is to watch the (short but seemingly) endless stream of (every 15 minute) infrared and/or vapor satellite photos animated, (after fixing the jumping around of the originals, due to lack of foresight, that they might be useful some day), and synchronized by 27.32 days periods, to see the repeating cycles.

    To set up five tiled windows, in the first show day #1 through #27 sequentially, then as they continue on in the same stream, the cycle of the first 27 days continues anew in window #2, synchronized by Lunar declination to #1. Till they spill over into window #3 stepping in phase with the other two, #4 the same idea gives you the four basic patterns of the Rossby wave 109.3 day cycle, of global circulation, that then repeat but seasonally shifted.

    In window #5 then would be the first repeat of window #1 in the same phase of the same pattern, and should look a lot like window #1. As the progression through the total series, proceeds, when you get 6558 days into the five stacks, a 6th window opens and the original day #1 in window #1 opens as #1 in window #6. As the series progresses on, real data can be viewed, in the real interactions going on.

    This would give you a look into the cyclic pattern that develops from the repetitive interaction of the inner planets, and tidal effects, caused by the Lunar declination, phase, perigee/ apogee cycles.

    By adding a sliding ball, vertically moving up and down a +-30 degree scale bar (referenced from the Equator), on the side of each tile space, that shows the plot of the current Lunar declination for the time of each frame. Which will allow you to see the shifts in the Lunar declinational angle’s effects, as the 18.6 Mn signal progresses.

    By adding another slide bar of +-30 degrees (with the heliocentric synod conjunction with Earth, as the zero reference), at the top, of each tile you could view each outer planet as we pass them, as color coded discs labeled, J, S,U, N, shifting from left to right. From viewing this progression of the outer planets, the merit of their influences, can then be seen in the additional surges in ion flux as they go by. You can watch the changes in the normal background, of the global circulation driven by the moon and inner planets, affected by the outer planets.

    By adding in the surface maps for the past historic temperatures, dew points, precipitation, types, and amounts, as overlays onto the IR/VAPOR photos, the patterns will be abundantly clear to 10 year old school kids. At the same time, generating a good long term forecast, set of analogs to base the models upon.

    Once the amount of additional angular momentum, and the process of it’s coming and goings can be clearly seen, it can then be measured, it’s effects calculated, and incorporated into the climate models, as a real quantized feedback. thereby giving us a much better picture, of the interactions, of all of the parts of the puzzle.

    All of the necessary data is in the archives, and free to use, to those that have the where with all, to assemble the real truth, be it inconvenient or not. I will probably spend the rest of my life, trying to do it alone, out of my own funds, as I have done so far.

    Richard Holle (22:31:45) 28 12 2009:

    If you bothered to read the lengthy entry above, you will understand that there are four patterns of global circulation that alternate, (as stated above) from ones of high zonal flow to ones of High medial flow.

    This is why the weather patterns run warm during the high zonal flow patterns ( September and November 2009) then cold during the alternate months of October, and December2009.

    The patterns induced by the Lunar declination run for 27.325 days at a cycle, as this is just short of a month the pattern slews into and out of phase with the “Monthly periods” so data stored “by Month” has problems with this slewed cyclic corruption. It would help if “they” used sets 27.325 days long to plot trends as they would see sets of clean alternating trends in resultant data sets.

    “They” could filter for the long term cyclic patterns, to reduce the noise in the composite signal, to the point that low frequency patterns caused by solar cycle shifts in activity, could also be filtered out leaving the residual surges in solar wind flux caused by the outer planets influence. Then when that is found, and defined well enough to filter out.

    What would be left should be the CO2 long term forcing, that will probably be very small, but conform well to the CO2ppm increases, in the atmosphere. THEN we would be able to decide rationally what if anything, needs to be done, about carbon foot prints, and suggestions for controls.

    Richard Holle

  60. ruhroh says:

    Hey Cheif;

    Here’s the dumb idea dujour;
    (might as well get on the leaderboard early…)

    Would this thing help anything?
    These instructions deviate from generic at step 11, but otherwise may have use?
    Just a random idea…

    “What is f90tohtml?

    * f90tohtml is a PERL script that converts FORTRAN source code into HTML. All the subprogram calls are linked, both forward and backwards. A clickable calling tree is constructed. A subject index can be made from a user-supplied hash. A search engine, based on regular expressions, searches the code.
    * f90tohtml was developed for the purpose of browsing large numerical weather prediction codes, the University of Oklahoma’s ARPS model, the PSU/UCAR MM5, the NCEP Regional Spectral Model, the Navy’s COAMPS model, and the new community WRF model.
    * f90tohtml is most effective when used on your code; browsing from your own disk is much quicker than over the net. But you may view an online WRF Browser. The WRF model is v1.3, downloaded from on March 30, 2003.
    * The files and scripts that will help you apply f90tohtml to your source of WRF, ARPS5.0 and COAMPS2.0 are bundled with the distribution.

    How to install f90tohtml

    1. Download f90tohtml.tar.gz (meaning the latest version). Then gunzip f90tohtml.tar.gz, then tar xvf f90tohtml.tar
    2. cd f90tohtml. Then vi f90tohtml and edit the line #!/usr/bin/perl to your path to perl, if need be. The path is usually either /usr/local/bin/perl or /usr/bin/perl. Also change the path to your f90tohtml directory in the statement $path_f90tohtml=”/home/bfiedler/f90tohtml/”;
    3. Then chmod u+x f90tohtml
    4. Add f90tohtml to your path in your .cshrc or .bashrc file.
    5. Then cd examples
    6. Do the appropriate editing within the first few lines of and In d2ps.f2h change $dir_html in $dir_html=”~/d2psbrowser/”; so that it contains a valid path to where f90tohtml will create a directory d2psbrowser. Do not create the directory d2psbrowser yourself. Such browser directories can always be later moved to your public html directory, or the browser directory can be made there directly. The search feature will not work until the browser directory is in a directory where cgi works.
    7. Type A directory d2ps_ls and a file within that directory will be made. Actually, a d2ps_ls directory comes with the bundle. But within the .ls files, the filepath will be to a certain bfiedler, which, of course, will not work on your code.
    8. Now type f90tohtml d2ps.f2h
    9. If successful, f90tohtml will tell you where to open your html-ized code with firefox, or a similar browser.
    10. If you want use the search engine, move the browser to your public directory. Move grepper.cgi to your personal cgi-bin. The browser is coded to find the cgi-bin at the same level as the browser directory. Make sure you have permissions and the path to perl set correctly.
    11. Try making a browser for the NCAR Column Radiation Model; first run, make the appropriate changes within crm.f2h, then f90tohtml crm.f2h. “

  61. Pingback: John Coleman’s hourlong news special “Global Warming – The Other Side” now online, all five parts here « Watts Up With That?

  62. M. Simon says:

    I think it is important when you talk about thermometer deletions to note whether the data is still being recorded but not used.

    Otherwise it can be confusing.

    It confused me until I learned about the Russian “deletions”.

  63. M. Simon says:

    I can’t think of a good synonym for “”available but not used”. I’m open to suggestion. Maybe just “unused” data.

    Here is four more pages of ideas:

    Any way I think it needs a clearer term than “deletions”.

  64. M. Simon says:

    It just takes some time (and learning that the AGW folks web sites are full of time sinks that try to convince you that you can’t get it).

    I had a warmist try to convince me that I could not get homogenization because it was very complicated (actually it is one of the least complicated things in this whole mess). I came back to him that I was relatively up to speed with it as I had been involved in the discussions at Climate Audit and would he care to discuss it further. Let me add that the warmist in question was a regular poster at the board I frequent. Actually I’m also a moderator at that board. A rather meaningless title (more akin to janitor in the instance) as I only clean up unworking urls or overlong urls etc. I delete nothing (I could) and I don’t even insert comments in posts (other than to admonish about style defects such as overlong urls or pictures that mess up the formatting). For the most part I behave just like a regular commenter.

    It has been three days since I made that offer and not a peep out of him. Heh.

    I have been a code monkey in the past (assy lang for the most part), I know the issues with digital representations of numbers (I have designed routines in assy lang), and my calculus is very weak from decades of unuse but I can follow the arguments. Where I do shine is in instrumentation issues and Naval issues (old sea water temp data) since I was a Naval Nuke and know my way around a Naval steam plant. And the most important thing I know is how to “gun deck” the data. Heh.

    Well any way there is a lot of covering smoke (or squid ink if you prefer) from the warmists.

  65. M. Simon says:

    I’m going a little OT here but the comments on temp precision are closed.

    Let me just say those folks who argue against you either

    1. Don’t get the concept
    2. Are thick as a brick
    3. Are intentionally blowing smoke.

    I’d say more but I have already said too much.

  66. Pingback: John Coleman’s hourlong news special “Global Warming – The Other Side” now online, all five parts here « ~ THE GUNNY "G" BLOGS ONLINE ~ NEWS-VIEWS-BS, AND WORSE!

  67. MK says:

    Thank you for your efforts. Please make sure that you get this information to Jesse Ventura (Trutv, the Conspiracy Theory show) and James Delingpole at the UK Guardian. Both have really worked to publicize what a fraud the entire cap-and-trade etc. is and how the climate information was being falsified.

    REPLY: [ Frankly, I’m up to my eyeballs in stuff that urgently needs doing. If you want somebody to know about this, it’s up to you to let them know. There just isn’t enough of me to do what I’m doing and do marketing / PR too… This is a communal barn raising. Ya’ll come. -E.M.Smith ]

  68. David Segesta says:

    I’ve often wondered if the “deleted” stations were cherry picked by removing those that showed a declining or neutral trend and keeping those that showed an upward trend. But as I read this it sounds like they have deleted stations in colder areas and then calculated the new average based on the remaining stations. That’s guaranteed to produce an apparent warming trend! Am I understanding that correctly?

    REPLY: [ Basically, yes. There are a couple of ‘finesse’ points that you’ve skipped. The AGW folks believe that “the reference station method” will let you make up data for a missing station with high accuracy by looking at a “nearby rural” station and using historic “offsets” to adjust. Nearby can be 1000 km and rural can be a major international airport measured near the runway… So if a grass field next to a dirt strip at an airport with piston planes landing one a day in 1950 had a +5 C offset from your mountain, then today you would measure the temperature at your 10000 foot concrete runway with acres of tarmac Internationl Tourism Jet Port with 400 flights a day and tons of kerosene being burned for takeoffs and say: “no problem, just adjust that by 5 C and stick it up in the mountains”. I think this “has issues”. They do not. Further, if you have your baseline in one phase of the PDO (that makes one offset between the two regions) and the present in a different PDO phase (with a different relationship) you will get unrepresentative values too. Now, lets say your “past” had some cold and warm stations in the baseline, but your present only has warm. You take your past offset (cold vs mix of warm/cold) and create a new made up “filled in cold mountain” by taking your present (warm only,so higher value) and adjusting by the offset from before (so a 5 C offset between old mix/cold now gets applied to a HOT only (that might, for example, have a 7C offset), and will make a warmer “cold” area by 2 C). There is more, but you get the idea. The notion that you can “make it up” with enough precision and accuracy to support a 1/100 C anomaly map is, IMHO, nuts. -E.M.Smith ]

  69. davidc says:

    Yes, thanks for your efforts.

    Listening to the discussion about removal of cold temperature sites from the “active list”, something dawned on me that I had missed before. In your discussion regarding the stage in the program where the anomaly is calculated, you said (if I remember correctly) that GISS says that they calculate the anomalies at individual sites, then calculate the global anomaly as the average of the individual anomalies. But (you said) the program actually calculates a global average temperature and then subtracts the global reference temperature to get the global anomaly.

    With a stable data set these two procedures should give the same result (apart from rounding issues). So I couldn’t see why you were fussing about it. But with a changing data set, with cooler active sites being removed, the two methods are very different. Starting with the anomalies at individual sites, an increase in global anomaly on removing a site would occur if the site removed had a smaller than average anomaly – which could be a hot site or a cool site. But by the second method (which it seems they used) removal of a cool site would tend to produce an increase in global anomaly, just because the site was cool – regardless of whether the site was warming or cooling.

    It might be worth repeating some of that as there might be others who missed the point, as I did, not realising how much the datasets were changing.

  70. Jan says:

    Hi from the Czech Republic. Nice Job. I was interested in the problem, so I asked the Czech meteorologic institute leading climatologist L. Metelka I dispute sometimes in internet discussions – how is it with the Czech stations in the global temperature network. He replied to me that recently the CRU asked them to re-provide them with the Czech data, the CRU allegedly have lost.
    So I asked him what stations they provide data for CRU from.
    And it looks like it follows exactly the pattern you describe.
    Czechs provide data for CRU from 5 locations:
    1. Prague-Clementinum (altitude:197m – in very very center of the Czech capital Prague);
    2. Prague Ruzyne (alt:350m – at the largest Czech airport, station somehow “technically coupled”? with Clementinum – Mr. Metelka mentioned that, but I don’t understand what does it mean);
    3. Brno Turany (alt:237 – at the second largest Czech airport);
    4. Ostrava Mosnov (alt:257m – at the third largest Czech airport);
    5. Cheb (alt: 483m – at the eastern outskirt of the town in immediate neighborhood of an industrial zone, namely of large asphalted areas of the truck park lot, approx 2 miles from large water reservoir located more eastwards).
    The mean altitude of the Czech republic is 430m.
    To sum up: 3 stations at 3 largest international airports in the country, 1 in the very center of the largest Czech city, 1 inbetween the town industrial zone and water reservoir. No rural station whatsoever.
    To me it looks like completely unrepresentative pick for the Czech republic, where are the large mountain areas in north and south of the country, which is not covered at all.
    Next time I’ll ask Mr. Metelka, if there were some choice changes of the stations for the CRU network in the past and also if they provide data for GHCN and from where.

  71. Pingback: Militant Libertarian » ClimateGate arrives in the US – NASA Cheated Climate Data

  72. Jan says:

    Just another information I got about the three airports – all of them were substantially enlarged during last years since 2005 – as the air-traffic becomes much more dense after many of the low budget airlines were found in Europe.


  74. Pingback: Primary US Data Climate Centers Now Caught in Data Manipulation - XDTalk Forums - Your XD/XD(m) Information Source!

  75. Jan says:

    Some more informations> So the coupling of the two stations Prague-Klementinum and Prague-Ruzyne means that until 1951 they use data from Clementinum, then from Prague-Ruzyne in one measurement line. The temperatures in the CRU line from Klementinum before 1951 is adjusted to the altitude of Prague-Ruzyne. What is quite weird to me is that Mr. Metelka state that the available data 1970-1990 from Klementinum are now by CRU also adjusted to Ruzyne. To me it makes no sense.
    It is also positively stated from Czech Meteorological institute that there is no correction of UHI made in their data, nor any adjustments except the adjustments of the Klementinum temperatures before 1951 to the level of Ruzyne. The Institute also has at the time no information about what stations are at the time used by GHCN.
    Interesting is the answer to the question why they use the data from airports which I allow myself to translate here: “The airports are [chosen], because WMO as well as ICAO constitute and control very strict rules for the station positioning at the airports for the measurement quality. Data from the airports also are going into the international exchangeand are very quickly (in minutes) available throughout the world.”
    Which bring about a guestion if all the non-linear regression analyses that allegedly prove the CO2 (not sun or PDO) is a dominat factor of the warming are not merely a result of an artifact, which maybe more than from a fraud stems from the lazines of GHCN, CRU etc. to obtain unbiased data from national Met offices, and they rather rely on the easily available data used by international air-traffic meteorologists [its exactly the weather data from airports what is mainly important for airtraffic and should be available instantaneously] which is quite questionable if it might be used for climate predictions especially for proving the CO2 is a major cause of the observed warming in last decades.
    I would think the warming due to air-traffic will be very losely significant even around the airports, but even not much significant by itself, it could bring an artificial trend into the temperature lines and if then somebody would try to find which corelates with the temperature best (from sun, PDO, CO2), he would always find the strong corelation with the CO2 (which steadily rises the same way and trend as the traffic on airports), and could then falsely conclude, the CO2 is the cause of warming.

  76. Tony says:

    Dear Mr Smith,

    Please check out the post on WUWT on the UK Parliament’s investigation into Climategate, and their call for testimony.

    Nationality is no bar to submissions, and it would be a great service if you could send them a summary of your GISTEMP findings.



  77. Ibrahim says:

    I noticed that Alaska was warming on the Nasa-Chart.
    I don’t know but looking at the information on the here under given sites this is not true.
    I believe that there are more places (like Australia) where the same “climatechange” is (not) happening.

    Could you have a look at this?

  78. Pingback: » US Government Agencies Manipulated Climate Data To Intentionally Skew Data. The Liberty Tree Lantern

  79. ibrahim says:

    OT but I don’t know where to ask this

    maybe you could have a chemist look at the acidification of the oceans bij underwater volcanoes compared to CO2-acidification

  80. Pingback: Readers Edition » Jetzt haben auch die USA ihren Klimagate-Skandal

  81. Pingback: Klimalüge: Die USA lügen was das Zeug hält » Europnews

  82. Gunnar Saunes says:

    Marc Sheppard is also writing about CRU,GISS and Climagate in an article on American Thinker.
    There he lits up NOAA and GISS as the “bad boys”
    In your article it is NCDC and GISS.

    What is right?

    REPLY: [ Both. NCDC is a part of NOAA. Just as GISS is a part of NASA. I often state them as NOAA/NCDC and NASA/GISS just to make it clear that all 4 names talk about 2 entities. So it’s the National Climatic Data Center subdivision of National Oceangraphic and Atmospheric Administration… Basically, how high up the org chart do you want to include? You could also go lower and hit, eventually, individuals like Peterson who is the NOAA / NCDC person and / or the GHCN “data set manager” who is a NASA employee (one presumes “on loan” to NOAA/NCDC). Even their web site URL shows the relationship: though in a small to large way… -E.M.Smith ]

  83. Pingback: A few more AGW shenanigans links « No Cynics Allowed

  84. Pingback: Climagate (continued) « polis

  85. anom says:

    I worked in Asheville with NCDC into 2006. They started moving very ‘left’ with their interpretation of data in the years before I left. They became involved with IPCC and this was a big deal for the organization. I felt Tom Karl was playing politics. This seemed to have worked out well for him based on his AMS appointment and considering the politics on the new NOAA head, Dr. Jane. I went back to forecasting. Of course, in nearly 30 years of meteorology and climatology, I ‘ve never met an operational meteorologist who believed a climate model or long term observational record trends. The only folks who believe a climate model usually have their hand out for funding or a promotion.

  86. Hi – really good website you have established. I enjoyed reading this posting. I did want to issue a remark to tell you that the design of this site is very aesthetically delightful. I used to be a graphic designer, now I am a copy editor for a merchandising firm. I have always enjoyed playing with information processing systems and am attempting to learn computer code in my free time (which there is never enough of lol).

  87. Russ Smith says:

    Very interesting site, you may not remember me but [snip! – I remember, but everyone else doesn’t need to know! -E.M.Smith]

    Funny all these years later I stumble across this site ,you and my dad would get along famously he’s quite active in this area I sent him a link to some of your stuff he’s quite a bit more technical than I am so he’d probably be better able to discuss this with you. Very interesting stuff I guess when people say Hansen is the father of Global Warming it’s more literal than they realize, as in he’s helped literally create it with the reduction in measurement centers.

    Russ Smith

  88. Alf Hänle says:

    “Wer misst, misst Mist.”
    “Who measures, measures (a) mess”
    Thank You ems, for unveiling all this about gistemp!

    As it is kind of difficult to follow the changings in temperature measurement systematics for anybody not beeing a so called involved scientist, I even want to imply criminal intense in the doing of the climate-change-knigthts of NASA et al wordlwide and what is unveiled in this forum.
    In politics and big business it is a well known strategy to cut the public from the raw data source, or make it extrem difficult to read for everybody. Hoops! Now we will show You, what we want You to see!
    One of the well known tricks to achieve this, is to present the (temerature-) trends of index values instead of the absolute values,
    as they do at
    for instance,
    because, by using an index chart, the changings of only a very small number of measuring points, show the same behaviour as would do an aggregation of all points together if changing in the same mannor. So, selecting the preferred stations and showing their index values is throwing sand in our eyes!
    (one example of this practice is the pretension that gas-prices in Europ follow the oil-prices by nature since the early 70ies; to proove that heating oil and natural gas are comparable goods that ought to have following price-up-and-downs. In truth in the 70ies and 80ies, contrary to nowadays, only an evanescent number of gas bying contracts had real oil price influence; but the index chart of gasprice-changings show a 1to1 consistence. But thats nothing but a statistical trick!)

    Altering the position of the regarded measuring stations and their area of influence is as well but a trick to disgise whats going on.

    One other info I want to mention here is the article of Ernst-Georg Beck, Dipl.Biol.
    “50 Jahre kontinuierliche CO2- Messung auf Mauna Loa- Kurzfassung”

    “Who measures …. ” !

    Constant wrong measuring, or interpretation of the measurements ????
    What a mess-uring!


  89. Pingback: John Coleman’s hourlong news special “Global Warming – The Other Side” now online, all five parts here « Dark Politricks Retweeted

  90. Pingback: John Coleman’s hourlong news special “Global Warming – The Other Side” now online, all five parts here « Dark Politics

  91. tj says:

    Great website, thanks for your hard work.

    What do you think about the the satellite based data on global climate change. After a quick google search, I discovered that the original satellite data showed global cooling since 1979. More interesting is that the ‘scientists’ went to work on the data, a voilla, after correcting for ‘errors in the data’, the satellite data is consistent with global warming.
    “”Previously reported discrepancies between the amount of warming near the surface and higher in the atmosphere have been used to challenge the reliability of climate models and the reality of human induced global warming. Specifically, surface data showed substantial global-average warming, while early versions of satellite and radiosonde data showed little or no warming above the surface. This significant discrepancy no longer exists because errors in the satellite and radiosonde data have been identified and corrected. New data sets have also been developed that do not show such discrepancies.”
    ” (from wikepedia but has links to original sources –

    Here is the nasa website link with the original story about satellite based global cooling.

    I hope you get a chance to take a look at the ‘error’ correction method used to reverse the satellite based results. Any time we point out the evidence of globlal warming is doctored, the global warming folks point out the satellite based data is totally independent from the ground based observations and it also shows global warming.


    REPLY: [ Everybody needs to “pick a row to hoe” and I started by picking GIStemp. Got through most of it (up to STEP2 in detail, STEP3 in rough form, not yet STEP4_5) and discovered that it had “issues” but that the data coming IN to GIStemp had bigger issues. So I swapped rows and started digging at GHCN. Now I’ve got 2 half finished rows…

    So my “intent” is to finish up GHCN ( I have Africa yet to post for dT/dt – and I want to make a ‘presentation quality’ version of the dT/dt code; then I want to make a version of dT/dt that can better suppress “splice artifacts”. I wanted this present version to “detect them”, but now I’m curious what’s REALLY going on if you can cleanly splice the data… don’t know if that’s an impossible hill to climb or not… so far it looks like everyone whose chewed on it has broken their teeth trying…) then I’d really like to finish up running GIStemp through STEP4_5 and doing a set of A/B compares of it’s product to the “clean” product. At that point (probably a year away? I can only work on this part time / spare time…) I’ll be looking to “pick a new row”.

    In comments elsewhere I’d noted that the 50 gHz band sats use is now getting a load of ‘unlicensed usage’. Panasonic and some others have a “wireless TV” connection for HDTV, plus a variety of radar uses. So we’re rapidly adding a load of “noise” that is unregulated ‘in spectrum’. Further, there is every reason to suspect that as heat is dumped from the oceans through the air, you could get local higher temps in the air. (That pesky old “temperature is NOT heat” problem) So any particular PART of the planet getting warmer may only be telling you about HEAT flow out, not in. There are other issues too… and attractive ones… but I need to keep some amount of focus or I’ll be scattered all over the place. (But just Google “50 gHz” with either “Radar” or “TV” and see what pops up. 60 Ghz too. IIRC there was even ground based weather radar in there… so a satellite looking down from space is “seeing” exactly what again? And an increase in it means what?… )

    Like you, I hit the “We adjusted it to fix it” wall and just said “OK, so you are buggering the data too. Nevermind…” Maybe they are doing the right thing, but given the history of “climate science” to date, I strongly doubt it. The product of GIStemp is horridly and demonstrably broken. If I have one known broken clock and another one agrees with it, I do not expect that the mangled one suddenly healed itself; I suspect that I’ve got two broken clocks…

    But for now I’m hoeing two rows, and adding a third would be a bit too much. -E.M.Smith ]

  92. tckev says:

    Like many others here I do not follow all the math and statistical manipulations shown here but a friend of mine brought the AGW arguments into perspective by comparing it to manipulating digital photography.
    Her 10 step are –

    1. Take a face picture of someone you know well at high resolution, and largest size you can fit on your screen. Keep this as the reference image file.
    2. Reduce this image resolution by at least 2/3 and add noise until the image is recognizable but visibly grainy. Save this is our sample image.
    3. Divide the sample image into roughly rectangular 20×20 grid of randomly sized, slightly overlapping image subsamples.
    4. Save these subsamples as numbered files 1 to 400.
    5. The subsamples files that contain image details of –
    a) the top of the nose subsamples files are deleted ,
    b) one eye subsample file(s) is(are) deleted
    c) and 10 other (randomly picked) subsamples are deleted.
    6. Each remaining subsample file is processed by randomly reducing the resolution, the contrast, and changing hue by a random amount. Then apply Gaussian blur filter until the subsample image is a uniform color. Finally resize each subsample image by a random amount but no more than + or -10%, and save the subsample file.
    7. Reassemble all the subsample files to make a new sample image. Any gaps in the image are filled in by copying the subsample image next to it until all the gaps are filled.
    8. Apply a gaussian blur filter to the whole image until subsample discontinuity artifacts are no longer obvious.
    9. Resize and increase resolution of the new sample image to the same values as the original reference image.
    10. Compare new sample image with the reference image.
    I’m sure if you look closely some definition has been lost between the two images. /sarkoff

    Hope this make sense. Is it a fair comparison?

  93. E.M.Smith says:

    It’s close!

    A bit over the top, but a usable example could be made from it.

    The reality is more like having a picture of your girlfriend as the first one, then getting a second picture of your dad. Your Dads picture has some of the grids missing and some of them with wrong contrast and with other shifts.

    These two images are then averaged to make your ‘baseline’.

    Now you get 2 new pictures, with missing blocks (non-Nyquist) and with some contrast, brightness and color shifts (siting issues and errors, instrument change and station drops). You take these two pictures and use one to fill in missing bits from the other, sometimes filling in from nearby blocks, sometimes from the other picture, sometimes making it up from an average. Then the result gets run through your smoothing filter.

    Now you compare the result to your baseline average and decide if the pictures you got were more like your sister or your dad.

    In reality it was a picture of your dog and your mom.

  94. jim says:

    If you don’t have enough to argue about already, there is a guy on JC’s new blog disputing your findings on GISSTEMP, Tom Curtis I believe it was.

  95. E.M.Smith says:

    I finally got time to take a look. The commenter is missing a couple of things. They point to the inventory file to say “look there are so many” in essence. Far more than my about a hundred. Yet the inventory file shows all thermometers ever used, while my number of 136 was only in the year 2009. Further, they seem to think I was talking about globally, while that number is specific to the USA (the global number was about 1200 IIRC). So I’m underwhelmed at their inability to read and notice little words like “USA” and “2009”.

    Also, since that date, GIStemp has “put back in” the USHCN thermometers (that were DOA from about 2007 until November 2009, again IIRC). So the used TODAY number for the USA will be much higher, as they fixed the broken USHCN handling. So the commenter has no sense of the passage of time and events.

    I suppose I could get all wound up in re-writing history every time something changes, but then I’d start to feel like I worked at NASA or NCDC ;-) Can’t have that, now can we? 8-)

    At any rate, as I’ve said a few dozen times before, and will say a few dozen more, I guess: I’m not so much interested in admiring the thousands of ways that folks get things wrong as I’m interested in finding out what is really happening. So I avoid getting pulled into ‘cat fights’ and ‘he said / she said’ postings. I put here what truths I’ve found. Folks can take them or leave them. If they want to toss rocks about them, they can go do that elsewhere where I don’t have to watch the mudslinging.

    I know, not the “social norm” in the blogosphere where folks are looking for hit counts and controversy to generate traffic. But I treat this more like my personal lab book than a newspaper gossip column. There is limited time in life, and I’d rather spend it learning something interesting than to spend it cleaning other folks mental diapers. I don’t need to fight them, and I don’t need to tend them. I just need to be me.

  96. Pingback: A year of blogging… | Digging in the Clay

  97. Pingback: That horrible Global Warming | Sullivan's Travelers

  98. For my result of 40 years of working science, and writing about it for the last 4 years,

    Please go to

    Thank you

    Bruce A. Kershaw

    $25,000.00 Reward For the proof mankind is responsable for global warming and or climate change.

  99. gallopingcamel says:


    I just found this amazing post. Awesome! Lacking your programming skills I have been processing GHCN v2 data set using “gedit” to reduce the files to sizes that Open Office 2.4 can handle.

    As you can imagine this is a laborious process, especially as I handle data gaps manually and that makes my output suspect. I am going to try your approach even though it is way above my pay grade!

    Thanks for using “tarballs”; that improves my chances of success! Could you send a link to the latest version of the code, please?

    Last year I visited the GHCN folks in Asheboro to discuss the “Great Dying of Thermometers”. I failed to make a convincing case for the idea that the folks at GHCN had killed the thermometers to create a warm bias so I gave them the benefit of the doubt; perhaps I was trying too hard not to “burn bridges”:

  100. E.M.Smith says:


    I understand that MS Excel can now handle all of GHCN in one go. Verity Jones at Digging in the Clay has, I think, done so.

    Also there is a ‘Sister Site’ that has the data in an online database with some graphing software.


    The last version I ported is the one documented here. The “latest” is on the NASA site as in the referenced files here.

    I’ve not done a “port” of the latest, but the last few times I’ve looked they just glued on some new code and never did much to change the old stuff…

    Per “killing the thermometers”: That implies intention. I’ve called it “The Great Dying” as that is intention neutral. Basically, I don’t know if “It was murder or negligence”.

    It could easily be that they were sucking their own exhaust so much that they actually believed it didn’t matter how many thermometers you change in a calorimetry experiment and were quite happy to have a large number for the historical baseline and only use 1/10 th that number now (believing their magic sauce would fix things, when it can’t).

    For now, I’d suggest that you look through the code on line here, perhaps download the code as it stands today at NASA, compare the two, and see if you really want to proceed from there.

    I came to the conclusion that GIStemp “screws the pooch” and at that point continuing to actually USE it seems a bit silly. So mostly I’ve looked at how it got things wrong.

    To see what the data actually look like, I made my own code ( dT/dt ) which has the source code up in the relevant articles here:

    though a bit spread through the articles. It is MUCH less code, as I leave out a lot of the “crap” in GIStemp that just obfuscates.

    has all the GIStemp stuff.

    Has some interesting bits on NCDC (i.e. GHCN) issues too.

    If you really want a tarball of what I’m running, let me know and I’ll find somewhere to put it… But it’s now a few “revs” out of date…

  101. I have been making maps from a composite of the total station input into the TD 3200 COOP summary of the day, separated into cycles of 6558 days and each cycle date that matches the same progression # with its same # from the other three cycles of data, I end up with ~4800 stations with the 75th percentile separation distance is ~0.1 degree from the nearest neighbor.

    When I girded the data by the defaults for the program then cleaned it up some to select a few better choices for resolution and amount of smoothing, I got the maps presented on the web site

    I wanted to add Canada to the forecast base so have downloaded, extracted, converted to F & ” from metric, and the programer is tabling the data by date at this time.

    In the process of looking at the BEST settings to use to make the grids, from which to generate the maps. I found that I could close the search ellipse from the default [extent of data range] down to 5 degrees and still have on average a hundred data points to find the 8 nearest neighbors. By changing the grid sample area from 0.5 degrees down to 0.05 ~3 miles squares I got much better resolution of the contours of the precipitation patterns, to where type of weather front is determinable from the composite of the last four cycles the same as today, and used as a forecast.

    When I looked at temperatures with the increased resolution I changed the contour bands from 10 degree F to single degree contour lines, and at raw 1600 dpi I can zoom into a 20 mile square, at that view can be seen patterns of warm and cooler areas, that when compared to Google sat maps fairly well define the urban centers as well as other similar surface contoured areas, that haven’t been developed yet. There naturally appear spots where the hills and water runoff combine to be more habitable due to being sheltered from cold blasts. The first settlers chose those spots on river systems to start from, so the oldest stations will still be in the most sheltered areas now.

    The temperature contour maps can show detail about small towns, reservoirs, wooded areas national parks….plots as small a a couple square miles if there is contrast. The outlines of the inter mountain valleys where agriculture is on going because it is possible, show nice trends that shift with the seasons.

    What I am trying to say if you use all of the original daily data from the total data base only dropping stations on the individual days the station data is bad or missing, which leaves some loss of resolution for that spot for that day, or end of record strings. The resolution matches the most detailed real time maps available, if all stations were reporting live.

    About three weeks or maybe more to bringing on line with all of Canada and the USA Alaska included, with the higher res maps. I have ordered the Australian BOM data set for all stations for full length of record in sets raw daily collections of Temperatures, dew point, Precipitation, daily snow fall totals, and barometric pressures where taken daily. I am going to get the end of records time of April 1st 2011, so to test the accuracy of the method I am going to start the Australian forecast of composite maps back in June 2010 and post them side by side with the actuals for each day for the comparison of the timing of the shift from dry to wet they had this cycle compared to the past cycles shift in precipitation patterns.

    Would you be interested in looking over the final code we end up with to check for obvious errors?

  102. E.M.Smith says:

    @Richard Holle:

    Fine with tme to “take a look”.

Comments are closed.