So Long HP; Seeking Computes…

So I’m posting this from my Samsung Galaxy Note.  I bought a bluetooth keyboard for it.  Still needs a mouse.  I can ‘sort of type’ on the keyboard.  It is a bit slow and bouncy, and the key spacing is a bit off, so more ‘fixes’ needed.   But a world better than the faux keyboard on the screen with hunt and peck.

Still sucky at ‘mark text cut and paste’.  Hoping a mouse fixes that.   For now it is what it is.

And it is what I have.

The HP Laptop is dying.  

First the battery said it would not charge.  That was about 2 years ago.  So it has lived on life support to the wall.

Then the keyboard lost it’s key markings.   No big.  I touch type.  That was about a year ago.

Now it is saying, at boot, that it has detected a fan not working and it will shutdown in 15 seconds.  (Or I can choose to continue and it might end horribly….)   So that’s where I’m at.  

Need to suck off about 1/2 TB of data (mostly on backups already… in California…) and move onto a new laptop.   Until then, this poor excuse for a typing station is what I’ve got.  Barely workable for straight text posting.

OK, I took a look at Chromebooks.  Found one that woud let me instal Linux and with a 350 GB hard disk.  No longer made… Sigh.  The others are all SSD Solid State Disk now.  OK.  Except FLASH has limited read/write cycles (about 10,000 to 100,000 ) and forgets things in a few years ( so those archived SD cards of photos ‘forget’ in 5 to 10 years…)   Put Linux swap on one of those, you have a brick in months… or weeks…   

All the “Windows Afflicted” are Win8, with EUFI, so buggered and not secure no màtter what you do.  Can”t even slide TrueCrypt under the OS.

Dell has a Linux Laptop sold by Amazon so I can wait a long time and   hope the bios is not EUFI buggered.  Sigh.


So postings will be thin until I work this out.  It woud be easier if I was happy with Mico$oft crap and lack of security, but I’m not.  I’ll likely get an old Win-7 laptop used and just scrub it.


I did try using my Raspberry Pi.  Got it to work with the HDMI TV and external keyboard.  The TV has flicker and with X windows running it is way slow in FireFox.   Midori (sp?) was fast, but both didn’t have any video player.  Sigh.  Again.   So I’m thinking if I want a ‘mix your own’ workstation I need a different card / computer for it.   


So I’m “seeking computes” and largely limited to small comment sized blocks of typing until I sort this out.  Oh, and I second Anthony’s distaste at the “beep beep boop’ editor at WordPress.  Stupid and sucky and slow.

Posted in Human Interest | Tagged | 21 Comments

What! A coincidence?

It’s all just a coincidence, I’m sure…


Correlation of USHCN adjustments with CO2

This post is not a joke, but is stunning.

The graph below shows the relationship between atmospheric CO2 and the magnitude of USHCN data tampering. There is almost perfect correlation between the amount of CO2 in the atmosphere and how much cheating our friends at NCDC are doing with the US temperature record.

That text is from above the image in the original posting, so that “below” in the text is “above” in this re-post. Still, it’s a bit hard to accept that it’s all just a coincidence. Note the “R” value. 0.988 is about as close as you can get in any real world test.

In comments several folks make remarks about it being linear while the CO2 effect is supposed to be non-linear… but maybe the CO2 effect on adjustments is different ;-)

Subscribe to feed

Posted in AGW and GIStemp Issues, NCDC - GHCN Issues | Tagged , | 54 Comments

GIStemp – who needs Antarctic data or temps near ice.

I finally got GIStemp source code loaded into the Qemu SPARC emulator, unpacked it, and did a preliminary assessment.

First off, it is way out of date. It looks like it is a 2011 release from just prior to a major revision. So no source code for 1/3 of a decade or so AND a major revision. Not exactly caring about that whole ‘currency’ thing…

Next up, in looking at the things that did change, I noticed some things. First off, it points to a link for the Antarctic data that no longer works. Did it work in 2011? Who knows. But now it has a nice glitzy interactive web site that lets you download bits and pieces and look at bits and pieces… but not grab the data sets that are used as input to GIStemp. This means that either they have a very different way of getting the Antarctic data, or they just don’t bother to use it any more.

In this download, the same source is used as in the older version described here:

Except that site now has a user friendly interface, not somewhere that you can download the data as a file.

So look around at the interface on those pages. You can ‘hunt and peck’ for a station detail, on a point in time basis. Download the data set? Not so much… Then again, notice on the ‘temperatures’ page linked above; the last update date:

“Last modified : Friday 6 February 2009″

So… just what Antarctic data ARE used by GIStemp? Data, we don’t need no stieenking data! (A reference to a movie with the line “Badges? Badges! We don’t need no steenking badges!” give by some Mexican banditos pretending to be Federal agents.) Now this isn’t as bad as it looks ( or maybe it is worse…) since most of those stations don’t have any modern data anyway. Click on a few. Couple of decades for many. Often from disjoint moments in time. So again: Just what Antarctic data are used in GIStemp? Say, from this millennium?

A screen capture of the listing of the input_files directory where the Antarctic data is bundled in with GIStemp sources is somewhat revealing.

GIStemp input_files datestamps

GIStemp input_files datestamps

Notice that the Antarctic data included in the download are from ‘about’ the same vintage as the source code. 2011 is when I think they last updated. Some of the antarc data files are 2011, some 2010, some 2009… though the 2009 is marked ‘old’. So when did the Antarctic site break the download? Has there been any update since? Does GIStemp even use any Antarctic data anymore? Who knows.

But wait, there’s more…

Since GIStemp smears data via ‘homogenizing’ to places where it has no data, from places where it does have data, and since those places can be up to 1200 km away (though it does this smear in three different steps, so really might be ‘smearing a smear’ from up to 3600 km away) it can just fill in the temp data using ‘nearby’ stations. In the final steps GIStemp adds in the Hadley Sea Surface Temperature. So what is wrong with using cold temps from near the ice to fill in over the ice? Well… How about if you decide to just not use temperatures from places where the water is near the ice?

Remember that you can click on the images to make them larger and easier to read. This is an interesting bit of code. It is in STEP4 and the whole listing (minus the added bit described in a panel below) is in this posting:

In looking at the date stamps in STEP4_5, I noticed that it had changed. Why? I wondered. It isn’t a USHCN related step, and doesn’t depend on USHCN.V1 vs V2 changes (like the rest of the changed files).

GIStemp 3vs4_5 datestamps

GIStemp 3vs4_5 datestamps

Notice that this program is the only one with a 2011 date stamp. Everything else is quite old. So what changed? First off, to get oriented, let’s look at this part from the top of the code. (I downloaded this copy of the sources today, so this is as ‘fresh’ as it gets). In particular, we are looking for the meaning of three variables that are used in a very small fragment of code that was inserted into this program in 2011. They are I, J, and M. Look at the comments about lines 9, 10. They state that I is longitude and J is latitude. (You can also see where they are dimensioned and I gets 1-360 while J gets 1-180 – so 360 degrees of longitude and from -90 to +90 mapped onto 1 to 180). M gets the temperature for that location for any given month.

convert1.HadR2_mod4.f  top

convert1.HadR2_mod4.f top

Why focus on those variables? Because this code was added. It says to mark any cell that is anywhere in longitude ( from 1 to 360 degrees around the globe ) that are in the range near the pole, with ‘bad’, meaning to toss out the data. Just don’t use it. Notice the comment.

GISTemp throwing out data from water near the ice at the pole

GISTemp throwing out data from water near the ice at the pole

Yes, you read that right. The comment says:

“Skip some regions where SST is impacted by nearby ice floats”

That 166 to 180 loop says that anything from 76 N to 90 N is to have the data tossed out.



I found a description of which way the arrays run. I’d assumed N to S, but they run S to N, so the above line has been changed from 76 S to 76 N. So it is the N. Pole (where there is more water data anyway) that has the SST data tossed.

The output GRID is rectangular (i:West to East, j:South to North),
where the pole boxes might be of a different latitudinal size than
the other boxes (but North and South pole boxes having the same size).
if offI=0, Western edges of i=1 boxes lie on the international date line;
if offI=-.5, centers of i=1 boxes lie on the international date line;
if dlat=180./JM, all boxes have the same latitudinal extent;
if dlat=180./(JM-1), pole boxes are half boxes.

Though notice that there are some selections possible.

So any Arctic water temperatures can only come from places closer to the equator, and the Antarctic itself has very few currently reporting stations; and that data may or may not actually be making it into GIStemp.

Perhaps that is why, when we have an all time record ice area in the Antarctic, with record cold temperatures being recorded (by others…), GIStemp thinks it looks like this:

GIStemp polar view, May to October, 2013

GIStemp polar view, May to October, 2013

This is the May to October period, so when it is cold at the South Pole. BTW, I have a series of sea surface graphs from that period, all showing cold water anomalies. Nowhere to be seen in GIStemp. But that is for another day to sort out. IMHO, this image is showing GISTemp having a Jump The Shark moment.

In Conclusion

I’ve got some more on GIStemp, but that will wait for another posting. The big lumps are just that the source code is out of date, the data looks like it is missing a chunk from ‘down south’, and in 2011 the code was modified to avoid water temps from near ice. One can only wonder what possible rationalization could exist for that change. IMHO, it is totally unwarranted by any means. Might as well just start dropping any thermometers in cold places…

Subscribe to feed

Posted in AGW and GIStemp Issues, GISStemp Technical and Source Code | Tagged , , , , | 33 Comments

Data Sources – A List

Over the years I’ve had several postings with ‘source of the data’ link for one thing or another. After a while, you get tired of digging them up again and trying to remember what was in each one. So what I’m going to do here is pretty simple: Put up a set of site links (as sometimes the link to the specific data goes stale when they delete it and replace with a ‘new historical’ set of data-food-product…) along with links to the current detail data links, and a statement or two for some of them saying “what is there”. (Adjusted, un-adjusted, adjusted silently via “QA”, etc.)

I don’t expect that I have everything, nor that I’ve got a description for all of it, so I’ll be putting this posting up “medium rare” and adding to it for a few days. If you have other ideas or sources, please post a comment. I’ll collect the sources into the head posting over time.

FWIW, I’ve been a bit of a packrat over the years. I have a GHCN v1 copy (or two or three…) along with a few GHCN v2 copies. Sometimes I’d download just the ‘adjusted monthly average’, sometimes I’d grab more. It depended on what I was doing, how much disk was free, and what I was thinking at the time (such as “gee, NOAA would never delete the old data, I don’t need a copy of V1 Daily”…) So my personal collection is a bit eclectic. It is also scattered over 1/2 dozen computers on both sides of the country and a few dozen CD / DVD backup disk sets…. But, someday, I hope to collect it all into some kind of a valid “history of the changing history”. Once I have a decent format and layout, I’ll be looking for an archive site where I can put up a few gigabytes for anyone else to use.

Why keep old data copies? For postings / discovery like these comparing version 1 with version 3 data:

As it stands now, I’m pretty sure I’ve got the GHCN semi-raw daily data (if the description can be believed – it looks like QA flagged, but datum still in place) for some large set of stations along with several sets of USHCN. I don’t have much at all from Hadley, but will likely packrat that too. With SD cards at $1 / GB and DVDs for backups at about 20 ¢ / 4 GB or so, it just isn’t all that expensive to “make a set”. (The hard part for me has been keeping them organized and ‘near me’ ;-) So pointers to other datasets in need of protection / archiving would be appreciated, along with any old archival copies an individual might have that they would like to see a broader audience.

With all that said, here’s the “Draft Alpha Posting” on temperature data sets. Do remember, I’m actively updating this list over the next few days in dribs and drabs, so don’t expect it to be done; expect it to be a construction project.


The National Climate Data Center – supposed to be the great guardians of the data, and they do have a nice archive, but it is a bit slim on “original raw only” and on version control. More than some others though (like GISTemp that is ‘never the same way twice and no version history kept’). In software development there are dozens of “version control” bits of software that let you roll forward and back to any particular revision while storing things efficiently. (From CVS to RCS to GIT to…) It would seem that folks in “Climate Science” are unfamiliar with these tools, not even having a source code archive to display changes. Oh Well. It is what it is.

Top Page:

Lists several products and has a nice interface to the data.

Climate Data Online (CDO) provides free access to NCDC’s archive of historical weather and climate data in addition to station history information. These data include quality controlled daily, monthly, seasonal, and yearly measurements of temperature, precipitation, wind, and degree days as well as radar data and 30-year Climate Normals. Customers can also order most of these data as certified hard copies for legal use.

I note in passing the lack of a statement that the original source data ‘unadorned’ (i.e. raw) is also included. It looks as though the QA status is a flag on the daily items and that the reading is still there; but I’ve not proven that via actually looking at the data archive. (It’s a bit large ;-) though you can download individual years if desired and they are smaller). IFF that is true, this is a very good starting point. It would show the actual reading, the QA assessment, and you can decided. Then since it is Daily Min / Max you can calculate your own trends through either, or various daily, weekly, monthly, ‘whatever’, averages. It is what I think needs more exploration sooner. I’ve downloaded a copy of daily data ( many GB and many hours) but not yet unpacked it as I’d already filled up my disk with ‘other stuff’… So it will be a while before I can give it a look and assure it’s what I think it is (and described just above – yes ‘trust but verify’ applies to my own work too, and especially to my speculations).

One clicks on ‘datasets’ to get to the data, and that takes you here:

Climate Data Online

Annual Summaries
Daily Summaries
Monthly Summaries
Nexrad Level II
Nexrad Level III
Normals Annual/Seasonal
Normals Daily
Normals Hourly
Normals Monthly
Precipitation 15 Minute
Precipitation Hourly

Legacy Applications

COOP Daily / Summary of Day
Climate Indices
Extremes Monthly
Global Climate Station Summaries
Global Hourly Data
Global Marine Data
Global Summary of the Day
National Solar Radiation Database
Quality Controlled Local Climatological Data
Regional Snowfall Index
Snow Monitoring Daily
Snow Monitoring Monthly

I found the “Daily Summaries” most useful:

File:COOPDaily_announcement_042011.doc 	34 KB 	4/20/2011 	12:00:00 AM
File:COOPDaily_announcement_042011.pdf 	123 KB 	4/20/2011 	12:00:00 AM
File:COOPDaily_announcement_042011.rtf 	67 KB 	4/20/2011 	12:00:00 AM
all 		7/20/2014 	7:28:00 AM
by_year 		7/19/2014 	7:05:00 PM
figures 		2/6/2013 	12:00:00 AM
File:ghcnd-countries.txt 	3 KB 	9/20/2013 	12:00:00 AM
File:ghcnd-inventory.txt 	23798 KB 	7/15/2014 	8:40:00 AM
File:ghcnd-states.txt 	2 KB 	5/16/2011 	12:00:00 AM
File:ghcnd-stations.txt 	7709 KB 	7/15/2014 	8:40:00 AM
File:ghcnd-version.txt 	1 KB 	7/20/2014 	8:27:00 AM
File:ghcnd_all.tar.gz 	2521828 KB 	7/20/2014 	8:27:00 AM
File:ghcnd_gsn.tar.gz 	100441 KB 	7/20/2014 	8:27:00 AM
File:ghcnd_hcn.tar.gz 	278461 KB 	7/20/2014 	8:27:00 AM
grid 		7/19/2014 	8:34:00 PM
gsn 		7/20/2014 	5:34:00 AM
hcn 		7/20/2014 	5:35:00 AM
papers 		10/2/2012 	12:00:00 AM
File:readme.txt 	23 KB 	3/18/2014 	5:02:00 PM
File:status.txt 	28 KB 	1/10/2014 	12:00:00 AM

The ghcn_all.tar.gz is a ‘tar’ tape-archive gzip format file. Helps to know / use Linux or Unix to unpack and untar it. As you can see, it is large at 2.5 GB. It is larger once uncompressed. (That is why I’ve not unpacked and inspected it yet…) Under the by_year directory you can get any individual year as a smaller and easier to swallow chunk. Just looking at a listing of it lets you see just how little data there is in the early years compared to recent years. Long term trends are really strongly biased by the very few early readings. We can’t really get quality long term trends out of the recent 50 years data, since there are 60 ish year cycles in weather (climate-change).

File:1763.csv.gz 	4 KB 	7/19/2014 	7:02:00 PM
File:1764.csv.gz 	4 KB 	7/19/2014 	7:02:00 PM
File:1765.csv.gz 	4 KB 	7/19/2014 	7:04:00 PM
File:1766.csv.gz 	4 KB 	7/19/2014 	7:03:00 PM
File:1767.csv.gz 	4 KB 	7/19/2014 	7:00:00 PM
File:1768.csv.gz 	4 KB 	7/19/2014 	7:02:00 PM
File:1769.csv.gz 	4 KB 	7/19/2014 	7:03:00 PM
File:1770.csv.gz 	4 KB 	7/19/2014 	7:02:00 PM
File:1771.csv.gz 	4 KB 	7/19/2014 	7:00:00 PM
File:1910.csv.gz 	40216 KB 	7/19/2014 	7:05:00 PM
File:1911.csv.gz 	41844 KB 	7/19/2014 	7:02:00 PM
File:1912.csv.gz 	43604 KB 	7/19/2014 	7:02:00 PM
File:1913.csv.gz 	44876 KB 	7/19/2014 	7:02:00 PM
File:1914.csv.gz 	46453 KB 	7/19/2014 	7:00:00 PM
File:1915.csv.gz 	47889 KB 	7/19/2014 	7:00:00 PM
File:1916.csv.gz 	49533 KB 	7/19/2014 	7:05:00 PM
File:1917.csv.gz 	49795 KB 	7/19/2014 	7:01:00 PM
File:1942.csv.gz 	77718 KB 	7/19/2014 	7:01:00 PM
File:1943.csv.gz 	78514 KB 	7/19/2014 	7:03:00 PM
File:1944.csv.gz 	80328 KB 	7/19/2014 	7:03:00 PM
File:1945.csv.gz 	82367 KB 	7/19/2014 	7:03:00 PM
File:1946.csv.gz 	82934 KB 	7/19/2014 	7:02:00 PM
File:1947.csv.gz 	84043 KB 	7/19/2014 	7:05:00 PM
File:1948.csv.gz 	100697 KB 	7/19/2014 	7:03:00 PM
File:1949.csv.gz 	114757 KB 	7/19/2014 	7:01:00 PM
File:1950.csv.gz 	117901 KB 	7/19/2014 	7:01:00 PM
File:2008.csv.gz 	179740 KB 	7/19/2014 	7:03:00 PM
File:2009.csv.gz 	184554 KB 	7/19/2014 	7:03:00 PM
File:2010.csv.gz 	186549 KB 	7/19/2014 	7:02:00 PM
File:2011.csv.gz 	173750 KB 	7/19/2014 	7:04:00 PM
File:2012.csv.gz 	169154 KB 	7/19/2014 	7:00:00 PM
File:2013.csv.gz 	166776 KB 	7/19/2014 	7:03:00 PM
File:2014.csv.gz 	83161 KB 	7/19/2014 	7:04:00 PM

The degree of ‘instrument change’ over time is just incredible. Trying to do global calorimetry with that is really rather silly. But that is what ‘climate scientists’ do…

The ReadMe file has the interesting notes:

GHCN-D is a dataset that contains daily observations over global land areas. 
Like its monthly counterpart, GHCN-Daily is a composite of climate records from 
numerous sources that were merged together and subjected to a common suite of quality 
assurance reviews.

It is unclear without more digging just what ‘merging’ and ‘quality assurance’ has done to the readings, but a top text reading implies flagging, not replacement or deletion. A “Dig Here!” is to dig into the referenced links and papers to assure / deny that assessment.

This by_year directory contains an alternate form of the GHCN Daily dataset.  In this
directory, the period of record station files are parsed into  
yearly files that contain all available GHCN Daily station data for that year 
plus a time of observation field (where available--primarily for U.S. Cooperative 
Observers).  The obsertation times for U.S. Cooperative Observer data 
come from the station histories archived in NCDC's Multinetwork Metadata System (MMS).  
The by_year files are updated daily to be in sync with updates to the GHCN Daily dataset. 

Just why 1770 needs daily updating is unclear… but the above listing quote does show daily update changes…

There are also pointers to other links and other data archives that need some kind of cross check to figure out if they are different or the same, change over time or not, and just what IS the real historical data…

Further documentation details are provided in the text file ghcn-daily_format.rtf in this directory.

Users may find data files located on our ftp server at 
There is no observation time contained in period of record station files. 

GHCN Daily data are currently available to ALL users at no charge. 
All users will continue to have access to directories for ftp/ghcn/ and ftp3/3200 & 3210/ data at no charge.

For detailed information on this dataset visit the GHCN Daily web page at

You would think there would be ONE historical set of original RAW data, but no… not found it yet… there looks to be many copies, all with slightly different processing, mixes, histories, updates, “quality assurance”, etc…

Down in the footer of the top page is a comment that they have a ‘legacy’ site with data sets not yet migrated to the new site. Probably worth some archival time…

Data Here:

There is a ‘data set’ option in a drop down on the right. Has some satellite as well as ground data. Could likely spend a month on that just sorting it out, and archiving the bits that are valuable. I’ve not explored it yet, but the implication is that this ‘older’ site / version is going to go away once brought up to date. No idea if that means “copy over intact” to the new site; or “convert and expunge the original past”.

They also have a USHCN set of data, but they describe it as a subset of the GHCN. More details here:

It comes in versions too… they have Version 2.5 up now.

Has the various versions listed (v1, v2, v2.5) as directories. Also has a ‘daily’ directory.

File:ushcn_01.tar 	130275 KB 	11/7/2002 	12:00:00 AM
File:ushcn_98.tar 	122824 KB 	3/23/2000 	12:00:00 AM

At a few hundred MB, not that large to snag a copy. Though one wonders what happened since 2002 as the newest date stamp.

V1 has only a ‘metadata’ directory and v2 has only ‘monthly’. No idea where to get the old v1 and v2 data now.

Hadley CRU Climate Research Unit

The English archival site. Infamously having said that they “lost the original data”, but that post processing “improved” versions were available, and besides, it was mostly like GHCN that was at NCDC. (And would that be GHCN version 1, 2, or 3? Hmmm??? As the data Langoliers are busy rewriting it…)

At any rate, they may have something of value; but I’d likely use the NCDC daily data instead. (Oh, NCDC also has a long list of ‘data food products’ that they have post processed in various ways. As, IMHO, those are NOT data but are processing products, I’ve not listed them here. From the same top link you can get to their processed, adjusted, homogenized, etc. stuff, I’ve left that out of this posting). For Hadley, since they lost the actual data, all you get is their post-processing data-food-products.

Top Link:

It claims that data are available.

Data Here:


Interesting graph of it with a Very Nice plunge at the end in the last decade or so…

Hadley graph of Central England Temperature

Hadley graph of Central England Temperature

Gridded Monthly combined land / sea data-food-product:

Has a data download link on the page, but mostly just looks like you can get the post-processed data-food-product as numbers instead of as scary pictures. Don’t see the point, really.

They may have something else interesting there, but I’ve not put the time in to find it. There is a link to “CRUTEM4″ that claims to be actual data (for some degree of ‘data’):

This page describes updates in CRUTEM4 version CRUTEM. Previous versions of CRUTEM4 can be found here. Data for CRUTEM. can be found here.
Additions to the CRUTEM4 archive in version CRUTEM.

The changes listed below refer mainly to additions of mostly national collections of digitized and/or homogenized monthly station series. Several national meteorological agencies now produce/maintain significant subsets of climate series that are homogenized for the purposes of climate studies. In addition, data-rescue types of activities continue and this frequently involves the digitization of paper records which then become publicly available.

The principal subsets of station series processed and merged with CRUTEM (chronological order) are:

Norwegian – homogenized series
Australian (ACORN) – homogenized subset
Brazilian – non-homogenized
Australian remote islands – homogenized
Antarctic (greater) – some QC and infilling
St. Helena – some homogenization adjustment
Bolivian subset – non-homogenized
Southeast Asian Climate Assessment (SACA) – infilling /some new additions
German/Polish – a number of German and a few Polish series – non-homogenized
Ugandan – non-homogenized
USA (USHCNv2.5) – homogenized
Canada – homogenized

In addition, there have been some corrections of errors. These are mostly of a random nature and the corrections have generally been done by manual edits. For a listing of new source codes in use, see below (end).

Largely homogenized and fermented to make data-food-product… something vaguely cheesy, but not real. (In the USA various artificial cheese like products must be labeled ‘cheese food product’ so you will not confuse them with real cheese. I’ve adopted the phrase ‘data food product’ to similarly identify things that are vaguely data like, but not really source data…)

At any rate, it is unclear to me just why I want such a data food product from Hadley.

They claim older versions are here:

They are all version 4. It is unclear to me what has happened to versions 1, 2, and 3 and where they might be found, though a quick search turned up this link for 3:

Similar searches on Crutem2 give:

So Steve MacIntyre has likely got a set saved somewhere. The article has links in it that currently give 404 not found errors, so the official Version 2 data sets look to have hit the bit bucket. (Maybe I can talk Steve into sending me a set of V2 to archive… or a link to an archive. It would be amusing to do 2 vs 3 vs 4 compares someday…)

Interesting to note that CRU have the v2 links still up. One hopes it is the actual data, but it is what it is. (I’ve downloaded the data, whatever it might be)

Data for Downloading
ERRATUM: before 21st May 2003, the NetCDF versions 
erroneously used a time dimension with units "months 
since (startyear)-1-1" that started from 1. It should
 (and now does) start from 0.

gzipped ASCII	
zipped ASCII	
gzipped NetCDF	Last updated

crutem2.dat.gz    3.2mb       3.2mb       19.2mb     1.9mb 	2006-01-18

crutem2v.dat.gz   2.5mb      2.5mb       9.6mb    1.8mb 	2006-01-18

hadcrut2.dat.gz   4.5mb      4.5mb       9.3mb    3.5mb 	2006-01-18

hadcrut2v.dat.gz  4.2mb     4.2mb      8.4mb   3.3mb 	2006-01-18

absolute.dat.gz    47kb       47kb        63kb     40kb 	1999-07-13

The CRU crew might have other stuff of interest, but frankly, I'm not very comfortable that it is accurate:

The one link for surface data I did follow did a circular run to Hadley / MetOffice / and on…

Someone else can explore all that further, if needed.


IMHO, not really a source of “data”. GISS (Goddard Institute of Space Studies) just takes in the NCDC data, munges it around a little, and calls it data. It isn’t. It takes in an already ‘adjusted’ data-food-product and further manipulates it according to a fixed algorithmic process that is a bit dodgy. They fill in missing bits from other stations up to 1200 km away (doing this three times in successive sections) so any given ‘data item’ might be a complete fabrication partially based on data up to 3600 km away. They also do a somewhat backward Urban Heat Island “correction” that doesn’t correct for urban heat. In the end, they are the data outlier in most results; but for some reason many folks like to look at their stuff.

Data Input From: GHCN, USHCN, and Antarctica. (Links to be added a bit later after I unpack the latest GIStemp code to see if it has changed). It merges and homogenizes this batch and then makes up missing data and prints the results. Not really useful for anything as far as I can see.

Top link:

Includes links to the data-food-products that it produces.

Data Output Here:

Has a link to GHCN version 2 data on that page, and lets you see individual station data after GIStemp is done with it.


The Carbon Dioxide Information Analysis Center. Guess where their bias lays…

Little referenced, they have their own set of data archives. I’ll be wandering through them to see what I can find. Sometimes it has interesting stuff.

Data Here:

USHCN intro page:

Home page:

I squirreled away a copy of the daily-by-state data some time ago, but who knows where it is now.

USHCN data: (by State) (in one wad for all).

There are likely other sources for USHCN daily, but I’ve not spent the time to track them down. If you know of any, put a link in comments and I’ll add it.

B.E.S.T. Berkeley

Berkeley Earth Surface Temperature.

Claims to be the best, but isn’t. Not particularly worst either. One of the developers claims to have used the method that skeptics wanted; but it isn’t the method I’ve seen asked for much. It has a slice and dice data splicer at the core of it. Since data splicing is considered a sin in many technical disciplines, how doing more of it, more finely, is a feature; well, that’s beyond me. They also cooked up their own way to store data with their own date format (reasonable since they needed to bring divergent data together) but in a way that is sort of painful to use and a bit of work just to understand. (For example, a date instead of being 30 July 2012 or 300712 or any other is a floating point number. X.YYY where the granularity of the part after the decimal will resolve it to a particular day. I’ll get an accurate description and put it here. Just realize that you can’t take the B.E.S.T. copy of GHCN data and do a straight difference against the other GHCN copy as it has been, um, ‘converted’ in format.

So B.E.S.T. takes in much of the same data as the others, chops, dices, and splices it up a lot. Does more homogenizing and infilling things, then claims it is “data”. Yet another data-food-product, IMHO.

They do have an online archive of their sources (that are largely the same as the above: GHCN / USHCN /… )

The Berkeley Earth Surface Temperature Study has created a preliminary merged data set by combining 1.6 billion temperature reports from 16 preexisting data archives. Whenever possible, we have used raw data rather than previously homogenized or edited data. After eliminating duplicate records, the current archive contains over 39,000 unique stations. This is roughly five times the 7,280 stations found in the Global Historical Climatology Network Monthly data set (GHCN-M) that has served as the focus of many climate studies. The GHCN-M is limited by strict requirements for record length, completeness, and the need for nearly complete reference intervals used to define baselines. We have developed new algorithms that reduce the need to impose these requirements (see methodology), and as such we have intentionally created a more expansive data set.

We performed a series of tests to identify dubious data and merge identical data coming from multiple archives. In general, our process was to flag dubious data rather than simply eliminating it. Flagged values were generally excluded from further analysis, but their content is preserved for future consideration.

So far so good. Start from raw (though some is stated as slightly cooked…) and then combine and clean. It’s then the splice and dice homogenize that gets them, IMHO. “Methodology”.

Data Here:

Breakpoint Adjusted Monthly Station data

During the Berkeley Earth averaging process we compare each station to other stations in its local neighborhood, which allows us to identify discontinuities and other heterogeneities in the time series from individual weather stations. The averaging process is then designed to automatically compensate for various biases that appear to be present. After the average field is constructed, it is possible to create a set of estimated bias corrections that suggest what the weather station might have reported had apparent biasing events not occurred. This breakpoint-adjusted data set provides a collection of adjusted, homogeneous station data that is recommended for users who want to avoid heterogeneities in station temperature data.

What part of “thou shalt not splice data and have no error” is unclear to them? Sigh.

So it’s just a much more homogenized and much more sliced, diced, and spliced data-food-product.

But the good thing is that they put their source data on line, so you can back up ahead of their processing and start over:

There are many individual links there that I’ve not fully explored. Some of them are already above. Some are not (like the Colonial Era data and the Coop stations). I’d like to archive the lot of them, but time does not allow that at the moment. Perhaps folks could split the job up and each grab a chunk? Assemble again later?

Wood For Trees

Need to put in a good description of these folks. They have nice graphing facilities, and have the data sets behind it. Not yet found out if you can download the whole set of data direct from them.

Top Link:

Lists their data sources on the side bar. Does include the UAH satellite data. (At some point I’ll add links to the satellite data, but since they don’t seem to have ‘revisions’ to their history quite so much I’ve not seen it as urgent).

Interactive graph here:

Wolfram Alpha

More a calculation site than an archive, yet they clearly have some kind of temperature data archive to be able to compute graphs for folks. Such as this example:

Looking at trends through it a couple of years back, it was clear they did no ‘clean up’ for things like large gaps and odd outlier data; so likely it is (or was) raw not QA checked data. (The data does need some kind of cleaning to be usable, but IMHO Hadley and NCDC go way too far).

Misc and Smaller Sites

There are a lot of ‘bits’ all over. I’ll be expanding this section “for a while”. From individual nations, to specific archives at schools and others. I don’t know much about them. It would be interesting to audit a few of these and compare them with the data-food-products above. If the little guys say “we recorded this data” and the above say “this is it!” and they are different, well… Some examples:


This site contains files of daily average temperatures for 157 U.S. and 167 international cities. The files are updated on a regular basis and contain data from January 1, 1995 to present.

Source data for this site are from the National Climatic Data Center. The data is available for research and non-commercial purposes only.

So it might be possible for some such sites to find older copies held by some other ‘packrat’.

Environment Canada

Has a nice top page that says you can select various kinds of data:

If folks post enough links to various national sites, I’ll make a “by nation” section to collect them.

For now, I’m slowly working down the list of sites found by web searches like this one:

which also found:

DOWNLOAD a .csv spreadsheet file compatible with programs such as Excel, Access or the Free Open Office Calc, or view in your browser
Exclusive Station Finder Tool helps you find the best data for your needs
Complete National Weather Service archive – over 10,000 stations (far more than most resources on the web)
Unmatched data accuracy
Archived stations some daily data as far back as 1902, most hourly data back to 1972
Meteorologists on hand to assist
Instant access – Get the information you need immediately via your web browser with backup links sent via email

It looks to want to charge you for larger amounts, but might be free for individual station ‘samples’. They claim to have a lot of sources:

Weather Source meteorologists have created perhaps the world’s most comprehensive weather database by unifying multiple governmental and other weather databases together and applying advanced data quality control and correction methods. The resulting “super database” of weather information contains over 4 Billion rows of high quality weather observations. The Weather Warehouse provides users with direct and immediate access to this database. On the Weather Warehouse users have access to the following weather information:

I’ve not dug into it to figure out “what ultimate data source and what processing”, but it looks like all the Excel users our there can get a nice comma separated value CSV spreadsheet for easier processing for some stations.

In Conclusion

OK, that’s it for the moment. I think the GHCN daily is likely the most unmolested of the lot. Coop data and CET are likely pretty useable too. USHCN daily is also likely clean of distortions. Any of the monthly average ‘data’ is not really data. It has been QA checked, filtered, selected, processed; potentially homogenized, adjusted and more. I’d rather start with daily data and work up from there.

If you know of any good archives of old musty data, please add a link!

Subscribe to feed

Posted in AGW and GIStemp Issues, CRUt, NCDC - GHCN Issues | Tagged , , , , , , , , | 13 Comments

A rather useful archive of Individual Station Modification Graphs

It would seem that NCDC have made a nice set of graphs that show the adjustments done on each and every station in the GHCN Global Historic Climate Network temperature history.

A brief look seems to indicate more cooling of the past and warming of the present, via adjustment, than from any asserted “Global Warming” in the actual data. It would take a lot more work, though, to demonstrate that via looking at every station and the net impact on the final ‘warming’.

But for now, there’s quite a set of useful images here:

For each station. So you can just wander around and find things of interest.

I’m going to upload a couple here, for purposes of illustration. But really, this is one giant “Dig Here!” that would benefit from many hands (and eyes) looking at many graphs.

The graphs are in folders with a single digit number. That number is the first digit of the station ID (so also the continent / cluster).

Here’s some info from other locations at that site:

The “ReadMe” file:

Last Updated: 09/29/2010

The following directory:

is comprised of sub-directories (that are named by the first digit of a station
ID) that contain individual station plot files (in “gif” format).

The plot files contain 9 individual graphs, arranged in a 3×3 matrix. The first
column of graphs, contain 2-D colored symbol graphs of the actual monthly data
for the entire period of record for A) the (Q)uality (C)ontrolled (U)nadjusted
(QCU) data, B) the (Q)uality (C)ontrolled (A)djusted (QCA) data, and C) the
differences between QCA and QCU monthly data. The second column of graphs
contain histograms of the monthly data for QCU, QCA, and (QCA-QCU) respectively.
Finally, the third column of graphs depict annual anomalies and their associated
trend line for QCU and QCA, and the differences in the annual anomalies for QCA
and QCU. Detailed axis titles and units are displayed in the title of each

So you can see that there’s lots of good info here on unadjusted vs adjusted. I find the tend line and the difference graphs the most interesting.

Here’s an example from Tatlayoko Lake, BC:

Adjustment Graph for Tatlayoko BC

Adjustment Graph for Tatlayoko BC

On the right, notice that the original dropping trend line has been turned into a generally flat one. The graph at the bottom right shows that the past was cooled, and the present warmed. Clearly and obviously.

Now, to me, it isn’t so much the warming present and cooling past, as that pretty much every graph has more change from adjustments than it does from actual trend. Those that are not changed generally are so short of data that there isn’t much point. (Though there are graphs that are unchanged).

What’s the net-net of it? Hard to say, but I’d say mostly a “Global Warming” signal that comes out of the adjustments, not out of the data.

They have a paper describing their latest changes here:

It has some interesting bits buried in it, like their new method finding more step change points to prune out and inducing even more change than the prior version. The “homogenizing” looks to be the magic sauce. It looks similar to the B.E.S.T. splice and dice method of taking slow changes (like aging paint) and keeping that warming in, while taking out the step function when it is repainted to the proper white. Version 3.2.0 finding 1.07 C / Century while Version 3.1.0 has 0.94 C / Century. So we get 0.13 C of added warming from this one update to the code. Now, do that 5 times, you have all of Global Warming. How many updates have there been? Well, since this was from 3.1 to 3.2, I’d wonder about 1.x to 2.x to 3.x… Looks like about a dozen or three to me…

Yes, just a first approximation. But I’d like to know just how many salami slices of warming have been added just this way.

Here is an example station that gets no change:

Syowa GHCN Adjustments

Syowa GHCN Adjustments

So why is it left alone, while others are changed? Who knows…

Again, if it is so important to change the data, dramatically, for other stations; then why is it not just as important for THIS station? Which is the error? Changing the other one, or not changing this one? They BOTH can not be error free decisions…

While Faraday gets the rather high trend there cooled down:

Faraday GHCN Adjustments

Faraday GHCN Adjustments

Mawson station in the same major number cluster gets a bit of warming:

Mawson GHCN Adjustments

Mawson GHCN Adjustments

In a general ‘look over’ it looks to me like the added warming makes up all of the “AGW” signal. It needs a full on analysis / proof to show that. But what gets me more is that there is no rhyme nor reason. Some stations up, some down, some flat. Is the whole thing just an artifact of an algorithmic adjustment gone mad? The average warming signal being the leftovers in the error band of all those seemingly senseless adjustments?

Looking at the raw data for many locations does not show much “warming” at all. This one for example:

Fosston GHCN Adjustments

Fosston GHCN Adjustments

So why do they end up getting a warming trend? And why is the tend from those adjustments so much more than any trend in the actual data?

IMHO, the folks doing the adjusting are in love with their intellectual creations and not bothered to actually look at what it does to the data. (The alternative requiring malice… and “never attribute to malice that which is adequately explained by stupidity”… )

In Conclusion

This takes a whole lot more eyes looking at a whole lot more of these graphs. Sorting them by type of adjustment. Assessing each one for sanity. Calling “BS” on the ones that are just not justified by the known facts. Calling “BS” on the ones with no known facts to justify them. Calling “BS” on the ones where natural cycles and processes have been ironed out in the name of ‘homogeneity”.

But at least the graphs are now produced, and sitting there for everyone to have a look.

Station data and other info is available too. They have a FAQ “Frequently Asked Questions” file here:

and it claims to link to other documents that:

global temperature trends?
NCDC Technical Report No. GHCNM‐12‐02 provides a detailed summary of each software modification
and the resulting impacts to global temperatures. This report is available at Report NCDC No12‐02‐

With software available for inspection:

Is it possible to obtain the computer software code that NCDC uses for making homogeneity
Yes. The Pairwise Homogeneity Adjustment algorithm software is available online at .

So plenty to keep a lot of folks busy, if they have the time to dig in and help.

What is very clear is that there is an awful lot of room for fudge in those adjustments, and a lot of room for error that does not show up as error bars, and ought to.

If it is at all like what they do to the USHCN, that is about 1/2 F, it accounts for roughly all of the “Global Warming” signal with nothing left over for nature:

The cumulative effect of all adjustments is approximately a one-half degree Fahrenheit warming in the annual time series over a 50-year period from the 1940’s until the last decade of the century.

USHCN Adjustments

USHCN Adjustments

Subscribe to feed

Posted in AGW Science and Background, NCDC - GHCN Issues | Tagged , , , , | 27 Comments