NOAA/NCDC: GHCN – The Global Analysis | Musings from the Chiefio

NOAA/NCDC: GHCN – The Global Analysis

Posted on 3 November 2009 by E.M.Smith

The Thermometer count by year crashes about 1990

The Day The Thermometer Music Died. Thermometers by Year Crashes.

And they were singin’: “Bye Bye Miss American Pie, Drove my GIStemp to the Levy But the Levy was Dry; and them Good Ol’ Boys was Drinking Whiskey and Rye, and Singin’ ‘This Will Be The Day That I Die!, This Will Be The Day, That I Die’…”

Introduction to GHCN – The Global Historic Climate Network

This is an ‘aggregator posting’. I’m putting here the links to the various individual analysis steps of GHCN on a global basis. (This is so that, in the future, I only need to use this one link to get to any of them).

UPDATE: I’ve added the “by altitude” links below in the regional section.

So what are the postings?

I looked at GHCN input data from various places around the world. By continent. By major country on some continents. As time permits, I’ll add more fine grained looks at some other countries. (Under the Asia thread, I found that Japan now has no thermometer above 300 M. Who knew Japan was as flat as Kansas… So I’m going to “do Japan” at some time and see what else turns up… When that happens, a link will be added here.) With that, here is the list of links to “What the GHCN (Global Historic Climate Network) data look like, by continent and with selected countries”.

GHCN is the Global record of land thermometers (that is the “historic” part – clearly their bias is that satellites are the future and those actual instruments on the ground are so ‘historic’ as to be positively ‘old school’). Frankly, a well tended mercury thermometer is hard to beat, but I’m not in charge (and they are not well tended
http://www.surfacestations.org ).

You can get a bit more detail, along with some file format information at:

https://chiefio.wordpress.com/2009/02/24/ghcn-global-historical-climate-network/

You can download your own copies of the GHCN data from the ftp site at:

ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2

Which also includes more information in “readme” files.

I’ve also worked out that some folks refer to this specific NOAA produced data product as “NCDC Temperature Data” or “NCDC Data”. Even though NCDC provides many different types of data, it looks like the GHCN specific data set is sometimes given the name of the parent organization. GIven the recent melt down of HadCRUt in Climategate, and the mutual mantra of GIStemp is like HadCRUt is like NCDC so we all must be right, when all are variations on GHCN, I suspect NCDC via GHCN is at the heart of Climategate. So, when you hear NOAA or NCDC or GHCN, they are just organizational levels of the same thing. NOAA is the parent of NCDC and NCDC is the producer of GHCN.

The Links to my Analysis

These are presented in an order different from their original writing. You may well find that when you read one it talks about “what we saw before” that you have not yet seen. “No worries”. There is not a lot of time dependence and you ought to be able to take them in any order without too much of an ‘issue’.

The World, and The Hemispheres

Early on, I noticed that the history of the thermometer record had “The Thermometers March South”. Initially I assumed this was just an artifact of the spread of technology, and time, spreading from the north to the south. And perhaps some spread of wealth and thermometers in the Jet Age as airports spread to tropical vacation lands. Yet there was an odd discontinuity at the end. In the 1990’s, the thermometer count plunged overall, and the percentage in the Northern Cold band was cut dramatically. This was the early investigation that lead to all the other links here:

https://chiefio.wordpress.com/2009/08/17/thermometer-years-by-latitude-warm-globe/

Early on, before I had worked out the details and polished the code enough to take detailed slices through the GHCN data; I had made some postings that looked at large swathes of the data. The selections were often done by hand (using Linux / Unix tools like ‘grep’) and the processes later were turned into the scripts that are listed at the bottom of this posting.

One early cut looked at the Northern Hemisphere in total. In retrospect, you can see the “Day the Thermometer Music Died” in the charts of winter months. Thermometers at the beach in Los Angeles don’t get very cold in winter, those at Squaw Valley Ski Area do. In California, all our thermometers have left the mountains and are now on the beach with 3/4 of them near L.A. and San Diego. I didn’t know that when this posting was made, but you can clearly see the effect of these changes in the Northern Hemisphere graphs:

https://chiefio.wordpress.com/2009/10/18/the-northern-hemisphere-what-warming/

I initially thought that the stability in the record seen in long lived thermometers must have been an artifact of modification flag changes with the changes of equipment (as many placed moved to automated temperature recording). Since then, I’ve determined that there really were a bunch of thermometers deleted. 90+% in many major countries around the world. There is still a discontinuity at the end, and that lead to more detailed investigations.

Sidebar on Data Sources: In some of the links, you will see the file v2.mean as the data source. That is the straight GHCN data. In others, you will see v2.mean_comb as the data source. This is the result of “STEP0” of GIStemp. It has had the Antarctic data enhanced with some extra data from the Antarctic projects directly; it has had an extended copy of one site in Germany used to replace the original; and it has had the USHCN copy of the US data merged with the GHCN copy (and that will only effect the US data).

Because of this, for substantially everywhere in the world other than the USA and Antarctica, these two data sets will have indistinguishable effects.

For Antarctica it is valuable to use the v2.mean_comb file to see most of the place (though a look with only the v2.mean might be instructive about GHCN…); while for the USA, since GIStemp did not make the transition to USHCN.v2 (version 2) the impact of the USHCN data ‘cuts off’ in 2007. There is no difference between the GHCN data for the USA and the v2.mean_comb contents after that date.

The Detail Study Links

We will step through the whole globe, region by region.

North America:

This first one has California in the title, and there is an important issue about California in particular in the posting; but the posting is about all of The U.S.A. and uses data for the whole country. (The “issue” is that as of 2009, California has 4 active thermometers in GHCN. 3 on on the beach in Southern California and one is at the airport in San Francisco, we presume waiting for it’s ride to L.A….)

https://chiefio.wordpress.com/2009/10/24/ghcn-california-on-the-beach-who-needs-snow/

This posting looks at changes in Mexico (which is found to have a strong thermometer change bias) but also looks at the “little bits” left over in North America when you leave out Canada, the USA, and Mexico. Thermometers can’t move very far in Belize or The Bahamas… and we find that the temperature record there shows no warming (and even small hints of cooling).

https://chiefio.wordpress.com/2009/11/01/ghcn-mexico-a-megathermal-vacation-band/

For Canada, the thermometers have been leaving the Rockies and running to the shore where it is much warmer:

https://chiefio.wordpress.com/2009/11/13/ghcn-oh-canada-rockies-we-dont-need-no-rockies/

In the U.S.A. the mountains are similarly deleted. Snow? Who needs snow? It’s too cold and it is at an “unrepresentative altitude” (at least, that is the reason given for deleting the high altitude thermometer in Hawaii):

https://chiefio.wordpress.com/2009/11/17/ghcn-the-under-mountain-western-usa/

The Arctic:

In this posting we look at the Canadian and Russian arc that surrounds most of the Arctic. Russia is split into a “European” and “Asian” chunk in GHCN anyway, so I hope this geographic discontinuity is not too jarring… But I think it does make sense to cover Canada under an “arctic” listing.

https://chiefio.wordpress.com/2009/10/27/ghcn-up-north-blame-canada-comrade/

Similarly, we put the Nordic Europe in one bucket to see what it looks like in isolation.

https://chiefio.wordpress.com/2009/10/29/nordic-north-nothing-much-to-see/

Once again we see that when the thermometer record in a country is grossly distorted by deletions, there is an artifact (though not always in the direction expected) and that when a geography is instrumented with a more stable thermometer set, there is no warming present.

Basically, if we’re setting up a global calorimeter to measure heat gain / loss, we need to stop changing the instrument by moving around thermometers and adding / deleting them. Pulling 90% of your thermometers out of the calorimeter makes calibration impossible. (And that renders the results more fantasy than useful. Heck, it makes “Cold Fusion” calorimetry look positively stellar in comparison…)

The Pacific Islands and Australia / New Zealand:

A look at the change of thermometers “by altitude” in the pacific basin:

https://chiefio.wordpress.com/2009/11/13/ghcn-pacific-islands-sinking-from-the-top-down/

And by latitude:

https://chiefio.wordpress.com/2009/10/29/ghcn-pacific-basin-lies-statistics-and-australia/

https://chiefio.wordpress.com/2009/10/23/gistemp-aussy-fair-go-and-far-gone/

And one of my favorites where we see how one island can shift the whole region:

https://chiefio.wordpress.com/2009/11/01/new-zealand-polynesian-polarphobia/

The end of it all is that the entire Pacific Basin is substantially flat on temperatures. Hard to have “Global Warming” if the Pacific is not participating. Australia and New Zealand show warming, but only due to thermometer change artifacts. For New Zealand, it is one single cold thermometer: And when that one is deleted from the whole record, not just the last few years, New Zealand has no “Global Warming” either.

Hard to have “Global Warming” when the 1/2 of the planet that is the Pacific Basin is dead flat with only a small “ripple” as the PDO flips state every 30 or so years.

The Antarctic (we covered one pole, let’s do the other):

One of the more interesting bits is in an update way down at the bottom. I broke out bits of Antarctica by geography so folks can compare east to west and peninsula to center. One site shows 4 years of dirty data, but the NASA site GIStemp map somehow turns this into one single data point; though in the wrong year! Another site has dead flat 21.x C entry and exit to the data series, yet the NASA GIStemp chart has the entry and exit ends of the graph flip flopping like a fish on the dock by about 2 C. (Yes, 2 whole degrees, forget the tenths place…) This, IMHO, is clear proof that the GIStemp process and NASA charts are as much fantasy as anything else.

https://chiefio.wordpress.com/2009/11/02/ghcn-antarctica-ice-on-the-rocks/

South America:

Hard to have “Global Warming” when most of South America is not participating…

https://chiefio.wordpress.com/2009/11/02/ghcn-south-america-siesta/

https://chiefio.wordpress.com/2009/10/24/ghcn-brazil-sambas-north/

https://chiefio.wordpress.com/2009/10/24/argentina-cool-on-the-pampas/

And this is despite NOAA / NCDC deleting the cold Andes from the recent records:

https://chiefio.wordpress.com/2009/11/16/ghcn-south-america-andes-what-andes/

Africa:

Where we find that the continent is not warming, though the thermometer coverage moves around quite a bit, we can still see everything we need to see:

https://chiefio.wordpress.com/2009/10/29/africa-halle-barry-hot-steady-with-variable-coverage/

Africa is a hot place, but it is not getting hotter. Hard to have “Global Warming” when Africa is not participating…

And this stability is despite clear, if complicated, attempts to redact thermometers from cool areas, like the Morocco coast, and move them into the hot area, like toward the Sahara:

https://chiefio.wordpress.com/2009/12/01/ncdc-ghcn-africa-by-altitude/

Asia:

The bulk of the countries in Asia have no “Global Warming”. They are a fairly smooth temperature set. It is only the Siberian thermometer changes and the Chinese thermometer deletions that show much change. There are hints in the “Without Siberia and China” charts of other things to dig into, but the “Global Warming” signal is definitely squashed by taking out the thermometer “issues” in the two big countries. This “hint”, though, led to an early look at altitude changes in Japan, where we find it no longer has any thermometers over 300 meters elevation. It seems that, like California, Japanese thermometers like it on the beach…

https://chiefio.wordpress.com/2009/11/02/ghcn-asia-chinese-footprints-in-siberian-snow/

China, too, has had a thermometer count crash:

https://chiefio.wordpress.com/2009/10/28/ghcn-china-the-dragon-ate-my-thermometers/

Europe:

https://chiefio.wordpress.com/2009/11/02/ghcn-europe-goes-mediterranean/

Europe is an interesting study in that it is one of the earlier places where thermometer migration south shows up. It has a smoother character to the change. It is harder to see the time and spacial onset, and harder to pick a “smoking gun” moment; but it is a strong example of The March Of The Thermometers southward.

And Thermometers are Selective – They Survive at Airports

https://chiefio.wordpress.com/2009/12/08/ncdc-ghcn-airports-by-year-by-latitude/

GHCN Adjusted is Hosed More

This link does a wonderful job of looking at the GHCN “Adjusted” data set and finds it to be horridly hosed. A very good read:

http://wattsupwiththat.com/2009/12/08/the-smoking-gun-at-darwin-zero/

So in addition to all the cooking of the books by thermometer selection, we also have buggering of the adjustments as well.

GEEK Corner: The Computer Code

I will also be putting here the listings of the code I used to process the GHCN data. This is so that anyone who wishes to duplicate any of this work can see what I did ( and hopefully both replicate it and improve on it). I’ve not put up the “by longitude” or “by altitude” code. It is a fairly trivial variation on the “by latitude” program. (Use the longitude field a few spaces further over…) If anyone really wants them, put a comment here and I’ll post them. But it’s 99.9% the same program.

This code is written in “bash” for the scripts (that ought to run in ‘sh’ and ‘ksh’ environments too, if I did things right) and in FORTRAN for the main code. Why FORTRAN? Simply because I’m ‘deconstructing GIStemp’ and it is written in FORTRAN. I find it easier to only have one language loaded into my brain at a time (well, 2 if you count bash… 3 if you include English… but I use Spanish at the fast food places… and French is currently on line too… and… well, lets just say it’s crowded in here and the less added stuff the better). Besides, C is somewhat antithetical to FORTRAN file formats and these data come from FORTRAN programs. So it’s easier to just stick with “the horse what brung you to the party”…

If you don’t like it, perhaps a plea to the Gods Of Source Code will result in someone posting a C translation. Other than the file format issues, it is a trivial bit of code to produce.

One Disclaimer: All this code was written as a fast “hand tool” cobbled together from other code as a base. It isn’t the best solution, only the more expedient one. There is plenty of room for improvement…

Finally, the scripts may extend off the right side of the page. WordPress, in this theme, gives me the Hobson’s Choice of doing a “preformatted” listing and truncating visibility on the right, or not, and letting it steal all the white space and wipe out the formatting. For those programmers who really want to see the “stuff off the right edge” you can just choose “view page source” for your browser and all the text is there. Programmer types ought not to have much of a problem with that, and non-programmers won’t care that they don’t see the right most text.

Merge Temperature History with Station Information

First, we make a merged file from the v2.mean and v2.inv files. Horridly inefficient, but I was more interested in getting done quickly than elegance. It ought to be a straight database load, but you do what is fastest to complete some times.

[chiefio@tubularbells vetted]$ cat ccaddlat.f
C     Program:  ccaddlat.f
C     Written:  Nov 3, 2009
C     Author:   E. M. Smith
C     Function: This program matches v2.inv and v2.mean on Station IDs
C     sorted in order by cc stationID(8) and produces a merged v2.mean
C     format file.
C     So as to match station location info with  thermometer records.
C
C     Copyright (c) 2009
C     This program is free software; you can redistribute it and/or modify
C     it under the terms of the GNU General Public License as published by
C     the Free Software Foundation; either version 2, or (at your option)
C     any later version.
C
C     This program is distributed in the hope that it will be useful,
C     but WITHOUT ANY WARRANTY; without even the implied warranty of
C     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
C     GNU General Public License for more details.

C     General Housekeeping.  Declare and initialize variables.
C
C     There are some left in there from a prior version of the code.
C     I've not "preened" this version for consistency of definitins
C     declarations and use. -ems
C
C     itmp    - an array of 12 monthly average temperatures for any
C               given record.  The GHCN average temp for that station/year.
C     icc     - the "country code"
C     iyr     - The year for a given set of monthly data (temp averages).
C     idcount - the count of valid data items falling in a given month
C               for all years.  An array of months-counts with valid data.
C     sid     - Station ID for the idcount.
C     line    - A text buffer to hold the file name of the input file
C               to be processed, passed in as an aguement at run time.
C     line2   - A text buffer for the file name of Station IDs to process.
C     oline   - A text buffer for the output file name
C     cid     - Country Code - Station ID 3+8 char
C     csid    - Country Code - Station ID 3+8 char
C     id      - Staion ID in the input data.
C     buff    - A text buffer for holding input / output record data.
C     buff2   - A text buffer for holding input / output record data.

C2345*7891         2         3         4         5         6         712sssssss8

      integer itmp(12), icc, iyr, idcount, lc
      character*128 line, line2, oline
      character*8 id, sid
      character*11 cid, csid
      character  mod
      character*95 buff
      character*64 buff2

      data itmp    /0,0,0,0,0,0,0,0,0,0,0,0/
      data buff /"                                                  "/
      mod=""
      icc=0
      id=""
      sid=""
      cid=""
      csid=""
      idcount=1
      iyr=0
      lc=0

C     Get the name of the input file, in GHCN format.  The file must be
C     sorted by ID (since we count them in order.)
C     The name of the output file will be inputfile.withmean
C     line2 will hold the v2.mean type file sorted on ID (9 char)
C     line will hold the v2.inv info sorted on ID (8 char)

      call getarg(1,line)
      call getarg(2,line2)
      oline="v2.mean.inv"

C2345*7891         2         3         4         5         6         712sssssss8
C     lines of the form "write(*,*)" are for diagnostic purposes, if desired.
C      write(*,*) oline

      open(1,file=line,form='formatted')
      open(2,file=line2,form='formatted')
      open(10,file=oline,form='formatted')              ! output 

C      write(*,*) line2

C     read in a v2.mean record, then a v2.inv description record

      read(2,'(a11,a1,a64)',end=500) csid, mod, buff2
C      write(*,*) "Before 20: ", icc, sid, mod, buff2
   20 read(1,'(a11,a95)',end=400) cid,buff
C      write(*,*) "After 20: ", icc," ", id, " ", buff
C      write(*,*) "at 20: ", sid," ",mod," ",buff2 

   40 continue
      if(cid.eq.csid) goto 70
      if(cid.gt.csid) goto 80
      if(cid.lt.csid) goto 90

C      write(*,*) "Compiler error!.  You can't get here!"

   70 DO WHILE (cid .eq. csid)
         write(10,'(a11,a1,a64,1x,a95)') csid,mod,buff2,buff
         read(2,'(a11,a1,a64)',end=300) csid, mod, buff2
C      write(*,*) "From 70: ", icc," ", sid," ", mod," ", buff2
      END DO
      goto 40

   80 DO WHILE (cid .gt. csid)
         read(2,'(a11,a1,a64)',end=300) csid, mod, buff2
C      write(*,*) "From 80: ",  csid," ", mod," ", buff2
         lc=lc+1
         if (lc.gt.1) then
                write(*,*) "in 80: ", csid," ",mod," ", buff2
                write(*,*) "in 80: ", cid," ", buff
                write(*,*) "Too many times in loop 80", lc
         end if
      END DO
         lc=0
      goto 40

   90 DO WHILE (cid .lt. csid)
         read(1,'(a11,a95)',end=200) cid,buff
C      write(*,*) "From 90: ", icc, id, buff
C      write(*,*) "id vs sid", id, sid
      END DO
      goto 40

  200 continue
      STOP "Out of v2.inv Station Records - ought to be rare!"

  300 continue
      STOP "Out of v2.mean records.  Most likely case."

  400 STOP "Input file 1 blank on first record!"
  500 STOP "Input file 2 blank on first record!"
      END
[chiefio@tubularbells vetted]$

I used g95 as the FORTRAN compiler. The “wrapper script” that does the environmental set up and manages the program is somewhat overly complex. It lets you choose various input files other than the GHCN v2.mean and v2.inv files so that you can use the same tools on other sources of data. You don’t really need any of that (but it is helpful in GIStemp analysis, so I can choose to use the v2.mean.z file after Antarctica is added, or not…) Mostly it just calls the one program and hands it the v2.inv and v2.mean files as input.

[chiefio@tubularbells vetted]$ cat mkinvmean
echo " "
echo "Optional:  sort the v2.inv type file by Country Code / Station ID"
echo " "

ls -l ${1-/gnuit/GIStemp/STEP0/input_files/v2.inv} v2.inv.ccid

echo " "
echo -n "Do the sort of v2.inv data into v2.inv.ccid (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     INVFILE=v2.inv.ccid
     echo INVFILE= $INVFILE

     sort -n -k1.1,1.11 ${1-"/gnuit/GIStemp/STEP0/input_files/v2.inv"} > $INVFILE
else
     INVFILE=${1-"/gnuit/GIStemp/STEP0/input_files/v2.inv"}
     echo INVFILE= $INVFILE
fi

echo " "
echo "v2.inv data from:"
echo " "
ls -l $INVFILE
echo " "

ls -l ${2-"/gnuit/GIStemp/STEP0/input_files/v2.mean"} v2.sort.ccid

echo " "
echo "Optional:  Sort the v2.mean type file by Country code / station ID"
echo " "
echo -n "Do the sort of v2.mean into v2.sort.ccid (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     MEANFILE=v2.sort.ccid
     sort -n -k1.1,1.11 ${2-"/gnuit/GIStemp/STEP0/input_files/v2.mean"} > $MEANFILE
else
     MEANFILE=${2-"/gnuit/GIStemp/STEP0/input_files/v2.mean"}
fi

echo " "
echo "v2.mean data from: "
echo " "
ls -l $MEANFILE
echo " "

echo "Then feed the v2.inv data, sorted by cc/station ID, and "
echo "v2.mean by ccid to the program that matches ID to description"
echo "records in the data set."
echo " "
echo -n "Do the matching of $INVFILE with $MEANFILE into: ./v2.mean.inv (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
      bin/ccaddlat $INVFILE $MEANFILE
fi

echo " "
echo "Produced the list of v2.mean.inv records "
echo "(temps, with station data, sorted by CCStationID"
echo " "

ls -l v2.mean.inv

echo " "
     ls -l v2.sort.ccid v2.inv.ccid
echo " "

echo -n "Remove the work files v2.sort.ccid and v2.inv.ccid (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     echo rm v2.sort.ccid v2.inv.ccid
     rm v2.sort.ccid v2.inv.ccid
fi

echo " "
echo -n "Does v2.mean.inv need sorting by CC, Station ID, and Year (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
#       sort -n -k1.1,1.16  v2.mean.inv
        echo bin/meansortidyr v2.mean.inv giving v2.mean.inv.ccsidyr
        bin/meansortidyr v2.mean.inv v2.mean.inv.ccsidyr
        echo
        ls -l v2.mean.inv.ccsidyr
fi
[chiefio@tubularbells vetted]$

The script “bin/meansortidyr” is a one line sort that could easily be put ‘in line’ in the above script. I broke it out as a convenient ‘hand tool’:

[chiefio@tubularbells vetted]$ cat bin/meansortidyr
sort -n -k1.1,1.16 ${1-”../../STEP0/input_files/v2.mean”} > ${2-”v2.sort.ccidyr”}
[chiefio@tubularbells vetted]$

Temperature Averages By Years

This program creates a history of temperature over years. You could run it against the above combined v2.mean v2.inv file or against the v2.mean format files of GIStemp and GHCN (it only uses the first v2.mean format part of the file, so will work with any of them.)

[chiefio@tubularbells analysis]$ cat src/lmyears.f
C2345*7891         2         3         4         5         6         712sssssss8
C     Program:  lmyears.f
C     Written:  October 30, 2009
C     Author:   E. M. Smith
C     Function: To produce a list of Global Average Temperatures for
C     each year of data in a GHCN format file, with one GAT for each
C     month and a total GAT for that year.  Summary GAT records are
C     produced for the whole data set as a "crossfoot" cross check of
C     sorts.  While you might think it silly to make a "global average
C     temperature" for a 130 year (1880 to date) or 308 year (1701 the
C     first data in GHCN, to date) interval, once you accept the idea
C     of adding together 30 days, or 365 days of records, or
C     thermometers from all over the planet "means something":
C     Where does it end?

C     Personally, I think the whole idea of a GAT is bogus,
C     but if you accept it as a concept (and GIStemp and the AGW
C     movement do) then you must ask:
C     "in for a penny, in for a pound":
C     When does the GAT cease to have some value, and exactly why?...
C
C     So I produce GAT in several ranges and you can inspect it
C     and ponder.

C     Copyright (c) 2009
C
C     This program is free software; you can redistribute it and/or
C     modify it under the terms of the GNU General Public License as
C     published by the Free Software Foundation; either version 2,
C     or (at your option) any later version.
C
C     This program is distributed in the hope that it will be useful,
C     but WITHOUT ANY WARRANTY; without even the implied warranty of
C     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
C     GNU General Public License for more details.
C
C     You will notice this "ruler" periodically in the code.
C     FORTRAN is position sensitive, so I use this to help me keep
C     track of where the first 5 "lable numbers" can go, the 6th
C     position "continuation card" character, the code positions up
C     to number 72 on your "punched card" and the "card serial number"
C     positions that let you sort your punched cards back into a proper
C     running program if you dropped the deck.  (And believe it or
C     not, I used that "feature" more than once in "The Days Before
C     Time And The Internet Began"... 

C2345*7891         2         3         4         5         6         712sssssss8

C     Oddly, within the 7-72 positions, FORTRAN is not position
C     sensitive.  This was so if you put in an accidental space
C     character, you didn't need to repunch a whole new card...
C     Oh, and if you type a line past the 72 marker, you can cut
C     a variable name short, creating a new variable, that FORTAN
C     will use as an implied valid variable.  So having "yearc"
C     run past the end can turn it into "year" "yea" "ye" "y"
C     which will then be the actual variable you are using in that
C     line, not the one that prints out on your card.  The
C     source of endless bugs and mirth 8-}

C     General Housekeeping.  Declare and initialize variables.
C
C     itmp    - an array of 12 monthly average temperatures for any
C               given record.  The GHCN average temp for that
C               station/year.
C     incount - the count of valid data items falling in a given month
C               for a year.  An array of months-counts with valid data.
C     nncount - the count of valid data items falling in a given month
C               for all years.  An array of months-counts w/ valid data.
C     itmptot - Array of the running total of temperatures, by month.
C     ymc     - count of months for the total of a year  with some
C               valid data.
C     ntmptot - running total of all temperatures, by month column.
C     icc id iyr nyr iyrmax m iyc - countrycode, Stn ID, on year
C               of data as monthly averages of MIN/MAX temps, max year
C               so far, month, iyc In Year Counter: # recs in year.
C     tmpavg  - Array of average temperatures, by month. The data
C               arrive as an INTEGER with an implied decimal point
C               in itmp.  This is carried through to the point where
C               we divide by 10 and make it a "REAL" or floating point
C               number in this variable.
C     ttmpavg - Total of temperature data by month for all years.
C     tymc    - count of months for the total of all data with some
C               valid data.
C     eqwt    - Total of monthly averages of temperature data, by month.
C     eqwtc   - Counter of months with valid data.
C     eqwtg   - Grand Total of calculated monthly averages of MIN/MAX
C               averages.  Divided by eqwtc for Grand Avg.
C     gavg    - Global Average Temperature.  GAT is calculated by
C               summing tmpavg monthly averages that have valid data,
C               then dividing by the count of them with vaid data.
C    ggavg    - The Grand Grand Average Temperature,
C               whatever it means...
C     line    - A text buffer to hold the file name of the input file
C               to be processed, passed in as an aguement at run time.
C     oline   - A text buffer for the output file name,
C               set to the input_file_name.GAT

C2345*7891         2         3         4         5         6         712sssssss8

      integer incount(12), nncount(12), itmptot(12), ntmptot(12)
      integer itmp(12)
      integer icc, id, iyr, nyr, iyrmax, m, iyc

      real tmpavg(12), ttmpavg(12), eqwt(12), eqwtc(12)
      real gavg, ggavg, ymc, tymc, eqwtg

      character*128 line, oline

      data incount /0,0,0,0,0,0,0,0,0,0,0,0/
      data nncount /0,0,0,0,0,0,0,0,0,0,0,0/
      data itmptot /0,0,0,0,0,0,0,0,0,0,0,0/
      data ntmptot /0,0,0,0,0,0,0,0,0,0,0,0/
      data itmp    /0,0,0,0,0,0,0,0,0,0,0,0/

C      data tmpavg  /0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0./
C2345*7891         2         3         4         5         6         712sssssss8

      data tmpavg  /-99.,-99.,-99.,-99.,-99.,-99.,-99.,-99.
     *,-99.,-99.,-99.,-99./

      data ttmpavg /0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0./
      data eqwt    /0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0./
      data eqwtc   /0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0.,0./

      icc   =0
      id    =0
      iyr   =0
      nyr   =0
      iyrmax=0
      m     =0
      iyc   =0

      gavg  =0.
      ggavg =0.
      eqwtg =0.
      ymc   =0.
      tymc  =0.

      line =" "
      oline=" "

C     Get the name of the input file, in GHCN format.  The file
C     must be sorted by year (since we sum all data by month
C     within a year.) The name of the output file will be that of
C     the input_file.yrs.GAT where GAT stands for Global Average
C     Temperature.

C2345*7891         2         3         4         5         6         712sssssss8

      call getarg(1,line)
      oline=trim(line)//".yrs.GAT"
      open(1,file=line,form='formatted')
      open(10,file=oline,form='formatted')              ! output

C     Read in a line of data (Country Code, ID, year, temperatures)
C     Set the max year so far to this first year, set the "LASTID"
C     to zero so it will fail the equality test later.

      read(1,'(i3,i8,1x,i4,12i5)',end=200) icc,id,iyr,itmp
      iyrmax = iyr
      LASTID = 0
      rewind 1

   20 CONTINUE

      read(1,'(i3,i8,1x,i4,12i5)',end=200) icc,id,iyr,itmp

      if(iyr .gt. iyrmax) then

C      if you have a new year value, you come into this loop,
C      calculate the Monthly Global Average Temperatures, the
C      Yearly GAT for iyrmax.
C      Print it all out, and move on.

        do m=1,12

          if (incount(m) .ne. 0) then

C      We keep a running total of tenths of degree C in itmptot,
C      by month. Then we divide this by the integer count of
C      valid records that went into each month.  This truncates
C      the result (I think this is valid, since we want to know
C      conservatively how much GIStemp warmed the data
C      not how much my math in this diagnostic warms the data ;-)  

C      So we have a "loss" of any precision beyond the "INTEGER"
C      values being divided, but since they are in 1/10C, we are
C      tossing 1/100C of False Precision, and nothing more.
C      THEN we divide by 10. (REAL) and yield a temperature
C      average for that month for that year (REAL).
C      I could do a 'nint' instead:  nint(itmptot(m)/incount(m))
C      and get a rounded result rather than truncated, but I
C      doubt if it's really worth if for a "hand tool" that I'd
C      like to be a conservative one.  If I truncate, then any
C      "warming" of the data is from GIStemp, not this tool.
C      (Or GHCN, now that I'm using to analyse the input data
C      as well as the code itself.)

C2345*7891         2         3         4         5         6         712sssssss8

C       Diagnostic write to check missing data flag handling.
C       write(*,*) "tmpavg: ", tmpavg

            tmpavg(m) = (itmptot(m)/incount(m))/10.

C       Diagnostic write to check missing data flag handling.
C       write(*,*) "TMPavg: ", tmpavg

            gavg      = gavg+tmpavg(m)
            ymc       = ymc+1.

C       We put a running total of yearly averages together,
C       along with a count for tmpavg, it is the total of
C       monthly temperature averages divided by the count of
C       months with data in them (converted to C from 1/10 C).
C       For eqwt it is a running total of those averages that are
C       used at the end to calculate a "monthly average of
C       monthly averages ".
C       Basically, the first form, gavg, weights each recored
C       equally, while the second form gives equal weight to
C       each month, regardless of number of records in that month.
C       Which one is right?  You get to choose...  (And THAT
C       is just one of the issues with an "average of averages of
C       averages" means something...

C       I just put them here so you can see that they are, in fact,
C       different...

            eqwt(m)   = eqwt(m)+tmpavg(m)
            eqwtc(m)  = eqwtc(m)+1

          end if
        end do

        gavg=gavg/ymc

C2345*7891         2         3         4         5         6         712sssssss8

C Write out the Year, the averages, the grand avg, and the
C number of thermometers in the year

        write(10,'(i4,12f5.1,f5.1,i4)') iyrmax,tmpavg,gavg,iyc

C     Diagnostic "writes", should you wish to use them.
C       write(*,*) "iyc: ", iyc
C2345*7891         2         3         4         5         6         712sssssss8
C       write(*,'("GAT/year: "i4,12f7.2,f7.2,i6,f7.2)') iyrmax,
C    *tmpavg,gavg,iyc,ymc

C      probably paranoia, but we re-zero the monthly arrays of data.
C      ande pack tmpavg with missing data flags of -99
C
        do m=1,12
          incount(m) =0
          itmptot(m) =0
          tmpavg(m)  =-99.
          ymc        =0.
        end do

        gavg   =0.
        iyc    =0
        LASTID =0
        iyrmax =iyr

C     hang on to the present year value and ...

      end if
C     End of "new year" record handling.

C     So we have a new record (for either a new year or for the
C     same year.) If it is valid data (not a missing data flag)
C     add it to the running totals and increase the valid
C     data count by one.

C2345*7891         2         3         4         5         6         712sssssss8

C     Increment the running total for stations in this year.
C     In Year Counter

      if (id .NE. LASTID) then
          iyc  = iyc+1
          LASTID = id
      end if

C     For each month, skipping missing data flags, increment
C     the valid data counter for that month incount, add that
C     temperature data (in 1/10 C as an integer) into the
C     yearly running total itmptot.
C     Also do the same for the total records count nncount
C     and running total of all temperatures (by month) ntmptot.

      do m = 1,12

        if (itmp(m) .gt. -9000) then
          incount(m) = incount(m)+1
          itmptot(m) = itmptot(m)+itmp(m)
          nncount(m) = nncount(m)+1
          ntmptot(m) = ntmptot(m)+itmp(m)
        end if
      end do

C     and go get another record
      goto 20

C     UNTIL we are at the end of the file.
  200 continue

C2345*7891         2         3         4         5         6         712sssssss8

C     Here we use the method vetted in the earlier program
C     totghcn.f where we hold the temps as integers in 1/10 C
C     until the very end, then we do a convert to real
C     (via divide by 10.) and cast into a real (ttmpavg)
C     that is the total average temperature for that month for
C     the total data.  

C     ggavg is the grand total GAT, but after it it stuffed with
C     valid data we must divide it by the number of months with
C     valid data.  It is the "Average of yearly averages of
C     monthly averages of daily MIN/MAX averages".
C     Why?  Heck, GIStemp is a "serial averager", thought it might
C     be fun to see what you get.

C     We also show the average of all the individual monthly
C     data.  That gives a different value.
C     Will the real GAT please stand up? ... 

C     I would chose to use the average of all data in a month
C     since it is less sensitive to the variation of number
C     of thermometers in any given year, but you might chose a
C     different GAT.  Averaging the data directly gives weight
C     to the years with more data.  Averaging the monthly
C     averages gives each month equal weight.  Choose one...
C     Rational? No.
C     But it is the reality on the ground..
C     Basically, if you do "serial averaging", the order of the
C     averaging will change your results.  As near as I can tell,
C     GIStemp (and the whole AGW movment) pay no attention to this
C     Inconvenient Fact.

      do m=1,12
          if (nncount(m).ne.0) then
            ttmpavg(m)=(ntmptot(m)/nncount(m))/10.
            ggavg=ggavg+ttmpavg(m)
            tymc=tymc+1
          end if
          eqwt(m)=eqwt(m)/eqwtc(m)
          eqwtg=eqwtg+eqwt(m)
      end do

C     ggavg is the grand average of monthly averages for a month.
C     eqwt is the sum of all months averages.  eqwtc is the
C     count of all months with valid data.  So this is the
C     place where the total gets divided by the count to give the
C     average of all averages in a month.

      ggavg=ggavg/tymc
      eqwtg=eqwtg/tymc

      write(10,'(4x,12f5.1,f5.1)') ttmpavg,ggavg
      write(10,'(4x,12f5.1,f5.1)') eqwt,eqwtg

      stop
      end
[chiefio@tubularbells analysis]$

The wrapper script for it is:

[chiefio@tubularbells analysis]$ cat dotemps
#       First off, sort v2.mean into a version for reporting by year.

DIR=${2-./Temps}

echo " "
echo -n "Do the extract / process for v2.mean_comb for ${1-501} (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     PAT=^${1-501}
     echo $PAT
     grep $PAT /gnuit/GIStemp/STEP0/to_next_step/v2.mean_comb > $DIR/v2.meanC.${1-501}

     ls -l $DIR/v2.meanC.${1-501}

     echo Now Sort
     echo

     sort -n -k1.13,1.16 -k1.1,1.12  $DIR/v2.meanC.${1-501} > $DIR/Temps.${1-501}

     echo
     echo After the Sort
     echo
fi

ls -l $DIR/Temps.${1-501}

echo " "
echo "Doing GAT Yearlies w/ Missing Flag: lmyears"
echo " "

echo " "
echo -n "Do the Reporting process for $DIR/Temps.${1-501} (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
#     bin/yearsghcn v2.meanC.sorted
#     bin/locyearsghcn Temps.${1-501}

     bin/lmyears $DIR/Temps.${1-501}

     echo " " >> $DIR/Temps.${1-501}.yrs.GAT
     echo For Country Code ${1-501} >> $DIR/Temps.${1-501}.yrs.GAT
     echo " "
     echo "Produced:"
     echo " "
     ls -l $DIR/Temps.${1-501}.yrs.GAT
fi

echo " "
echo -n "Look at $DIR/Temps.${1-501}.yrs.GAT (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     cat $DIR/Temps.${1-501}.yrs.GAT
fi

echo " "
     ls -l $DIR/v2.meanC.${1-501} $DIR/Temps.${1-501}
echo " "
echo -n "Clean up / Delete intermediate files (Y/N)? "
read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     rm $DIR/v2.meanC.${1-501} $DIR/Temps.${1-501}
fi
[chiefio@tubularbells analysis]$

Thermometer Percentages By Latitude

Once you have this combined file, you have temperature data with descriptions attached. At that time you can do “by latitude” and “by altitude” studies on the stations, countries, etc. As this “by latitude” program demonstrates:

[chiefio@tubularbells analysis]$ cat src/latcust.f
C2345*fff1         2         3         4         5         6         712sssssss8
C
C    this program sorts records into latitude bands
C    Input must already be sorted by year and filtered to selected latitude
C    A spcial v2.mean+v2.inv concatinated file is the source
C
C    There is an input file named "BANDS" that holds 9 latitude integers. S to N
C
C     Copyright (c) 2009
C
C     This program is free software; you can redistribute it and/or
C     modify it under the terms of the GNU General Public License as
C     published by the Free Software Foundation; either version 2,
C     or (at your option) any later version.
C
C     This program is distributed in the hope that it will be useful,
C     but WITHOUT ANY WARRANTY; without even the implied warranty of
C     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
C     GNU General Public License for more details.
C
C     You will notice this "ruler" periodically in the code.
C     FORTRAN is position sensitive, so I use this to help me keep
C     track of where the first 5 "lable numbers" can go, the 6th
C     position "continuation card" character, the code positions up
C     to number 72 on your "punched card" and the "card serial number"
C     positions that let you sort your punched cards back into a proper
C     running program if you dropped the deck.  (And believe it or
C     not, I used that "feature" more than once in "The Days Before
C     Time And The Internet Began"... 

C2345*7891         2         3         4         5         6         712sssssss8

      integer itmp(12), icc, id, iyr, iyrmax, m, iyc, kyr, ky, band(9)
      real latcnt(11), lattot(11)

      real  latitude, kount
      character*128 line, oline, pline

      data latcnt  /0,0,0,0,0,0,0,0,0,0,0/
      data lattot  /0,0,0,0,0,0,0,0,0,0,0/

      icc=0
      id=0
      iyr=0
      iyc=0
      iyrmax=0
      kyr=0
      latitude=0
      kount=1.

C     Believe it or not, the program may make bogus values,
C     unless you do this initialization.

      do m=1,11
         latcnt(m)=0
         lattot(m)=0
      end do

C     Get the name of the input file, in modified GHCN format.  The file must be
C     sorted by year (since we sum all data by month within a year.)
C     The name of the output file will be that of the inputfile.yrs.LAT
C     The input file ought to be a GHCN format file with V2.inv data
C     concatenated per line.  

      call getarg(1,line)
      oline=trim(line)//".per.LAT"
      pline=trim(line)//".Dec.LAT"
      open(1,file=line,form='formatted')
      open(10,file=oline,form='formatted')              ! output
      open(12,file=pline,form='formatted')              ! output

      open(2,file="BANDS",form='formatted')              ! output
      read(2,'(9i4)',end=300) band

C2345*fff1         2         3         4         5         6         712sssssss8

C     Read in a line of data (Country Code, ID, year, temperatures, latitude)

C     For each year, total "thermometer counts" increment decade counts
C         increment thermometer counts by latitude.

      write(10,'(" ")')
      write(10,'("        Year SP",i4,2x,i4,2x,i4,2x,i4,2x,i4,2x,i4,2x,
     *i4,2x,i4,2x,i4,"NP")') band
      write(12,'("        Year SP",i4,2x,i4,2x,i4,2x,i4,2x,i4,2x,i4,2x,
     *i4,2x,i4,2x,i4,"NP")') band
      write(10,'("         Year SP-45    50    55    60    65    70
     * 75    80    85    -NP ")')
      write(12,'("           Year SP-45    50    55    60    65    70
     * 75    80    85    -NP ")')

      read(1,'(12x,i4,12i5,33x,f6.2)',end=200) iyr,itmp,latitude
      iyrmax=iyr
      rewind 1

   20 read(1,'(12x,i4,12i5,33x,f6.2)',end=200) iyr,itmp,latitude

      if(iyr.gt.iyrmax) then

C      if you have a new year value, you come into this loop, print the
C      the  total thermomether count per latitude for that year
C      calculate the decade total thermometer count, and every decade
C      print it all out, and move on.

C2345*fff1         2         3         4         5         6         712sssssss8

C       increment the latitude totals for the decade

        do m=1,11
           lattot(m)=lattot(m)+latcnt(m)
        end do

        do m=1,11
           latcnt(m)=(latcnt(m)/latcnt(11))*100.
        end do

C        write(10,'("LAT year: "i4,9i6,1x,i5)') iyrmax,latcnt, iyc
        write(10,'("LAT pct: "i4,11f6.1,1x)') iyrmax,latcnt

        if (mod(iyr,10).eq.0) then

C ok, at this point we want to print out the decade average of thermometer
C counts by latitude band. In 5 degree increments.
C we would do that by printing out 

             kyr=iyrmax

             do m=1,11
                lattot(m)=((lattot(m))/lattot(11))*100.
             end do

             kyc=nint(kyc/kount)

       write(10,'(" ")')
C      write(10,'(" ",f6.1)') kount
C      write(10,'("DecadeLat: "i4,9i6,1x,i5)') kyr,lattot, kyc

      write(10,'("DecLatPct: "i4,11f6.1)') kyr,lattot
      write(10,'(" ")')
      write(12,'("DecLatPct: "i4,11f6.1)') kyr,lattot

C  Then we set the decade counter to zero and reset the decade array.
             do m=1,11
                lattot(m)=0
             end do
             kount=0.
             kyc=0
        end if

C      we re-zero the array of latitude counts for the year.
C
        do m=1,11
          latcnt(m)=0
        end do

        kount=kount+1.
        iyrmax=iyr
        iyc=0

      end if

C2345*fff1         2         3         4         5         6         712sssssss8

C     So we have a new record for a new year or for the same year.
C we count the thermometer regardless of the data flag (not many all zero)
C and we add a count to that thermometers latitude for that year.

      iyc=iyc+1
      kyc=kyc+1

      if     (latitude .lt. band(1) ) then
         latcnt(1)=latcnt(1)+1
      else if(latitude .lt. band(2) .and. latitude .ge. band(1) ) then
         latcnt(2)=latcnt(2)+1
      else if(latitude .lt. band(3) .and. latitude .ge. band(2) ) then
         latcnt(3)=latcnt(3)+1
      else if(latitude .lt. band(4) .and. latitude .ge. band(3) ) then
         latcnt(4)=latcnt(4)+1
      else if(latitude .lt. band(5) .and. latitude .ge. band(4) ) then
         latcnt(5)=latcnt(5)+1
      else if(latitude .lt. band(6) .and. latitude .ge. band(5) ) then
         latcnt(6)=latcnt(6)+1
      else if(latitude .lt. band(7) .and. latitude .ge. band(6) ) then
         latcnt(7)=latcnt(7)+1
      else if(latitude .lt. band(8) .and. latitude .ge. band(7) ) then
         latcnt(8)=latcnt(8)+1
      else if(latitude .lt. band(9) .and. latitude .ge. band(8) ) then
         latcnt(9)=latcnt(9)+1
      else if(latitude                             .ge. band(9) ) then
         latcnt(10)=latcnt(10)+1
      else
       write(*,*) "You can't get here, compiler error Or dirty Data! "
      end if

      latcnt(11)=latcnt(11)+1

C     and go get another record
      goto 20

C     UNTIL we are at the end of the file where we print the last average
  200 continue

      do m=1,11
         lattot(m)=lattot(m)+latcnt(m)
      end do

        do m=1,11
           latcnt(m)=(latcnt(m)/latcnt(11))*100.
        end do

C      write(10,'("LAT year: "i4,9i6,1x,i5)') iyrmax,latcnt, iyc
      write(10,'("LAT pct: "i4,11f6.1)') iyrmax,latcnt

      do m=1,11
         lattot(m)=((lattot(m))/lattot(11))*100.
      end do

      kyc=nint(kyc/kount)

      kyr=iyrmax

C      write(10,'(" ",f6.1)') kount
C      write(10,'("DecadeLat: "i4,9i6,1x,1i4)') kyr,lattot, kyc

      write(10,'(" ")')
      write(10,'("DecLatPct:"i4,11f6.1)') kyr,lattot
      write(12,'("DecLatPct: "i4,11f6.1)') kyr,lattot

C2345*fff1         2         3         4         5         6         712sssssss8
  300 continue

      stop
      end
[chiefio@tubularbells analysis]$

And the wrapper script that runs it and tends the environment:

[chiefio@tubularbells analysis]$  cat dolats

DIR=${3-./Lats}
echo " "
echo "Remember to update BANDS with 9 LAT bands prior to use"
echo " "
echo "Need to make a joined GHCN with v2.inv data for the ${1-403} records "
echo " "
      ls -l $DIR/v2.${1-403}.withlat
echo " "
echo -n "Make the Extract of v2.inv.id.withlat (Y/N)?  "

read ANS
if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     ls -l ${2-./vetted/v2.inv.id.withlat}
     echo " "
     grep "^${1-403}" ${2-./vetted/v2.inv.id.withlat} > $DIR/v2.${1-403}.withlat
     echo " "
     ls -l $DIR/v2.${1-403}.withlat
fi

echo " "
echo "Then sort the Special GHCN with v2.inv by year"
echo "into a version for reporting."
echo " "
echo from $DIR/v2.${1-403}.withlat into $DIR/Therm.by.lat${1-403}
echo " "
     ls -l $DIR/Therm.by.lat${1-403}
echo " "
echo -n "Re-sort the selected records back into year order (Y/N)? "

read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     sort -n -k1.13,1.16 $DIR/v2.${1-403}.withlat > $DIR/Therm.by.lat${1-403}
     ls -l $DIR/Therm.by.lat${1-403}
fi

echo " "
echo -n "Do the Count of therm/yrs by latatitude (Y/N)? "

read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     echo bin/latcust $DIR/Therm.by.lat${1-403}
     bin/latcust $DIR/Therm.by.lat${1-403}
     echo " " >> $DIR/Therm.by.lat${1-403}.Dec.LAT
     echo " " >> $DIR/Therm.by.lat${1-403}.per.LAT
     echo For COUNTRY CODE:  ${1-403} >> $DIR/Therm.by.lat${1-403}.Dec.LAT
     echo For COUNTRY CODE:  ${1-403} >> $DIR/Therm.by.lat${1-403}.per.LAT
fi

ls -l $DIR/Therm.by.lat${1-403}.Dec.LAT $DIR/Therm.by.lat${1-403}.per.LAT

echo " "
echo -n "Look at $DIR/Therm.by.lat${1-403}.Dec.LAT (Y/N)? "

read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     cat $DIR/Therm.by.lat${1-403}.Dec.LAT
fi

echo " "
echo -n "Look at $DIR/Therm.by.lat${1-403}.per.LAT (Y/N)? "

read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     cat $DIR/Therm.by.lat${1-403}.per.LAT
fi

echo " "
ls -l $DIR/Therm.by.lat${1-403}.Dec.LAT $DIR/Therm.by.lat${1-403}.per.LAT
echo " "

echo -n "Clean Up / Remove REPORT files   (Y/N)? "

read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     echo rm  $DIR/Therm.by.lat${1-403}.Dec.LAT $DIR/Therm.by.lat${1-403}.per.LAT
     rm  $DIR/Therm.by.lat${1-403}.Dec.LAT $DIR/Therm.by.lat${1-403}.per.LAT
fi

echo " "
ls -l $DIR/v2.${1-403}.withlat $DIR/Therm.by.lat${1-403}
echo " "

echo -n "Clean Up / Remove intermediate WORK files   (Y/N)? "

read ANS
echo " "

if [ "$ANS" = "Y" -o "$ANS" = "y" ]
then
     echo rm  $DIR/v2.${1-403}.withlat $DIR/Therm.by.lat${1-403}
     rm  $DIR/v2.${1-403}.withlat $DIR/Therm.by.lat${1-403}
fi

exit 

echo " "
[chiefio@tubularbells analysis]$

Some of the postings about particular places and groups of countries depend on variations on these themes that select out individual countries or do a specific list. They are fairly easily created from this code base, so I’m not including them here at this time. If there is interest, I can post them too. I may do it anyway when I get time. For example, the ‘by altitude’ program is mostly a change of any LAT or lat to ALT or alt and one change of the format field to pick up data from the altitude field instead of the temperature field. Oh, and I made it an integer instead of a float. All in all, just few minutes work. I intend to merge the ALT and LAT versions into one program with a flag rather than keep two almost identical programs to maintain… so I have not posted the “by altitude” variant here, yet.

Some of the listings extend a bit past the right margin. Viewing the page source ought to let you see those bits if you need them. As time permits, I’ll come back and “pretty print” the listings so you can see the bits off the edge…

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...

View all posts by E.M.Smith →

This entry was posted in AGW Science and Background, Favorites, NCDC - GHCN Issues and tagged GIStemp, Global Warming. Bookmark the permalink.

69 Responses to NOAA/NCDC: GHCN – The Global Analysis

Roger Sowell says:

4 November 2009 at 5:12 pm

Great job, Ed. Mega – interesting!
Earle Williams says:

5 November 2009 at 1:19 am

I dabbled with some GISS US temp data a while back to try and get a handle on the “lapse rate” adjustment that is performed in GISTEMP when a station changes altitude.

My understanding of the code is that it applies the atmospheric lapse rate to adjust for a station elevation change. So if you move your station from the post office to the fire station 100 meters higher in elevation, the code adjusts for the now colder readings as if you had attached your sensor to a balloon tethered 100 meters above the old location.

The thermometer is not in free air though, it is 2m above a ground surface of varying elevation. I hypothesized that the temperature change at the intersection of the two half spaces does not match the temperature change within the air half space.

So I looked at annual, summer, and winter temperatures for a latitude band across the US and plotted up average temperatures against elevation. As I recall my best linear fit yielded a lapse rate that was roughly half that of the atmospheric lapse rate. The rate was more extreme in winter than summer.

The point of this ramble is that there is an over-correction for changes in station elevation. If stations are migrating to higher or lower altitudes this will bias the elevation adjustments.
Mike says:

5 November 2009 at 5:18 am

Ed,

great work, by the way.

I was wondering if you had managed to get the GISTEMP code in a good enough state that you can start plugging un-trended dummy data with increasing “realism” into it to test its properties?

I was thinking of starting along the lines of:

1. Every station reports 15.0°C exactly from 1880 — current day.
2. Every station reports its long-term average exactly from 1880-current day.
3. As for 1 and 2, except each station only reports when the real station does.
4. As for 1-3, each with added, trendless red noise.
5. As for 1-4, with known step changes (both up and down) to simulate station moves/instrumentation changes, either randomly or when the actual stations are subject to them.

If the ultimate “world” temperature output trends up or down for any of the 11 permutations above, you will have absolute proof that the trend is an artefact of the method, and you will know for certain exactly what it is causing this.

Just a thought…

Cheers,
Mike
E.M.Smith says:

5 November 2009 at 9:29 am

@Mike: Yes, sort of…

I have it running through STEP3 (the last LAND step). STEP4_5 just blends in the Hadley Sea Surface Anomaly Map so measuring it is a bit low on my list of “must do’s” given that Hadley can’t explain what they do…

(And STEP4_5 compiles and generally runs, but it needs a ‘big endian’ data set read and I’m on a ‘little endian’ box. Either I get a different box or install a newer compiler with a ‘big endian flag’. But the code is, I believe, done. Old SPARC stations are about $30 on Craigslist, so it’s mostly just a matter of finding time to do it and a place to put Yet Another Computer… there are 5 within 6 feet of me as I type this already.)

GIStemp code depends on exact file matching line by line in several places. To the extent that a “what if” changes the number of lines in a file, it may break the run. I learned this by trying an early benchmark where I just deleted some stations v2.mean temperature data. Toy Broke… To the extent you keep the same number of stations, the run ought to work.

I started down the GHCN investigation as a result of “building the benchmark” for testing GIStemp. I was just going to “characterize the data” and use the “as given” as a benchmark, then change entries (like, oh, delete the thermometers above 500 meters…) and discovered that the GHCN Data Set Managers had already done “The Forbidden Experiment” by deleting 93% of the land thermometers in the U.S.A. in 2007 … that then resulted in these postings.

So I had this “OMG” moment…

But now that I know GHCN is hosed after 2007, I’m back to the “how to do a benchmark” question.

Unfortunately, there are several parts of GIStemp that depend on differences between records to be activated. Fed all 15.0 C I would expect them to “behave oddly”; so I suspect that would not be a very good benchmark… But at the same time, the “base case” real data is clearly hosed, so can not be the benchmark either.

Which leaves me pondering.

As an ‘interesting test but not a benchmark’ I’m going to fix the code to merge in the USHCN.v2 data and see what happens in the USA if you actually use the USA data… (I know, actually use the real thermometer data? What a concept…) It would let me ‘run the forbidden experiment’ but in reverse; since GHCN has already done the deletion and GIStemp has already dropped the USHCN updates after 2007. The only “issue” I see is if the v2.inv file is not in sync with the USHCN.v2 file the run will break and I’ll need to update the station inventory data.

But what about synthetic data benchmarks?

Right now I’m thinking maybe a set of:

Those stations with any data in any year get ‘filled in’ for all years with one of 4 profiles. 1) Coastal: made from a model based on the profile of somewhere like SFO but standardized data. 2) Central plain. Model on? Kansas? 3) Mountain (if I can find any left in GHCN to model…) 4) Urban (so the UHI code gets run) modeled on ? Omaha? Atlanta?

A set of seasonal trended, but multi year flat, data.

0.0 2.0 4.0 6.0 8.0 10.0 12.0 10.0 8.0 6.0 4.0 2.0

Then (per 3 below) try it with one year -1 C and one +1 C on each side so a 3 year ripple. And one could do longer ripples too, and even match the counter cyclical patterns seen in the real data where the USA was “hot” in 1998, but Antarctica and the Pacific were “cool” and look to be counter cyclical to each other.

Then, one by one, do tests:

1) Prune some years for some places. What does the “fill in” do?
2) Re-mark a rural place urban (or an urban, rural) to see what change the UHI code causes nearby.
3) Put in a multi year ripple (but still untrended) with a gradual 1 C up / down / up / down. Does it amplify or dampen?
4) drop out data for, oh, 1/2 the rural areas. Does the “anomaly grid box” do a decent job of proper re-creation of the trends? Do things shift?
5) Ditto for Urban.
6) Put in a 2 C UHI drift on all those tropical island airports. What happens to the surrounding ocean boxes that are fabricated based on those island airport readings?
7) Put in a “by continent” bias (as identified in these postings, where the thermometer starts at one place, but by the time we get sufficient coverage it is clear the early data are strongly location biased…) where we have the full number of thermometers, but “ramp up” the average per year in keeping with the location bias identified here.
8) Same as 7, but using only the actual stations as they arrive in history. (i.e. start with ‘the one’ and for each station as added, put in a data set that matches that station raw data average) Between them 7 and 8 let me see the impact of the original location bias in aggregate; but also lets me see the differential behaviour with station count.

and much much more…

I could be at this for years…

But since my original “use the data as is” benchmark idea looks like a no-go due to the GHCN being pruned and the USHCN.v2 being dropped entirely, I need a bit of a re-think.

I may yet use your ideas just to get started with something even if it doesn’t tickle all the code… If nothing else, it would be a ‘robustness’ test and tell us if it causes breakage due to the code not handling ‘edge cases’ well…

At any rate, it is better to put excess time into thinking through the benchmark design; rather than running a bunch of “benchmarks” that don’t measure anything useful but may mislead. So the “ponder time” is a good thing. I think. Maybe.
Pouncer says:

5 November 2009 at 6:40 pm

I was taught FORTRAN in 1974. I’ve forgotten most of what I even partially learned then.

But of what I still know, I think your code is beautiful…
E.M.Smith says:

5 November 2009 at 9:29 pm

@Pouncer

Thank you!

Any code can be made better with a little white space, some organization, a few well placed comments, and a sense of tidiness.

Sometimes I do the “pretty printing” while pondering how to fix a bug. The added organization and re-reading the lines can help focus you on what is not quite right yet.

Sometimes just to make it fit to share with someone else.

And every so often, just because it, well, it “looks right”…

I also have to admit that after wading through the uncommented, dense and obtuse parts of GIStemp, I do it just to clear my head and remember what it’s like to keep a tidy mind… Something of a ‘Kata for the programmers soul’ …

Glad someone noticed 8-)
Mike says:

6 November 2009 at 1:54 am

Ed,

Thanks for the response.

Best of luck!
Al says:

26 November 2009 at 10:50 am

“Unfortunately, there are several parts of GIStemp that depend on differences between records to be activated. Fed all 15.0 C I would expect them to “behave oddly”; so I suspect that would not be a very good benchmark…”

Just pick a year (say: 1950) and propagate that year’s data from the station-start-date to the station-end-date. Any station that was non-existent in that year could just have its own data from “the year closest to 1950” used as fill.

The phenomena you’ve exposed is easy to see conceptually on a small dataset – the trick is determining if it actually isn’t compensated for somewhere. I really don’t think that it is, I’ve never heard of a “station closing adjustment” when the (rather lengthy) litany of various adjustments are discussed.

Fundamentally, all you’re saying is:
You need to perform a bloody cross-calibration when you change instruments. And station closures do indeed count as an instrument change.

REPLY: [ BINGO! You are clearly in possession of “The Clue Stick”. Feel free to swing it in the appropriate directions… And it would be interesting to make a de-trended synthetic data set as you described and ‘see what happens’. -ems }
Plato Says says:

26 November 2009 at 12:25 pm

Hat of to you Sir – what a superb analysis and data mining exercise.

And how much did Phil Jones et al eat up in research grants?

Oh yes £13.7m in just 10 yrs.

What were they spending it on?

REPLY: [ As near as I can tell, lots of junkets to interesting places around the world (Bali, Carolinas, Copenhagen) a nice new Super Duper Super Computer (don’t know why… my work has been done a a box that started life as a x486 machine and has been upgraded all the way to an AMD chip at 400 mHz and with all of 132 MB of memory… I’m running GIStemp just fine… So, looks to me like they spent it on: Toys, Party Getaway Meetings, and Salaries. But some of it might have gone to nice cushy office facilities and “support staff” to make sure their tea is made right… /sarcoff> -ems ]
Plato Says says:

26 November 2009 at 1:00 pm

Just to let you know that I posted your data on James Delingpole’s blog at the Daily Telegraph and he’s trying to run with it.

“@platosays – thanks for that fascinating link. Will post on it if I have time. Confirms what Anthony Watts – of Watts Up With That – said in a very interesting talk he gave in Brussels at Roger Helmer’s conference last week. He showed dozens of photos of officially weather stations which had effectively been rendered useless by their repositioning (eg near heating vents, in the middle of car parks, at airports).”

http://blogs.telegraph.co.uk/news/jamesdelingpole/100018003/climategate-five-aussie-mps-lead-the-way-by-resigning-in-disgust-over-carbon-tax/
Chris Polis says:

26 November 2009 at 2:07 pm

Just wondering if the ‘deleted’ station data is still available but not in GISSTEMP? We use station data (Australia) for air conditioning calcs and I haven’t seen anything to indicate that the culled stations have actually stopped recording… might be possible to get the ‘missing’ data?

REPLY: [ As near as I can tell, most of the stations are still recording. The “deletion” looks to occur at the entry to GHCN. For GIStemp, they have recently put back in the USA data (that was still in USHCN.v2) after from prodding… I expect to find a similar pattern for the rest of the world. The data set manager is the person who coordinates this activity and ought to have managed the meetings where these decisions were taken. A quote from an earlier posting:

The “magic sauce” is GHCN. As is admitted in the emails, the CRUt series depends heavily on GHCN. GIStemp depends heavily on GHCN. NOAA (with a NASA data set “manager”) produces GHCN.

All the thermometer location “cooking” that was done to GHCN (moving from the mountains to the sea, moving from the poles to the equator) is reflected in both Hadley CRUt and GIStemp. Same Garbage In, Same Garbage Out.

From:

Hadley Hack and CRU Crud

Comment by Prof. Phil Jones
http://www.cru.uea.ac.uk/cru/people/pjones/ , Director, Climatic
Research Unit (CRU), and Professor, School of Environmental Sciences,
University of East Anglia, Norwich, UK:
[…]
Almost all the data we have in the CRU archive is exactly the same
as in the Global Historical Climatology Network (GHCN) archive used
by the NOAA National Climatic Data Center

And just who owns that NOAA dataset? Who is “The Data Set Manager”? What I could find looks like a guy at NASA. From:

GHCN – California on the beach, who needs snow

down in the comments:

e.m.smith
It took a while to find, but I think I found “who owns GHCN” and “who manages it”.

From: http://gcmd.nasa.gov/records/GCMD_GA_CLIM_GHCN.html

We find that:

GHCN data is produced jointly by the National Climatic
Data Center, Arizona State University, and the Carbon Dioxide
Information Analysis Center at Oak Ridge National Laboratory.

The NCDC is a part of NOAA. So I’m not seeing NASA on this list. But…

It goes on to say:

Personnel
SCOTT A. RITZ
Role: DIF AUTHOR
Phone: 301-614-5126
Fax: 301-614-5268
Email: Scott.A.Ritz at nasa.gov
Contact Address:
NASA Goddard Space Flight Center
Global Change Master Directory
City: Greenbelt
Province or State: Maryland
Postal Code: 20771
Country: USA

So it looks to me like it has NASA staff assigned, part of Goddard (though it isn’t clear to me if G. Space Flight Center and G.I.S.S. are siblings or if one is a parent of the other. I suspect GSFC is an underling to GISS. That would have Scott Ritz reporting to Hansen IFF I have this figure out… (And all that personal data is at the other end of the link anyway so I’m not publishing any private data NASA has not already published.)
…
It’s looking to me like GISS has their fingerprints all over the GHCN deletions, with NOAA ether as patsy or passive cooperator.

-ems ]
Plato Says says:

26 November 2009 at 2:42 pm

UK popular media is waking up

http://blogs.telegraph.co.uk/news/jamesdelingpole/100018056/climategate-this-is-our-berlin-wall-moment/
Bob Bentley says:

30 November 2009 at 1:31 pm

From http://www.giss.nasa.gov/: “The NASA Goddard Institute for Space Studies (GISS), at Columbia University in New York City, is a laboratory of the Earth Sciences Division of NASA’s Goddard Space Flight Center and a unit of the Columbia University Earth Institute.”

So GISS is organizationally a part of GSFC.

REPLY: [ It would seem so… -ems ]
Pingback: GHCN Database Adjustments « Save Capitalism
fred ohr says:

3 December 2009 at 1:26 pm

Are the US taxpayer supported domestic climate data gatherers as corrupt as those at the CRU? If so, please send your work to Sen. Imhofe’s staff. He is the most vocal about calling for a Congressional investigation

REPLY: [ Of the three major temperature series: NCDC who produce GHCN, GIStemp, and CRUt; two are American (National Climate Data Center and GISS). All three are highly intellectually inbred and share a great deal of motivation, methods, and work product. All three use GHCN as their basic input data. All three point to the other two as “validation”. All of them are taxpayer supported at the tax trough. I may send something to Imhofe, but I’m up to my eyeballs in just making the stuff. If folks really think it is worthy, they can print it out and mail it or send an email link to him. -emsmith ]
Barry Brill says:

4 December 2009 at 8:26 pm

I was fascinated by your NZ findings on the effect of removing Campbell Island!

NZ local science advisers, NIWA, arrive at the same outcome as GHCN but using quite different stations and methods. As their adjustments find that NZ warmed by 1degC 1931-2008, I assume GHCN has a similar result.

It seems highly counter-intuitive that temperate zone islands should warm at twice the global average. My question is how wide is the band, from which the mean is taken? Is NZ an outlier at twice the average, or do many countries enjoy this status?

REPLY: [ I presumed you meant GHCN Global Historical Climate Network. The zones, grids and boxes in GIStemp can reach 1000 km to 1200 km away in at least 4 steps. Any given box, grid, or zone may well be “warming” because of something happening far far away… Typically, though, the “reach” is about 1200 km for a UHI or similar adjustment. (There is some variation as it varies how far to reach for each station based on how close ‘enough’ stations are to be found.) So take a roughly 1200 km radius around N.Z. and that’s roughly the distance, plus or minus 1000 km … So deleting that on cold island also means that the 1200 km south of it must be filled in from somewhere else (probably further south) too.

I don’t know how many places are in the status of warming 2 x the average. I suppose I could find it by analysis of the files. As a guess I’d say it’s likely a standard normal distribution about the mean. For N.Z. there is the southern ocean complication. There is not much nearby to ‘reach’ to. So pruning a single very cold island, and adding a couple of neighbors in the tropical seas up north, can add a great deal of “warming”… -E.M.Smith ]
D. Robinson says:

9 December 2009 at 11:30 am

E.M. Smith,

Another excellent analysis, thank you.

To emphasize your point of the apparent late 20th century temperature climb correlating better with thermometer dropout than to the smooth rise in CO2, it might make a very nice graphic to show temperature data with thermometer count and co2 on the same graph.

The visual correlation of temperature to thermometer count and lack of a visual correlation to co2 ppm would be interesting to see.

Thanks for the great post.
DABbio says:

10 December 2009 at 8:19 pm

Last line of chorus ought to be changed to “This will be the day that I lie.”

REPLY: [ Giggle! -ems ]
Pingback: Still I look to find a reason to believe « jdwill07 blog
Dominic says:

11 December 2009 at 5:17 pm

Hi again EM

We had a difference of opinion a while back on the accuracy of the temperature data (you remember all those exchanges about the law of large numbers and suchlike). Nonetheless, I am impressed at what you have achieved in what is a relatively short period. A good job done quickly!

A thought just occurred to me that it might be cool to do some sort of interactive web app so that people can pick and choose regions of the world and a choice of locations to be used for which to calculate the temperature changes. Locations could be filtered by altitude, latitude, longitude, population etc…

Basically it would be a way to quickly do a bottom up calculation of the average for that region. It would allow people to quickly and easily replicate your studies which I think would enhance openness and credibility.

Out of curiosity, how much time does it take to pull in the data and do the calculations ? Do you see any merit in this?

Regards
Dominic

REPLY: [ Yes, I remember. The precision issue seems to be one of those “shiny things” that folks get wound around. I’ve moved that whole discussion onto the Mr. McGuire thread where it belonged in the first place. Ought to have directed it there from the beginning… From here forward, whenever it breaks out again, it will be redirected there. At any rate, as you can see, despite my belief that the precision is bogus and the average of a set of intensive temperature variables is meaningless, I’m willing to indulge in “willful suspension of disbelief” and go ahead and make reports via that method. (Though I do periodically put in the disclaimer… Rather like the old Irish Monks who copied various manuscripts from old Rome or Greece, and would put in the margin that they thought it was blasphemous, but would faithfully copy anyway ;-) or perhaps the Jewish deli which will sell you a “ham and Swiss” because that’s what folks want… )

Per replicating the individual studies as a web interfaced tool and time to do a run:

I think it would be a great idea, but would be a ways out on my “todo list”. The code is published, so anyone can do this. The “match v2.inv data to v2.mean data” takes a couple of minutes, max. If I’d known I was going to do this many variation, I’d likely have dumped it into a relational database. It is a trivial “match on stationID key” of two flat files. The only oddity is that v2.inv has an 8 char key while v2.mean has a 9 char key (the last digit is the ‘modification flag’). In an RDBMS this would be a single command to match the two files on that key.

After that, it’s a very straight forward “select and report” process. It has only minor variations (what field you read) for LATITUDE vs LONGITUDE vs ALTITUDE vs {any other v2.inv field you want to do something with}. The “by altitude” reports also have the finesse of using the altitude if you have it, or the altitude from terrain map if you don’t (there are 2 altitude fields in v2.inv). I put the sort band limits into an external parameter file so they can be quickly changed. I have a set of such files with names like “Australia”, “SouthernHemisphere”, SpoleToEquator, etc. So it’s just a link command to swap parameters.

Each report takes about a minute to produce. Basically, you read 100 MB of data, so whatever speed your disk reads at, that is how long it will take. It’s a read/select record front end pass that is most of the time. Were I doing this “production” I would again use an RDBMS so that “search on key field” would not require reading the whole 100 MB. You could likely get report times down to the “few seconds” range. It is selecting what parameters to use for a given “cut” and evaluating the results that takes the time. For example, it took me a while to catch on that Africa had a run FROM the beach to the elevated, but hotter, interior; while S. America ran from the snows to the warmer beaches.

My next muse is likely to be a set of “by population” reports based on that field. (Just in case anyone wants to run out ahead of me ;-) – not that have better things to do, mind you 8-0 )

I see my job as to “plough the field” with a rough fast pass of the sodbuster plough. I hope others will find things of interest to ‘redo’ in a better package once they know where the “dig here” places are located. If I run out of ‘fresh sod’ then someday I’ll come back and pretty things up. So if anyone wants to do that flashy web version now ….

-E.M.Smith ]
Al says:

12 December 2009 at 12:02 am

I’m having questions about how well even a purely rural thermometer can measure a pure average gridcell temperature. I think we have the information to test this directly.

Is there a simple way to go from lat/long to knowing the lat/long of the four corners of the gridcell containing the original point?

That is, given a location, can one find the extent of the bounding gridcell as used in the official analysis?

Step two would be “And a list of stations inside that box.” But I think there’s code for that already.

Step three involves weather.com and contour maps.

IOW: The stated error on the thermometer’s box is (shockingly) for measuring the temperature next to the thermometer. These guys do not seem to ever back off of the “0.1 C” error (or whatever) to the realization “Hey, it can’t measure the temperature fifty miles away to that level.”

But with the contour maps, it should be possible to both the most reasonable current offset for any site as well as the “proper” error bars. (Which will be just a teensy bit larger than 0.1C) Send -those- errors through the propagation of errors cascade….
E.M.Smith says:

12 December 2009 at 2:32 am

@Al

I’ve actually admired that problem a few times here. Forget exactly which link, though.

The basic problem is that temperatures are fractal. We depend on the air to do a quasi-integration, but… So is the “grid cell” the temperature of the snow in the shade? The sunny rock? The steam of snowmelt? The tree leaves, or the black beetle crawling on them? I was in that place and the surfaces ranged from about 30 F to 80+F. What was “the temperature”?

And all that stuff was within 100 feet. Imagine 10 miles away…

Yes, the grid cell layout is defined (it’s ‘in the code’ in STEP2 and / or STEP3 under the GIStemp tab up top. Either toannon.f or toSBBX, I’d look it up but I desperately need sleep now. IIRC they are 1/2 degree on a side. So start at the top of the globe and lay out 1/2 degree grid.
pyromancer76 says:

12 December 2009 at 8:53 am

E.M. Smith, I hope you get a very restful sleep, not one with menacing dreams of fiery thermometers coming at you, or dreams so filled with the code remaining to be processed that you crumple the covers in a toss-and-turn. Your accomplishments are deserving of the sweet sleep of babes after a satisfying feed — even though you have been feeding us!

I am trying to sort something out that is probably more clear to those who know what they are doing. My most urgent issue is: “Where can we trust the raw data”? Not the adjusted data, not that which we now know to be “horridly hosed”. The surfacestations project might apply. If we can trust “some” raw data, where is that data located and, especially, where can it be located where no nefarious minds/fingers can touch it and so that reliable people/scientists can work with it and advertise/publish their findings to the world. Willis Eschenbach has accomplished this and I read about a number of individuals who are checking their local thermometers and finding “no warming”. I still think we need an agreed-upon, verifiable base from which to work.

One of your conclusions: “So in addition to all the cooking of the books by thermometer selection, we also have buggering of the adjustments as well….This, IMHO, is clear proof that the GIStemp process and NASA charts are as much fantasy as anything else.”

Unfortunately,I probably want “us” (you? and your knowledgeable colleagues? and their willing associates with computers?) to collect/produce something trustworthy. Maybe develop an internet “academy” as it were — you almost have as you rough-plow the ground.

My apologies for the next material — my comments on Basil Copeland’s post this a.m., WUWT — but I can’t sleep over these unknowns….and you are probably one of those most knowledgeable at this point.

From Pyromancer76, under mod.
Basil, I think we need to be more certain about the raw data. You suggest that you do not think using raw data is reasonable and that it is not necessary, if I read you correctly.

Basil 6:28:57 “For all the talk about going back to the “raw” data, I don’t think that is where the problem begins. From my work with US data (I do some consulting work where I have occasion to look at the truly “raw” data occasionally), NOAA does some “quality” control right off the bat in reading from the hand written daily records. I doubt that any systematic “warming bias” is introduced at that point.”

I just can’t agree. NOAA’s “‘quality’ control right off the bat” must be checked by truthful citizens/scientists — perhaps at a sample of stations to verify that the raw data is not already cooked. Why would you “doubt” that any systematic “warming bias” is/has been introduced when this is the most significant scientific scandal of our time plus one that already has cost us billions, if not trillions — perhaps even quadrillions — of dollars during the last 10 years? Do you think these people are going to let this go easily?

REPLY: [ Slept wonderfully, thanks! When I’ve “hit the wall” of short slept, I enter that deep “completely zonkered” state where not even dreams intrude. Probably part of why I “push it” so much ;-)

As of now we know:

1) The original form images are on-line for the USA. They can be trusted as actual raw data.

2) NOAA has significantly processed all three of GHCN and USHCN and USHCN.v2 in “strange and wondereouse ways” that can not all be valid. USHCN and USHCN.v2 are at times up to 1/2 C different from each other. Both can not be “accurate” as “raw” data, so both must be treated as “wrong”. AND both are different form GHCN (though in different ways…).

1 + 2 together says that the only “provably correct” data lie between the paper images and prior to NOAA.

For the rest of the world, the same analysis holds, except we don’t have access to the paper images…

Oh, and “surfacetemps.org” (I think I have that right) is being started (see WUWT) to do exactly the job of making a real, audited, raw temperature series. The only major problem I see is those foreign countries that will not release the raw data as they see a money spigot… But we don’t need a whole world coverage to show that “global” warming does not exist. If, say, the USA, Australia, and England are not warming: It isn’t “global”…

And, IMHO, Basil is wrong. On two counts. First: “Trust Me” is not an acceptable auditing technique; and when there has been demonstrable reasonable suspicion of fraud, only a real forensic audit is acceptable. Not “trust me” from anyone: even me.

Second: We have the existence proof of GHCN USHCN USHCN.v2 divergence. They have all been held up as “right” (as has the “adjusted” versions of the same data). They all have 1/2 C variations demonstrated in a single small sample, with perhaps whole degree C variations. They can not all be “right”. So, almost by definition, the NOAA data are wrong as given. The answer of “we were wrong before, but now we have it right” is just another form of “trust me”, so you can’t use it to dump USHCN and embrace the New and Improved USHCN.v2 data set.

The only way out of this is a full forensic audit, with “before data”, the “transformation methods and code”, and the “after data” published; or a not very good alternative, a “select panel” including “skeptics” (several) that views the evidence in private IFF it must be done that way for selected foreign countries only. I.e. US, UK, Australian, etc. data sets and transforms public, but if Nepal insists on proprietary rights, then the “panel” audits Nepal data and transforms.

Until something like that happens, we have no “global temperature data”, all we have is “Global NOAA / NCDC Computer Fantasies” as “Raw adjusted QA screened preened homogenized pasteurized data food product” to work with: and as we’ve already seen, they cook the books; sometimes in several different ways.

-E.M.Smith ]
VJones says:

12 December 2009 at 11:50 am

@pyromancer76

The quaility of the ‘raw data’ is certainly the issue ‘du jour’, and even the question “which raw data?” Basil’s statment holds up for me – especially if he has seen the data.

I deal with raw data of a different kind in a lab and people put decimal points in the wrong place and reverse figures all the time. There is also a need to adjust for Time of Observation (TOBS). It would be hard to introduce systematic warming at that stage, but if an agency ‘wants’ to do it, much easier when you can see the bigger picture at the collated data stage.

REPLY: [ I understand the basic need, but doubt the morality and capacity of the “doers of the deed”. A lot of folks have wondered about the need for a TOBS adjustment and what it’s for (“If you are doing a min / max thermometer how does when you read it change the min or the max?”. You might consider a posting explaining why you think that is an issue. I’d certainly read it… -E.M.Smith ]
Al says:

12 December 2009 at 12:16 pm

I’m pretty content starting with the satellite data as “trusted.”

The reason being we’re actually focused on trends, not the actual raw temperature, and the satellites have the coverage and uniformity that are desperately lacking elsewhere.

If we later discovered “Hey, we need to perform strict calibrations of the satellite data” it would be a matter of saturating some gridcells with surface stations. Really, quite straightforward – but it should not influence the trends, only the offset and the error bars.

So one should be able to at least come up with a number for “True ground level gridcell temperature based on the satellites best efforts.” For most every gridcell.

Then perform as many calibrations as possible over the period of overlap between ground stations and satellite data. Yes, a ground station might be four degrees offset just because gridcells are large. And another 8 because it is more shaded, or whatever. But if you’re thinking of the surface station as more of a ‘proxy’ for the gridcell temperature than a direct reading, you should arrive at a reasonable 1) reading, 2) error. And it won’t be the insane “0.1C” error of a perfectly placed thermometer.
kuhnkat says:

12 December 2009 at 10:00 pm

E. M. Smith,

thank you so much. Sorry to use your work for petty reasons, but, I have run off several alarmists by suggesting they come by your site and AUDIT your work!!!

Never hear back from them on the subject!!!!

Here is another interesting analysis of the GHCN adjustments similar to the above link from SAVECAPITALISM in the comments.

GHCN and Adjustment Trends

REPLY: [ The link was also put in a comment on another thread. I suspect it is ‘going viral’. BTW, the whole idea behind “Public Review” is just what you did. Put it in their face. If it has flaw, find it. If not, get over it. “Truth needs no gatekeeper. -E.M.Smith” and I’d add, it needs no peer review either… just the bright light of day and a citizen scientist army that is ‘without peer’. -E.M.Smith. ]
cthulhu says:

13 December 2009 at 8:34 am

REPLY: [ For some unknown reason, WordPress pended this post for a while. OK, I looked it over and didn’t see any particular key word that ought to have set it off, but who knows. I’m going to interleave my reply inline in the text, since the tone is a bit combative and it has no valid email address. But, for future reference “cthulhu”, you need a valid email address. -E.M.Smith ]

you can’t trust the raw data – we know it’s wrong.

REPLY: [ Yes, we do, and while I’m sure it would be better to just admit we have no data suited for “climate change” study, there seems to be an entire industry built around this bit of “junk data” that we have, so that is what we are stuck with. Unfortunately, this crappy base data is not improved as thermometers are deleted from cold locations (but only in the recent record, not in the baseline periods…) nor as the data them selves are “adjusted” to be colder in the past, yet warm in the present. The base data may be really crappy, but they are vastly better than the “cooked” adjustment products.
-E.M.Smith ]

The adjusted data is an attempt to correct for problems in the raw data.

REPLY: [ And “cook the books” to show warming where there is none, IMHO. The pattern is too consistent and too clear to be an accidental artifact. There is a deliberate molestation of the base data that looks to me to be consistent with deliberate fraud. You do not end up with the only “valid” records in California being from SFO and “3 on the Beach Near L.A.” by a statistical artifact nor by “adjusting” for old thermometers. BTW, I find the quality of the older scientists work and the care with which data was recorded in the past to be at least as good as today, and often far far better. Oh, and they were honest too. Here is a good starting point for you to understand what GIStemp is supposed to try and do:

GIStemp – A Human View

-E.M.Smith ]

Recent station drop out is probably a case of only some of the GHCN station data being updated in realtime. I guess a number of data is picked up a few years later and a lot will be picked up in GHCN v3 or it’s equivalent.

REPLY: [ Not A chance. GHCN has a cutoff date each month. If you miss it by A DAY they do not go back and put the data in. It is also not the case that the, roughly, 90% of thermometers that were “left out” of GHCN in The Great Dying of Thermometers was “only some” or an artifact of “updated in realtime”. You are completely mischaracterizing what actually happens and has happened. -E.M.Smith ]

Will that change the record significantly? I have to say the satellite records are in close agreement wrt the 2002/2003 step jump in temerature, so I doubt it.

REPLY: [ The old “satellites agree” canard again? So sorry, that is a bad joke. The base data are “cooked” by making the past colder, prior to the satellite record. They can agree all they want today. It’s the past that is being re-written. Also, the location of the thermometers is changed so that cold locations are deleted in the present. This biases the anomaly maps relative to the baseline when the cold locations are left in. This has nothing to do with satellites. Basically, the satellite record is way too short to be used for any climate related studies. Please don’t waste time by brining it up again. -E.M.Smith ]

So the issue is one of sampling. Is a smaller sample of stations able to represent what the whole population shows? This would seem easy to test. Artifically remove a subset of the total station record and see how that affects the result.

REPLY: [ No, adjustment is not a sampling issue at all. Nor are the deletions. This is not a random act, an accidental random sample or subset, it is focused in time and space. The issue is one of selective manipulation of thermometer locations kept in the record along with “adjusting” the past data to be artificially cold, erase the hot 1930s, lift the cold 1970s, and put a hockey stick on the end. Per your sample question: I did substantially that. I keep in the longest lived best tended thermometers and found that the warming goes away. (Later, more detailed studies, also show no warming in places as diverse as Africa and the entire Pacific Basin. Like all the studies that this posting links to which you seem to have not bothered to look at). The warming is all carried in a batch of biased short duration recent thermometers with locations strongly biased to warm and equatorward locations. There are lots of studies here showing that. Please read:

The Northern Hemisphere – What warming?

NOAA/NCDC: GHCN – The Global Analysis

(Yes, a link to this very page. It would seem you didn’t read any of the reports linked to from this page the first time, so try again.)

GIStemp Quartiles of Age Bolus of Heat

And the few dozen others you will find on the right side under “AGW and GIStemp issues” and “NCDC / GHCN issues” -E.M.Smith ]

Of course remember to delete your code afterwards rather than comment it out, in case it’s leaked and someone misinterprets your artifical adjustment to remove stations as something nefarious…

REPLY: [ No comparison between the clear collusion to deception at CRU as evidenced in their emails and a simple “what if” fully documented and published. If you can’t see that, well, … “Subset” analysis is far far different from back room deals at CRU and silent deletion of data in GHCN. -E.M.Smith ]

Also with comments like this it would a good idea to demonstrate the method changes the results in any way significantly rather than just assuming it does. If it doesn’t change the results significantly which ever method you choose (ie it isn’t the make or break of recent warming) then your complaint is vacuous.

REPLY: [ I take it you did not notice that both methods are computed and printed out at the bottom of the reports. The comments just explain what they are. The difference is HIGHLY significant (for Antarctica it comes out to 7C Yes, 7 whole degrees of C.) There is no ‘assuming it does’ there is proof it does and it does dramatically change the results, and significantly (as evidenced in the various reports showing the bouncing around in the 1/10 C place and sometimes the whole C place). The only thing “vacuous” here is your reading of the postings. The actual runs and program are right in front of you, and you bring up this kind of “IF” dodge? There is no if there is demonstrated result.

GHCN – Antarctica, Ice On The Rocks

The key point is that: It doesn’t matter what you would choose at all. All that matters is that an effectively random choice that the programmer MUST make changes the 1/10 C place and sometimes the whole C place. That programmer choice swamps the “global warming” signal. In one single averaging example. What it does in the dozen or so “serial averagings” of GIStemp is anyones guess. GIStemp has not been properly benchmarked nor QA vetted. AGW could easily be nothing but a programming bug from a choice exactly like this one. -E.M.Smith ]

“C I would chose to use the average of all data in a month
C since it is less sensitive to the variation of number
C of thermometers in any given year, but you might chose a
C different GAT. Averaging the data directly gives weight
C to the years with more data. Averaging the monthly
C averages gives each month equal weight. Choose one…
C Rational? No.
C But it is the reality on the ground..
C Basically, if you do “serial averaging”, the order of the
C averaging will change your results. As near as I can tell,
C GIStemp (and the whole AGW movment) pay no attention to this
C Inconvenient Fact.”
rob r says:

14 December 2009 at 1:20 pm

Similar to Earle Williams I have looked at the environmental (altitudinal) lapse rate across surface stations but in the South Island of NZ. Similar to Earle I find that the lapse rate is much greater in winter than during summer (more than double). The mean annual environmental lapse rate across the surface is less than that in the free atmosphere here as well.

Questions begin to arise from this type of observation. Is the seasonal lapse rate constant through time i.e. does it change from year to year as the regional temperature fluctuates? I don’t know. The summer-winter difference may point to a greater lapse rate in colder summers than in warmer summers. Perhaps a greater lapse rate during glaciations than during interglacials?

How does Hadcrut and GISTemp handle such issues?

REPLY: [ As near as I can tell, they don’t handle it. FWIW, you got me thinking. The actual altitude at Antarctica is very low, but the built up ice cap makes the elevation very high… So now I’m wondering if ice ages are partly the result of a self fulfilling altitude build up. Once the ice gets high enough, all it does is add more… -E.M.Smith ]
nolan says:

15 December 2009 at 7:28 pm

Mr. Smith, Go to Washington! This must be presented to each and every blow-hard trying to destroy our country.
With your permission, I’d like to link to this as much as possible, but with the expectation of getting the same response kuhnkat seems to be getting!
Of the many sites and info I’ve seen disecting and debunking the AGW party-line, this info is the most far-reaching in it’s scope and I’m duly impressed.
Thank you for doing what must be a mind-numbing task.
over

REPLY: [ I’d love to “go to Washington”, but they would not have me. I am, for better or worse, exactly the kind of “Mr. Smith” that was in the movie (though a bit more worldly and cynical than he was… having seen more of the world.) I would not survive an election process, what with the way politics are today. Being a Sarah Palin sort of “Joe Sixpack” person, I’d just be vilified and slimed. There is no place for a normal human being who holds honesty and the truth above all else in Washington. My God, man, I’m an agnostic married to a hyper-religious person! What room is there in DC for someone with that kind of tolerance for others views and who thinks the constitution is readable by anyone and ought to be followed to the letter! And if you don’t like it, there are specific directions on how to change it, and no other change is valid. I’d either be slandered to death or outright killed. (But what a way to go! … )

On using stuff you find here:

Link Away! I hereby grant blanket permission for anyone attempting to stop the AGW fantasy to use any information presented on this site that relates to debunking AGW in that effort. Print it. Fax it. Link it. Publish it. Make “peer reviewed” research based on it. Get a Masters Thesis or even a Ph.D. thesis out of it. Whatever. Go For It! That’s what it is here for. A footnote of attribution would be appreciated, but is not necessary if it would interfere in the effort. (i.e. if it would prevent being published or getting that M.S., then just forget you picked up the idea here… I will not mind at all.)

And yes, it is at times incredibly mind numbing. During the first 6 months of bashing my brains against GIStemp I “gave up” a dozen times. I would work on it until I could not stand it any more, then take a break. I even went so far as to visit pro-AGW sites just to be abused by them so as to get the motivation back! Were it not for being “borderline high function Aspergers” (and so subject to a need for “compulsive completion”…) I probably would have given up (as have so many before me…). FWIW, I can now feel completion is at hand. There is still a lot to do to “flesh out the picture”, but the pattern is mostly complete now. So as of now, it is much less of a “PITA” and more of a “finishing touches” and that feels really good 8-)

-E.M.Smith ]
Pingback: Sensible solutions that benefit the public are being ignored « TWAWKI
Rob R says:

16 December 2009 at 3:39 am

Of altitude and ice sheets

There is quite a bit of published literature on theories of ice sheet growth and decay and the various feedbacks involved (notably involving but not limited to ice albedo and astronomical cycles).

My particular interest relates to the periodic regional glaciation of the Southern Apls in the South Island of New Zealand (largely non-glaciated at present) hence part of my reason for examining lapse rates.

One thing that appears to be an influence on ice volume in Antarctica is sea level. If sea level is lowered due to ice volume increasing in the Northern Hemisphere then this exposes significant areas of continental shelf around Antarctica. So the ice-grounding line migrates outward. This means a significantly larger area for ice accumulation and it changes the “profile” of the entire ice-sheet. So lowering of sea level can cause an increase in Antarctic ice volume even if there is no change in Antarctic climate. Throw in air and sea surface temperature, wind and precipitation patterns that are out-of-phase with climate change from the Northern Hemisphere and you have a very complex system.

If the ice area increases in response to falling sea level then the regional albedo also changes. This again is not caused entirely by local temperature change. It is a passive response to sealevel.

So change in the size and shape of the ice sheet has a mix of causes. Then there are the effects. The change in shape changes the internal precipitation pattern. Some areas get “drier” and little additional ice growth occurs there even if the climate gets colder. Some areas can gain kilometrs of ice thickness e.g. the Ross Sea, which is mainly continental shelf. The extent of winter sea ice is also influenced by sea level and ice sheet size. These changes have climatic impacts as far away as NZ, Australia and most of the South Pacific.

Ice sheet dynamics is a complex topic and is generally not well understood even by the experts (and an expert I most certainly am not).

Nothng like a bit more useless information to confuse matters

Cheers

Rob R
aclay1 says:

16 December 2009 at 12:05 pm

Wonderful analysis. Can you imagine the legacy media even attempting to do this sort of work, let alone write about it.
Pingback: Lamont County Environment » Blog Archive » Climate Scandal: Corrupted UN-IPCC World Temperature Data
globaltemps says:

21 December 2009 at 3:31 pm

I’m probably a bit late to chime in (just found this blog), but I independently came to the same conclusion while trying out a different representation of the data in the NOAA files. Check here for another look at the trend to lower latitudes.

REPLY: [ Never too late until AGW is a historical fantasy… It’s always a good thing to have truly independent verification. -E.M.Smith ]
Bill S says:

23 December 2009 at 8:39 pm

Love the work you are doing–and Steve McIntyre, Jeff Id, Anthony Watts, and all the rest. I have been a skeptic for years but now I am getting a lot more interested in the specifics of how temperature data are adjusted for the various effects (UHIE especially). However, I cannot find actual rules posted anywhere that delineate how they actually do it. The Hansen 2001 paper on the Gistemp site has 2 graphs on page 15 which, I sincerely hope, are not the way they make modifications for UHIE. Do I need to download both the adjusted and unadjusted data and do my own comparison to figure out what changes have been made/what the rules are? Have you or someone already done that? If there is a better place to start I’d appreciate any suggestions. Thanks, and keep up the good work!

REPLY: [ Thanks! Unfortunately there is no “one way” to do UHI adjustment. The way GIStemp does it is, IMHO, bizarre. For each station, they look for up to 20 “nearby” stations that are up to 1000 km away and are marked “rural” (which includes many cities with enough population to be an UHI, along with many many airports and some nice warm sewage plants) then they “adjust the past” of the “urban” station… in about 1/4 to 1/3 of the cases, they adjust the wrong way from UHI (meaning the reference is wrong). If you want to see the ‘gory details’ the program in question is PApars.f in STEP2. You can find it under the “GIStemp Technical and Source Code” category on the right margin. There is also a write-up of the impacts in the “GIStemp, a slice of Pisa” posting under “AGW and GIStemp issues” Category. The general mode of adjustment is to “sort of average” the up to 20 “Rural Reference Stations” then find an “offset” in one period of time (the baseline) and compare this offset to the present to decide how much to change the past. If that sounds a bit like hocum to you, well, join the club… -E.M.Smith ]
Pingback: PUNK ROCK REPUBLICAN » Blog Archive » This Blog Might Be Even More Chaotic Than Mine But …
Pingback: US ClimateGate: GHCN (Global Historic Climate Network) and NOAA; John Coleman on C02 (Video) « The Catastrophist blog
Pingback: “This Is A Scandal To Rank With Climategate”: Temperature Readings Have Been Manipulated At The Two Key Climate Centers In The U.S. – Pat Dollard « The Daley Gator
Jim Sward says:

17 January 2010 at 2:22 am

I haven’t read all the articles yet, but I have a question that about the melting of glaciers and the open sea in the artic. Can you point me to the data (article, site, etc.) that reconciles the melting glaciers and the lack of global warming? I believe everything the climate warming skeptics say, but the glaciers are really melting and rivers disappearing on real time scales.

Thanks for your work and if I can do anything as a resident of upstate SC let me know.

Jim Sward

REPLY: [ There has been a lot of discussion of the Arctic over on http://wattsupwiththat.com with the general observation that the pattern is cyclical on at least two time scales that are longer than the typical person’s awareness (a 60 year cycle and a 176 year cycle (and a 1/2 that 88 year or so cycle); and I’d add there are others include a 1500 year Bond Event cycle…) based on ocean currents and potentially solar changes. THE major driver of Arctic melting is ice breakup and flushing to warmer waters driven by wind and ocean currents. Glaciers often grow when it is WARMER as warm water makes more snow at high elevations and shrink when it is colder as you get less snow. Glacial advance / retreat is not just a cold driven growth thing… So I’d start with some key word searches there. For example, a Google of “wattsupwiththat glacier” gave 41,000 hits including:

Study: Glaciers defied hotter temperatures 9000 years ago

Swiss ETH: Glaciers melted in the 1940's faster than today

While goggle of “wattsupwiththat arctic” give 118,000 hits ( I’m sure most of them are links or folks talking about it…) including:

The Arctic Oscillation Index goes strongly negative

http://www.hoodathunkblog.com/2009/05/wattsupwiththat-arctic-non-warming-since-1958/
http://digg.com/environment/1922_Arctic_Ocean_Getting_Warm_Seals_Vanish_and_Bergs_Melt

Notice that last one… 1922. We’re not seeing anything new today. We just don’t live very long, that’s all.

What can you do for me? Go visit the Smokey Mountains and drive a bit of the road down the top of them… very fond memories of it… I’d love to think of someone vising there: http://www.blueridgeparkway.org/ or whatever road it turns into in SC. Either that, or go catch some big ass fish ;-) Hey, if *I* can’t do it, at least I can enjoy the idea that someone else is doing it for me ;-) Beer optional (but would be a nice touch ;-) And if you get a chance, find some 70+ year old and ask them if the weather now reminds them of any prior times, and see if they don’t pop a story or two along the lines of: “Well, it was hott’rn this back in the ’30s, but we did have that cold spell in the 40’s, right now it’s kind’a remindin’ me of the middle of the ’70s when we had this really hard snow … ” and post the story in a comment here somewhere… It would be nice to capture some of that oral tradition before it literally dies out. (I have a 2nd hand oral history from when I was about 8 and talked to the 70-somethings in my home town. I remember them talking about how Godawful hot the ’30s were and some of them mentioned a long slow wandering from hot to cold back to hot etc.. We don’t honor that wisdom enough.)

-E.M.Smith ]
Martin Judge says:

17 January 2010 at 4:12 am

Bolivia La Paz/Alto

La Paz/Alto is one of the stations claimed by GISS to be actaully used in its global temperature records. As has been pointed out , La Paz has no data post 1990. (Its data record is poor prior to 1990!)

Does this mean that the 1500+ stations claimed to be used by GISS includes stations which do not actually have any recent data records?

How many of GISS’s nominated stations produce data? Most important: how many of its station set have a reliable data record going back at least 50 years?

Does anyone know the answer to these questions?

REPLY: [ The answers are knowable, but with some work. That is what many of the reports posted here under the GIStemp tab up top (and links it points to) are all about. There are roughly 6000 total stations in GIStemp and NCDC / GHCN, of them, approximately 1500 of the GHCN stations have data for 2009 (and of those, about 1000 make it through the GIStemp steps that toss out short records). GIStemp, as of November 2009, moved to a newer version of USHCN (called Version 2) that fills in the missing USA stations from May 2007 (when version 1 cut off the data) to now. I do not yet have a count of those stations, but it ought to be in the “1000 or so” range.

One thing to watch out for: You will often see “coverage maps” showing thermometer dots all over the world. These most often are just showing that a theremometer was at that point SOMETIME in the past. It does not mean that a thermometer is there, reporting, and USED in GHCN at this time. So a lot of places show a GIStemp used thermometer, but have no data since 1990. (On the order of 4000 stations…)

Hope that helps. -E.M.SMith ]
Barry says:

17 January 2010 at 6:50 am

Is it true that Bolivia gets a double whammy?

First, it is deemed to have the same high temperature as a sea-level thermometer in nearby Peru.

Secondly, it is automatically subject to an “elevation offset adjustment” by the topographical data built into the program, to raise the deemed alpine temp to its sea level equivalent.

If all the high areas in the world are notionally increased to sea level equivalent temps, how can this be regarded as the average temperature of our mountainous planet?

REPLY: [ I don’t think Bolivia gets a ‘double whammy’ due to the attempts at correction in the GIStemp code. My opinion is that GIStemp tries to make corrections, but fails to do it enough (I have a preliminary benchmark that shows this, but have not completed the checking enough to publish it). So, for example, that “beach in Peru” will be, oh, 5 C hotter on average. Now GIStemp tries to compute an offset and maybe comes up with 4 C (due to using an airport with little UHI in 1950) then uses that to ‘adjust’ the today temp at that airport (which may presently have a 6 C hotter ‘offset’ due that airport being much bigger). And you get a 2 C temperature “bump” in the ‘guess’ for the mountain due to comparing the past airport ‘offset’ with the present airport temperature at the beach. 2 C is not twice as big as 5 C… but it is not 0 offset, which is what the claim of Global Warming requires if we are to get excited about a 1/10 C change in the “anomaly”… OK, one complication. Say the baseline is from the 30 year cool phase of the PDO (it IS cherry picked to be in the bottom of a cool dip) and the present is in the hot phase. Now all it takes if for that beach near the ocean to have a larger response to warm water near it while the mountains have a lower response being so far away. At that point you could have an added 1 C or 2 C of ‘offset’ today that is not in the baseline. Your “2 C” warmer becomes “4 C warmer”. When a real thermometer in the mountains might show they were not warmer at all… As you start “double dipping” those effects, then you might be able to get a “double whammy” where a mountain site with no growth, compared to a tourist airport at the beach, could be unchanged on a real theromometer, yet get +2 C from the “corrected” offset AND get +2 C from the offset being in a different regimin of the 30 year cycle. Now you have 4 C of fiction and no reality… repeat for any other similar effects… And that is why I think that if you want to know the temperature in Bolivia, it is better if you actually use a thermometer and measure it… Especially since the data ARE being collected and ARE available. Frankly, to chose NOT to use real data when you have it smacks of “agenda driven design” just way too much. There is simply no valid reason I can see to drop 3/4 of the thermometer data on the floor and say “Oh, no worries, we can make it up more accurately than we can measure it.” -E.M.Smith ]
Martin Judge says:

17 January 2010 at 9:50 am

Martin Judge

Thanks for the speedy reply.

I understand what you are saying but the necessity for any measuring reference – be it economic, climatic. or medical – to have temporal consistency is so fundamental to statistical analysis, that I still find it hard to believe that what you describe has happened, has actually happened.

In Business it equates to the New York stock exchange removing removing 75% of the companies from the Dow Jones Index and yet claiming that there has been no change to the Dow and that the Dow is still an accurate reflection of US business value.

I don’t think the Wall Street Journal would agree.

Thanks again for your reply : your excellent work is appreciated.

REPLY:[ You are most welcome. The Dow 30 Industrials is, in fact, a stellar example. Why? Simple. Ask yourself: Is Bank of America an “industrial”? Classical “Dow Theory” works if you compare the “Dow Transports” with actual industrial companies (it depends on the fact that stuff has to be transported to and from a factory at a different time from the factory operation; so you will find that you must have a rise in shipments of coal, iron ore, fabricated parts, etc. before you can ship a car, and that car must be shipped before the sale reflects in the company report. OK, loading up the “Dow 30 Industrial” with a couple of banks and some drug companies dow NOT reflect the original needs of the Dow Theory. This rankles Dow Theory Purists, as you might imagine. Yet it remains a reasonably valid indicator of general business growth or progress. Why? Because there is a very complex set of “factors” applied to each company in the Dow 30 Industrials today to maintain the continuity of the series. So what would happen if we dropped out the banks and the drug companies? First off, we would need to take the weighting factors applied to the other companies and bump them up by a lot to keep the quoted Dow from yesterday the same value. Then would would need to accept that when The Fed makes anouncements about interest rates or when a mortgage crisis hits, the “Dow” will move much less (no bank volatitlity with interest rates and collapse of mortgages – that happens every decade or three… remember the “S and L Collapse”… and the “Banking Panic of 33 A.D. ) We would likely need to adjust some volatility adjustment to try to keep the movements the same. The list goes on… So, OK, you can do that “well enough”. Now the problem: How do you compare that “New Dow” without the banks and drugs to the “Middle Dow” that had them? Does an “anomaly” comparing one to the other still “work”? Is it valid to compare that “New Dow” to a Dow in mid crash in the banking crisis we just went through? Patently not. BUT it WOULD work well again in classical Dow Theory… FWIW, the mirror image of this “issue” was raised when the Dow 30 Industrials was expanded in number of companies and had the banks and drugs put in in the first place. It damaged the utility of classical Dow Theory and there was a great deal of complaint about lack of “continuity” with the past making comparison (i.e. anomaly detection ) between the “Really Old Dow” and that “Middle Dow” an exercise in frustration. So, in fact, the Dow 30 Industrial has been through a very similar upheaval and had a very similar “You want to do WHAT?” response from the trader community (and discussions of it do surface from time to time). It was just so long ago few folks remember it and in so narrow a community of folks (traders who study Dow Theory) that it gets little coverage or awareness out of that community.

For a good laugh, read:

Business Panic of 33AD – things never change

And you will see that things never change. It would read like last years news were it not for the “Funny Thing Happend on the Way to the Forum” character names and setting.

Finally, this effect of selection bias can be a good thing. It is why it is so darned hard to beat the “S&P 500” as an investment vehicle. The S&P is a market weight capitalization index. The 500 biggest companies in America. THE reason it beat 60-75 % of all professional money manager is because the index keeps changing what is in it. Basically, size matters. And when “buggy whips” drops in market size, it exits the S&P prior to complete collapse and bankruptcy (cutting your losses) and when PDA’s and iPods grow large enough to “be interesting” those companies enter the index. Any “stock fund” that follows the changes, buys in to new trends at a “good time” and sells at a “good time”. Maybe not the optimum time, but better than most fund managers (following their fantasy theories about what is happening or why… gee, sound familiar?…)

It was those two bits of history / understanding that made me say “They do WHAT? Compare a baseline with one set to ‘today’ with another set? And make adjustements based on ‘knowing their theory must be right’ instead of via benchmarking and testing performance? Then toss in the ‘almost always wrong nintendo modeling’ that has driven many a ‘program trading firm’ out of business and completely missed things like the housing collapse…. So now you can see why my “Skeptic” buzzer started going overtime… It is the same set of human failings and behaviours that fails miserably in the finacial field. Yet they claim perfection to 1/100 C ?? “I don’t think so Tim”… ’cause I’ve seen this movie before a few times and it doesn’t end well. So I stated testing, measuring, and checking claims. The rest, as they say, is history. -E.M.Smith ]
Martin Judge says:

17 January 2010 at 1:12 pm

E M Smith

Thanks again for the info: if the S & P 500 is a better giude than the Dow 30, wouldn’t the GISS 6000 be a better guide than the GISS 1000 ?

Regards

MJ

REPLY: [ BINGO! Give that man a Quipie Doll !! ;-) -E.M.Smith ]
Pingback: Climategate goes American: NASA Goddard Institute for Space Studies involved - TeakDoor.com - The Thailand Forum
Dan Smollen says:

18 January 2010 at 9:41 pm

The climate warmists only see science as a means of hoodwinking the public to extort enormous sums of money and as a source of personal power.
Since the non-controlled data indicates a cooling trend since 2000 and even some climate warmists are saying that we are in a cooling trend for perhaps another 20-30 years, it makes sense that they are reporting a selected few stations and choosing new stations to report data from. This will hide/dampen the cooling trend. They feel that they must hide/dampen the cooling to keep their models alive and keep the alarmist siren of global warming alive. The stations influenced by the oceans will show less cooling due to the moderating affects of the ocean, and the urban stations will show less cooling due to the urban heat island effect. I have no doubt that when the cooling cycle reverses and a warming cycle begins they will again choose to report different stations that amplify the new warming cycle. Liars to the end! Sunlight is the best disinfectant! Thank you…. They will resist calibrating the new stations because they are behind the curve, since we are now in a cooling trend; and to calibrate in a cooling trend will impair their dampening affect. I believe if we could indeed calibrate or compare the new stations over say years 2004 thru 2010 we would definitely see the new stations are not an adequate substitute for the old stations…Also, the few stations will not adequately represent the many stations.
Pingback: Climategate goes American: NOAA, GISS and the mystery of the vanishing weather stations : Federal Jack
boballab says:

19 January 2010 at 6:00 am

Whats funny is that you can go to GISS’s website and their map generator where they let you change the settings including the radius of the infill and see how the thermometers disappeared over time, here is an example of this:

This is comparing a 1970 250 km infilled anomaly map compared to a 2009 250 km infilled anomaly map. (Gray areas means no data for that area)
http://tinypic.com/usermedia.php?uo=f%2BydX%2FAT8aUwjGvSLxEJ94h4l5k2TGxc

Notice the huge hole in the middle of South America, the entire Canadian Arctic coast is gone and Africa’s looks like it is a thermometer cancer patient the way the gray areas have metasitized in the 2009 map.
Pingback: Climategate goes American: NOAA, GISS and the mystery of the vanishing weather stations | We Are Change Utah
Pingback: Climategate goes American: NOAA, GISS and the mystery of the vanishing weather stations « The Invisible Opportunity: Hidden Truths Revealed
Pingback: US ClimateGate Report: NOAA and the Global Historical Climate Data (GHCN) « The Catastrophist blog
Pingback: ClimateGate: 30 years in the making « The Ninth Law
Pingback: When Station Data Goes Missing « Things I Find Interesting
Tony Ryan says:

22 February 2010 at 4:41 pm

There is a report that old data in Darwin (North Australia) has been falsified. Someone should check this out. This is pure fraud.

Moreover (Narooma News, NSW, late January), a farmer asked readers if they have climate readings to cover two years missing. But, disregarding the missing data, the hottest year recorded was 1934.

I thought that amusing inasmuch as a NASA corrective review found the same.

Note: Many Australian farming families kept scrupulous records; which are dismissed by scientists as anecdotal. My attitude is that the farmers were never motivated to falsify records, which makes these a whole lot more reliable to me.
Pingback: A Hypothetical Cow “Gored” to Death « Things I Find Interesting
E.M.Smith says:

23 February 2010 at 4:27 am

@Tony Ryan:

Every farmer I’ve known was very interested in knowing exactly when it was going to frost or be too hot. They wanted to know when a crop was likely to need more water and how many days to harvest (dependent on ‘degree days’ so directly proportional to temperature over time).

If they got it wrong, they lost money. Sometimes 100% of the crop. (Don’t set out smudge pots for frost? Lost peaches. Smudge when not needed? Lost a few hundred bucks to a few thousand bucks dependent on size of plot). So especially around specific times and dates the ONLY thing my little farm town talked about was the weather. (Especially frost potentials at the lead in / lead out of a season).

Each farmer knew their plot of land and how it was different from a neighbor. Even just a degree (or maybe less!) different. So one guy was bit more down slope (colder on still nights) or closer to the river (cooler most days, warmer on sudden cold snaps from the thermal inertia of the water) and knew he could wait to spend that money on smudging, or could not wait.

Most of the town knew the pattern. (Kids from town often were hired to do the smudging… so you needed to know when you might be working at midnight or 2 am … and the teachers needed to know when 1/4 of the class was going to be way sleepy the next day ;-) So when “The Andersons” would flood their peach field (being nearer to the river and with fairly cheap water) you could be pretty darned sure that the next night (if the cold persisted or worsened) the folks just a degree warmer the first night would be looking to have a smudge team on site. (Teams would be hired for standby, but not actually ‘light up’ until the temperature reading said it WAS going to frost. You would sometimes set out the fueled up smudge pots, but not light them. Then come back a day or two later and run through the orchard lighting them when the frost point was reached.)

Occasionally you would hear of a farmer who got the call wrong and either lost crop, or had to spend all night smudging his field without a crew as everyone was booked up already. I remember hearing about someone who had it wrong, but at about 2 am ‘got clue’ and a panic call to neighbors getting a loaner crew sent over.

We ran a restaurant in town, and we had to be aware of the temperatures too. We would often have a late after dinner rush of folks staying up late waiting for work calls to flood fields or do smudging; or on major cold days when the crews would work well into the night, have a breakfast rush of folks who’d been up all night filling, setting, and lighting smudge pots.

Given how much was on the line (and how thin the line was between making it and literally ‘losing the farm’ for many of the folks) I would take farmers records as accurate any day. Especially around frost dates.

FWIW, we also had an odd local weather pattern. This was in a place where from about May to October you could pretty much bank on “no rain” as the forecast ( California Central Valley with two seasons, wet and dry… ;-) In August, there are a couple of weeks where you can sometimes get rain. Just for no good reason, it goes to a “midwest thunderstorm” pattern sometimes. ( I suspect this is related to the hurricane season elsewhere, but don’t really know why.) Every peach farmer knew about those weeks and watched the weather very closely.

Why?

Because of ‘brown rot’.

Peaches, when under-ripe, can take some dampness and be fine. If harvested already, you care not at all about rain. But if just ripe, and damp, you risk brown rot and crop loss.

Those two weeks or rain shower risk tended to land right on the “late harvest” window. (Different varieties ripen at different times, and different degree-days can move harvest a week or two one way or the other. Scheduling pickers and trucks and cannery delivery dates ALL depended on that… so folks with ‘late varieties’ often got a premium for filling in that part of the season, but took the risk of loss.) The topic of conversation then was always “Think it will rain this season? What days?” Because if it rained (and was hot enough to mold but not so hot as to dry off the crop before brown rot could grow…) you needed to pay a bucket of money to the crop dusters to dump sulphur on the crop. (Or use a ground level sprayer for smaller fields).

So in the hot part of August, folks were very aware of rain, humidity, and exactly what the temperatures were (and if they were in the window where brown rot could grow).

And this scenario is repeated around the world for different crops. Grapes are harvested (and winery crews hired) based on degree-days. Bug sprays are applied based on temperatures. (Some sprays, if applied too hot, damage crops. Others, if too cold, don’t work as well. Less so now than then as the new sprays have wider temperature bands. But you still don’t want to spray if the temps are low enough that the bugs are not growing.) And all of those processes can either cost you money, or crop, or both.

Every farmer I knew started the day by looking out the window and checking the sky and the temperatures. And every farm town radio show has an early morning weather report for farmers. And Lord help the weather man that calls a frost wrong…

With all that said: Some farmers were better than others. There were some folks who’s records I would not trust as much as others. However, there was an “odd thing”. The ones who KEPT records, tended to be the best. The ones who were ‘more wrong’ didn’t keep records. They were more of a “seat of the pants – what did Bob do yesterday? guess I’ll try that” sort. The folks who took the time to book their readings were the better ones. The ones other folks asked for guidance over coffee at the restaurant counter.

And frankly, at the end of the day, I’d take a set of farmers readings over those re-imagined books of NCDC any day.
Tony Ryan says:

24 February 2010 at 1:50 am

EM Smith

Your perspective encourages me to call for all farmer’s records to be copied and submitted for analysis by an independent volunteer panel of climatologists. I believe the more competent versions will be easy to identify, and mean data can be averaged to provide the most accurate historical assessment yet.

In the meantime, 80% of Australians do not believe the AGW theory holds enough water to support government action, so government will have to impose carbon taxes against a hostile electorate. I see this as an an opportunity for a political watershed.

If this is being read by Australian farmers, check out Agmates.com and http://www.oziz4oziz.com/

Another thought… it would be politically powerful if farmers in all nations got together on-line, especially as the globalist agenda particularly aims at eliminating about 80% of the world’s food-producing capacity.

Waiting for your local politicians or industry representatives to back you would be an exercise in absolute futility.

And to scientists… peer reviewing is the biggest load of hogwash and is designed to facilitate cronyism and corruption. As many Aussie farmers has discovered, if you talk directly to civvies you will find a ready ear, and reliable allies in the hard times to come.
Pingback: Climategate goes American: NOAA, GISS and the mystery of the vanishing weather stations « Dark Politricks Retweeted
Pingback: May 2010, globally warmest May on record - Page 3
William says:

4 July 2010 at 10:54 pm

Wow pretty interesting stuff, thanks E.M.
Tony Ryan says:

6 July 2010 at 2:50 am

Further to our conversation (EM/TR), it may be interesting to note that Australia’s Prime Minister Kevin Rudd, who wanted to adopt an ETS to trade away our ‘carbon footprint’, was unceremoniously shoved aside by his deputy, now Prime Minister Julia Gillard, and the ETS is permanently on hold.

Why, because Gillard and her backers discovered that 80% of Aussies reject AGW.

She also discovered the vast majority of voters want no more migrants or refugees, so they have been put on hold as well.

The truth is, Australians have had a gutsful of politicians who ignore the voice of the majority of citizens. A few pollies now realise people are angry enough to beat their pollies to death; especially in Queensland.

It only takes one country to change direction and others will follow. Pretty soon, a handful becomes an avalanche, and the NWO elite are out in the cold.

I’ll drink to that.

Meanwhile, how about the world’s farmers start getting together.
E.M.Smith says:

6 July 2010 at 1:53 pm

Well, I’ll lift a glass or two to the folks taking back ownership of their government. (Scotch or Peaty Irish preferred, Canadian pretty good too. I have an issue with Bourbon as I have a corn allergy and I’m not sure what makes it through the process, like the flavor but not so sure about the risk to me. Do the Aussies make a whiskey? If so, is it imported to the USA? If not… I offer to be the distributor… Will distribute for a percentage in kind ;-)

In the case of the USA, we have the fascinating case of the Tea Party that has stepped outside the normal control structures. they have set up a party based on Policies and not on Politicians. Very novel and very interesting. They support folks from any party who have the right goals. And are having great success. Much to the dismay of the power bosses in the parties ;-)

I heft a cup of tea to them daily. Maybe some day I’ll get ambitious enough to actually attend some event and see what it’s like.

Per getting farmers together globally: A very good idea. Don’t know how to do it, though. The ‘hard bit’ would be structuring the organization so that it could not be co-opted by the Powers That Be. (They like to co-opt organizations that rise against them). Initially a ‘benign dictator’ would probably work (the person who sets it up holds control) but eventually a strong constitution and a selection of a board that was via hard to corrupt methods would be needed. Coordination would be pretty simple in the internet age.

farmers.org returns an IP, but doing ‘host farmer.org’ does not. Probably would be held to be too close a name if they have a copyright on farmers.org. ‘farmer.org’ does not have an IP but does have a mail handler, so is in use. ‘farm.org’ has both.

Doing a ‘host globalfarmers.org’ returns no users. Might be worth registering it and then contacting the various national farmers orgs to set up a coordinating agency for global issues / laws. (Not hard to set up a domain name at all, if you need help, holler. Most ISPs will do it as part of setting up an internet connection or you can do it yourself – which I prefer – at a variety of sites.)
Tony Ryan says:

6 July 2010 at 7:12 pm

Gidday EM

I wondered what the tea party was about.

A party of policy, eh?

That is coming perilously close to democracy, and will not be tolerated by the elite.

Looks like someone finally figured out that the US has representationalism, not democracy. That is, Americans elect a corrupt politician to do our thinking for us. Not a good move.

Abe Lincoln, in magnificant and lyrical overstatement, got it right… very literally… government of the people, by the people and for the people.

Funnily enough, Thucydides of ancient Greece accepted that definition, as did the Irish Monks, the democratic states of 4-8th century Finland, Thomas Paine, and Lord Acton.

Maybe you auta check out them Tea Party varmints, EM.

Now on the subject of a farmers org, I will move in this direction. I have one nagging worry, and that is that farmers tend to look out for number one; especially in Australia. For example, the cattle farmers switched to live export, made more money, but left a quarter of a million fellow Aussies unemployed in the process. Now they are getting into GM crops, in spite of majority opposition.

Which reminds me, EM, GM is possibly the reason why you are allergic to corn. You have a smart metabolism.

Cheers

tony
Pingback: Global Warming? Really? Part 3 | 420trader
Pingback: Climate Gate 2.0 : les reconstitutions de températures de la NASA falsifiées ! | WeAreChangeRennes
Rob says:

4 December 2010 at 6:40 am

How does the low number of themometer stations in your chart, and statements such as 3 in California match up with:

“Surfacestations project reaches 82% of the network surveyed. 1003 of 1221 stations have been examined in the USHCN network.”

including more than three in California as detailed on http://www.surfacestations.org/
Pingback: 2010: "Hottest Year on Record"? | Americas Society
Pingback: We are warmists, we are righteous, we fake data « Clear Thinking
Pingback: NOAA Climategate Ground Zero | simonjmeath
Pingback: Tečka za Climategate: Aneb nejoblíbenější výmluvy

Comments are closed.

Recent Posts

	jim2 on W.O.O.D. – 6 May 2024…
	another ian on W.O.O.D. – 6 May 2024…
	jim2 on W.O.O.D. – 6 May 2024…
	another ian on W.O.O.D. – 6 May 2024…
	E.M.Smith on W.O.O.D. – 6 May 2024…
	E.M.Smith on EV Future Path Revisited
	E.M.Smith on W.O.O.D. – 6 May 2024…
	another ian on W.O.O.D. – 6 May 2024…
	beng135 on W.O.O.D. – 6 May 2024…
	Keith on W.O.O.D. – 6 May 2024…
	another ian on W.O.O.D. – 6 May 2024…
	H.R. on W.O.O.D. – 6 May 2024…
	Canadian Friend on W.O.O.D. – 6 May 2024…
	E.M.Smith on Notes On PCC Pistol Caliber…
	another ian on W.O.O.D. – 6 May 2024…