
Don't worry, it's only a temperature sausage from GIStemp
The Un-Discovered Country
In an earlier posting, California On The Beach, we saw that there were significant thermometer deletions, in the USA in particular.
Many of these could be tracked to a conversion of the USHCN data input file to a new format USHCN.v2, (while GIStemp did not have the maintenance programming work done to use the new format). USA data from USHCN cuts off in 2007. At about the same time, GHCN had a large reduction of thermometers as well. In the USA, this reduced the active thermometer count in the present to 136. Hardly representative.
So there was a dramatic “crash” of thermometer count.
I asserted “this matters”. And I can now put a number on it.
I’ve run a program to convert the USHCN.v2 file into a USHCN format that GIStemp can process. That program is listed here:
https://chiefio.wordpress.com/2009/11/06/ushcn-v2-gistemp-ghcn-what-will-it-take-to-fix-it/
The old deleted thermometers and the after putting them back in full temperature histories are listed below. I’ve also pasted in the console log from the run of STEP0 so you can see that it ran to completion normally (and produces terrible console logs…) There is also a brief wrap up after the data.
All that is just after the findings in the next section.
The Re-Discovered Country
After running that program I found there had been 59 new stations added (beyond the older 1000+ that were simply being ignored now):
[chiefio@tubularbells tmp]$ wc -l USHCNv2.Adds 59 USHCNv2.Adds [chiefio@tubularbells tmp]$
Longer term, it will take a bit of work to go through those added stations and put updated entries for them into the needed tables for GIStemp (it dies if the entries don’t match). But knowing that these stations are brand new stations, and that GIStemp is going to toss them out for being under 20 years long in STEP2, there is another route to a benchmark.
I just removed those station records from the converted USHCN.v2 input file. Now the remaining data match the “station inventory” and the program runs to completion.
This ought to have nearly no effect on the benchmark after STEP2 (to be done a bit later) and only a small effect on this benchmark. Basically, it’s better to have put 1000 stations back in and be short a few, then to be short all of them.
UPDATE: I’ve added the USHCN.v2 inventory format entries for those “added” stations down at the bottom.
And what do we find? We find that the record for 2008 cools dramatically when you use all the thermometers.
There is a 0.6 C “Selection Bias” in the U.S.A. temperature record from deleting the USHCN thermometers in GIStemp
This selection bias measurement is for the U.S.A. data only (that is where USHCN covers). When averaged in with the rest of the world, that number will reduce. (Though there are also deletions in the rest of the world data. If all the deleted thermometers were put back in, one might well find a similar effect for the ROW – Rest Of the World.) To the extent that the ROW deletions are of similar pattern, this would be representative.
Take a look at the 2008 “yearly average” and “thermometer count” numbers in these two excerpts from the two runs of “old” and “new” USA data. Those are two fields on the far right.
This is the bottom part of the “before”. Run on my standard benchmark copy of the USHCN data:
Thermometer Records, Average of Monthly Data and Yearly Average by Year Across Month, with a count of thermometer records in that year -------------------------------------------------------------------------- YEAR JAN FEB MAR APR MAY JUN JULY AUG SEPT OCT NOV DEC YR COUNT -------------------------------------------------------------------------- 2002 2.1 2.8 4.8 12.4 15.6 21.8 24.5 23.1 20.0 11.7 6.0 2.1 12.2 1421 2003 -0.1 0.5 6.6 11.6 16.4 20.4 24.0 23.8 18.5 13.5 6.7 1.9 12.0 1411 2004 -1.4 0.9 8.2 11.8 17.4 20.5 22.9 21.5 19.3 13.4 7.3 1.7 12.0 1381 2005 0.3 3.2 5.6 11.8 15.6 21.4 24.2 23.4 20.2 13.4 7.5 0.2 12.2 1213 2006 4.1 1.4 6.1 13.3 17.0 21.7 24.8 23.3 17.7 11.8 7.1 3.0 12.6 1200 2007 0.0 -0.3 8.4 10.4 17.3 21.6 23.6 24.2 20.2 15.0 7.5 2.4 12.5 1164 2008 0.3 2.1 6.2 11.5 16.2 21.6 23.5 22.6 19.3 12.7 6.8 1.7 12.0 136 AA -0.7 0.9 5.3 10.8 15.9 20.4 23.0 22.2 18.4 12.5 5.9 0.8 11.3 Ad -0.7 0.9 5.3 10.9 16.0 20.5 23.1 22.3 18.5 12.6 6.0 0.9 11.4 For Country Code 425 [chiefio@tubularbells Temps]$
And this is the “After”. Run on the converted USHCN.v2 data:
Thermometer Records, Average of Monthly Data and Yearly Average by Year Across Month, with a count of thermometer records in that year -------------------------------------------------------------------------- YEAR JAN FEB MAR APR MAY JUN JULY AUG SEPT OCT NOV DEC YR COUNT -------------------------------------------------------------------------- 2002 2.1 2.8 4.8 12.4 15.5 21.8 24.5 23.1 19.9 11.7 6.0 2.1 12.2 1421 2003 -0.1 0.5 6.6 11.6 16.4 20.3 24.0 23.8 18.6 13.6 6.7 2.0 12.0 1412 2004 -1.3 1.0 8.2 11.9 17.4 20.5 22.9 21.5 19.3 13.5 7.4 1.8 12.0 1381 2005 0.4 3.2 5.7 11.9 15.7 21.4 24.2 23.4 20.2 13.4 7.6 0.3 12.3 1220 2006 4.2 1.5 6.2 13.4 17.1 21.7 24.8 23.3 17.8 11.8 7.2 3.1 12.7 1205 2007 0.1 -0.2 8.6 10.5 17.3 21.3 23.6 24.0 19.6 14.4 6.7 1.0 12.2 1166 2008 -0.6 1.3 5.5 10.8 15.6 21.2 23.4 22.3 18.7 12.2 6.5 0.2 11.4 1170 AA -0.7 0.9 5.3 10.8 15.9 20.4 23.0 22.2 18.4 12.5 5.9 0.8 11.3 Ad -0.7 1.0 5.4 10.9 16.0 20.5 23.1 22.3 18.5 12.6 6.0 1.0 11.4 For Country Code 425
Since the “cut off” of USHCN only happens mid year of 2007, the full impact does not show up until the 2008 number, but we see hints of it in the 2007 numbers were we have a 0.3 warming bias in the “Hansen Way” for the totals. We can also see that while the Jan Feb Mar numbers are almost identical, the Oct Nov Dec numbers have a significant warming bias of 0.6 C, 0.8 C, and 1.4 C. That mid-year cutoff thing showing through…
The bottom lines are two ways of doing “averages of the above averages” to show how much impact a programmer decision can have on “averaging”. The one with AA is the average of the monthly averages printed in the chart above. The one with Ad is an average of the daily values for the total history of that month, without going through the monthly average first. You can see that the 1/10 C place wanders back and forth depending on which way you chose to do that particular average. This is part of why I say that the “1/10 C place” is not something on which to bet the fate of the planet, or the economy…
Now, with this benchmark, we may need to move that to more than a single 1/10 C that is in doubt…

Comparison of Before and After USHCN.v2 - Version 2
With thanks to ‘Ripper’ who supplied the graph in comments.
That’s the “meat of it”. Eventually I’m going to put a “STEP1” and “STEP2” benchmark A/B together. But that will have to wait until after morning coffee and maybe a spot of sausage and eggs. I love the smell of sausage being cooked in the morning ;-)
The Original USHCN Old Format Mid-2007 cut off Temperature History
[chiefio@tubularbells Temps]$ cat Nov2U.425.yrs.GAT Thermometer Records, Average of Monthly Data and Yearly Average by Year Across Month, with a count of thermometer records in that year -------------------------------------------------------------------------- YEAR JAN FEB MAR APR MAY JUN JULY AUG SEPT OCT NOV DEC YR COUNT -------------------------------------------------------------------------- 1880 5.1 3.3 5.4 11.7 18.5 21.7 23.4 22.7 18.7 12.6 3.4 0.1 12.2 135 1881 -1.8 1.4 5.3 10.9 18.4 20.9 23.9 23.6 20.5 13.8 6.5 4.8 12.3 148 1882 0.9 4.0 6.5 11.1 14.8 21.0 22.7 22.8 19.1 14.4 6.2 1.2 12.1 179 1883 -2.1 0.5 4.5 11.1 15.4 21.7 23.5 22.0 18.4 12.6 7.0 2.3 11.4 197 1884 -2.0 1.4 5.1 10.4 16.5 21.1 22.8 22.3 20.1 14.7 6.7 0.7 11.7 227 1885 -2.1 -1.9 3.4 11.0 16.4 21.0 24.1 22.4 18.8 12.4 7.0 2.2 11.2 257 1886 -3.1 0.9 4.5 11.8 17.7 21.0 23.8 23.3 19.7 13.7 5.5 -0.2 11.6 272 1887 -1.3 1.2 5.2 11.1 18.8 21.8 24.6 22.3 18.9 12.3 6.3 0.6 11.8 314 1888 -3.5 1.2 3.0 12.3 16.0 21.7 24.0 22.8 18.5 12.0 6.9 2.9 11.5 367 1889 1.0 -0.4 7.0 12.3 16.9 20.7 23.4 22.4 18.5 11.9 6.3 6.0 12.2 437 1890 1.2 2.8 3.8 11.7 16.3 22.3 23.9 21.9 18.3 12.7 7.8 1.9 12.0 463 1891 1.0 1.0 3.0 11.9 15.8 21.1 21.9 22.3 20.2 12.5 5.6 3.6 11.7 523 1892 -1.9 2.4 3.9 10.4 15.4 21.4 23.1 22.7 19.0 13.2 5.7 -0.2 11.3 595 1893 -3.7 -0.7 4.0 10.6 15.6 21.5 23.8 22.2 18.8 12.7 5.4 1.4 11.0 661 1894 0.1 -0.9 7.2 11.9 16.8 21.5 23.7 22.8 19.6 13.3 5.6 2.1 12.0 696 1895 -2.4 -2.9 4.7 12.2 16.7 21.4 22.4 22.8 20.2 10.9 5.5 1.3 11.1 751 1896 0.1 1.9 3.3 12.7 18.6 21.4 23.6 23.1 17.9 11.7 5.4 2.4 11.8 785 1897 -1.6 1.4 4.9 11.2 16.2 20.7 23.9 22.0 20.2 14.2 5.9 0.2 11.6 818 1898 0.5 1.6 6.5 10.4 16.3 21.6 23.5 23.0 19.7 11.9 4.6 -0.6 11.6 842 1899 -0.7 -3.5 3.2 11.1 16.5 21.2 23.0 22.8 18.3 13.6 8.0 0.4 11.2 871 1900 1.0 -1.2 4.1 11.4 16.9 21.2 23.1 23.7 19.5 14.9 6.0 1.7 11.9 905 1901 0.1 -1.6 5.0 10.0 16.2 21.1 25.1 23.1 18.2 13.5 5.5 0.0 11.3 928 1902 -0.6 -0.8 6.3 10.8 17.3 20.2 22.8 21.9 17.5 13.2 8.0 -0.3 11.4 946 1903 -0.4 -0.8 6.8 10.5 16.2 18.9 22.5 21.7 17.8 12.9 4.9 -0.7 10.9 986 1904 -2.6 -1.2 5.3 9.4 16.1 20.1 22.0 21.5 18.7 12.7 6.6 0.3 10.7 1025 1905 -2.9 -3.0 7.6 10.8 15.9 20.6 22.4 22.5 19.2 11.7 6.3 0.7 11.0 1041 1906 1.4 0.8 2.4 12.0 15.9 20.3 22.5 22.7 19.7 12.1 5.6 1.7 11.4 1074 1907 -0.2 0.8 7.7 8.2 13.6 19.1 22.8 21.8 18.3 12.0 5.6 2.0 11.0 1101 1908 0.4 0.3 6.7 11.6 15.6 19.9 23.0 21.7 19.1 11.8 6.6 1.3 11.5 1128 1909 0.0 1.8 4.6 9.6 14.8 20.7 22.3 22.7 18.1 11.6 7.8 -2.8 10.9 1158 1910 -1.0 -1.4 9.7 11.9 14.9 19.9 23.1 21.6 18.7 13.5 5.0 -0.2 11.3 1171 1911 0.3 1.1 6.2 10.0 16.8 21.6 22.7 21.7 19.0 11.9 3.5 1.1 11.3 1206 1912 -4.4 -0.8 2.3 10.8 16.3 19.3 22.4 21.2 17.6 12.4 6.2 1.4 10.4 1217 1913 0.0 -1.1 4.2 11.0 15.7 20.5 22.9 23.0 17.8 11.4 7.7 1.6 11.2 1237 1914 1.4 -1.5 4.7 10.6 16.5 21.0 23.2 22.1 18.0 13.4 6.7 -2.2 11.2 1249 1915 -1.5 2.3 3.0 12.9 14.7 19.1 21.7 20.9 18.4 13.2 6.7 0.7 11.0 1262 1916 -1.5 0.0 4.9 10.1 15.4 18.9 23.5 22.3 17.6 11.7 5.2 -1.2 10.6 1283 1917 -1.6 -1.3 4.0 9.5 12.9 19.3 23.1 21.4 17.6 10.0 6.2 -2.0 9.9 1298 1918 -4.8 0.6 7.3 9.4 16.4 21.2 22.1 22.6 16.4 13.5 5.2 2.0 11.0 1312 1919 0.3 0.3 5.1 10.5 15.3 20.7 23.3 21.9 18.7 12.0 4.5 -1.8 10.9 1320 1920 -2.0 0.4 4.6 8.1 14.9 19.7 22.2 21.3 18.5 13.0 4.7 0.9 10.5 1328 1921 1.1 2.4 8.1 10.9 15.6 21.3 23.6 21.9 19.3 12.7 5.8 1.6 12.0 1336 1922 -2.5 0.0 4.8 10.3 16.3 21.1 22.3 22.2 19.4 13.0 5.9 0.7 11.1 1339 1923 1.0 -1.5 3.4 9.9 15.0 20.2 22.9 21.6 18.3 11.0 6.2 2.8 10.9 1346 1924 -2.9 1.1 3.1 9.9 14.0 20.0 21.7 22.0 16.7 12.9 5.9 -2.2 10.2 1346 1925 -1.8 3.0 6.5 12.3 15.0 21.0 22.9 21.8 19.5 9.3 4.9 0.2 11.2 1353 1926 -0.6 2.5 4.0 9.7 16.1 19.8 22.8 22.3 17.8 12.6 4.9 -0.3 11.0 1356 1927 -0.4 3.0 5.7 10.5 15.1 19.2 22.3 20.3 18.3 13.3 6.8 -1.5 11.1 1361 1928 0.0 1.2 5.6 8.8 16.0 18.6 22.6 22.0 17.0 12.8 5.7 1.0 10.9 1369 1929 -3.0 -2.6 6.3 10.6 14.8 19.6 22.7 22.0 17.5 12.0 4.1 0.9 10.4 1372 1930 -4.0 3.8 4.4 11.8 15.2 20.1 23.6 22.5 18.8 11.1 5.5 0.0 11.1 1377 1931 0.7 2.9 3.9 10.5 15.1 21.3 23.9 22.0 20.0 13.6 7.1 2.4 11.9 1384 1932 0.3 2.2 2.8 10.6 15.7 20.6 23.0 22.3 18.0 11.6 4.8 -0.6 10.9 1390 1933 1.6 -0.9 5.0 10.0 15.6 21.8 23.4 21.8 19.6 12.3 5.6 1.8 11.5 1395 1934 1.4 0.2 5.2 11.6 17.8 21.6 24.3 22.6 17.8 13.4 7.5 0.4 12.0 1393 1935 -0.5 1.9 6.5 9.7 14.0 19.6 23.8 22.5 18.0 12.1 4.7 -0.8 11.0 1394 1936 -2.7 -4.1 6.1 9.7 17.3 21.1 24.6 23.5 19.1 12.2 4.6 1.6 11.1 1400 1937 -2.4 -0.2 3.6 9.8 16.3 20.4 23.2 23.4 18.3 12.0 5.1 0.3 10.8 1403 1938 -0.2 2.0 7.2 11.0 15.4 20.2 22.9 23.1 18.9 13.7 5.2 1.3 11.7 1404 1939 1.0 -0.6 5.4 10.4 16.8 20.4 23.3 22.3 19.5 12.7 5.7 3.0 11.7 1405 1940 -4.6 0.8 4.7 10.0 15.7 20.6 23.1 21.9 18.3 13.3 4.4 2.4 10.9 1404 1941 0.0 0.5 3.8 11.6 16.8 20.1 23.1 22.1 18.3 13.0 6.2 2.4 11.5 1412 1942 -0.8 -0.5 5.2 11.7 15.3 19.9 22.9 21.8 17.6 12.5 5.8 -0.3 10.9 1419 1943 -1.9 1.7 3.3 10.6 15.2 20.7 23.0 22.6 17.4 11.9 5.0 0.3 10.8 1416 1944 0.2 1.2 3.3 9.1 16.7 20.2 22.2 21.9 18.1 12.6 5.4 -0.8 10.8 1424 1945 -1.2 1.4 7.6 10.2 14.1 18.7 22.2 21.8 18.2 12.0 5.4 -1.5 10.7 1471 1946 -0.1 1.2 8.0 11.8 14.6 19.9 22.6 21.0 17.8 12.2 5.7 1.8 11.4 1476 1947 -0.2 -0.7 3.4 10.3 15.1 19.1 22.0 23.1 18.7 14.7 3.9 0.7 10.8 1498 1948 -2.4 -0.3 3.9 11.6 15.3 20.1 22.5 21.7 18.5 11.5 5.9 0.5 10.7 1616 1949 -1.7 0.4 5.2 10.7 16.3 20.6 23.1 22.0 17.4 12.9 7.4 1.2 11.3 1747 1950 0.1 1.4 3.9 9.1 15.3 19.9 21.5 21.0 17.5 14.0 5.0 0.4 10.8 1757 1951 -0.8 1.3 3.6 9.9 16.0 19.3 22.7 21.8 17.8 12.4 4.0 0.2 10.7 1786 1952 0.2 2.1 3.7 10.8 15.5 21.4 23.2 22.2 18.6 11.6 5.5 1.3 11.3 1800 1953 2.0 2.4 6.2 9.6 15.6 21.2 23.1 22.1 18.8 13.5 6.9 1.7 11.9 1815 1954 -0.5 4.4 4.3 12.1 14.6 20.6 23.6 22.2 19.1 12.9 7.0 1.1 11.8 1825 1955 -0.7 0.0 4.8 11.6 16.3 19.0 23.4 23.0 18.7 12.7 3.9 -0.1 11.0 1752 1956 -0.5 0.5 4.6 9.7 16.2 20.9 22.4 21.9 18.1 13.3 5.1 2.5 11.2 1754 1957 -2.0 3.2 5.4 10.9 15.6 20.6 23.1 21.9 18.1 11.3 5.7 2.8 11.4 1763 1958 0.0 -0.1 3.8 10.6 16.5 19.8 22.4 22.5 18.5 12.6 6.6 0.0 11.1 1767 1959 -1.3 0.8 4.9 10.9 16.3 20.9 22.7 22.8 18.4 12.1 4.4 2.4 11.3 1766 1960 -0.6 0.0 1.8 11.4 15.3 20.3 22.7 22.1 18.9 12.7 6.3 -0.2 10.9 1762 1961 -0.8 2.8 6.1 9.2 14.8 20.4 22.5 22.2 18.0 12.3 5.4 -0.2 11.1 1760 1962 -2.0 1.6 3.6 10.8 16.9 19.9 21.9 21.9 17.5 13.4 6.3 0.7 11.0 1800 1963 -3.2 0.6 6.5 11.1 15.9 20.4 22.7 21.8 18.8 15.1 6.8 -1.7 11.2 1849 1964 0.3 0.2 4.1 10.8 16.3 20.1 23.2 21.2 17.8 12.1 6.5 0.3 11.1 1841 1965 -0.2 0.2 2.9 11.0 16.3 19.4 22.2 21.6 17.2 12.5 7.2 2.4 11.1 1835 1966 -2.6 0.1 5.9 10.0 15.2 20.1 23.5 21.4 17.9 11.7 6.6 0.8 10.9 1830 1967 0.9 0.3 6.3 10.9 14.3 20.1 22.1 21.5 17.7 12.3 5.6 1.1 11.1 1823 1968 -1.4 0.2 6.6 10.8 14.7 20.3 22.6 21.8 18.0 12.7 5.7 -0.5 11.0 1821 1969 -1.4 0.7 2.9 11.6 16.3 19.7 23.0 22.4 18.6 11.4 5.7 1.2 11.0 1813 1970 -2.7 1.9 4.2 10.3 16.4 20.4 23.0 22.6 18.5 12.0 5.9 1.2 11.1 1797 1971 -1.9 0.7 3.9 10.0 14.6 20.9 22.0 21.9 18.4 13.5 5.6 1.9 11.0 1693 1972 -0.9 0.4 5.7 10.0 15.9 19.7 22.2 21.9 18.1 11.3 4.5 -0.2 10.7 1689 1973 -1.0 0.9 7.1 9.9 15.0 20.6 22.7 22.3 18.2 13.4 6.3 1.1 11.4 1685 1974 -0.2 1.1 6.5 11.1 15.6 19.8 23.0 21.3 17.0 12.0 6.1 1.2 11.2 1679 1975 0.2 0.5 3.8 8.5 16.3 19.8 22.8 22.0 17.1 12.8 6.3 1.1 10.9 1670 1976 -1.2 3.7 6.0 11.1 14.9 20.0 22.4 21.4 17.8 10.3 4.0 -0.5 10.8 1669 1977 -4.4 2.1 6.5 12.2 16.8 20.8 23.4 22.0 18.8 12.1 6.2 0.5 11.4 1660 1978 -2.9 -2.2 4.8 10.9 15.6 20.4 22.8 22.1 19.0 12.3 5.9 -0.1 10.7 1660 1979 -4.5 -2.7 5.9 10.3 15.5 19.9 22.6 21.7 18.8 12.9 5.6 2.2 10.7 1657 1980 -0.4 0.2 4.3 10.9 16.0 20.1 23.9 22.6 19.0 11.5 6.0 1.0 11.3 1650 1981 0.0 2.5 6.1 12.8 15.2 21.0 23.0 21.9 18.0 11.3 7.0 0.7 11.6 1623 1982 -3.4 0.2 5.5 9.3 16.4 19.1 22.8 21.8 17.9 12.1 5.8 2.8 10.9 1605 1983 0.4 2.5 6.0 8.9 14.7 19.8 23.4 23.7 18.6 12.8 6.6 -2.9 11.2 1594 1984 -1.6 3.0 4.5 10.1 15.5 20.7 22.6 22.8 17.5 13.0 5.7 2.4 11.3 1592 1985 -2.6 -0.3 6.6 12.3 16.8 19.9 23.1 21.6 17.7 12.8 5.4 -1.2 11.0 1594 1986 1.2 2.0 7.5 11.7 16.5 21.2 23.2 21.7 18.3 12.6 5.7 2.0 12.0 1590 1987 0.0 2.8 6.3 11.9 17.5 21.4 23.2 22.2 18.6 11.5 7.2 2.0 12.0 1589 1988 -1.9 0.9 6.0 11.2 16.6 21.4 23.8 23.3 18.3 11.6 6.8 1.3 11.6 1598 1989 1.5 -0.9 5.9 11.6 16.0 20.4 23.3 22.1 18.1 12.8 6.2 -2.1 11.2 1597 1990 3.0 2.8 7.4 11.7 15.5 21.4 23.1 22.7 19.6 12.8 7.7 0.7 12.4 1572 1991 -0.6 4.3 7.1 12.4 17.8 21.4 23.7 23.1 18.8 13.3 5.1 2.8 12.4 1549 1992 1.5 4.4 7.1 11.7 16.5 20.1 22.4 21.3 18.5 12.7 5.7 1.0 11.9 1536 1993 0.1 0.0 5.6 10.7 16.8 20.5 23.3 22.9 18.0 12.3 5.3 1.9 11.4 1529 1994 -1.5 0.3 7.1 12.3 16.4 22.1 23.4 22.4 19.0 13.0 7.2 3.0 12.1 1519 1995 1.0 2.6 6.9 10.6 15.8 20.7 23.7 24.0 18.6 13.3 5.7 1.1 12.0 1495 1996 -0.9 1.9 4.2 10.9 16.6 21.4 23.1 22.6 18.1 12.7 4.6 1.8 11.4 1464 1997 -0.6 3.0 7.3 9.7 15.5 20.7 23.3 22.3 19.4 12.8 5.5 1.7 11.7 1431 1998 2.1 4.4 5.8 11.4 17.9 20.8 24.2 23.5 20.9 13.5 7.7 2.8 12.9 1428 1999 0.8 4.1 5.9 11.7 16.4 20.8 24.1 23.0 18.4 12.8 9.2 2.6 12.5 1447 2000 0.5 4.3 8.2 11.5 17.7 21.0 23.2 23.3 18.9 13.4 4.2 -2.1 12.0 1429 2001 -0.2 1.3 5.1 12.3 17.4 21.0 23.6 23.6 18.7 12.7 9.2 3.1 12.3 1434 2002 2.1 2.8 4.8 12.4 15.6 21.8 24.5 23.1 20.0 11.7 6.0 2.1 12.2 1421 2003 -0.1 0.5 6.6 11.6 16.4 20.4 24.0 23.8 18.5 13.5 6.7 1.9 12.0 1411 2004 -1.4 0.9 8.2 11.8 17.4 20.5 22.9 21.5 19.3 13.4 7.3 1.7 12.0 1381 2005 0.3 3.2 5.6 11.8 15.6 21.4 24.2 23.4 20.2 13.4 7.5 0.2 12.2 1213 2006 4.1 1.4 6.1 13.3 17.0 21.7 24.8 23.3 17.7 11.8 7.1 3.0 12.6 1200 2007 0.0 -0.3 8.4 10.4 17.3 21.6 23.6 24.2 20.2 15.0 7.5 2.4 12.5 1164 2008 0.3 2.1 6.2 11.5 16.2 21.6 23.5 22.6 19.3 12.7 6.8 1.7 12.0 136 AA -0.7 0.9 5.3 10.8 15.9 20.4 23.0 22.2 18.4 12.5 5.9 0.8 11.3 Ad -0.7 0.9 5.3 10.9 16.0 20.5 23.1 22.3 18.5 12.6 6.0 0.9 11.4 For Country Code 425 [chiefio@tubularbells Temps]$
This is the same chart we have seen before, my standard benchmark of archived USHCN and GHCN input. Also notice that this is for Country Code 425. The U.S.A.
The New USHCN.v2 data file Temperature History
I may have translated some of the “Estimated Value” flags a bit more sternly than warranted. If someone familiar with them can look at the “How to fix USHCN” link and comment there on the choices I made, I can rerun with better choices. For this run, I just said “all estimates are made up values” and those get tossed. I think you see that in the early part of this series where the thermometer counts are lower due to more old data being estimated. It is also possible that the input USHCN.v2 file is just more paranoid about marking estimated values. In either case, it has little to no impact on the benchmark and has none at all on the merit of the 2007 and 2008 comparisons.
Look at ./Temps/Temps.425.yrs.GAT (Y/N)? y Thermometer Records, Average of Monthly Data and Yearly Average by Year Across Month, with a count of thermometer records in that year -------------------------------------------------------------------------- YEAR JAN FEB MAR APR MAY JUN JULY AUG SEPT OCT NOV DEC YR COUNT -------------------------------------------------------------------------- 1880 5.6 3.7 6.0 12.1 18.8 21.9 23.5 22.9 18.7 12.9 3.5 0.5 12.5 71 1881 -1.4 1.9 5.7 11.1 18.7 21.2 23.9 23.8 20.3 13.9 6.7 5.2 12.6 78 1882 1.6 4.9 7.1 11.5 15.0 20.7 22.3 22.6 19.1 14.5 6.8 1.9 12.3 81 1883 -1.5 1.1 5.4 11.5 15.3 21.2 23.1 21.8 18.4 12.7 7.4 3.1 11.6 83 1884 -1.1 1.9 5.9 10.9 16.5 20.9 22.4 22.0 19.5 14.4 7.4 1.0 11.8 86 1885 -1.2 -0.5 4.9 11.8 16.2 20.5 23.5 22.0 18.8 12.6 7.7 3.1 11.6 86 1886 -2.4 2.1 5.3 12.0 17.4 20.7 23.7 23.1 19.2 13.8 5.8 0.5 11.8 93 1887 -1.0 1.2 6.1 11.7 18.3 21.3 24.1 21.9 18.7 12.4 6.9 1.3 11.9 104 1888 -2.9 2.1 3.7 12.6 15.5 21.1 23.6 22.4 18.6 12.3 7.3 3.5 11.7 108 1889 1.3 0.1 7.8 12.6 16.5 20.3 23.1 22.2 18.4 12.4 6.4 6.1 12.3 119 1890 0.9 2.8 4.3 11.9 16.0 21.8 23.7 21.7 18.2 12.9 8.3 3.0 12.1 126 1891 2.0 1.5 3.8 12.1 15.6 20.4 21.8 22.3 19.9 13.0 6.2 4.1 11.9 134 1892 -0.9 3.2 5.0 10.6 15.3 20.9 22.9 22.6 19.1 13.5 6.4 0.4 11.6 145 1893 -2.4 -0.2 4.3 10.3 15.3 20.9 23.4 22.0 18.5 12.7 5.7 2.0 11.0 154 1894 0.3 -0.4 7.3 12.0 16.4 20.9 23.4 22.7 19.3 13.4 6.2 2.8 12.0 155 1895 -2.5 -2.9 4.6 12.2 16.5 21.0 22.2 22.5 19.8 10.8 5.3 1.0 10.9 745 1896 -0.1 1.7 3.2 12.4 18.2 21.1 23.5 22.8 17.6 11.5 5.0 2.3 11.6 792 1897 -1.7 1.3 4.8 11.2 16.3 20.6 23.7 22.1 20.0 14.0 5.7 0.1 11.5 848 1898 0.4 1.5 6.3 10.4 16.3 21.5 23.4 23.0 19.5 11.8 4.5 -0.8 11.5 877 1899 -0.9 -3.6 3.1 11.1 16.5 21.2 23.1 22.8 18.3 13.5 8.0 0.4 11.1 903 1900 1.0 -1.2 4.3 11.5 17.0 21.3 23.2 23.8 19.5 14.8 6.0 1.7 11.9 927 1901 0.0 -1.7 5.0 10.1 16.3 21.2 25.1 23.1 18.2 13.5 5.5 0.0 11.4 947 1902 -0.7 -0.9 6.3 10.9 17.4 20.3 22.8 22.0 17.6 13.3 8.0 -0.2 11.4 971 1903 -0.4 -0.7 6.9 10.7 16.3 19.0 22.6 21.7 17.8 12.9 5.0 -0.7 10.9 1012 1904 -2.6 -1.1 5.4 9.6 16.2 20.1 22.1 21.6 18.8 12.7 6.7 0.3 10.8 1050 1905 -2.8 -2.9 7.6 10.9 16.0 20.6 22.5 22.6 19.2 11.8 6.3 0.7 11.0 1069 1906 1.3 0.8 2.4 12.1 16.0 20.3 22.5 22.7 19.7 12.1 5.6 1.7 11.4 1096 1907 -0.2 0.8 7.7 8.4 13.7 19.2 22.9 21.8 18.4 12.1 5.6 2.0 11.0 1132 1908 0.4 0.3 6.8 11.7 15.7 20.0 23.1 21.7 19.2 11.8 6.6 1.4 11.6 1149 1909 0.0 1.8 4.7 9.8 14.9 20.7 22.4 22.8 18.2 11.7 7.9 -2.8 11.0 1178 1910 -1.0 -1.4 9.8 12.1 15.0 20.0 23.2 21.7 18.8 13.6 5.1 -0.2 11.4 1185 1911 0.4 1.2 6.3 10.1 17.0 21.7 22.8 21.8 19.2 12.0 3.6 1.2 11.4 1215 1912 -4.4 -0.8 2.3 10.9 16.4 19.4 22.5 21.2 17.8 12.4 6.3 1.4 10.5 1221 1913 0.0 -1.1 4.3 11.1 15.9 20.5 23.0 23.1 17.9 11.5 7.8 1.7 11.3 1246 1914 1.5 -1.5 4.8 10.7 16.5 21.1 23.3 22.1 18.1 13.5 6.7 -2.2 11.2 1260 1915 -1.5 2.3 3.0 13.0 14.9 19.2 21.8 21.0 18.5 13.3 6.8 0.8 11.1 1272 1916 -1.4 0.0 4.9 10.2 15.6 18.9 23.6 22.3 17.7 11.7 5.3 -1.2 10.6 1292 1917 -1.5 -1.2 4.1 9.6 13.1 19.4 23.2 21.5 17.7 10.0 6.2 -2.0 10.0 1309 1918 -4.8 0.7 7.4 9.5 16.5 21.3 22.2 22.7 16.4 13.6 5.3 2.1 11.1 1324 1919 0.3 0.3 5.2 10.6 15.4 20.8 23.4 22.0 18.8 12.2 4.6 -1.7 11.0 1331 1920 -1.9 0.5 4.7 8.3 15.0 19.8 22.2 21.4 18.6 13.1 4.7 1.0 10.6 1341 1921 1.2 2.4 8.2 11.0 15.7 21.4 23.7 22.0 19.5 12.8 5.9 1.6 12.1 1349 1922 -2.4 0.1 4.9 10.5 16.4 21.1 22.4 22.3 19.5 13.1 6.0 0.8 11.2 1351 1923 1.1 -1.3 3.6 10.0 15.2 20.3 23.0 21.7 18.4 11.1 6.3 2.8 11.0 1356 1924 -2.8 1.2 3.2 10.1 14.2 20.1 21.8 22.1 16.8 12.9 6.1 -2.0 10.3 1358 1925 -1.7 3.0 6.6 12.5 15.2 21.1 23.0 21.8 19.6 9.4 5.0 0.3 11.3 1366 1926 -0.5 2.6 4.0 9.8 16.2 19.8 22.8 22.4 17.9 12.7 4.9 -0.2 11.0 1367 1927 -0.3 3.1 5.7 10.6 15.3 19.3 22.3 20.4 18.5 13.4 6.8 -1.4 11.1 1372 1928 0.0 1.2 5.7 8.9 16.1 18.7 22.7 22.1 17.1 12.9 5.8 1.0 11.0 1383 1929 -2.9 -2.5 6.4 10.8 14.9 19.6 22.7 22.0 17.6 12.1 4.1 0.9 10.5 1386 1930 -4.0 3.8 4.5 11.9 15.4 20.2 23.6 22.5 18.8 11.1 5.5 0.1 11.1 1390 1931 0.7 3.0 4.0 10.6 15.2 21.3 24.0 22.0 20.1 13.6 7.1 2.4 12.0 1398 1932 0.4 2.3 2.9 10.7 15.8 20.6 23.1 22.3 18.1 11.7 4.8 -0.6 11.0 1403 1933 1.6 -0.9 5.0 10.1 15.7 21.9 23.4 21.9 19.6 12.4 5.7 1.8 11.5 1406 1934 1.4 0.2 5.3 11.7 17.9 21.6 24.4 22.7 17.8 13.4 7.5 0.5 12.0 1404 1935 -0.4 1.9 6.6 9.8 14.1 19.6 23.8 22.5 18.1 12.1 4.7 -0.8 11.0 1405 1936 -2.6 -4.1 6.2 9.8 17.4 21.2 24.6 23.6 19.1 12.2 4.7 1.6 11.1 1411 1937 -2.4 -0.1 3.6 9.9 16.4 20.4 23.3 23.4 18.4 12.0 5.1 0.3 10.9 1413 1938 -0.2 2.1 7.3 11.1 15.5 20.2 23.0 23.1 19.0 13.7 5.3 1.3 11.8 1414 1939 1.0 -0.5 5.5 10.5 16.9 20.5 23.3 22.3 19.6 12.7 5.7 3.0 11.7 1415 1940 -4.6 0.8 4.7 10.1 15.7 20.7 23.1 21.9 18.3 13.3 4.4 2.4 10.9 1414 1941 0.0 0.5 3.8 11.7 16.9 20.1 23.1 22.1 18.3 13.1 6.2 2.4 11.5 1422 1942 -0.9 -0.5 5.2 11.7 15.3 20.0 23.0 21.8 17.6 12.5 5.9 -0.3 10.9 1428 1943 -1.8 1.8 3.3 10.7 15.3 20.7 23.0 22.6 17.4 11.9 5.0 0.3 10.8 1428 1944 0.2 1.2 3.3 9.1 16.7 20.3 22.2 21.9 18.2 12.6 5.5 -0.8 10.9 1435 1945 -1.2 1.4 7.7 10.3 14.2 18.7 22.2 21.8 18.2 12.0 5.4 -1.6 10.8 1481 1946 -0.1 1.2 8.0 11.8 14.7 19.9 22.6 21.0 17.9 12.2 5.8 1.8 11.4 1486 1947 -0.2 -0.8 3.4 10.3 15.2 19.1 22.1 23.1 18.8 14.7 3.9 0.6 10.8 1507 1948 -2.4 -0.4 3.9 11.6 15.4 20.2 22.5 21.7 18.5 11.5 5.9 0.5 10.7 1623 1949 -1.7 0.4 5.2 10.7 16.3 20.6 23.1 22.0 17.4 12.9 7.4 1.2 11.3 1755 1950 0.1 1.4 3.9 9.1 15.3 19.9 21.5 21.0 17.5 14.0 5.0 0.4 10.8 1762 1951 -0.8 1.2 3.7 10.0 16.0 19.3 22.7 21.9 17.8 12.5 4.0 0.2 10.7 1789 1952 0.2 2.1 3.7 10.8 15.6 21.4 23.2 22.2 18.6 11.6 5.5 1.3 11.3 1802 1953 2.0 2.4 6.2 9.6 15.7 21.2 23.1 22.1 18.8 13.5 6.9 1.6 11.9 1816 1954 -0.5 4.4 4.3 12.1 14.7 20.6 23.6 22.2 19.1 12.9 7.0 1.1 11.8 1826 1955 -0.7 0.0 4.8 11.6 16.3 19.0 23.4 23.0 18.8 12.7 3.8 -0.1 11.0 1752 1956 -0.5 0.4 4.6 9.7 16.2 20.9 22.4 21.9 18.0 13.3 5.1 2.5 11.2 1755 1957 -2.0 3.1 5.3 10.9 15.6 20.6 23.1 21.9 18.1 11.3 5.7 2.7 11.4 1764 1958 0.0 -0.1 3.8 10.7 16.6 19.8 22.4 22.5 18.5 12.5 6.6 0.0 11.1 1768 1959 -1.3 0.7 4.9 10.9 16.4 20.9 22.7 22.8 18.4 12.1 4.4 2.4 11.3 1766 1960 -0.6 0.0 1.8 11.4 15.3 20.3 22.6 22.0 18.8 12.7 6.3 -0.3 10.9 1762 1961 -0.9 2.7 6.1 9.2 14.8 20.4 22.5 22.2 18.0 12.3 5.4 -0.2 11.0 1760 1962 -2.0 1.6 3.6 10.8 16.9 19.9 21.9 21.9 17.5 13.3 6.2 0.6 11.0 1800 1963 -3.2 0.6 6.5 11.1 16.0 20.3 22.7 21.8 18.8 15.0 6.8 -1.7 11.2 1849 1964 0.2 0.2 4.1 10.8 16.3 20.1 23.2 21.2 17.8 12.0 6.4 0.3 11.0 1840 1965 -0.3 0.2 2.8 11.0 16.4 19.4 22.2 21.6 17.2 12.5 7.2 2.4 11.0 1834 1966 -2.7 0.1 5.9 10.0 15.2 20.1 23.5 21.4 17.9 11.7 6.6 0.7 10.9 1829 1967 0.9 0.2 6.2 10.9 14.3 20.0 22.1 21.5 17.7 12.2 5.5 1.1 11.1 1821 1968 -1.4 0.1 6.5 10.8 14.7 20.2 22.6 21.7 18.0 12.7 5.6 -0.6 10.9 1820 1969 -1.4 0.6 2.9 11.6 16.3 19.7 23.0 22.4 18.6 11.4 5.6 1.1 11.0 1811 1970 -2.8 1.8 4.1 10.3 16.4 20.4 23.0 22.6 18.4 11.9 5.8 1.2 11.1 1796 1971 -2.0 0.6 3.8 10.0 14.6 20.8 22.0 21.9 18.3 13.4 5.6 1.8 10.9 1692 1972 -1.0 0.3 5.6 10.0 15.9 19.7 22.2 21.9 18.0 11.3 4.5 -0.3 10.7 1688 1973 -1.0 0.8 7.1 9.9 15.0 20.6 22.6 22.3 18.2 13.4 6.3 1.1 11.4 1685 1974 -0.2 1.1 6.5 11.0 15.6 19.8 23.0 21.3 16.9 12.0 6.0 1.2 11.2 1677 1975 0.2 0.5 3.7 8.5 16.2 19.8 22.8 22.0 17.1 12.8 6.3 1.1 10.9 1669 1976 -1.3 3.6 6.0 11.1 14.9 20.0 22.4 21.4 17.8 10.3 3.9 -0.5 10.8 1668 1977 -4.5 2.1 6.5 12.2 16.8 20.8 23.4 22.0 18.8 12.1 6.1 0.4 11.4 1658 1978 -3.0 -2.2 4.8 10.9 15.6 20.4 22.8 22.1 19.0 12.3 5.9 -0.1 10.7 1659 1979 -4.5 -2.8 5.8 10.3 15.4 19.9 22.6 21.7 18.8 12.8 5.6 2.2 10.6 1656 1980 -0.4 0.2 4.3 10.9 16.0 20.1 23.9 22.6 18.9 11.5 6.0 1.0 11.2 1648 1981 -0.1 2.5 6.1 12.7 15.2 21.0 23.0 21.9 18.0 11.3 6.9 0.6 11.6 1622 1982 -3.5 0.2 5.5 9.2 16.4 19.1 22.8 21.8 17.8 12.0 5.7 2.8 10.8 1605 1983 0.3 2.5 5.9 8.9 14.7 19.8 23.3 23.7 18.6 12.7 6.5 -2.9 11.2 1594 1984 -1.6 2.9 4.5 10.1 15.5 20.7 22.6 22.7 17.5 13.0 5.6 2.4 11.3 1593 1985 -2.6 -0.4 6.6 12.3 16.8 19.9 23.1 21.6 17.7 12.8 5.3 -1.2 11.0 1595 1986 1.1 1.9 7.4 11.6 16.5 21.2 23.2 21.7 18.2 12.6 5.6 1.9 11.9 1590 1987 0.0 2.8 6.3 11.8 17.5 21.4 23.2 22.2 18.6 11.4 7.1 2.0 12.0 1589 1988 -1.9 0.8 5.9 11.1 16.6 21.4 23.8 23.3 18.3 11.6 6.7 1.3 11.6 1598 1989 1.4 -0.9 5.9 11.5 16.0 20.4 23.3 22.0 18.1 12.8 6.1 -2.2 11.2 1597 1990 2.9 2.7 7.3 11.7 15.5 21.4 23.1 22.6 19.6 12.7 7.6 0.6 12.3 1572 1991 -0.7 4.3 7.0 12.4 17.8 21.4 23.6 23.1 18.8 13.2 5.1 2.7 12.4 1549 1992 1.4 4.3 7.0 11.6 16.5 20.1 22.3 21.3 18.5 12.7 5.7 0.9 11.9 1536 1993 0.0 0.0 5.5 10.7 16.8 20.4 23.3 22.9 18.0 12.3 5.2 1.8 11.4 1529 1994 -1.6 0.2 7.0 12.2 16.3 22.0 23.4 22.4 18.9 13.0 7.1 3.0 12.0 1519 1995 0.9 2.6 6.9 10.6 15.7 20.6 23.7 24.0 18.6 13.3 5.6 1.0 12.0 1494 1996 -1.0 1.9 4.1 10.8 16.6 21.4 23.1 22.6 18.1 12.6 4.6 1.7 11.4 1464 1997 -0.6 2.9 7.2 9.6 15.4 20.7 23.2 22.2 19.4 12.7 5.5 1.7 11.7 1432 1998 2.1 4.4 5.7 11.4 17.9 20.8 24.2 23.5 20.9 13.4 7.7 2.7 12.9 1429 1999 0.8 4.1 5.9 11.6 16.4 20.8 24.1 23.0 18.4 12.8 9.2 2.6 12.5 1448 2000 0.5 4.3 8.2 11.6 17.7 21.0 23.2 23.3 18.9 13.4 4.2 -2.1 12.0 1431 2001 -0.2 1.3 5.1 12.3 17.4 21.0 23.6 23.6 18.7 12.7 9.2 3.1 12.3 1437 2002 2.1 2.8 4.8 12.4 15.5 21.8 24.5 23.1 19.9 11.7 6.0 2.1 12.2 1421 2003 -0.1 0.5 6.6 11.6 16.4 20.3 24.0 23.8 18.6 13.6 6.7 2.0 12.0 1412 2004 -1.3 1.0 8.2 11.9 17.4 20.5 22.9 21.5 19.3 13.5 7.4 1.8 12.0 1381 2005 0.4 3.2 5.7 11.9 15.7 21.4 24.2 23.4 20.2 13.4 7.6 0.3 12.3 1220 2006 4.2 1.5 6.2 13.4 17.1 21.7 24.8 23.3 17.8 11.8 7.2 3.1 12.7 1205 2007 0.1 -0.2 8.6 10.5 17.3 21.3 23.6 24.0 19.6 14.4 6.7 1.0 12.2 1166 2008 -0.6 1.3 5.5 10.8 15.6 21.2 23.4 22.3 18.7 12.2 6.5 0.2 11.4 1170 AA -0.7 0.9 5.3 10.8 15.9 20.4 23.0 22.2 18.4 12.5 5.9 0.8 11.3 Ad -0.7 1.0 5.4 10.9 16.0 20.5 23.1 22.3 18.5 12.6 6.0 1.0 11.4 For Country Code 425
I note here, again, that this report is for Country Code 425: The U.S.A.
Run Log of GIStemp STEP0 run to completion
[chiefio@tubularbells STEP0]$ do_comb_step0.sh v2.mean Clear work_files directory? (Y/N) y Bringing Antarctic tables closer to input_files/v2.mean format collecting surface station data ... and autom. weather stn data ... and australian data replacing '-' by -999.9, blanks are left alone at this stage adding extra Antarctica station data to input_files/v2.mean created v2.meanx from v2_antarct.dat and input_files/v2.mean GHCN data: removing data before year 1880. created v2.meany from v2.meanx replacing USHCN station data in v2.mean by USHCN_noFIL data (Tobs+maxmin adj+SHAPadj+noFIL) reformat USHCN to v2.mean format extracting FILIN data getting inventory data for v2-IDs USHCN data end in 2009 finding offset caused by adjustments extracting US data from GHCN set removing data before year 1980. getting USHCN data: -rw-rw-r-- 1 chiefio chiefio 10255476 Nov 7 10:21 USHCN.v2.mean_noFIL -rw-rw-r-- 1 chiefio chiefio 9594277 Nov 7 10:21 xxx doing dump_old.exe removing data before year 1880. -rw-rw-r-- 1 chiefio chiefio 9594277 Nov 7 10:21 yyy Sorting into USHCN.v2.mean_noFIL -rw-rw-r-- 1 chiefio chiefio 9594277 Nov 7 10:21 USHCN.v2.mean_noFIL done with ushcn created ushcn-ghcn_offset_noFIL Doing cmb2.ushcn.v2.exe created v2.meanz replacing Hohenspeissenberg data in v2.mean by more complete data (priv.comm.) disregard pre-1880 data: At Cleanup created v2.mean_comb move this file from to_next_step/. to ../STEP1/to_next_step/. Copy the file to_next_step/v2.mean_comb to ../STEP1/to_next_step/v2.mean_comb? (Y/N) n and execute in the STEP1 directory the command: do_comb_step1.sh v2.mean_comb [chiefio@tubularbells STEP0]$
The New Stations Since USHCN Cut Off in 2007
[chiefio@tubularbells Uv2study]$ more New.Station.inv 021514 33.2058 -111.6819 434.3 AZ CHANDLER HEIGHTS 025467 ------ ------ +7 026353 31.9356 -109.8378 1325.9 AZ PEARCE SUNSITES 022669 022659 ------ +7 035512 35.5125 -93.8683 253.0 AR OZARK 2 035508 ------ ------ +6 091500 31.1903 -84.2036 53.3 GA CAMILLA 3SE 093516 090979 ------ +5 100803 42.3353 -111.3850 1817.8 ID BERN 480915 ------ ------ +7 101956 47.6789 -116.8017 650.1 ID COEUR D'ALENE 100667 ------ ------ +8 105685 44.5664 -113.8953 1539.2 ID MAY 2SSE 101663 ------ ------ +7 106305 43.6039 -116.5753 752.9 ID NAMPA SUGAR FACTORY 101380 ------ ------ +7 116738 39.8058 -90.8236 198.1 IL PERRY 6 NW 113717 ------ ------ +6 141867 38.6758 -96.5097 402.3 KS COUNCIL GROVE LAKE 142602 ------ ------ +6 147542 39.7772 -98.7783 542.5 KS SMITH CTR 146374 ------ ------ +6 150381 36.8825 -83.8819 301.8 KY BARBOURVILLE 155389 ------ ------ +5 153762 37.7558 -87.6456 136.9 KY HENDERSON 8 SSW 156091 ------ ------ +6 170814 45.6603 -69.8120 323.1 ME BRASSUA DAM 177174 ------ ------ +5 171628 44.9197 -69.2417 90.5 ME CORINNA 176430 ------ ------ +5 180700 39.0303 -76.9314 44.2 MD BELTSVILLE 181995 ------ ------ +5 185718 39.2811 -76.6100 6.1 MD MD SCI CTR BALTIMORE 180470 ------ ------ +5 196783 42.5242 -71.1264 27.4 MA READING 191447 ------ ------ +5 198757 42.1608 -71.2458 50.3 MA WALPOLE 2 192975 ------ ------ +5 199316 42.1333 -71.4333 64.0 MA WEST MEDWAY 191561 ------ ------ +5 210252 48.3311 -96.8253 258.2 MN ARGYLE 213455 ------ ------ +6 213303 47.2436 -93.4975 399.3 MN GRAND RPDS FOREST LAB 216612 ------ ------ +6 215175 47.6308 -93.6522 422.8 MN MARCELL 5NE 219059 ------ ------ +6 244364 45.9353 -107.1375 944.9 MT HYSHAM 25 SSE 242112 ------ ------ +7 247318 47.3033 -115.0908 810.8 MT SAINT REGIS 1 NE 243984 ------ ------ +7 248569 47.8800 -105.3686 696.2 MT VIDA 6 NE 246660 ------ ------ +7 258133 41.4581 -100.5986 911.4 NE STAPLETON 5W 253540 ------ ------ +6 270706 44.3061 -71.6575 359.7 NH BETHLEHEM 2 270703 ------ ------ +5 288816 39.9500 -74.2167 30.5 NJ TOMS RIVER 288899 ------ ------ +5 292608 36.9358 -107.0000 2070.5 NM DULCE 052432 ------ ------ +7 300023 42.1014 -77.2344 304.5 NY ADDISON ------ ------ ------ +5 302060 42.0628 -75.4264 304.8 NY DEPOSIT 300360 ------ ------ +5 302129 41.0072 -73.8344 61.0 NY DOBBS FERRY ARDSLEY 307497 ------ ------ +5 304996 44.8419 -74.3081 268.2 NY MALONE 301387 ------ ------ +5 308737 43.1450 -75.3839 216.7 NY UTICA FAA AP 308733 308739 ------ +5 318694 36.3919 -81.3039 876.3 NC TRANSOU 310506 ------ ------ +5 321408 46.8769 -97.2328 285.0 ND CASSELTON AGRONOMY FM 325660 ------ ------ +6 332098 41.2778 -84.3853 213.4 OH DEFIANCE 335664 335669 ------ +5 364896 40.3333 -76.4667 137.2 PA LEBANON 2 W 363699 ------ ------ +5 367029 41.7394 -75.4464 548.6 PA PLEASANT MT 1 W 363056 ------ ------ +5 416794 33.6744 -95.5586 165.2 TX PARIS 344384 ------ ------ +6 422726 41.0222 -111.9353 1335.0 UT FARMINGTON 3 NW 427318 ------ ------ +7 425477 38.4500 -112.2292 1801.4 UT MARYSVALE 420519 ------ ------ +7 426135 39.7122 -111.8319 1563.0 UT NEPHI 422418 ------ ------ +7 427559 38.9139 -111.4161 2304.3 UT SALINA 24 E 425148 ------ ------ +7 427729 39.6847 -111.2047 2655.4 UT SCOFIELD-SKYLINE MINE 423896 ------ ------ +7 437607 44.6264 -73.3031 33.5 VT SOUTH HERO 306659 ------ ------ +5 437612 44.0725 -72.9736 408.7 VT SOUTH LINCOLN 435733 435740 ------ +5 451630 48.5472 -117.9019 474.6 WA COLVILLE 451630 451650 ------ +8 451939 47.3706 -123.1600 6.4 WA CUSHMAN POWERHOUSE 2 453284 ------ ------ +8 453222 45.8081 -120.8428 499.9 WA GOLDENDALE 453226 ------ ------ +8 455224 47.1358 -122.2558 176.5 WA MC MILLIN RSVR 456803 ------ ------ +8 457267 47.0894 -117.5931 592.8 WA SAINT JOHN 451586 ------ ------ +8 466989 38.6667 -80.2000 877.8 WV PICKENS 2 N 466991 ------ ------ +5 467029 37.5744 -81.5356 390.1 WV PINEVILLE 463353 ------ ------ +5 469610 37.6731 -82.2761 231.6 WV WILLIAMSON 469605 ------ ------ +5 475808 44.5297 -90.6383 310.9 WI NEILLSVILLE 3 SW 473471 ------ ------ +6 480552 42.6339 -106.3775 1831.8 WY BATES CREEK #2 487105 ------ ------ +7 481840 44.5219 -109.0633 1549.0 WY CODY 481175 ------ ------ +7 [chiefio@tubularbells Uv2study]$
So this is a list of the stations in the USHCN.v2 inventory format for which I need to create GHCN v2.inv file entries that look like:
42572786007 COLVILLE 5NE 48.58 -117.80 914 885R -9MVxxno-9x-9COOL CONIFER A1 0
Or as “ragged right”:
42572786007 COLVILLE 5NE 48.58 -117.80 914 885R -9MVxxno-9x-9COOL CONIFER A1 0
There are a few, like this one for COLVILLE, that already have an entry, but most of them do not.
Conclusion
I think there is pretty clear evidence for significant warming of the temperature record from this “Selection Bias” or perhaps “Survivor Bias” in the US data. It is not just a “California Thing”. There is a similar deletion process in the thermometers for other major countries of the world. At present, I do not have alternative data sources for those temperature series.
What is very clear, however, is that this deletion of thermometers from the present reporting base introduces significant errors into the 1/10 C place, and perhaps even up into the whole degrees of C. For this reason, the GIStemp product is no longer usable for statements about the temperature of the planet, the direction of any trends, and certainly not for any policy decisions. For those, you would be better served to look out the window…
Can you provide a key for your data tables so that each column is clearly defined? It is missing from this essay.
– thanks
Anthony
REPLY: “Yeah, it needs to be there. I’d described it in the last dozen postings and was feeling a bit repetitive… but there will always be new folks who have not read the last dozen… -ems”
also, check your email and reply please
Amazing stuff E.M.
Here is a [url=http://members.westnet.com.au/rippersc/gisstemp.jpg]graph[/url]
REPLY: “Don’t know why, but for some reason WordPress tossed this version into the spam queue. Maybe it doesn’t like the ‘URL’ bit? Who knows… At any rate, thanks for the graph, I’m putting it into the report ‘soon’ -ems”
Amazing stuff E.M.
Here is a graph for you.
I love the smell of GIStemp GHCN goose(sausage) being cooked on E.M. Smith’s blog. Way to go, Chiefio!
REPLY: “Now you know why WSW is a bit delayed this weekend… -ems”
If your analysis holds up, it is breathtaking and makes more understandable why Gavin Schmidt would have asked that any suggestion he is “connected to” GIStemp be publicly corrected. If he is aware of what appears to be willful distortion in this dataset, he should run the other way.
Next question, how can this analysis be “peer reviewed” and vetted in a way that results in GIStemp being taken out of the science of climatology and Hansen being charged with fraud? There have to be consequences for government scientists being medacious on this scale. And don’t give me the “he was doing his best with limited budget” crap. Doesn’t wash.
REPLY: ‘Well, I’ve put it up for the world to see. The Peers I care about most are able to look at it right now. Those peers are you’all. The ones who edit magazines for a living can catch up later ;-) IMHO this is nothing more than a return to how Science was done 100 years ago. You kicked it around with interested parties, sometimes published yourself, and later it might end up in some journal for the record. I see no real need for “gatekeepers on the truth”; though good editing and decent feedback ought to improve the product that reaches the record. Yeah, it’s a bit rougher on the ego, and yeah, it’s more “risky” in that if you FUBAR, you do it live on stage in front of the world instead of in a back room with a “peer”; welcome to the Wild West – I’ve got a few hundred years of ancestors who did not shrink from far greater adversity and risk: I’m not about to stop the tradition now.
BTW, my “budget” has been about $40 for coffee and $10 for tea, a recycled 20 year old “white box” PC, and my time. It took me about 6 hours programming time (interrupted by kids, cats, spousal units, door to door solicitors,…) over 2 elapsed days to do this. “Budget” is not the issue. The desire to do it is. -emsmith’
@Harold Vance
Perhaps I’ll be able to beg someone to do a validation test using completely different software and hardware 8-}
@all
Download USHCN.v2 data and description from:
http://www1.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/
The document that describes the files is:
http://www1.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/readme.txt
Extract 2006, 2007, and 2008 records.
Sum temperatures, by months and for whole year.
Compare to values above.
Values ought to be very close the ones in the report above (differing only by the records that were not in the station inventory file v2.inv that I deleted).
This ought to be doable with Excel or any database or any programming language.
The file is just a flat file of text. Station, type flag, year, 12 months of temperatures…
The comparison of the USHCNv2 file with GHCN is all that is needed for “proof”. Harold has already vetted that GHCN is selected to a reduced set; and, under the “California Beach” thread, we vetted that bit of code and result and showed that USHCN cuts off in 2007.
https://chiefio.wordpress.com/2009/10/24/ghcn-california-on-the-beach-who-needs-snow/
The only really “open issues” are did I screw up the selection of records to match the v2.inv station inventory; and is the conversion program buggy in some way? (posted source in the “How to fix it” thread):
https://chiefio.wordpress.com/2009/11/06/ushcn-v2-gistemp-ghcn-what-will-it-take-to-fix-it/
I’m working on a second, completely independent code path to what ought to be (almost) the same result. (The only difference ought to be in how the ‘skipped’ records are handled and that ought to be in 1/100 or less precision impact…)
But it would, as always, be A Really Good Thing to have a completely independent attack / proof of the process and conclusion.
Isn’t Real Science ™ fun? You get to beg people to attack your work 8-}
It might be interesting to go the other way too – if you use just the stations GISS used for 2008 for the entire available record. Just as a comparison, and assuming that they have the required data that far back. Yeah, I know – so much to do, so little time. I can wish, can’t I?
REPLY: [ It’s “on the list”… right after pizza with friends, decompress I need a break, pay bills it’s more fun ;-) But seriously, probably about 2 days away. I want to do a bit more “vetting” of the above stuff before running off to new approaches. But if I’m in code that does something near to that, you can bet I’m going to dump out that benchmark too… -ems ]
Thank you, sir, for your excellent work. And thanks for your visit as well.
A set of charts of global temperatures (or anomalies) by latitude band would be interesting, as it would tend to reduce the impact of the thermometers migrating toward the equator.
You’ve got sausages to grind; I might take this up when I get past the current crunch, and a bit of eye surgery.
===|==============/ Level Head
While I’m still reeling from your reply https://chiefio.wordpress.com/2009/11/09/gistemp-a-human-view/#comment-1529
There is more in your data above than the numbers give away at first sight


Annual means:
Records:
Not only does the change of records change the annual temperature at the beginning and end of the series, but it changes the overall trend as well.
What is the basis for fewer thermometers in the early part of the record?
REPLY: [ I don’t know for sure why the drop off happens early. I suspect it is an artifact of the change in station marking for “estimated”. NOAA changed what flags mean for “estimated” data, and the early data have a lot of estimates.
We have two moving parts here.
1) I translate the “new” flag into an “old” flag and may not have made the best selection of translations. It is not a clear “one for one”. And type M records get dropped by GIStemp (as that means they were just made up… estimated from near nothing.) My mapping is visible in the posted source code and suggestions for improvement welcomed. I toss a lot of the subtile nuances of “estimated” into the M bin together.
2) The change of flags and meanings by NOAA was accompanied by a re-mark of the data by NOAA. It is possible that the very early data had the interpretation on “estimated” values changed from “estimated from something” to “estimated from not enough”, and this NOAA action could toss more records into the “M bin”.
It would take knowing exactly what each flag meant, and looking at a representative sample of the early records, to figure out which of these two moving parts is “the issue”. But you can see the rapid convergence of ‘kept record counts’ in the middle / late series show the effect is concentrated in the very earliest years of sparse questionable data.
I most strongly suspect my ‘flag mapping’ but It has not been high on my list of ‘issues to sort out’… -ems ]
Being British, I can’t ask these questions of GISS. Someone needs to do an FOI request, or ask their Congressman whether GISTemp has been starved of funding. I would expect lack of funds to be their first reason for not-updating the program for USHCN V2, and the drop off in thermometer count. They have had two years to do this. Then there needs to be an FOI request or congressional committee to find out the real reason why they haven’t updated it.
On this side we will start asking whether HADCrut have done the same as GISTemp.
REPLY: [ It is clearly not a funding issue. I did this fix in about 6 hours and I was not familiar with the data format changes, so most of the time was spent learning USHCN.v2 data format. This is something a programmer who regularly works on this bit of code and data could have done before lunch on a slow day… They have not done it for the simple reason that they didn’t want to do it. The real question is why did they decide to deprecate USHCN. That was a design decision, not a programmer budget issue… -ems]
E.M.,
I just found a site with the USHCN sites mapped and the data downloadable
http://cdiac.ornl.gov/epubs/ndp/ushcn/ushcn_map_interface.html
REPLY: [ Nice! Thanks! This is a great addition to the GISS site. -ems]
I have been watching your blog with interest since this summer when someone referred to it at CA. I applaud your efforts to get the code running because others have previously stated it is a difficult exercise and I believe an independent review of GISS is very desirable. Given a successful emulation then it becomes a very powerful tool in understanding what GISS has done.
This post is one of the first where you have done a test run which has an output that allocates an order of magnitude to the issues you have been raising. I am quite prepared to believe that there are issues within GISS and that the issues are significant but I do need to be convinced about the one you are describing here.
There are 3 arguments that can be made in defence of GISS.
1. The use of zones and anomalies should make the results immune from the impact of changes to the number of sites used unless those additional sites can be demonstrated to be radically different in their properties. You appear to believe anomalies and zones are not the saviour but I dont understand the processing mechanism that causes the difference you have identified here. You have made vague references to averaging of averaged averages and rounding issues but so far provided no clarity of the actual mechanisms at play.
2. GISS comes up with similar results to Hadley. There is some commonality of input and possibly some shared processing methodology so may have common issues but I wouldn’t neccessarily expect similar rounding problems.
3. GISS programs were “independently” reviewed by Nick Barnes “Clear Climate Code” project and I believe they got as far as implementing a revised version of one step in python which you have commented on. This project apparently reviewed and corrected rounding issues. Not sure why you are still finding them although it looks like the CCC project is unfinished so perhaps they did not get all the way through.
With respect to your emulation have you been able to verify that it produces the same results as GISS by matching intermediate files and checking for differences. The CCC site does have outputs from the various steps and it would be nice to reassured you are getting the same results as they get.
clivere: I have been watching your blog with interest since this summer when someone referred to it at CA. I applaud your efforts to get the code running because others have previously stated it is a difficult exercise and I believe an independent review of GISS is very desirable.
Thank you. It was not the fixing that was difficult, though, it was the coming to understand the details of some rather painfully written and under documented code. After that, the actual lines needed to get it to run were not that many. It’s documented here:
https://chiefio.wordpress.com/2009/07/29/gistemp-a-cleaner-approach/
Given a successful emulation then it becomes a very powerful tool in understanding what GISS has done.
Um, minor “nit” to harvest. This is not an emulation. This is GIStemp. The real deal.
Ported to run on Linux with the minimum changes possible / needed for stability; and none of them very significant to the operation. This was a deliberate act so that any “benchmarks” done can be vetted as clearly GIStemp and not “my code”.
This post is one of the first where you have done a test run which has an output that allocates an order of magnitude to the issues you have been raising.
Um, perhaps you missed one of my earliest ones where I found an issue with the compiler dependent math done in USHCN2v2.f of 1/10C per record in 10% of records with an overall 1/100 C warming of the data series?
https://chiefio.wordpress.com/2009/07/30/gistemp-f-to-c-convert-issues/
or:
https://chiefio.wordpress.com/2009/08/12/gistemp-step1-data-change-profile/
which finds a measured couple of tenths C uplift in the data through step 1
or the very similar posting about the STEP0 data change profile…
My pattern is to assess the code and do a qualitative review and then do a code review and then construct and do a benchmark that does measurement. Then I do a “post benchmark evaluation” and sometimes a “fix” such as this fix to put the current USHCN.v2 thermometers back in. (Which then gets followed by a re-benchmark).
I’m not surprised you haven’t caught up on it all.
And while not strictly GIStemp, the analysis of the input data change over time and space has had a fair degree of numerical, measurable, testable content:
https://chiefio.wordpress.com/2009/11/03/ghcn-the-global-analysis/
though of a more ‘distribution’ sort rather than ‘one value’ sort.
I am quite prepared to believe that there are issues within GISS and that the issues are significant but I do need to be convinced about the one you are describing here.
No problem. You have the code and the data. Easy enough to duplicate. For that matter, you can take the data for a sample period of time and count them by hand or even using Excel.
This posting is about the “selection bias” that comes in the leaving out of the USHCN.v2 data set. Nothing more. So you can take that USHCN.v2 data for any individual month (after the USHCN old version cuts off) and add it up with whatever tool makes you comfortable.
You do not need GIStemp running for that process. You ought to get values very close to those in the above charts. (In further benchmarks I will be taking this data through the rest of GIStemp. To validate what it does with this +0.6 C selection bias, you will need GIStemp running).
There are 3 arguments that can be made in defence of GISS.
I would suggest that there are more than that. Several of the processes that it uses / advocates have some merit. They are just not sufficient to overcome the issues.
1. The use of zones and anomalies should make the results immune from the impact of changes to the number of sites used unless those additional sites can be demonstrated to be radically different in their properties.
And there’s your first leap of faith. “Immune”.
Nope.
And I have demonstrated that the inputs are “radically different in their properties”. See all the “by latitude” postings and the new series of “by altitude” postings along with the demonstration that the older longer records show no bias but the newer records do show bias. Those are “radically different” inputs.
No code is perfect and no method is perfect. I would use words more like “mitigate” or “more robust” or “resistant”. And I believe that it does mitigate the impact, but with less than 100% perfection, so the 0.6C bias can, and does, leak through. Mitigated, but not removed. See this (admittedly ‘first cut’ and rough) benchmark that shows “the anomaly changes”:
https://chiefio.wordpress.com/2009/11/12/gistemp-witness-this-fully-armed-and-operational-anomaly-station/
And since it does change the anomaly process is not perfect and the product is not “immune”. So now we have to look at exactly “how much” and measure: but it is not “zero”. (That “measuring in painful detail” is what is on the plate for the coming week).
You appear to believe anomalies and zones are not the saviour
It was a belief up until I ran the benchmark. Now it is a demonstrated fact. Next it will become a measured quantity. That is the process of vetting code with benchmarks.
No filter has perfect “Q”. GIStemp is perported to be a filter with perfect “Q”, yet through the first steps it is measured to be an “amplifier” not a “filter”. That means the following section (STEP 3) must be one heck of filter. Beyond perfect filtering of the data bias, it must also filter out the amplification of the early steps…
but I dont understand the processing mechanism that causes the difference you have identified here.
Pretty simple: USHCN “cuts off” in 2007. GHCN drops most of the same stations at about the same time. By using USHCN.v2 (that GISS ought to have done already) I “put them back” into the input data. Then measure a 0.6C impact from the change. There is no opinion in this, it is a measured behaviour.
The only argument that can be made about it is that the US thermometers ought to be left out and that my putting them back in is somehow wrong. I’m willing to explore that, but it is a weak argument. (Especially given that taking them out biases the STEP0 output by a 0.6C demonstrated warming. You would need to prove a -0.6C corrective behaviour in the following steps of GIStemp to make that an acceptable behaviour.)
You have made vague references to averaging of averaged averages
Nothing at all vague about it. It is an accurate statement of what GIStemp and NOAA do. The daily MIN / MAX are averaged. That average is adjusted by NOAA based on other data that are themselves often averages. That corrected average is then averaged for the days of a month to give the monthly Average of MIN/MAX mean that is fed into GIStemp.
All this is before GIStemp does it’s own first averaging. There are many more steps in GIStemp that do averaging and I will not be repeating them here. It is posted under the GIStemp tab up top and covered in detailed articles that look at each coding step as I finish it. PApars.f in particular has been covered as it does the UHI “adjustment” based on averaging together up to 20 records (IIRC) that are used to move the past temperatures of a given station. And in STEP1 various fractional records are merged by various in-fill and splice processes that use, yes, averages. It’s all there in the code, and has no vagueness about it. And, of course, there is much more averaging done in the creation of the ‘zonal averages’ and the grid / box averages used to produce the anomalies that are measured against the “baseline average”.
and rounding issues but so far provided no clarity of the actual mechanisms at play.
See the above
https://chiefio.wordpress.com/2009/07/30/gistemp-f-to-c-convert-issues/
as one example. Every single averaging step needs the same kind of analysis. I’ll get there eventually, but there is only one of me and discovering things like the violence done to the thermometer record by deleting 93% of the thermometers in USA have taken priority. It is, however, very clearly the case that when you do arithmetic in a computer you will accumulate rounding errors in the low order bits.
(There are some ‘infinite precision’ math packages, but those are highly specialized and not used in GIStemp. GIStemp uses standard FORTRAN and when using regular INTEGER and REAL data types, you must “vet” every single math operation performed for: rounding, underflow, overflow, truncation, precision, accuracy, and generally be aware that error accumulates unless you have take extraordinary pains to avoid it. Pains that are not in evidence in GIStemp.)
FWIW, rounding and similar issues are, IMHO, the smallest issue with GIStemp. I would guess it is more than the 1/100C place (we’ve already found that much) but less than the 1/2 C place. Somewhere in the 1/10 to 3/10 C as a reasonable guess.
2. GISS comes up with similar results to Hadley.
All the more reason to suspect the Hadley code “shares issues”. I will gladly give it the same type of code review if they every decide to release their code. Until then, Hadley is just a ‘black box’ that lost their data. For all we know, it could be GIStemp. When you have a demonstrated broken clock, and another agrees with it, you suspect both of being broken…
These folks share papers, share journals, share beliefs. It is very common for folks indulging in that much “group think” to come up with similar “solutions” with similar failings. So Hansen published a peer reviewed paper showing ‘up to 1000 km’ can be used for ‘the reference station method’. I could easily see a researcher at Hadley saying: “Well, it’s peer reviewed, so we better use 1000 km and the reference station method”. One of the problems with the ‘must be peer reviewed’ mantra is that it enforces group think and shared error.
https://chiefio.wordpress.com/2009/09/14/gistemp-pas-dun-coup/
There is some commonality of input and possibly some shared processing methodology so may have common issues but I wouldn’t neccessarily expect similar rounding problems.
I, too, think that the “commonality of input” is the biggest issue and would suspect that “shared processing methodology” is most of the rest. As this article demonstrated, the “input” selection bias is a measured 0.6C from the single decision to leave out USHCN.v2 in 2007. This is greater than the 1/2 C probable upper bound I would put on math precision / rounding issues.
To put a very fine point on it: I suspect that the final answer will simply be that folks believed in a “perfect filter” and deleted or changed thermometers based on that belief. But that belief is fundamentally broken. No filter is perfect. So both Hadley and GISS take the GHCN data and assume a perfect filter will fix it and both write imperfect filters. It takes nothing more than that for them to make similar results.
But I assume no filter is perfect; so I’m doing what they ought to have done in the first place: Measure the actual changes in the data and measure the actual degree of filtering done (the filter “Q” or “Quality Factor”.) And those numbers will tell everyone exactly how good a filter it is, and how much of the 93% thermometer deletions show up in the product. But we already know the quantity is more than zero.
3. GISS programs were “independently” reviewed by Nick Barnes “Clear Climate Code” project and I believe they got as far as implementing a revised version of one step in python which you have commented on.
I believe their stated goal was to do a “translation” not a benchmark and code QA. I would expect them to find some issues as part of a port / translation, but the major focus of a port is not testing the design of the original, but rather to make the port function the same way, bugs and all.
This project apparently reviewed and corrected rounding issues.
I think you are making a bit of a leap. They found and fixed one or two issues, IIRC. One being in USHCN2v2.f where a FORMAT statement was ‘off by one’ in the data read step. That “fix” was already in the code I down loaded, so they did not find the compiler dependent 1/10 C error from bit shifting that I found above. They did not do an exhaustive math review. The report of it I heard was more of a “found this by accident while doing port” rather than “was doing math benchmarking and found…”. (But in reality, only they can speak to their process).
Not sure why you are still finding them although it looks like the CCC project is unfinished so perhaps they did not get all the way through.
As I understand it, they are not done. Also, I’m a bit of a stickler on math issues. But as noted above, I don’t think I’ll find much above the (already demonstrated) 1/10 C place.
With respect to your emulation
It is not an emulation. It is the real GIStemp code being run on the NOAA data as directed in the GIStemp documentation. It is GIStemp.
have you been able to verify that it produces the same results as GISS by matching intermediate files and checking for differences.
I don’t really need to do that since it IS GIStemp. But spot checks can be done. I am also in communication with a gentleman doing a C port and we have matched results. As part of my “end to end benchmark” next week, I’ll be downloading the current NOAA data and at that time I’ll compare the STEP3 anomaly reports from “my” GIStemp to the published reports.
Once again: This is the GIStemp code. The things it took to make it run were not material. (Almost entirely matching the f77 or g95 compiler to the steps that used f77 or f90 and taking data initializations out of variable declarations into dedicated DATA statements. Not the kind of thing that would effect processing or output. Though I did surface that their code was sensitive to compiler choice in that STEP0 F to C conversion step… and provided a “fix”.)
The CCC site does have outputs from the various steps and it would be nice to reassured you are getting the same results as they get.
Anyone wants any output from any step, just tell me what FTP server to put it on.
This is an “all open, all visible” operation. Anyone who wants to join in is welcome to come to the party. I’ve put a fair amount of output up already. Any one can compare it. Other output available upon request.
As long as it’s just me, though, it’s going to be what I put at the head of the queue that gets done first. For now, that’s to assume that GIStemp is GIStemp when the source code is from the NASA site; and benchmark what it does to the data.
AFTER I’ve got the data and process characterized and benchmarked “end to end”, well, then I’ll worry about if there is are a few bits of precision in the low order bits that have any jitter between their hardware and mine or their f90 compiler vs the g95 compiler.
FWIW, the “rough benchmark” of the anomaly step (in the link above) looks like about a 2/10 impact from the 6/10 of uplift measured in this posting. When I have the final vetted benchmark done and run I’ll have a ‘real number’ to post. But as a ‘first look rough cut’ that’s about a 66% reduction (of this “bolus” of warming bias) from the Zone, Grid, Box anomaly step. Not a bad performance. More than I’d expected.
Unfortunately, that means 1/3 gets through. That would imply about 0.2C of warming bias in the final anomaly step comes from this “selection bias” of leaving out USHCN.v2 after the file format change.
Since some folks want us all hot and bothered about 1/10 C (and some folks even get lathered about 1/100 C place changes) that is a ‘material issue’…
I was not going to put that out for public view until I’d done a fuller and more clean benchmark (updated v2.inv file, fresh download, etc.). But since we’re “on the topic”, and with a pile of “early in the benchmark process” caveats, that’s about the figure of Q we’re looking at.
Just to make it clear: I’m fairly certain that the “Anomaly, box, grid, zone” process does, in fact, reduce the impact of thermometer change. I am just also fairly certain that it is not a perfect filter and some of the input change bias gets through. The first cut rough measurement is 1/3, but the final cut of the benchmark will be definitive.
Hope this (rather long…) reply is helpful to you.
E.M.,
I will await next week’s posts with eager anticipation (more so than usual).
…rounding issues…. somewhere in the 1/10 to 3/10 C as a reasonable guess.
1/3 gets through. That would imply about 0.2C of warming bias in the final anomaly step comes from this “selection bias” of leaving out USHCN.v2 after the file format change.
Now we’re getting to the foundations and have discovered the big layer of clay that has the potential to undermine the stability of the whole product.
I realise what you have sound so far is USHCN/GHCN related. Add that to station changes and poor, even reverse UHI adjustment and my belief is that the 0.6-0.7C of unprecidented global warming will reduce to a more normal natural variation magnitude of 0.3C.
REPLY: [ Sounds about right to me. This is slow slogging, but that is the only correct way to do it. I could get it done in about 1/3 the time if my funding were something more than zero and my time available was more than “spare time between chores”. Oh well. It will eventually all get done. -ems ]
ok – thanks for the reply which clarifies some of this for me. Please dont assume I am hostile to what you are doing. I have reason to believe there are issues with GISS but you appear to uncovering something I was not expecting so I am probing to get a better understanding.
I dont want to get hung up about terminology so I will refer to your version if that is ok with you.
I am pleased you are taking steps to compare outputs as that will improve confidence that your version is ok.
I had previously read the 0.6C as being the difference between full runs of all the steps. My mistake. You are suggesting that the actual difference for a full run is likely to be a lot less than 0.6 which for me is more subtle and therefore much more plausible. Particularly as you imply rounding is a significant contributer.
You have also clarified that the use of anomalies and zones is post the step in question and I now understand why they dont matter for this issue.
From a processing perspective I still dont understand the exact nature of the issue. The change in number of records is a trigger. I recognise that rounding issues can go in different directions depending for example on whether individual figures are rounded, totals are rounded, whether the rounding is up or down, values truncated etc etc.
Looking at the records from the early years it appears that lower numbers of records mean higher yearly averages and I am intrigued at the actual mechanisms which would have that impact particularly as the magnitude is still relatively gross.
It would also be interesting what the impact would be on the period from 1900 to 2008 if each year was somehow constrained to the same number of approx say 900 records from mainly the same stations.
clivere: ok – thanks for the reply which clarifies some of this for me. Please dont assume I am hostile to what you are doing.
You’re welcome; and I don’t… It’s just that the field is rather technical so precision in the details matters.
I dont want to get hung up about terminology so I will refer to your version if that is ok with you.
Unfortunately, the terminology is highly important. An “emulation” is a fake that looks sort of like the original. A new instantiation of the original is most like within one part in a few thousand of identical (and will most often be identical to the limit of all bits.) As an analogy, a “soyburger” emulates hamburger; while the Macdonalds 1 mile away is expected to be indistinguishable from the one 10 miles away.
Calling it “my version of GIStemp” is completely accurate. Calling it “my port of GIStemp” is more precise but only the geeks among us will notice.
I am pleased you are taking steps to compare outputs as that will improve confidence that your version is ok.
It’s all just a matter of time. I’ve done software QA for a living before (major compiler tool chain, among other products). Had a team of 1/2 dozen folks then, though… I’m just frustrated at how slow it’s going and how long it takes me some times.
I had previously read the 0.6C as being the difference between full runs of all the steps. My mistake.
Understandable one. Again, one of those “terminology” things that matters. I specifically was measuring only the “Selection Bias” (or more properly, the “Survivor Bias”) from the decision NOT to do the maintenance programming to keep the USHCN thermometers in during the transition to USHCN.v2 format.
When measuring a filter, you want a step function in the input, then measure the degree to which it gets to the output. This posting measures that step function in the input from that decision. The bigger it is, the better the filter must be to remove it. 0.6 C is a big rise, so STEP3 will have a great deal of work to do from this step function. It also has about a 0.2 C warming bias from STEP0 and STEP1 (documented in that other link) to remove as well. So we’re getting to the point where we have about a whole degree C of warming bias in the input that STEP3 is expected to remove. That will be very hard to do.
You are suggesting that the actual difference for a full run is likely to be a lot less than 0.6 which for me is more subtle and therefore much more plausible.
I’ll go stronger than that: GIStemp will reduce that 0.6C to some significant degree. The first cut benchmark is that it will end up 0.2C with about a +/- 0.1C error bar. But since 0.1C is held out as the metric for “Doom in Our Time”, that is a significant deal…
Particularly as you imply rounding is a significant contributer.
You seem particularly focused on “rounding error”. It is only one class of error and often one of the smaller ones. I mention it (probably more than justified) because it is something that always exists in normal computer math (such as in FORTRAN) and most folks know what a rounding error is. If I say “bit shift” error; folks will glaze over. (But that was, IMHO, the cause of the 1/10C in the F to C conversion error. the order in which math was done sometimes caused a larger “bit shift” in the ( 5 * T )/9 calculation, so you got a different enough result to warm 10% of records by 1/10 C depending on the compiler you used. I ‘fixed it’ by doing the 5/9 division once (and thus no variable bit shift… 100*5 shifts everything 2 decimal places, then divide by 9 moves it back one… and some precision fell off the end due to the bit shift…).
One of the more obscure in FORTRAN is that division by an integer truncates:
10 / 3 = 3
If you want to get 3.3333… you need to do 10.0 / 3.0 or sometimes written as 10. / 3. so just a missing “.” can turn a 3.3333333 into a 3 with no fractional part.
An obscure behaviour of FORTRAN, but that’s the kind of thing that needs to be looked at for every single bit of math done in the whole thing. (BTW, there are several more kinds of ‘math error’ like these that can cause significant errors in a program and are often not thought about by non-programmers – i.e. researchers, accountants, etc.)
You have also clarified that the use of anomalies and zones is post the step in question and I now understand why they dont matter for this issue.
Now that I’ve made it clear, let me muddy it up just a little :-)
In STEP1 there is a comparison of fragments of some records to each other and to the averages of their neighbors. This step splices fragments together and fills in missing bits. It does use a “sort of an anomaly” in that it will compute the offset between a station and other stations and treats that offset as the “anomaly” for computing the fill-ins. Technically, it is a kind of an anomaly, but not in the sense that most folks think (zonal / grid /box anomalies). The actual math is more a comparison of averages, but they are normalized to a mean, so technically is is an ‘anomaly’ like process. (Though I think it is best described as a comparison of the station data to the average of neighbors). I know, painfully detailed technical minutia… but that is the stuff of computer program audits…
But at the end of the day, it still doesn’t matter. It is the final end to end benchmark that will measure “Q”.
From a processing perspective I still dont understand the exact nature of the issue. The change in number of records is a trigger.
BINGO!
The claim is that you can have 100 lbs pressure in the hose, or 10 lbs pressure and the same spray will come out the far end.
I’m measuring how much the pressure on the input changes (0.6 C from this one ‘decision’ to leave out USHCN.v2) and did a ‘peek ahead’ and saw the spray changes (by what looks like 0.1 C to 0.3 C ‘by eyeball’). Now comes the part where we put on the rain coat and take the tape measure and measure exactly how much the range changed…
I recognise that rounding issues can go in different directions depending for example on whether individual figures are rounded, totals are rounded, whether the rounding is up or down, values truncated etc etc.
Yes, but bit shifts tend to have a bias. You shift one way and drop low order bits, then you shift back. The effect is simply to turn those low order bits into zeros. If used in a division, you get a bias up or down depending on which side of the divide it is on… The bias in that particular calculation will always be in the same direction (as long as we’re talking integers… i.e. not “divide by 0.5” which is really a multiply by 2 as far as bit shifting…)
Looking at the records from the early years it appears that lower numbers of records mean higher yearly averages and I am intrigued at the actual mechanisms which would have that impact particularly as the magnitude is still relatively gross.
The early years are particularly sparse in data. I suspect that it is an artifact of NOAA ‘rescoring’ some of the early data as “estimates” (GIStemp tosses out “fully estimated” records…). This is a ‘selection bias’ issue, rather than a ’rounding issue’ IMHO.
It would also be interesting what the impact would be on the period from 1900 to 2008 if each year was somehow constrained to the same number of approx say 900 records from mainly the same stations.
That was one of my earlier attempts. Unfortunately, GIStemp is written in a very “brittle” way (that is, not resilient – for example, I put out a “log file” of records “not in the v2.inv file” and keep on going in my “fix” for USHCN2v2.f where the original just dies on you).
I’ve spent the better part of the day trying to get a consistent USHCN.v2 file, but cut back to end in 2007 as does USHCN the original, to run through STEP1. It runs through STEP0 just fine (as does the full through 2009 version of USHCN.v2); but STEP1 tosses it’s cookies. Don’t yet know how to fix it.
So what ought to have been a 1/2 hour “cut USHCN.v2 to 2007”, run steps 1-3, compare to prior full USHCN.v2 and to prior USHCN-2007 runs: has instead been 1.5 days with nothing to show for it. So far… It would have been a great benchmark. (USHCN-2007 vs USHCN.v2-2007 showing the exact impact of the NOAA change from 1/100 F to 1/10 F; while USHCN.v2-2007 vs USHCN.v2-2009 giving an exact “selection bias” reading from the added time span only.)
Instead I’m trying to debug badly designed and poorly written brittle code with lousy error handling and no documentation to speak of. And getting nowhere fast. All you get is a hard crash and not much usable error or activity logs. I suspect somewhere is a file with some fragment of a site in it that expects to match some input record, and it crashes instead of doing something reasonable (like “log it, skip it, and move on”). This “splicing” step has several hand made custom tweak files that feed it little bits of mystery..
So that is why all those lovely clean comparisons, that we’d all like to see, come out so slowly. And why the ‘first benchmarks’ out are often these “mixed things”. It is because that was the only thing that would make it through the code without crashing…
It seems to be particularly sensitive to data deletions. Exactly the thing you need to do to do the most interesting tests.
Oh well, I’ll figure it out. It just takes time…
Currently our ccc-gistemp uses USHCN version 1, and official GISTEMP uses USHCN version 2. So the somewhat ad hoc comparison I did last month can be used to get a rough idea of the difference between using one version or the other. Our blog post, “How close are we to GISTEMP?”, shows the differences (in the global anomaly) to be at most 0.01 or 0.02 K. I suspect that those differences are due to things other than USHCN v1/v2.
I’ll take that as an invitation, although I should point out that the project is not just me. As we have made clear from the beginning, we are not scientists, and are not interested in making a new analysis. We want to take the mishmash of grotty old Fortran, typical science code, especially for older code bases – make no mistake, GISTEMP is not particularly bad in this respect – and write some lovely clear code implementing the same algorithm which anyone can understand. That code can then form the basis for informed discussion of the algorithms (and, quite possibly, for improving those algorithms). For instance, anyone could easily tinker with the STEP1 scribal-record combination.
Before our first announcement last year we had STEP0, STEP1, and STEP3 in Python. Now we have the whole thing in Python, including STEP4. We’re working on STEP6. I’m rewriting our STEP0 USHCN import to match the new USHCNv2 import that GISTEMP now has. Once that is done, we’re going to make a release from our GoogleCode project and an announcement post over on our blog. In the meantime, anyone can download all the current code from GoogleCode.
Our results match GISTEMP, of course, quite closely (not exactly, for a small number of reasons including the difficulty of doing mixed-precision floating-point arithmetic in Python, which defaults to double). If they didn’t, that would be a bug. Our code includes programs to compare result files and generate an HTML report on the differences found.
That’s the first pass.
The second pass will involve refactoring the code. Our current STEP2 code includes some of GISS’s own STEP2 (which was already in Python), and is in serious need of improvement.
All the code carefully matches the low-level Fortran behaviour. As a trivial example, STEP3 separates all the temperature records into 6 bands before calculating box means, because when the code was written it was not possible to load all the data into memory at once. This will have some (probably tiny) effect on the results, because it affects the order in which records are combined, which will affect rounding order, for instance.
The third pass will involve simplifying a lot of that behaviour: retaining the same high-level algorithm but losing the 80s-style data pipelining. As well as being much clearer, we expect the resulting code to go a lot quicker.
For more information, come over to the blog and ask. Note that it is not, absolutely not, yet another blog in which people to-and-fro abusively on the politics of climate change, or on random climate news. There are plenty of blogs like that out there, and no such comments will be tolerated at CCC. (NikfromNYC’s comment today is right on the borderline; I passed it because it asked some important questions about the actual project, which we are happy to answer).
We welcome contributors – people who are willing to either write or review code. This absolutely includes “sceptics”, as a recent blog post made clear. Volunteers who are able to do the work, and have the time and inclination, are thin on the ground, so it can take us a long time to get much done.
@drj11:
Unfortunately, you can’t compare USHCN v1 (truncates 2007) with USHCN.v2 and get a “survivor bias” metric since v2 has a different set of “adjustments” in the data. I don’t have an article ready yet, but v2 is “warmer” than v1.
It looks to me (rampant speculation) like the adjustments used are cooking v2 so that putting it in adds to the warming profile (just as leaving out v1 did). To demonstrate this takes a v2 vs v1 comparison over only the period in common, and then a benchmark of 2007 to date with v2 in vs out. I know I ought to do it, but I’ve gotten distracted by other things… like the blatant corruption of GHCN via deleting cold thermometers in particular and 90% or so in total in years after the 1980s.
@ Nick Barnes:
What you folks are doing “is a beautiful thing” and I just wish I had time to contribute to it. But I see our directions as complimentary, so I’d be loath to abandon mine. (And frankly, I’m better at remembering my old FORTRAN class than doing PYTHON. I can read PYTHON OK, but not up to writing it yet.)
Mine is to characterize what is actually being run and see where it “has issues” (of any size). That, IMHO, will surface areas that both GIStemp proper, and you, can look at to enhance your product. Basically, a “shopping list” for when your port is done and benchmarked to match GIStemp “end to end” and it’s time to go bug hunting. (Though with luck, you will stamp out some ‘issues’ as you find them.)
And yes, despite my whining about how bad GIStemp is to work on, I’ve actually seen worse… (shudder!)
BTW, at this point I think that the ‘code issues’ in GIStemp are ‘the small fish’ and that the massive changes of thermometers used in GHCN are ‘the big fish’. If you can, it would be interesting to know if your code will accept “subsets” end to end: then feed it the “surviving 1176 locations” in GHCN (i.e. remove survivor bias) with the (deleted after about 1989) 7k or so bolus in the baseline period removed. If you are uncomfortable publishing such a benchmark, just knowing that subsets of data will run end to end would be helpful to know.
As noted above, GIStemp in FORTRAN is not happy with subset data in some steps so it will take me a while to work around this. ( I found a way, but it’s unpleasant. Changing the data to “missing data flags” works, but it’s a PITA for benchmarking.)
And finally, now that you have it working through STEP4, I may just down load a copy and give it a whirl. When I first looked, it was missing a step along the way. Certainly for figuring out what a given FORTRAN step is supposed to be doing it will be “a better ride” for most folks. I can also see where running benchmark ideas through it first would be faster and easier, THEN going back to try and nurse them through GIStemp proper.
(But right now my disk is full… too many copies of historical benchmark variation datasets. I need to either archive some stuff or add a new disk… And I need to do an update to the latest revision of GIStemp… And … Sigh…)
E.M.,
As you know we have all the station data in database format now here, again to help investigate ‘issues’. In theory it should be possible to produce partial input files (and the matching partial station.inv files) for you – if this would help.
Do you think this would be feasible or am I just showing my lack of knowledge of GIStemp? ;-)
@E.M.Smith: you say «for when your port is done and benchmarked to match GIStemp “end to end”». ccc-gistemp (our “port”) is now fully rewritten in Python and matches GISTEMP “end to end” to within 0.01 or 0.02 K when comparing the global anomaly. The differences are tiny, see the article I mentioned earlier.
Thank you, and yes, I think our directions may be complementary, but please do feel free to point anyone else in the GISTEMP-analysis space in our direction.
And yes, among other things, it is already easier to switch stuff around in the Python (e.g. eliminating sub-steps – such as the Hohenpeissenberg data or the St Kilda adjustment – or removing stations or sets of stations, would be a matter of a line or two of clear code). In my dreams, a future CCC-GISTEMP is the reference to which amateurs turn to answer questions like “what happens if we do the peri-urban adjustment like this”, or “what if we have a different box/sub-box grid”, or “how about if we combine ocean and land data more like the JMA”. Certainly, that sort of question is now pretty easy for me, personally, to answer (mostly the answers are “it makes negligible difference to the anomaly signal”, but maybe I am still asking the wrong questions).
Well, one benchmark I’d love to see is what happens to the anomaly map if you put thermometers from Bolivia back in ;-) or the Canadian Yukon and Northwest Territories… but maybe that’s just me ;-)
Another would be to take, oh, South America and remove all thermometer data from prior to 1990 that does not have a matching station in the 1990 to date period (stabilize the instrument) and do an A / B anomaly map. Would tend to settle the issue of what impact the GHCN deletions has…
Holy Smokes;
These guys have the maps thing down cold;
http://82.42.138.62/
They’re obviously way ahead on how to put mapped-things on the web.
I’ll keep looking for some way to make a difference, albeit modest.
Cheers!
ACakaRR
Still no ability to register for their TEKthing, whatever it may be.
Love those dots and the zoom thing. Tool tip, the works…
@almostcertainly,
Apologies for the glitch with registering. I’m sure Kevin (who is UK-baed) will get on to it in the morning.
Glad you like the maps.
Just came across this post; wish I had seen it last November.
“So this is a list of the stations in the USHCN.v2 inventory format for which I need to create GHCN v2.inv file entries that look like:
42572786007 COLVILLE 5NE 48.58 -117.80 914 885R -9MVxxno-9x-9COOL CONIFER A1 0”
Interesting you should use Colville as an example. I have now made two trips up there to try to document that station history. (Pictures are in surfacestation.org gallery.) So far as I can see, COLVILLE 5NE was never in the USHCN.
MMS identifies COOP 451630 as “COLVILLE’; COLVILLE 5NE is now known as “COLVILLE BASIC”, and has, so far as I can see, always had the COOP number 451654.
But
http://cdiac.ornl.gov/ftp/ushcn_v2_monthly/ushcn-stations.txt
shows this history:
451630 48.5472 -117.9019 505.0 WA COLVILLE 451630 451650 —— +8
So the change sequence was from COLVILLE (in town) to the airport (451650) and back to town.
Unfortunately, by the time I got there, the station had been closed or moved yet again; with the help of the local mail carrier I was able to locate the site and take a picture of the post on which the MMTS had been mounted (with the wires still hanging out). MMS still shows it as current; but they seem to be very slow in updating. (Last year they hadn’t updated the location of Red Lodge, Montana, two years after the station was moved. I had to file an FOI request to get the current loocation.)
REPLY: [ Wow, that’s quite a story! Yeah, station moves make musical chairs look stable… -E.M.Smith ]
Pingback: American Thinker on CRU, GISS, and Climategate « Watts Up With That?
Pingback: Climategate: CRU Was But the Tip of the Iceberg « Thoughts Of A Conservative Christian
Hello, E M Smith,
I’ve never been here before, so am writing to say how impressed I am with the work you are doing. I particularly enjoyed your comprehensive reply to comments/criticisms on 13th Nov by Clivere, and I hope that he/she now fully appreciates the huge effort you had put in and indeed continue to do. If not I suggest that the route to follow is to repeat at least part of your work, or perhaps to think of another technology for critical examination of extensive data bases. This would dispel any doubts about just what is involved.
I’m also a climate time series analyst of 16 years standing, as a retirement hobby. I use data from many and assorted sources, including the GISS source. However, I’ve never attempted to operate on the whole set, simply one station at a time. I have probably looked in detail at a few thousand such series. You may deem this to be a trivial (or perhaps useless!) exercise, but in fact it can disclose a very interesting feature of this type of series. It is that step changes of considerable magnitude are a very common, almost ubiquitous, occurrence. I’ve not read enough of this blog yet to be sure that you have not mentioned this type of analytical observation. If you have, my apologies. If not, it is perhaps something that you might find interesting.
You’ve described the multiple averaging processes that are used to derive an annual average from the raw, recorded observations of climate technicians. Averaging, though an essential process, has at least one very unfortunate consequence. It virtually eliminates, and certainly disguises or hides, what might be very valuable information contained in the the data. As an ex industrial scientist I am a strong advocate of avoiding averaging as far as possible, and although I have to accept that monthly averages have necessarily suffered several averaging operations before I ever get to see them! Averaging over many sites or regions certainly tends to reduce the sharpness of step changes, since the steps that are evident for individual sites may have time offsets that reduce their impact when agglomerated.
Nevertheless, I am convinced that step changes are a somewhat neglected aspect of climate (and related series) and I’d welcome any input from others who work on the numerical side of climate information.
What a great blog you have!
REPLY: [ Thanks! I think the individual station data series are a critical part of ‘the issue’; especially given that, as you pointed out, they are pandemic, and they are not compatible with CO2 as smoothly causal. But this is a ‘communal barn raising’ we’re all doing. We each pick a part and ‘do what we can’. Other folks were already looking at individual temperature series when I started, but nobody was looking at the insides of GIStemp. I spent about 6 months saying “Somebody ought to analyse GIStemp” before I decided that “I was somebody”… I had the needed skill with FORTRAN (even if a bit musty for lack of recent use) and UNIX / LINUX. So I chose to work on the it. There are many other things I’d rather be working on, but this was where I had my ‘highest and best use’ and where ‘something just needed doing and was not being done’. I ended up looking at GHCN data in bulk as a consequence of doing a benchmark series on GIStemp, not ‘by design’. That’s when I saw just how horrid the bulk input data was.
BTW, my “day job” (sorely neglected for about 6 months now…) is that I make some pocket change trading stocks. In stock market indicators it is widely understood that you use averages TO HIDE what you do not want to see. Weekly averages are used to hide daily excursions, for example. When I saw how much averaging was going on in GIStemp (and prior to it at NCDC) my first thought was “My God, they are hiding a lot of SOMETHING in the data with all this averaging. What they are hiding is that it has far too little spacial and temporal coverage; and the changes of the bulk temperature come about from a collection of individual station changes (such as you describe) that are not possibly CO2 related. I can’t say if the ‘hiding’ was deliberate or simply ‘believing their own BS’ about anomalies; but I can say that is what an average does… Hope you enjoy the rest of the site. -E.M.Smith ]
@E.M.Smith: re your comment on January 6, 2010 at 1:09 pm. You suggest running ccc-gistemp with a set of stations that survive the “1989 or so deletions” versus a run with the deleted stations.
I believe my blog post from February addresses that. Was that the sort of benchmark you had in mind?
I know this is an old thread, but.
Don’t know if you read Climate Audit, E. M., but if you do you’re familiar with the sock puppet “Thefordprefect”. Well, he’s been castigating SM for not tearing apart the GISTEMP stuff. I told him to come over here and post his questions, but I doubt he will.
If you’re interested and have the time, here are his latest frothings: http://climateaudit.org/2010/12/26/nasa-giss-adjusting-the-adjustments/#comment-250826
I think he needs to be set straight.
Hey Cheif;
Someone posted this over at the SteveMc ‘jesting with adjusters’ thread;
http://climateaudit.org/2010/12/26/nasa-giss-adjusting-the-adjustments/#comment-250862
“The code for the ‘automated pairwise bias adjustment software’ (Menne and Williams 2009), used in the U.S. HCN version 2 monthly temperature dataset, is available at ftp://ftp.ncdc.noaa.gov/pub/data/ushcn/v2/monthly/software
”
At least it isn’t written in COBOL…
Party Time Now!
RR
Apparently ‘Troyca’ has taken a shot at running the Pointwise Homogenizer;
http://troyca.wordpress.com/2010/12/10/running-the-ushcnv2-software-pairwise-homogeniety-algorithm/
He posts some comments on what it took to run the thing. I don’t yet see a hint of the forensic mindset, but maybe I’m the one lacking it…
RR