OK, I was “offline for a while” and THAT was of course the moment all the fun breaks lose! So looking up the stuff about what was “released” from CRU…
Has an interesing analysis of the precision and lack of regional warming that we’ve seen in various GIStemp and GHCN postings here.
In comments, FrancisT has a link to his in depth look at just a couple of the comments in “HARRY_READ_ME”. Well worth a read:
Oh, and I’ve added a bit of time stamp forensics in a comment near the bottom too.
We have [begin quote]:
The bit that made me laugh was this bit. Anyone into programming will burst out laughing before the table of numbers
17. Inserted debug statements into anomdtb.f90, discovered that
a sum-of-squared variable is becoming very, very negative! Key
output from the debug statements:
OpEn= 16.00, OpTotSq= 4142182.00, OpTot= 7126.00
DataA val = 93, OpTotSq= 8649.00
DataA val = 172, OpTotSq= 38233.00
DataA val = 950, OpTotSq= 940733.00
DataA val = 797, OpTotSq= 1575942.00
DataA val = 293, OpTotSq= 1661791.00
DataA val = 83, OpTotSq= 1668680.00
DataA val = 860, OpTotSq= 2408280.00
DataA val = 222, OpTotSq= 2457564.00
DataA val = 452, OpTotSq= 2661868.00
DataA val = 561, OpTotSq= 2976589.00
DataA val = 49920, OpTotSq=-1799984256.00
DataA val = 547, OpTotSq=-1799684992.00
DataA val = 672, OpTotSq=-1799233408.00
DataA val = 710, OpTotSq=-1798729344.00
DataA val = 211, OpTotSq=-1798684800.00
DataA val = 403, OpTotSq=-1798522368.00
OpEn= 16.00, OpTotSq=-1798522368.00, OpTot=56946.00
forrtl: error (75): floating point exception
IOT trap (core dumped)
..so the data value is unbfeasibly large, but why does the
sum-of-squares parameter OpTotSq go negative?!!
[end of quote and quoted quote ;-) ]
For those unfamiliar with this problem, computers use a single “bit” to indicate sign. If that is set to a “1″ you get one sign (often negative, but machine and language dependent to some extent) and if it is “0″ you get another (typically positive).
OK, take a zero, and start adding ones onto it. We will use a very short number (only 4 digits long, each can be a zero or a one. The first digit is the “sign bit”). I’ll translate each binary number into the decimal equivalent next to it.
0000 zero 0001 one 0010 two 0011 three 0100 four 0101 five 0110 six 0111 seven 1000 negative (may be defined as = zero, but oftentimes defined as being as large a negative number as you can have via something called a 'complement'). So in this case NEGATIVE seven 1001 NEGATIVE six 1010 NEGATIVE five (notice the 'bit pattern' is exactly the opposite of the "five" pattern... it is 'the complement'). 1011 NEGATIVE four 1100 NEGATIVE three 1101 NEGATIVE two 1110 NEGATIVE one 1111 NEGATIVE zero (useful to let you have zero without needing to have a 'sign change' operation done) 0000 zero
Sometimes the 1111 pattern will be “special” in some way. And there are other ways of doing the math down at the hardware level, but this is a useful example.
You can see how adding a digit repeatedly grows to a large value (the limit) then “overflows” into a negative value. This is a common error in computer math and something I was taught in the first couple of weeks of my very first programming class ever. Yes, in FORTRAN.
We have here a stellar example of it in real life in the above example where a “squared” value (that theoretically can never become negative) goes negative due to poor programming practice.
There are ways around this. If a simple “REAL” (often called a FLOAT) variable is too small, you can make it a “DOUBLE” and some compilers support a “DOUBLE DOUBLE” to get lots more bits. But even they can have overflow (or underflow the other way!) if the “normal” value can be very very large. So ideally, you ought to ‘instrument’ the code with “bounds checks” that catch this sort of thing and holler if you have that problem. There are sometimes compiler flags you can set to have “run time” checking for overflow and abort if it happens (there are also times that overflow is used as a ‘feature’ so you can’t just turn it off all the time. It is often used to get “random” numbers, for example.)
But yes, from a programmers point of view, to watch someone frantic over this “newbie” issue is quite a “howler”…
And that is why I’ve repeatedly said that every single calculation needs to be vetted for rounding, overflow, underflow, precision range, …
Because otherwise you are just hoping that someone did not do something rather like they clearly have done before…
Also, from :
we have in comments:
Paul W (15:05:29) :
Phil Jones writes that the missing raw CRU data could be reconstructed:
(from file 1255298593.txt)
From: P.Jones@uea.ac.ukTo: “Rick Piltz” <email@example.com
Subject: Re: Your comments on the latest CEI/Michaels gambitDate: Sun, 11 Oct 2009 18:03:13 +0100 (BST)Cc: "Phil Jones" <firstname.lastname@example.org
, "Ben Santer" <email@example.com
Rick, What you've put together seems fine from a quick read. I'm in Lecce inthe heal of Italy till Tuesday. I should be back in the UK byWednesday. The original raw data are not lost either. I could reconstruct what wehad from some DoE reports we published in the mid-1980s. I would startwith the GHCN data. I know that the effort would be a complete wate oftime though. I may get around to it some time.
So we have a tacit confirmation that they start with GHCN data. That means that ALL the issues with the GHCN data (migration to the equator, migration from the mountains to the beaches…) apply to Hadley / CRU just as they do to GIStemp.
Both are broken in the same way, so that is why they agree. They use biased input data and see the same result.
Heck, I’ve even stumbled onto another programmer type doing stock trading stuff…
The discussion is very interesting, even if a bit ‘rough language’ at times:
In comments there we get a picture of “Mr. Harry Readme”:
Somehow, I can feel his pain at the code he must deal with. Best of Luck to you Harry.
This is in reverse time order, but as presented in the link.
From: Michael Mann
To: Phil Jones
Subject: Re: Skeptics
Date: Thu, 25 Jun 2009 11:19:45 -0400
Cc: Gavin Schmidt
well put, it is a parallel universe. irony is as you note, often the contrarian arguments
are such a scientific straw man, that an effort to address them isn’t even worthy of the
So we are “contrarian” are we? And not even worthy of a peer-reviewed address? I would think that someone has been inflating his own ego a bit overmuch…
“Facts just are. -emsmith”. It isn’t about the people, and it isn’t about the peers. It is all about the truth and the facts. And the facts are that there are lose ends to the AGW fantasy that have been pointed out by us “contrarians” that very much need addressing.
On Jun 25, 2009, at 10:58 AM, Phil Jones wrote:
Just spent 5 minutes looking at Watts up. Couldn’t bear it any longer – had to
stop!. Is there really such a parallel universe out there? I could understand all of
the words some commenters wrote – but not in the context they used them.
It is a mixed blessing. I encouraged Tom Peterson to do the analysis with the
limited number of USHCN stations. Still hoping they will write it up for a full journal
Problem might be though – they get a decent reviewer who will say there is nothing
new in the paper, and they’d be right!
Bolded a couple of bits here… Well, nice to know they poked a nose in at WUWT, even if they could not come to grips with it. And nice that he “could understand all of the words”… even they were too much for him to understand in context. Strange that I’ve never had any problem understanding the postings or comments on WUWT.
I also find it interesting that they are saying it’s fine to do an analysis on a reduced set of USHCN stations. Also the raw cynicism of the encouragement to publish an empty paper in light of the belief it would be ‘nothing new’ is particularly galling given their efforts so suppress what is really new, but against their agenda.
At 15:53 24/06/2009, Michael Mann wrote:
Phil–thanks for the update on this. I think your read on this is absolutely correct. By
the way, “Watts up” has mostly put “ClimateAudit” out of business. a mixed blessing I
talk to you later,
Unclear on the concept of “Synergy” it would seem… WUWT lead me to Climate Audit, and CA has pointed to WUWT with some fair frequency. One is more technical than the other, but both are good and both are well attended.
On Jun 24, 2009, at 8:32 AM, Phil Jones wrote:
Good to see you, if briefly, at NCAR on Friday. The day went well, as did the
dinner in the evening.
It must be my week on Climate Audit! Been looking a bit and Mc said he
has no interest in developing an alternative global T series. He’d also said earlier
it would be easy to do. I’m 100% confident he knows how robust the land component
I also came across this on another thread. He obviously likes doing these
sorts of things, as opposed to real science. They are going to have a real go
at procedures when it comes to the AR5. They have lost on the science, now they
are going for the process.
Prof. Phil Jones
Climatic Research Unit Telephone +44 (0) 1603 592090
School of Environmental Sciences Fax +44 (0) 1603 507784
University of East Anglia
Norwich Email firstname.lastname@example.org
So here we evidence for ‘inbreeding’ between NCAR and CRU. That GIStemp uses “NCAR” format data files about STEP2 – STEP3 then merges with HADLEY CRU SST in STEP4_5 continues to argue for excessive “group think” and shared design / code between the temperature series. So when folks point out that Hadley CRUt and GIStemp agree, maybe it’s because they have extensive overlap in design goals, frequent exchange of “ideas” and common input data, internal work files matching in format (and content? for easy comparison and convergence? at confabs such as in the email?), and processes…
BTW, IMHO it would be easy to make an alternative Global Temperature Series. “Mc” is quite right that it is easy. I could have one in about a day (less if I didn’t want to think about the details too much) and it would be more accurate than GIStemp. How? Simply by “un-cherry picking” some of the GIStemp parameters then running the code.
I finds the dig at “real science” vs “procedures” interesting. How can you have reliable science if your procedures are broken? I learned about “lab procedures” and the importance of them very early in chem lab. Anyone who disses the merit of sound procedures is an accident waiting to happen… IMHO. And will produce errors from unsound procedures.
But the overall thing that I pick up from this is just the tone of True Believers. These folks really do think they have it all worked out. And that is a very dangerous thing. It leads to very closed minds and it leads to very strong “selection bias”. Often with no ability to self detect that broken behaviour.
You know, I think there will be a great deal of insight come from this “leak”…
Oh, and here is a more complete copy of the snippet quoted above:
Comment by Prof. Phil Jones
http://www.cru.uea.ac.uk/cru/people/pjones/ , Director, Climatic
Research Unit (CRU), and Professor, School of Environmental Sciences,
University of East Anglia, Norwich, UK:
No one, it seems, cares to read what we put up
http://www.cru.uea.ac.uk/cru/data/temperature/ on the CRU web
page. These people just make up motives for what we might or might
not have done.
Almost all the data we have in the CRU archive is exactly the same
as in the Global Historical Climatology Network (GHCN) archive used
by the NOAA National Climatic Data Center [see here
http://www.ncdc.noaa.gov/oa/climate/ghcn-monthly/index.php and here http://www.ncdc.noaa.gov/oa/climate/research/ghcn/ghcngrid.html ].
The original raw data are not “lost.” I could reconstruct what we
had from U.S. Department of Energy reports we published in the
mid-1980s. I would start with the GHCN data. I know that the effort
would be a complete waste of time, though. I may get around to it
some time. The documentation of what we’ve done is all in the
If we have “lost” any data it is the following:
1. Station series for sites that in the 1980s we deemed then to be
affected by either urban biases or by numerous site moves, that were
either not correctable or not worth doing as there were other series
in the region.
2. The original data for sites for which we made appropriate
adjustments in the temperature data in the 1980s. We still have our
adjusted data, of course, and these along with all other sites that
didn’t need adjusting.
3. Since the 1980s as colleagues and National Meteorological
Services http://www.wmo.int/pages/members/index_en.html (NMSs)
have produced adjusted series for regions and or countries, then we
replaced the data we had with the better series.
In the papers, I’ve always said that homogeneity adjustments are
best produced by NMSs. A good example of this is the work by Lucie
Vincent in Canada. Here we just replaced what data we had for the
200+ sites she sorted out.
The CRUTEM3 data for land look much like the GHCN and NASA Goddard
Institute for Space Studies data
http://data.giss.nasa.gov/gistemp/ for the same domains.
Apart from a figure in the IPCC Fourth Assessment Report (AR4)
showing this, there is also this paper from Geophysical Research
Letters in 2005 by Russ Vose et al.
Figure 2 is similar to the AR4 plot.
So again we have confirmation that the Hadley input is substantially the same GHCN input data as for GIStemp. And as we’ve seen, there are strong biases built into the GHCN data set and it’s changes over time.
For any future assertion that Hadley and GIStemp agree, so they must be ‘right’, I think it’s pretty clear they are the same because the accept the same highly biased input data.
I also find it amazing that the response to “you lost the raw data” is “It isn’t lost because we have different data that has been modified and is better”. Clearly unclear on the concept…