In the GHCN v1 vs v3 comparison I used a minor variation on First Differences that I think is slightly superior to the Classic Peer Reviewed version.
I do not take a gratuitous ‘reset’ on missing data items.
This caused some folks to get their panties in a bunch, so I’ve redone the process using Classic First Differences for the “All Data” case and made a graph of the difference between the two (Classic First Differences and my dT or dP variation on it). That graph is at the top of this page. I use the period of time that covers the common climate codes, such as GIStemp (1880) and Hadley (1850) and running to 1990 when v1 ends ( I align v3 with v1 on that date).
Frankly, I expected more difference. There is nearly no difference for the latest years. About 1960 we open up a 0.05 C gap and then it closes again about 1919. Another separation happens of about 0.03 to 0.05 C then closes again. At the far right side at the “start of time” for programs like HadCRUT and GIStemp, the gap widens a little more to about 0.12 C. Which is to be expected. As the data series is a running total of difference, any differences also ought to accumulate. Also, as the purpose of the dT variation was to preserve trend through dropouts, the more dropouts happen in the data, the more the two series will diverge. The farther back in time you go, the more there are dropouts in the data.
While I would assert that my method is more accurate as it does preserve all valid data and does not take a ‘gratuitous reset’ just because one monthly value is missing; the effect is clearly limited.
The Difference In Method
In Classic First Differences, a missing data item will cause a reset of the running total of ‘difference’.
In the dT and dP process, I ‘bridge the gap’ via hanging on to the last valid temperature and just waiting for the next new valid data item.
The thesis being that January in New York is still January in New York even if it has a 2 year gap between Januaries rather than one, so there really isn’t a lot of difference between comparing January 2012 to 2011, or to 2010 (or even to 2009). If you find a difference, it is still a valid difference.
So, for the hypothetical series of temperatures:
10.4, missing, 10.1, 9.8
Classical First Differences would find 0 for the first anomaly (as the first year matches itself) then reset on ‘missing’ recording a 0, then record a zero on 10.1 (as it is again a ‘first value’) and finally compare 10.1 to 9.8 and record a -0.3 change. By inspection, we can see that the temperature dropped from 10.4 to 9.8 for a difference of -0.6 or 50% more.
The dT method would also record a 0 for the first value, and a zero for the missing value, but on that 10.1, it would compare it to 10.4 and find -0.3 of difference. Then compare 10.1 to 9.8 and find another -0.3 of difference. A total of -0.6.
While I would continue to assert that, for series like this where I only compare one month of one thermometer to itself, this is a more accurate method, the total impact on the result is rather low when used on the entire body of the GHCN data.
I still hold out hope that it can be demonstrated superior, especially in more sparse sets of data; but that will have to wait for another day. It’s now after 3 AM and time to wrap up for the day.