WIP on First Differences

Over in the WUWT ‘critique’ of my look at differences between GHCN v1 and v3, one of the more “vociferous” complaints was that I was using a ‘discredited’ or somehow deprecated method in using First Differences.

So this posting is really just a page in my notebook where I’m collecting pointers and information about First Differences. Not very exciting, but useful.

OK, don’t know if I can find the whole First Differences article text anywhere, but here is a link to the abstract:

http://www.agu.org/pubs/crossref/1998/98JD01168.shtml

Their “citations” page lists:

http://www.agu.org/pubs/crossref/1998/98JD01168.shtml

Chen, Fahu, Jinsong Wang, Liya Jin, Qiang Zhang, Jing Li, and Jianhui Chen (2009), Rapid warming in mid-latitude central Asia for the past 100 years, Front Earth Sci China, 3(1), 42.[CrossRef]

Christy, John R., William B. Norris, Kelly Redmond, and Kevin P. Gallo (2006), Methodology and Results of Calculating Central California Surface Temperature Trends: Evidence of Human-Induced Climate Change?, J Clim, 19(4), 548.[CrossRef]

DeGaetano, Arthur T., and Robert J. Allen (2002), Trends in Twentieth-Century Temperature Extremes across the United States, J Clim, 15(22), 3188.[CrossRef]

Free, Melissa, James K. Angell, Imke Durre, John Lanzante, Thomas C. Peterson, and Dian J. Seidel (2004), Using First Differences to Reduce Inhomogeneity in Radiosonde Temperature Datasets, J Clim, 17(21), 4171.[CrossRef]

Free, Melissa, Dian J. Seidel, James K. Angell, John Lanzante, Imke Durre, and Thomas C. Peterson (2005), Radiosonde Atmospheric Temperature Products for Assessing Climate (RATPAC): A new data set of large-area anomaly time series, J Geophys Res, 110, D22101.[CrossRef]

Hu, Qi (2005), How have soil temperatures been affected by the surface temperature and precipitation in the Eurasian continent?, Geophys Res Lett, 32, L14711.[CrossRef]

Jin, Menglin, and Robert E Dickinson (2010), Land surface skin temperature climatology: benefitting from the strengths of satellite observations, Environ Res Lett, 5(4), 044004.[CrossRef]

Jones, P. D., M. New, D. E. Parker, S. Martin, and I. G. Rigor (1999), Surface air temperature and its changes over the past 150 years, Rev Geophys, 37(2), 173.[CrossRef]

Jones, P. D., and A. Moberg (2003), Hemispheric and Large-Scale Surface Air Temperature Variations: An Extensive Revision and an Update to 2001, J Clim, 16(2), 206.[CrossRef]

Jones, P. D. (2004), Climate over past millennia, Rev Geophys, 42, RG2002.[CrossRef]

Jones, P. D., D. H. Lister, and Q. Li (2008), Urbanization effects in large-scale temperature records, with an emphasis on China, J Geophys Res, 113, D16122.[CrossRef]

Jones, P. D., D. H. Lister, T. J. Osborn, C. Harpham, M. Salmon, and C. P. Morice (2012), Hemispheric and large-scale land-surface air temperature variations: An extensive revision and an update to 2010, Journal of Geophysical Research—Atmospheres, 117, D05127.[CrossRef]

Karl, Thomas R., Richard W. Knight, and Bruce Baker (2000), The record breaking global temperatures of 1997 and 1998: Evidence for an increase in the rate of global warming?, Geophys Res Lett, 27(5), 719.[CrossRef]

Lawrimore, Jay H., Matthew J. Menne, Byron E. Gleason, Claude N. Williams, David B. Wuertz, Russell S. Vose, and Jared Rennie (2011), An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3, Journal of Geophysical Research—Atmospheres, 116, D19121.[CrossRef]

Li, Qingxiang, Wei Li, Peng Si, Gao Xiaorong, Wenjie Dong, Phil Jones, Jiayou Huang, and Lijuan Cao (2010), Assessment of surface air warming in northeast China, with emphasis on the impacts of urbanization, Theor Appl Climatol, 99(3-4), 469.[CrossRef]

Li, QingXiang, WenJie Dong, Wei Li, XiaoRong Gao, P. Jones, J. Kennedy, and D. Parker (2010), Assessment of the uncertainties in temperature change in China during the last century, Chinese Sci Bull, 55(19), 1974.[CrossRef]

Menne, Matthew J., and Claude N. Williams (2005), Detection of Undocumented Changepoints Using Multiple Test Statistics and Composite Reference Series, J Clim, 18(20), 4271.[CrossRef]

Menne, Matthew J., and Claude N. Williams (2009), Homogenization of Temperature Series via Pairwise Comparisons, J Clim, 22(7), 1700.[CrossRef]

Montandon, Laure M., Souleymane Fall, Roger A. Pielke, and Dev Niyogi (2011), Distribution of Landscape Types in the Global Historical Climatology Network, Earth Interact, 15(6), 1.[CrossRef]

Peterson, Thomas C., Kevin P. Gallo, Jay Lawrimore, Timothy W. Owen, Alex Huang, and David A. McKittrick (1999), Global rural temperature trends, Geophys Res Lett, 26(3), 329.[CrossRef]

Selvam, A. M. (2011), Signatures of universal characteristics of fractal fluctuations in global mean monthly temperature anomalies, Jrl Syst Sci & Complex, 24(1), 14.[CrossRef]

Shen, S. S. P., H. Yin, and T. M. Smith (2007), An Estimate of the Sampling Error Variance of the Gridded GHCN Monthly Surface Air Temperature Data, J Clim, 20(10), 2321.[CrossRef]

Shreve, Cheney (2010), Working towards a community-wide understanding of satellite skin temperature observations, Environ Res Lett, 5(4), 041002.[CrossRef]

Smith, Thomas M. (2005), New surface temperature analyses for climate monitoring, Geophys Res Lett, 32, L14712.[CrossRef]

Thorne, Peter W., John R. Lanzante, Thomas C. Peterson, Dian J. Seidel, and Keith P. Shine (2011), Tropospheric temperature trends: history of an ongoing controversy, WIREs Clim Chang, 2(1), 66.[CrossRef]

Trewin, Blair (2010), Exposure, instrumentation, and observing practice effects on land temperature measurements, WIREs Clim Chang, 1(4), 490.[CrossRef]

Vose, Russell S. (2005), An intercomparison of trends in surface air temperature analyses at the global, hemispheric, and grid-box scale, Geophys Res Lett, 32, L18718.[CrossRef]

Vuille, Mathias, and Raymond S. Bradley (2000), Mean annual temperature trends and their vertical structure in the tropical Andes, Geophys Res Lett, 27(23), 3885.[CrossRef]

Wu, Zhuoting, Hongjun Zhang, Crystal M. Krause, and Neil S. Cobb (2010), Climate change and human activities: a case study in Xinjiang, China, Clim Change, 99(3-4), 457.[CrossRef]

Zaiki, Masumi (2002), A statistical estimate of daily mean temperature derived from a limited number of daily observations, Geophys Res Lett, 29, 1892.[CrossRef]

Which raises the rather amusing question of “Have all those papers been withdrawn for using First Differences? Hmmmmm?”. Perhaps those authors will wish to ‘have a conversation’ with Steven Moser about his claims and decide how best to withdraw THEIR works. Right after that, I’ll consider it… /sarcoff>;

Near as I can tell, FD is still used, the papers not withdrawn, and it is just that the limitations of the method and any quirks it might have ought to be kept in mind when you use it (rather like all methods of doing things…)

One of the complaints was that I used a method that “bridged the gap” on dropouts. I just hold onto the last value of a valid data time, then whenever the next valid data item arrives, use that to compute the “difference”. This gives a slope to that segment that is essentially an interpolation of that space. ( One could interpolate each value in between, compute all those small linear segments discreetly, and then compute the overall slope and you end up at the same point.) But there was some significant vitriol applied toward the idea that this was just horrible and very unacceptable… and insinuation I must have some moral defect with regards to claiming First Differences was Peer Reviewed, then doing this ‘other thing’ that isn’t… (While I’ve regularly said First Differences is a peer reviewed method – which it is- and that I use a variation on it which I have not claimed is peer reviewed – but which I think is trivially demonstrated to be more accurate, not less, and certainly not very much different on the bulk of the GHCN data.)

Of interest here might be the opinion at Climate Audit:

http://climateaudit.org/2010/08/19/the-first-difference-method/

Changes that are likely to cause a level shift in the series, such as a TOBS or equipment change or a station move, should simply be treated as the closing of the old station and the creation of a new one, thereby eliminating the need for the arcane TOBS adjustment program or a one-size-fits-all MMTS adjustment.

Missing observations may simply be interpolated for the purposes of computing first differences (thereby splitting the 2 or more year observed difference into 2 or more equal interpolated differences). When these differences are averaged into the composite differences and then cumulated, the valuable information the station has for the long-run change in temperature will be preserved.
(When computing a standard error for the index itself, however, it should be remembered in counting stations that the station in question is missing for the period in question).

The Common Anomalies Method (CAM), used by most of the major indices, including MOSHtemp ;-) , requires restricting the data base to stations that are observed during a common base period, or at least during most of it. This requires throwing out a large portion of the data, and/or relying heavily on estimated TOBS and MMTS adjustments to artificially extend stations with these changes. In Table 1, for example, there is no period longer than 1 year for which more than 1 station has complete data, so that no trend could be computed by this method at all. If we were to settle for a base period in which each station has at least 2/3 of its data, a 3-year period such as Yr2-Yr4 could be used, as illustrated below:

So it looks like “bridging the gap” has passed muster at Climate Audit. (Though via interpolation, not just “hold and do anomaly over the gap”.) As I do FD only on a single month for a single record, one would expect that THAT month in THAT place is the same basic entity this year as last year, or 2 years ago, or even 5 years ago. Any change in a very long gap, like 10 years, is more likely the result of a valid action (such as ENSO or PDO changes) than a “random artifact”. (The only exception being, as noted, equipment changes or process changes, where a ‘reset’ on FD ought to take care of it.) The flip side of that point is that if you are looking for those splice artifacts not taking a reset will highlight them (which is what I’ve said was my goal all along). So using my dT/dt method on unadjusted vs adjusted (that has supposedly done TOBS et.al. to do ‘proper’ adjustments) ought to let us see how much that ‘splice’ was ‘repaired’. One of my eventual goals is to compare the two in just this way. (In earlier use of dT/dt I’ve looked at variously unadjusted and adjusted sets. Yet Another Work In Progress -WIP)

Part of the reason I restrict the comparison in doing dT/dt anomaly creation (or dP/dt that differs only in the sign; difference present to past vs past to present) to a SINGLE instrument CountryCode/WMO/modifier string is to prevent interaction between different stations or changed WMO numbers. An instrument is compared ONLY to itself and ONLY inside a single month series. So any change large enough to cause a WMO or substation number change will be in a different FD series / entry.

As noted above, the CAMS method depends on all the adjustments to remove TOBS and station change issues. So if you are looking at comparing unadjusted GHCN over two versions of released data and want to know how much “bias” might be in the data (so how much must be removed by things such as TOBS and could influence CAM based systems) taking a reset on gaps within a station will hide exactly what you are trying to find. The bias in the change of data. Not looking for what you wish to examine makes it rather hard to see…

So using a method that “spans the gap” and using First Differences on a large body of data will give closer match to the actual trend in the longer term for those stations. Just what I wanted to show (and just what I have asserted ought to happen). Though, yes, a stringent comparison of interpolation vs ‘lump sum’ needs to be done. ( I put all the change in a single ‘lump sum’ in the year where the next valid data is observed – keeping the change synchronized with the actual arrival of data; interpolation ASSUMES it can be linearly spread over the gap. As I don’t know that, I’d rather just preserve what the data actually said: ‘This change showed up here.’) It is, in some ways, part of that difference in mind set between “looking at the shape of the data” (as I’ve called it) and the quest for The One True Global Average Temperature. I want to know what THE DATA looks like and what it has to say, not what I can turn it into that tells me what I think it says about something else. I want to know if the data has large lumps in it, not hide them (even if the method of the ‘hide’ is approved.)

Is First Differences ideal? IMHO nothing is ideal. So in looking for “the best method” one has to ask “best for what?”. Again, at Climate Audit, they find some “issues” with FD and like a different method for the ‘best’ least biased calculation of the actual warming trend in a temperature series. Realize that this doesn’t say much about comparing two data set versions where, frankly, using any method ought to be about the same validity. Any systematic error in a method will tend to give the same offset in both results so the difference between them will tend to neutralize. (Apparently the Warmers love anomalies for offset suppression when used for temperatures but suddenly don’t like them when doing bulk set compares…) It is the comparison of a method applied to ONE data set version compared to reality (that Holy Grail…) that has some comparative “better” or “worse” in the match. For the simple reason that you can not make an anomaly between those two to remove the bias in the method; as that anomaly is unknowable (as the “reality” is what you are trying to find in the first place.)

Update 8/29 Just for the record, as noted below at http://climateaudit.org/2010/08/19/the-first-difference-method/#comment-240064, Jeff Id has convinced me that while FDM solves one problem, it just creates other problems, and hence is not the way to go.

But “way to go” for what? Well, for that comparison of A data set version to the unknown “reality”…

And is the suggested ‘better’ method (“Plan B”) free of all issues?

As stated, it gives equal weights to all stations. But estimating it by GLS with an appropriate covariance matrix would be straightforward.

One small drawback is that adding new data can change earlier estimates in the combined series because the latest values will add new information on station differences.
However, these differences will generally be relatively small.

True — but in live time this just means that you have to settle on a set of stations (with at least 10 years or so of readings), compute offsets, and then go with that formula for several years. 5 or 10 years later, you come out with Version 2 of your index, with new stations added and slightly modified offsets for the old stations.

And so on.

Now that doesn’t sound so good if your goal is the absolute minimal changes in the data used and wish to use EVERY data item in the set that it is possible to use. “Measuring the data” is NOT the same as “making up a number that I think matches reality the best” especially when it comes to things like “must have 10 years of data” or it doesn’t get used as part of the comparison base set.

That, BTW, is one of my complaints about the various CAM based methods. Overweight is given to some stations that are held to have some special merit due to a particular length of coverage in a particular span of time. That presents opportunities for those data to have “special effect” and for changes in a given instrument during those years to have greater impact on the comparison.

My goal in making a comparison between v1 and v3 is not to show how a particular 10 year period varies nor to give any decadal scale span overweight. It is to compare the two data sets, minimally biased, shifted, or weighted, directly to each other. Giving overweight to given decades just violates that goal.

In Conclusion

That’s what I’ve got for this now; for today. I’ll add some to it from time to time as things of interest show up.

Right now I’ve got to find P.G. who is waiting for me at a local hotel lobby. We have a very important project* that just Can Not Wait. So further R&D on this topic is on hold for the rest of the day… and likely 1/2 of tomorrow…

*(The “project” is a comparison of local Brew Pubs. The Faultline. Gordon Biersch, maybe even The Tied House… Very important research… ;-)

Subscribe to feed

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in AGW Science and Background, NCDC - GHCN Issues and tagged , , , , , . Bookmark the permalink.

10 Responses to WIP on First Differences

  1. E.M.Smith says:

    Just to note that after Tied House and Gordon Biersch, we ended up at Tao Tao for Chinese dinner. First time I even had pressed duck…. rather nice :-)

    We also visited the grounds of the Rosicrucians and looked at some Egyptian motif buildings and such. Oh, and earlier we stopped by Alviso and got pictures of how “sea level rise” has turned it from a boat dock into a marsh… on the way to becoming dry land…

    All in all, a very pleasant day… P.G. is “good folks” and easy to share a day with.

  2. dearieme says:

    One of the most difficult military maneouvres is an orderly retreat: there’s no sign that those beggars are capable of it. No wonder James Lovelock has simply left the building

  3. Pascvaks says:

    It sounds more like the old problem of what do you do when you don’t like the message? You can kill the messenger of course, but that never really solved anything. You can make fun of the messenger’s clothes, or his hat, or his boots; but that doesn’t solve the problem either. If you’re really desperate you can walk up to the messenger’s horse and check each hoof, look at it’s teeth, and scream that the poor beast has ticks. But, alas, that too doesn’t solve the main problem, what do you do with the damn message?

    You can sometimes make a judgemental leap about the reaction you get from a message. The person who reacts the most violently, is the person most fearful and in the most jeopardy. (Like I said, “sometimes”;-)

  4. oldtimer says:

    What I like about your method is the way it picks up what you describe as “splice artifacts” where there are station changes. This with your identification of changing station counts was, for me, a defining moment. If you make such a significant change in the number of stations as occurred c1990 (down from c6000 to c1200 with only c200 common to the before and after periods) how do you know that you are comparing apples with apples?

    In business, you would arrange a parallel run to validate the change or to measure the difference that the change made. You yourself have posted that you do not change instruments in the course of an experiment and expect to get a valid result. Yet the “climate scientists” did precisely that. As far as I am aware it is now impossible to calculate the effect of the change. Instead we are provided with mush that is the product of homogenisation.

  5. E.M.Smith says:

    @Oldtimer:

    That’s pretty much it. Basically, the folks like Hansen, Karl, Perterson, et. al. “prove” some interesting little bit of technical data manipulation, and then apply it very broadly. They apply it repeatedly (even though only “proven” in one sole application) and they apply it everywhere and in all times, even though the window of “proof” is limited in time and space. Finally, the assert that taking a herd of these “data fixes” in aggregate sill somehow work ensemble to give the One True Answer (an unproven assumption).

    So the First Differences Method was used for some things (see list above – that is likely not exhaustive), and eventually CAM with the use of baselines and selected thermometers. Then they underpin The Reference Station Method (‘tested’ only in a limited time and space with limited instrumental data). So various “QA processes” are done, then homogenizings are done… then TOBS and MMTS are added into the mix, then RSM is used to smear data to where there are none, then a strange UHI adjustment (that don’t cure UHI and sometimes warms stations) is done, then…. All the time with no validation and regression test suite.

    It is rather like having a drug shown effective for one disease, so you take 3 or 4 doses of it for different diseases at the same time and expect the result to be good.

    @Pascvaks:

    Sometimes it does feel that way…

    @Dearime:

    I saw that Lovelock news… May wonders never cease.

    @All:

    The general thing I’ve noticed, and it IS speculative; is that the Warmers look to come at things in an incremental / linear / analytical way. Looking at each part in isolation and just asking “Does this bit work to make a Global Average Termpature?” or “Does this bit at least in part improve that error or problem?” and as long as each bit “has a good story” or has a “plausible explanation” say “OK, that’s done” and move on to the next part. Not standing back and looking at the “sanity check’ overall fit and finish.

    So it’s a trivial thing to observe that a grass field is a lot cooler than a 10,000 ft long hundreds of feet wide concrete jet runway surrounded by kilometers of tarmac taxiways and parking aprons. It is just as obvious that the places that are now those airports were not airports in 1900. That the ones which were airports in 1950 have had dramatic growth in size and buildings and traffic. That they are, in short, hotter.

    Yet the Warmer Mindset grinds step by step through a linear set of The Story Steps and what looks like a variety of semi-broken UHI “examinations” and at the end pronounces that UHI and Airport Heat Island are not important. Nearly nothing. Somewhere along the way I posted a link to an article that did find Airport Heat Islands real, so it is in the literature; and it is easy to spot for yourself https://chiefio.wordpress.com/2012/06/15/an-example-airport-issue/ especially on days with modestly still air.

    But somehow that ‘disconnect’ between what is obvious and what they claim to prove by linearly crawling down the Story Line examining each little word in isolation, just doesn’t seem to sink in. It’s like one of those “definition chains” where you can “prove” that white is black in 5 or 6 steps… Usually done as a joke; but in ‘climate science’ not so much…

    Then if you point out the “issues”, it’s like you are saying their baby is ugly…

    If you say “But look, airports are just demonstrably hot places and they did not exit in 1900 but are the majority of the thermometers now.” you just get drug back through The Story again, module by module…. “Don’t you believe?”…. And, well, no I don’t. Nice Story, but doesn’t line up. Doesn’t “fit” with all the rest of life experience and observation.

    Supposedly it is hotter now than 50 years ago. Yet when I was a kid, my home town had many summer days of “110 F in the shade and there ain’t no shade.” and I personally remember one occasion of 117 F. It still tops out at about 110 F, but there are fewer of them now. The usual hot summer days are a degree F or two lower, on average. Heck, even the plant seasons have moved just a touch toward cooler.

    Where I am right now, I am marginal on heat for growing tomatoes. (Most kinds will not set seeds below 50 F at night). Watching for when that happens is a big deal with dramatic observables. Right now, we’re still having sporadic below 50 F nights. (At the warm peak in 1998 I got my best tomato production and it has been downhill ever since.) So for about 40 years here, you could put a tomato in the ground about April and have tomato set about June with the occasional harvest end June start July. Other than the nice batch in 1998, the last decade has mostly been ‘marginal’ on the tomatoes. This year I’m not even bothering to try.

    Yes, it is a ‘microclimate’, so subject to outside and individual variation. BUT… IF it were getting warmer globally, one would expect to see it in the garden. And it just isn’t there. Not in my garden. Not in other gardens elsewhere. All we find are the regular cyclical wanderings.

    Yet that disconnect gets dismissed with a wave of The Story, step by step… Or with a laundry list of complaints about method used in any OTHER demonstration.

    Then there’s the Moving Walnut Shell defense… “You are using the data from last version” or “You are using the method from 5 years ago”. Strange how all the parts in the past can be claimed wrong, yet the past FINDINGS are never claimed wrong….. But if you only look under the NEW Walnut Shell, there you will find the Pea Of Truth! Put your money on the table!!

  6. Adrian Camp says:

    Chiefio, is there any chance, any easy way, of dealing with the data by selecting out shapes? Say you got all the gradual increase sites, and all the step increases (or decreases), and all the bathtub curves and all the hogbacks and whatever else you might consider indicative and then try to see what the ones with similar curves had in common, and further whether there was any apparent selection bias in what sites go into the global average nonsense number?

    Yes, in a perfect world, I’d be able to do this myself, but my last program was in IBM3741 language.

  7. E.M.Smith says:

    @Adrian Camp:

    Interesting idea… For “simple shapes” yes. For “complex shapes”, probably but with a lot of work.

    So take the dP/dt report for a station.

    Rising stations will have negative values in the past, rising to zero or positive lately. Take a ratio of first half to last half and you get ‘general slope’. Falling increasingly positive in the past.

    A “hump” would have larger positive value in the middle.

    Looking for a large “step” in the dP value would find step function changes. (So have a ‘running total of change’ and compare it to the current value, a jump up that holds is a step function.)

    Yeah, I think something interesting could come from that…

    Instead of just looking at how many stations are ‘cooling’ look at what is the character of falling stations…

    Verity has some of this with the graphs of warming and cooling by station (though using a non-dP/dt method to find trend).

    One map I remember from a while ago had the warming and cooling stations with a red band of strongly warming stations just over the border in South Canada. Apparently North USA is not warming and the CO2 all runs to Canada ;-)

    Very interesting pictures here:

    http://diggingintheclay.wordpress.com/2010/10/06/google-earth-kml-files-spot-the-global-warming/

    Has a “hot Canada” but with lots of the red dots turning white if you require 50 years of data…

    There’s a lot more of that kind of analysis already done at Verity Jone’s site. Strongly worth some time with the search box there.

  8. adolfogiurfa says:

    We must not worry too much about GWrs. statistics and “tricks”, as the new paradigm will trash them all:
    >Sun makes the GMF change
    http://www.vukcevic.talktalk.net/LOD-GMF.htm
    >Current from the Sun makes the Earth spin and modifies LOD

    >Temperatures change as LOD changes, because ACI changes

    (This last image is from Professor Leonid B. Klyashtorin, email: )

  9. p.g.sharrow says:

    @EMSmith; The investigation of brewbubs was useful, informative and our ladies were pleased with us. The supper might of helped with that part. ;-) pg

  10. bkindseth says:

    A recent presentation, “Investigation of methods for hydroclimatic data homogenization” presented at the European Geosciences Union was discussed at WUWT. There was a lot of criticism that it was not (yet) a paper and that it had not been peer reviewed. To me, the most significant comments are on page 7 of the presentation, where they state, “Homogenization results are usually not supported by metadata or experiments….” and, “No single case of an old and a new observation station running for some time together for testing of results is available!”
    It was troubling that the authors of this presentation, Steirou and Koutsoyiannis, then did some statistical analysis of synthetic data to show that the methods used in data homogenization were unreliable and resulted in increases in the warming trend in 2/3 of the cases. It would have added a lot to the presentation if they would have used real data, selecting stations with large differences between raw and adjusted data, as demonstrated the unreliability of the current process.
    After thinking this issue through, I thought that First Differences might be the answer so I came to this web posting, and it was just what I was looking for. My ego was inflated when I read, “Changes that are likely to cause a level shift in the series, such as a TOBS or equipment change or a station move, should simply be treated as the closing of the old station and the creation of a new one, thereby eliminating the need for the arcane TOBS adjustment program or a one-size-fits-all MMTS adjustment” as this was exactly my conclusion.
    Jeff Id’s illustration at http://noconsensus.wordpress.com/2010/08/21/comparison-and-critique-of-anomaly-combination-methods/ like S&K is an analysis using synthetic data. What happens with real data? What happens if both series are the same length? Couldn’t a new FD series be started whenever there is an inhomogenity and then the series be spliced together to get the fixed length, say 100 years?
    Using synthetic data to prove a point troubles me, based on my experience as an engineer. In manufacturing, using a statistical tolerance procedure is OK with today’s machines that recalibrate after every part. But when using a manual process, tool wear only works in one direction, to make outside diameters larger and inside diameters smaller. If one were to assume a normal distribution about a mean, and tolerance the parts so that parts within 3 sigma of the tolerance range would fit together, then theoretically only 3 parts in 1000 would not fit. But that is not what actually happens, as the male parts would be biased towards the large size and the female parts would be biased toward the small size.
    How does this apply to temperature measurement? Most of the problems with temperature stations would tend to make the reading higher. If you had a site that was not well maintained, the paint on the wooden enclosure faded, peeled and got dirty, then the temperature would gradually increase. If it was then repainted, there would be a step change in readings and the homogenization would adjust the previous series downward. It would seem that each and every inhomogenity would have to be verified for the temperature series to be valid, and that it cannot simple be done statistically. In the meantime, keep up the good work with FD.

Comments are closed.