When is a Splice not a Splice? When it is a Deletion
I know, I’m supposed to provide answers, not questions, but sometimes the question is all you have, and sometimes it is the most interesting part.
In another posting, the Langoliers Lunch, I was adding an update on what STEP2 did to delete a couple of hundred thermometer records. In the process, I found that Calcutta / Dum had a deleted data set in the STEP2/short.station.list file, but inspection of the v2.mean input data showed there was another record with a different modification flag, that ought to have been merged with the discarded record to make a composite record. This was a bit puzzling, so I went to the GISS web site to look at what they thought their version had done.
you can choose to look at individual stations and plot their data at different points in the processing. There is a “drop down menu” with choices for the combined GHCN / USHCN data set (that they call “raw” but is really half baked ;-) in that it has some adjustments in it already along with a somewhat flawed merger of USHCN with GHCN from STEP0 ); the “after combining sources at the same location” (which they note they “have renamed the middle option (old name: prior to homogeneity adjustment)” which is the STEP1 process; and the “after homogeneity adjustment” which is the output of STEP2.
Now you would expect that what was “after combining” would stay combined when it went through “homogeneity adjustment”, but it doesn’t.
Just as the log file shows, the first half of the “combined sources” gets dropped.
So “homogeneity adjustment” really does / can mean “record deletion”.
While it is nice to have the confirmation that I was reading the tea leaves right, it is still a rather odd behaviour.
You can see the two graphs here:
First, as “combined”
Then, as “homogenized” and uncombined