GHCN v4 First Peek

I’m starting the process of examining the GHCN Global Historical Climate Network Version 4 data set. This is just a “first peek” at it to see how much looks different and get an idea what kind of work is ahead. First blush it looks like a lot has changed and it is designed to make comparisons between the “versions” difficult. I hope to fix that…

Where To Get It

I downloaded this set “long hand” from:

https://www1.ncdc.noaa.gov/pub/data/ghcn/v4/

Then unpacked the “un-adjusted” version (that DOES have adjustments in it – they note that the data may come pre-adjusted from the various country BOMs) along with the descriptions and text files. There is also a directory that looks to be individual station graphs for every station. I’m pretty sure they are all “adjusted” graphs so will consistently show warming (as their “adjustments” plot exactly to the CO2 predictions… strange that, all the warming is in the adjustments and they just happen to exactly match the CO2 theory predictions…)

Eventually, after I’ve sized things up, I want to create a “GHCN v4 Analysis Station From Scratch” posting and whenever (if ever?) that happens I’ll put in an rsync command / script that does the download.

The Downloaded Stuff

Here’s what you get:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ ls -l
total 124484
-rw-r–r– 1 chiefio chiefio 3955 Mar 18 17:53 ghcnm-countries.txt
-rw-r–r– 1 chiefio chiefio 6221 Mar 18 17:53 ghcnm-flags.txt
-rw-r–r– 1 chiefio chiefio 40872107 Mar 18 17:55 ghcnm.tavg.latest.qcf.tar.gz
-rw-r–r– 1 chiefio chiefio 42708888 Mar 18 17:50 ghcnm.tavg.latest.qcu.tar.gz
-rw-r–r– 1 chiefio chiefio 43854929 Mar 18 17:55 ghcnm.tavg.latest.qfe.tar.gz
-rw-r–r– 1 chiefio chiefio 11472 Mar 18 17:51 readme.txt
-rw-r–r– 1 chiefio chiefio 4416 Mar 18 17:51 status.txt
drwxr-xr-x 3 chiefio chiefio 68 Mar 18 o17:57 unpacking

I created the “unpacking” directory where I could lookover the .qcu file (that one is the “unadjusted”).

Here’s that record:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ mkdir unpacking
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ cp ghcnm.tavg.latest.qcu.tar.gz unpacking/chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ cd unpacking/
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking$ ls
ghcnm.tavg.latest.qcu.tar.gz
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking$ gunzip ghcnm.tavg.latest.qcu.tar.gz
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking$ ls
ghcnm.tavg.latest.qcu.tar

You can see that I only work on a copy of the gziped file, and I uncompress it with the gunzip command in Linux. That gives a “tar” or tape archive format file. That gets “untarred”. First I did it with a “tv” that says “table of contents verbose” to see what I was going to get, then used “xvf” that is “eXtract, Verbosly, from the File” where the file is that .tar file named in the command:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking$ tar tvf ghcnm.tavg.latest.qcu.tar
-rw-rw-r– jared.rennie/jared.rennie 162513448 2019-03-18 08:36 ./ghcnm.v4.0.0.20190317/ghcnm.tavg.v4.0.0.20190317.qcu.dat
-rw-rw-r– jared.rennie/jared.rennie 1888285 2019-03-18 08:36 ./ghcnm.v4.0.0.20190317/ghcnm.tavg.v4.0.0.20190317.qcu.inv
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking$ tar xvf ghcnm.tavg.latest.qcu.tar
./ghcnm.v4.0.0.20190317/ghcnm.tavg.v4.0.0.20190317.qcu.dat
./ghcnm.v4.0.0.20190317/ghcnm.tavg.v4.0.0.20190317.qcu.inv

That unpacks into a directory as two files, so I’ll go into that directory (“cd” or Change Directory)

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking$ cd ghcnm.v4.0.0.20190317/
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ ls
ghcnm.tavg.v4.0.0.20190317.qcu.dat ghcnm.tavg.v4.0.0.20190317.qcu.inv
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ ls -l
total 160556
-rw-r–r– 1 chiefio chiefio 162513448 Mar 18 08:36 ghcnm.tavg.v4.0.0.20190317.qcu.dat
-rw-r–r– 1 chiefio chiefio 1888285 Mar 18 08:36 ghcnm.tavg.v4.0.0.20190317.qcu.inv

You may notice that “owner” and “group” changed to me, since I unpacked it as me, not as the superuser “root”.

The .inv file is the “inventory” and tells you about the stations. That .dat file is the temperatures.

At this point, it is an unpacked version of GHCN v4 “unadjusted”. About 162 MB of data. So what’s it look like?

The Files

Here’s a bit of the inventory file:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ head *.inv
ACW00011604  57.7667   11.8667   18.0 SAVE                          
AE000041196  25.3330   55.5170   34.0 SHARJAH_INTER_AIRP            
AEM00041184  25.6170   55.9330   31.0 RAS_AL_KHAIMAH_INTE           
AEM00041194  25.2550   55.3640   10.4 DUBAI_INTL                    
AEM00041216  24.4300   54.4700    3.0 ABU_DHABI_BATEEN_AIR          
AEM00041217  24.4330   54.6510   26.8 ABU_DHABI_INTL                
AEM00041218  24.2620   55.6090  264.9 AL_AIN_INTL                   
AF000040930  35.3170   69.0170 3366.0 NORTH_SALANG                  
AFM00040911  36.7000   67.2000  378.0 MAZAR_I_SHARIF                
AFM00040938  34.2100   62.2280  977.2 HERAT   

I notice first that it starts with 3 letters instead of the Region / Country digits. Then 8 of what looks like the old WMO# but with the 000 in front leads me to think they swapped the “modification” or “near” number to the front and put the 5 digit WMO at the end. (Hopefully they didn’t make the whole thing backwards or ‘all new’ like they did in the v1 to v2 transition).

I’d guess the next two are LAT and LONG then an altitude and station name. Gone are all those “annoying” bits of data like nearness to water, airport flag, urban /rural flag, vegetation type and other things that would show they have largely gone to thermometers at airports in urban areas near or over pavement…

So right out the gate there will be “some assembly required” to match these up with the prior data. I’m pretty sure I can work that out. One table with country flag number and country flag text to match them up (about 160 lines – probably manual to create it) and perhaps another with WMO# to Station Name by version to match all of them up. So 7280 lines in the v2 / v3 case, more for this one as they have a bunch of USHCN and Germany added (rest of world not so much…)

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ wc -l *.inv
27361 ghcnm.tavg.v4.0.0.20190317.qcu.inv

So 27,361 instrument records now? OK… It will be interesting looking at a global map of them. From other sources I’ve heard they were mostly added in the USA and Germany. I may need to move off of the Raspberry Pi with that data size. It is already 10 minutes for some long reports. Make that 4 x as long we’re talking 40 minutes? So I’ll plan on a validation of times on the Pi M3 but expect I’ll need one of the USB 3.0 boards with faster processors to make this really fast.

I figure that’s about 2 weeks of full time work to get ‘er done, so probably a few months elapsed time (unless someone can scare up a “Grant” to pay me a salary… otherwise I have to put a roof on the house this spring… yeah, me, on the roof, with a hammer…)

So, moving on to the “data” file:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ head *.dat
ACW000116041961TAVG -142  k  183  k  419  k  720  k 1075  k 1546  k 1517  k 1428  k 1360  k 1121  k  457  k  -92  k
ACW000116041962TAVG   60  k   32  k -207  k  582  k  855  k 1328  k 1457  k 1340  k 1110  k  941  k  270  k -179  k
ACW000116041963TAVG -766  k -606  k -152  k  488  k 1171  k 1574  k 1567  k 1543  k 1279  k  887  k  513  k -161  k
ACW000116041964TAVG    9  k -138  k    2  k  685  k 1166  k 1389  k 1453  k 1504  k 1168  k  735  k  493  k   59  k
ACW000116041965TAVG   -9  k -158  k  -15  k  537  k  934  k 1447  k 1434  k 1424  k 1324  k  921  k  -22  k -231  k
ACW000116041966TAVG -490  k -614  k  108  k  246  k 1082  k 1642  k 1620  k 1471  k 1195  k  803  k  329  k    2  k
ACW000116041967TAVG -270  k   36  k  397  k  481  k 1052  k 1373  k 1655  k 1598  k 1318  k  997  k  559  k  -96  k
ACW000116041968TAVG -306  k -183  k  220  k  714  k  935  k 1635  k 1572  k 1718  k 1331  k  781  k  180  k  -56  k
ACW000116041969TAVG -134  k -494  k -185  k  497  k  962  k 1634  k 1687  k 1773  k 1379  k  932  k  321  k -275  k
ACW000116041970TAVG -483  k -704  k  -75  k  261  k 1093  k 1724  k 1470  k 1609  k 1163  k  836  k  300  k   73  k

So 12 sets of “k” (or some flag for who knows what) and what looks like a temperature as 1/100 C. Cheeky little bastards to think they have clue to 1/100 C. That takes exceptional calibration and a lab; not some bit of gear stuck out by the runway at Dallas Ft. Worth and ignored for a year+. But, OK, I can use my basic data prep-loading FORTRAN program to get that loaded. Now, for the front of the record…

We’ve got 3 Letters, then what looks like 8 digits of WMO# + “near” flag (again with the minor number ‘near’ part in front instead of in the back as they did in the past), then what looks like a 4 digit year and a flag to tell you type (in this case Temperature Average).

I think I can work with this lot too. Mostly it will be making a map of the 3 letter flag to the 3 digit Region / Country of the past. A manual task, but not too many countries nor continents to deal with.

The Descriptor Files

It comes with files that describes all this stuff. I generally like to just “eyeball” things first, then check it against the documentation. I can find more errors in the documentation that way (if any) and at the same time check my quality of instinct.

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ wc -l *.txt
  238 ghcnm-countries.txt
  125 ghcnm-flags.txt
  274 readme.txt
  103 status.txt
  740 total
chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ 

None of these are so big as to be horrible. 238 Countries? Didn’t know there were that many… Maybe some are sub-sets like overseas departments of France and such? The “status” file is a list of things that changed as they made the set. Announcement of ‘beta’ availability and such. I’ll leave it for last. Flags ought to be interesting. The Readme can be helpful too. So I’ll start with them.

Flags File

Here’s the flags:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ cat *flags*
GHCN-Monthly, Version 4, Flag File 

This file describes all of the data source flags used in GHCN-Monthly
version 4. They are formatted as follows:

Column1: Alpha-Numeric Flag
Column2: Dataset Origin (Note that some sources exist in more than one dataset)
             GHCND: Global Historical Climatology Network: Daily Dataset
                    ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/
             GHCNM: Global Historical Climatology Network: Monthly Dataset
                    ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3/
             ISTI:  International Surface Temperature Initiative 
                    ftp://ftp.ncdc.noaa.gov/pub/data/globaldatabank/

Column3: Description of Flag

********************************************************************************
0    GHCND       U.S. Cooperative Summary of the Day (NCDC DSI-3200)
6    GHCND       CDMP Cooperative Summary of the Day (NCDC DSI-3206)
7    GHCND       U.S. Cooperative Summary of the Day -- Transmitted via WxCoder3 
                 (NCDC DSI-3207)
A    GHCND       U.S. Automated Surface Observing System (ASOS) real-time data 
                 (since January 1, 2006)
a    GHCND       Australian data from the Australian Bureau of Meteorology
B    GHCND       U.S. ASOS data for October 2000-December 2005 (NCDC DSI-3211)
b    GHCND       Belarus update
C    GHCND       Environment Canada
E    GHCND       European Climate Assessment and Dataset (Klein Tank et al., 2002)	   
G    GHCND       Official Global Climate Observing System (GCOS) or other 
                 government-supplied data
H    GHCND       High Plains Regional Climate Center real-time data
I    GHCND       International collection (non U.S. data received through
                 personal contacts)
K    GHCND       U.S. Cooperative Summary of the Day data digitized frompaper 
                 observer forms (from 2011 to present)
M    GHCND       Monthly METAR Extract (additional ASOS data)
N    GHCND       Community Collaborative Rain, Hail,and Snow (CoCoRaHS)
Q    GHCND       Data from several African countries that had been "quarantined", 
                 that is, withheld from public releaseuntil permission was granted 
                 from the respective meteorological services
R    GHCND       NCDC Reference Network Database (Climate Reference Network and 
                 Historical Climatology Network-Modernized)
r    GHCND       All-Russian Research Institute of Hydrometeorological Information
                 World Data Center
S    GHCND       Global Summary of the Day (NCDC DSI-9618)
s    GHCND       China Meteorological Administration/National Meteorological 
                 Information Center/Climatic Data Center (http://cdc.cma.gov.cn)
T    GHCND       SNOwpack TELemtry (SNOTEL) data obtained from the Western 
                 Regional Climate Center
U    GHCND       Remote Automatic Weather Station (RAWS) data obtainedfrom the 
                 Western Regional Climate Center	   
u    GHCND       Ukraine update	   
W    GHCND       WBAN/ASOS Summary of the Day from NCDC's Integrated Surface 
                 Data (ISD).  
X    GHCND       U.S. First-Order Summary of the Day (NCDC DSI-3210)
z    GHCND       Uzbekistan update
Z    GHCND/GHCNM Datzilla
m    ISTI        mexico                        
V    ISTI        vietnam                       
c    ISTI        channel-islands               
e    ISTI        ecuador                       
p    ISTI        pitcairnisland                
j    ISTI        giessen                       
i    ISTI        brazil-inmet                  
1    ISTI        brazil                        
2    ISTI        argentina                     
4    ISTI        india                         
5    ISTI        gsn-sweden                    
8    ISTI        canada-raw                    
9    ISTI        east-africa                   
!    ISTI        uganda                        
#    ISTI        antarctica-aws                
$    ISTI        antarctica-palmer             
%    ISTI        antarctica-southpole          
&    ISTI        ispd-swiss                    
(    ISTI        ispd-ipy                      
)    ISTI        ispd-sydney                   
*    ISTI        antarctica-scar-reader        
+    ISTI        spain                         
-    ISTI        uruguay-inia                  
.    ISTI        uruguay                       
d    ISTI        swiss-digihom                 
/    ISTI        ispd-tunisia-morocco          
:    ISTI        sacad_non-blended             
;    ISTI        japan                         
    ISTI        russsource-australia_de       
^    ISTI        russsource-australia_wwr      
_    ISTI        russsource-ghcn               
'    ISTI        russsource-climat             
`    ISTI        russsource-conus_climat       
~    ISTI        russsource-ak_hi_climat       
?    ISTI        germany                       
w    ISTI        wmssc                         
@    ISTI        central-asia                  
[    ISTI        arctic                        
h    ISTI        histalp                       
]    ISTI        hadisd 
F    ISTI/GHCND  usforts 
D    ISTI/GHCNM  mcdw-unpublished
"    ISTI/GHCNM  ghcnsource
J    ISTI/GHCNM  colonialera
|    ISTI/GHCNM  climat-uk 
}    ISTI/GHCNM  mcdw
k    ISTI/GHCNM  knmi
P    ISTI/GHCNM  climat-ncdc
{    ISTI/GHCNM  wwr
,    ISTI/GHCNM  climat-bufr
3    ISTI/GHCNM  climat-prelim

So, was that that “K”? Well, I’ll sort that out later. What is clear is that there’s a lot of sources and they used funny letters to code them. Got it. Other than that, not so useful.

README File

How about the README? I’ve bolded some bits of interest. Looks like I’m generally right on my guesses.

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4$ cat readme.txt 
GLOBAL HISTORICAL CLIMATOLOGY NETWORK - MONTHLY (GHCN-M) Version 4
(Last Updated: 10/26/2018)

********************************************************************************

The underlying data, its software and all content are provided on an 'as is' 
and 'as available' basis. We do not give any warranties, whether express 
or implied, as to the suitability or usability of the product, 
its software or any of its content.

Feedback is welcome and should be sent via email to:
ncdc.ghcnm@noaa.gov

Should you have any questions or concerns regarding the product,
please let us know immediately so we can rectify these accordingly.
Your help in this regard is greatly appreciated.

********************************************************************************

1. INTRODUCTION

    1.1 OVERVIEW

        GHCN-M version 4 currently contains monthly mean temperature for over
        25,000 stations across the globe. In large part GHCN-M version 4 uses
        the same quality control and bias correction algorithms as version 3. 
        The greatest difference from previous version is a greatly expanded 
        set of stations based on the large data holdings in GHCN-Daily as well 
        as data collected as part of the International Surface Temperatue 
        Initiative databank (ISTI; Rennie et al. 2013).  

    1.2 INTERNET ACCESS

        The GHCNM v4 product can be found here:

        FTP: ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v4

    1.3 DOWNLOADING AND INSTALLING

        WINDOWS (example):

	GHCNM files are compressed into gzip format.  For more information on gzip, 
        please see the following web site:

        www.gzip.org

        and for potential software that can compress/decompress gzip files, please see:

        www.gzip.org/#faq4


        LINUX (example):

        wget ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v4/ghcnm.latest.qcu.tar.gz
        tar -zxvf ghcnm.latest.qcu.tar.gz

        (alternatively, if "tar" does not support decompression, a user can try:

        wget ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v4/ghcnm.latest.qcu.tar.gz
        gzip -d ghcnm.latest.qcu.tar.gz
        tar -xvf ghcnm.latest.qcu.tar.gz)


        Note: the data are placed in their own separate directory, that is named
              according to the following specification:

              ghcnm.v4.x.y.YYYYMMDD where 

              x = integer to be incremented with major data additions 
              y = integer to be incremented with minor data additions
              YYYY = year specific dataset was processed and produced
              MM   = month specific dataset was processed and produced
              DD   = day specific dataset was processed and produced
 
              Two files (per element) should be present in the directory a 
              1) metadata and 2) data file. Note: there will be no 
              increments to "x" and "y" above during the phase.

2. DATA

    2.1 METADATA

       The metadata has been carried over from GHCN-Monthly v3.  This would 
       include basic geographical station information such as latitude, 
       longitude, elevation, station name, etc. The extended metadata
       information, such as surrounding vegetation, etc. was not carried 
       over, however still exists in version 3

    2.1.1 METADATA FORMAT (.inv file)

       Variable          Columns      Type
       --------          -------      ----

       ID                 1-11        Integer
       LATITUDE          13-20        Real
       LONGITUDE         22-30        Real
       STNELEV           32-37        Real
       NAME              39-68        Character

       Variable Definitions:

       ID: Station identification code. First two characters are FIPS country code

       LATITUDE: latitude of station in decimal degrees

       LONGITUDE: longitude of station in decimal degrees

       STELEV: is the station elevation in meters. -999.0 = missing.

       NAME: station name

    2.2  DATA (.dat file)

         The data within GHCNM v4 for the time being consist of monthly
         average temperature.

    2.2.1 DATA FORMAT

          Variable          Columns      Type
          --------          -------      ----

          ID                 1-11        Integer
          YEAR              12-15        Integer
          ELEMENT           16-19        Character
          VALUE1            20-24        Integer
          DMFLAG1           25-25        Character
          QCFLAG1           26-26        Character
          DSFLAG1           27-27        Character
            .                 .             .
            .                 .             .
            .                 .             .
          VALUE12          108-112       Integer
          DMFLAG12         113-113       Character
          QCFLAG12         114-114       Character
          DSFLAG12         115-115       Character

          Variable Definitions:

          ID: Station identification code. First two characters are FIPS country code

          YEAR: 4 digit year of the station record.
 
          ELEMENT: element type, monthly mean temperature="TAVG"

          VALUE: monthly value (MISSING=-9999).  Temperature values are in
                 hundredths of a degree Celsius, but are expressed as whole
                 integers (e.g. divide by 100.0 to get whole degrees Celsius).

          DMFLAG: data measurement flag, nine possible values for QCU/QCF and one for QFE:

                  Quality Controlled Unadj/Adj (QCU/QCF) Flags:

                  blank = no measurement information applicable
                  a-i = number of days missing in calculation of monthly mean
                        temperature (currently only applies to the 1218 USHCN
                        V2 stations included within GHCNM)

                  Quality Controlled Adj and estimated (QFE) Flags:
                  E = Data has been estimated from neighbors within the PHA

          QCFLAG: quality control flag, eleven possibilities within
                  quality controlled unadjusted (qcu) dataset, and three 
                  possibilities within the quality controlled adjusted (qcf) 
                  dataset.

                  Quality Controlled Unadjusted (QCU) QC Flags:
         
                  BLANK = no failure of quality control check or could not be
                          evaluated.

                  Note: the following QC checks are listed in order of execution. When
                        a value is flagged by a QC check it is not subjected to 
                        testing in subsequent checks.

                  E = Identify different stations that have the "same" 12 
                      monthly values for a given year (e.g. when all 12 months
                      are duplicated and there are at least 3 or more non-
                      missing data months, and furthermore values are considered 
                      the "same" when the absolute difference between the two values 
                      is less than or equal to 0.015 deg C)

                  D = monthly value is part of an annual series of values that
                      are exactly the same (e.g. duplicated) within another
                      calendar year in the station's record.

                  R = Flag values greater than or less than known world 
                      TAVG extremes.

                  K = Identifies and flags runs of the same value (non-missing)
                      in five or more consecutive months

                  W = monthly value is duplicated from the previous month,
                      based upon regional and spatial criteria (Note: test is only 
                      applied from the year 2000 to the present, and in general
                      only for near real time produced data sources).                   

                  I = checks for internal consistency between TMAX and TMIN. 
                      Flag is set when TMIN > TMAX for a given month. 

                  L = monthly value is isolated in time within the station
                      record and flagged, when:

                      1) a non-missing value has at least 18 missing values
                         before AND after in time., or

                      2) a non-missing value belongs to a "cluster" of 2
                         adjacent (in time) non-missing values, and the cluster of
                         values has at least 18 missing values before AND after the
                         cluster, or

                      3) a non-missing value belongs to a "cluster" of 3
                         adjacent (in time) non-missing values, and the cluster of
                         values has at least 18 missing values before AND after the
                         cluster.

                  O = monthly value that is >= 5 bi-weight standard deviations
                      from the bi-weight mean.  Bi-weight statistics are
                      calculated from a series of all non-missing values in 
                      the station's record for that particular month.

                  S = Flags value when the station z-score satisfies any of the
                      following algorithm conditions.

                      Definitions:

                      neighbor = any station within 500 km of target station.
                      zscore = (bi-weight standard deviation / bi-weight mean)
                      S(Z) = station's zscore
                      N(Z) = the set of the "5" closest non-missing neighbor zscores.
                             (Note: this set may contain less than 5 neighbors,
                             but must have at least one neighbor zscore for
                             algorithm execution)

                      Algorithm:

                      S(Z) >= 4.0 and < 5.0 and "all" N(Z) = 3.0 and < 4.0 and "all" N(Z) = 2.75 and < 3.0 and "all" N(Z) = 2.5 and < 2.75 and "all" N(Z) < 1.6
                      S(Z)  -5.0 and "all" N(Z) > -1.9
                      S(Z)  -4.0 and "all" N(Z) > -1.8
                      S(Z)  -3.0 and "all" N(Z) > -1.7
                      S(Z)  -2.75 and "all" N(Z) > -1.6
                                   
                  T = Identifies and flags when the temperature z-score compared 
                      to the inverse distance weighted z-score of all neighbors 
                      within 500 km (at least 2 or more neighbors are required)
                      is greater than or equal to 3.0. 
                 

                  M = Manually flagged as erroneous.

                  Quality Controlled Adjusted (QCF) QC Flags:

                  A = alternative method of adjustment used.
 
                  M = values with a non-blank quality control flag in the "qcu"
                      dataset are set to missing the adjusted dataset and given
                      an "M" quality control flag.

                  X = pairwise algorithm removed the value because of too many
                      inhomogeneities.


          DSFLAG: data source flag for monthly value

                  For more information on data source flags, please refer to
                  ghcnm-flags.txt 

3. CONTACT

    3.1 QUESTIONS AND FEEDBACK

        NCDC.GHCNM@noaa.gov

So looks like 3 flags in a row around that “k” and I’ll need to get a finer grain reading on just what row that is to know what flag it is. Got it.

I also note they say it is an 11 “digit” value starting in column 1, but also that the first two characters are the FIPS Country Code. Looks like a documentation error to me. They have 3 CHAR and then 8 INT so “some interpretation required”… I’ll work that out when I do a detailed read of the description and match the 3 CHAR to “whatever”…

Taking a peek at the bottom of the data file is not encouraging:

chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ tail *.dat
ZIXLT6221161961TAVG 1965  J 1945  J 2004  J 1850  J 1690  J 1310  J 1240  J 1320  J 1785  J 2065  J 1950  J 2000  J
ZIXLT6221161962TAVG 2035  J 2035  J 1900  J 1835  J 1315  J 1395  J-9999   -9999   -9999   -9999   -9999   -9999   
ZIXLT6221161963TAVG-9999   -9999   -9999   -9999   -9999    1265  J-9999   -9999   -9999   -9999   -9999   -9999   
ZIXLT6221161964TAVG-9999   -9999   -9999   -9999   -9999    1120  J-9999   -9999   -9999   -9999   -9999   -9999   
ZIXLT6221161965TAVG-9999   -9999   -9999   -9999   -9999    1150  J 1200  " 1420  " 1660  " 1840  " 2040  " 2220  "
ZIXLT6221161966TAVG 2180  " 2040  " 1840  " 1690  " 1430  " 1280  " 1210  " 1460  " 1770  " 1980  " 2090  " 2110  "
ZIXLT6221161967TAVG 2110  " 1990  " 1890  " 1920  " 1510  " 1350  " 1100  " 1380  " 1660  " 2080  " 1990  " 1910  "
ZIXLT6221161968TAVG 2180  " 2000  " 1930  " 1820  " 1560  " 1080  " 1370  " 1630  " 1760  " 2180  " 1840  " 2070  "
ZIXLT6221161969TAVG 2090  " 2150  " 1950  " 1830  " 1410  " 1310  " 1160  " 1460  " 1780  " 2100  " 2040  " 1910  "
ZIXLT6221161970TAVG 2070  " 1990  " 1930  " 1720  " 1560  " 1330  " 1330  " 1540  " 2040  " 2030  " 2130  " 2150  "

It has 5 leading CHAR and then 6 INT of what looks like WMO#. My guess would be the leading 5 CHAR are a complicated code and not just a “minor station number near a WMO” as in the past. OK, harder to work with and match up, but not impossible. IF as described, the first two are Country, and then the last 5 are WMO# (a fella can hope can’t he?) that just leaves 4 of “mystery meat” to work out inside the “ID” of 11 “Integer” that is really including characters.

But at least now you see why I don’t just blindly trust the documentation right out the gate…

So again “some assembly required”…

I’ll need to ponder a bit how to unscramble the “Region / Country” business. The basic problem being that countries change over time (is Crimea Ukraine or Russia right now?…) and really it ought not to be part of the data structure. Oh Well. Maybe just a table of WMO and the FIPS country and the 3 digit v2 / v3 country and then the v1 country? IIRC v1 WMO was a different length, so might need it in the table too to show that mapping.

IF the WMO Number can be used as the Primary Key, that would be ideal, then make all the rest of the metadata and country code(s) dependent values. That ought to straighten out the inventory data.

Having the bulk of all stations devoid of things like Airstation Flag and distance to water will make it impossible to do that kind of report, but I can still do altitude and lat / long. Thin gruel that…

From here on down I’ve included, without comment, the “Status” and “Countries” files, just as simple documentation. I think I have a handle on what I need to do to suck this into a MariaDB database and compare it with GHCN v3. So I’ll start working on that now. The rest of this can “catch up” ;-)

Status File

GHCN-Monthly, Version 4, Status File 

(users can use this file to determine overall current status, including 
information related to previous changes and errata).

DISCLAIMER: The status file does not record every possible change that may
have occurred from one version to another.  This would be impractical for a
variety of reasons. Therefore if a user is interested in discovering every 
possible difference between two different file versions, they will need to 
construct their own program to determine all possible changes between the file
versions. 

********************************************************************************

10/26/2018

GHCN-M version 4.0.0 is now operational. Issues may still arise during the 
first few days of operations. For any questions, please email
ncdc.ghcnm@noaa.gov

********************************************************************************

09/24/2018

With the early online release of the GHCN-M v4 paper, this product is being 
transitioned from a public beta (v4.b.2) to operations (v4.0.0). During this
transition, issues may arise with data output. For questions, please email
ncdc.ghcnm@noaa.gov 

********************************************************************************

02/16/2018

Beginning with the 20180215 run, a third version of v4 beta will be produced.
In addition to QCU and QCF, a QFE inventory and data file will be generated.
The QFE includes estimated data for stations for the period of 1961-1990. Data
are estimated based upon neighboring station info in the Pairwise Homogeneity
Algorithm (PHA). Data is only provided between 1961 and 1990, and will be used
to assess climate trends.

********************************************************************************

11/22/2017

An update to the public BETA of GHCN-M version 4 has been made available, known
as v4.b.2. Changes include:

    - Incorporating the update from the ISTI Databank, now at version 1.1.1.
    - Incorporating changes and updates to the quality control procedure. For
      more information about the flags, please refer to the readme file.
    - Other minor bug fixes, as pointed out from the user community.

Please remember this is still a beta version of GHCN-M version 4, which is 
undergoing final testing before its official release. The underlying
data, its software and all content are provided on an 'as is' and 'as available'
basis. We do not give any warranties, whether express or implied, as to the
suitability or usability of the product, its software or any of its content.

Feedback is welcome and should be sent via email to:
ncdc.ghcnm@noaa.gov

Should you have any questions or concerns regarding the product,
please let us know immediately so we can rectify these accordingly.
Your help in this regard is greatly appreciated.

********************************************************************************

07/20/2017

Beginning this week, v4 beta will begin ingesting new data and adjusting it 
through the Pairwise Homogeneity Algorithm. This update essentially reverts the
change that was made on March 20th, 2017.

********************************************************************************

03/20/2017

Starting with the 20170310 run, v4 Beta will stop adjusting new data ingested. 
Instead, quality controlled unadjusted (QCU) data will be appended
to both QCU and QCF files of v4 beta. During the first month of the new year,
the Pairwise Homogenity Adjustment (PHA) will be run again to adjust the datasets
entire period of record. 

********************************************************************************

10/15/2015

A public BETA of GHCN-M version 4 has been made available,
effective 10/15/2015. This is the first beta distribution, and its version
is known as v4.b.1

Please note that this is a beta version of GHCN-M version 4, which is
still undergoing final testing before its official release. The underlying
data, its software and all content are provided on an 'as is' and 'as available' 
basis. We do not give any warranties, whether express or implied, as to the 
suitability or usability of the product, its software or any of its content.

Feedback is welcome and should be sent via email to:
ncdc.ghcnm@noaa.gov

Should you have any questions or concerns regarding the product,
please let us know immediately so we can rectify these accordingly.
Your help in this regard is greatly appreciated.

Countries File

AC Antigua and Barbuda 
AE United Arab Emirates 
AF Afghanistan
AG Algeria 
AJ Azerbaijan 
AL Albania
AM Armenia 
AO Angola 
AQ American Samoa [United States] 
AR Argentina 
AS Australia 
AU Austria 
AY Antarctica 
BA Bahrain 
BB Barbados 
BC Botswana 
BD Bermuda [United Kingdom] 
BE Belgium 
BF Bahamas, The 
BG Bangladesh
BH Belize
BK Bosnia and Herzegovina 
BL Bolivia 
BM Burma
BN Benin 
BO Belarus 
BP Solomon Islands 
BR Brazil
BU Bulgaria
BX Brunei
BY Burundi
CA Canada 
CB Cambodia
CD Chad 
CE Sri Lanka 
CF Congo (Brazzaville) 
CG Congo (Kinshasa)
CH China  
CI Chile 
CJ Cayman Islands [United Kingdom] 
CK Cocos (Keeling) Islands [Australia] 
CM Cameroon 
CN Comoros
CO Colombia 
CQ Northern Mariana Islands [United States] 
CS Costa Rica 
CT Central African Republic 
CU Cuba 
CV Cape Verde
CW Cook Islands [New Zealand]
CY Cyprus 
DA Denmark 
DJ Dijibouti
DO Dominica
DR Dominican Republic 
EC Ecuador 
EG Egypt 
EI Ireland 
EK Equatorial Guinea
EN Estonia 
ER Eritrea 
ES El Salvador 
ET Ethiopia 
EU Europa Island [France]
EZ Czech Republic 
FG French Guiana [France] 
FI Finland 
FJ Fiji
FK Falkland Islands (Islas Malvinas) [United Kingdom] 
FM Federated States of Micronesia 
FP French Polynesia 
FR France 
FS French Southern and Antarctic Lands [France]
GA Gambia, The 
GB Gabon 
GG Georgia
GH Ghana
GI Gibraltar [United Kingdom]
GJ Grenada
GK Guernsey
GL Greenland [Denmark] 
GM Germany 
GP Guadeloupe [France] 
GQ Guam [United States] 
GR Greece 
GT Guatemala 
GV Guinea 
GY Guyana 
HA Haiti
HK Hong Kong
HO Honduras 
HR Croatia
HU Hungary 
IC Iceland 
ID Indonesia 
IM Isle of Man
IN India 
IO British Indian Ocean Territory [United Kingdom] 
IR Iran 
IS Israel 
IT Italy 
IV Cote D'Ivoire 
IZ Iraq 
JA Japan 
JE Jersey
JM Jamaica 
JN Jan Mayen [Norway] 
JO Jordan
JQ Johnston Atoll [United States] 
JU Juan De Nova Island [France]
KE Kenya 
KG Kyrgyzstan 
KN Korea, North 
KR Kiribati 
KS Korea, South 
KT Christmas Island [Australia] 
KU Kuwait 
KZ Kazakhstan 
LA Laos 
LE Lebanon
LG Latvia 
LH Lithuania
LI Liberia 
LO Slovakia 
LQ Palmyra Atoll [United States] 
LS Liechtenstein
LT Lesotho 
LU Luxembourg 
LY Libya 
MA Madagascar 
MB Martinique [France]
MC Macau S.A.R
MD Moldova 
MF Mayotte [France]
MG Mongolia 
MH Montserrat
MI Malawi
MJ Montenegro 
MK Macedonia 
ML Mali 
MO Morocco 
MP Mauritius 
MQ Midway Islands [United States} 
MR Mauritania 
MT Malta 
MU Oman 
MV Maldives 
MX Mexico 
MY Malaysia 
MZ Mozambique 
NC New Caledonia [France] 
NE Niue [New Zealand]
NF Norfolk Island [Australia]
NG Niger 
NH Vanuatu 
NI Nigeria
NL Netherlands 
NO Norway 
NP Nepal 
NR Nauru
NS Suriname
NT Netherlands Antilles [Netherlands]
NU Nicaragua 
NZ New Zealand 
PA Paraguay 
PC Pitcairn Islands [United Kingdom] 
PE Peru 
PK Pakistan 
PL Poland 
PM Panama 
PO Portugal 
PP Papua New Guinea 
PS Palau
PU Guinea-Bissau
QA Qatar
RE Reunion [France]
RI Serbia
RM Marshall Islands 
RO Romania 
RP Philippines 
RQ Puerto Rico [United States] 
RS Russia 
RW Rwanda
SA Saudi Arabia 
SB Saint Pierre and Miquelon [France]
SC Saint Kitts and Nevis
SE Seychelles 
SF South Africa 
SG Senegal 
SH Saint Helena [United Kingdom] 
SI Slovenia 
SL Sierra Leone 
SN Singapore
SO Somalia
SP Spain 
ST Saint Lucia 
SU Sudan 
SV Svalbard [Norway] 
SW Sweden 
SX South Georgia and the South Sandwich Islands [United Kingdom]
SY Syria 
SZ Switzerland 
TD Trinidad and Tobago 
TE Tromelin Island [France]
TH Thailand 
TI Tajikistan 
TL Tokelau [New Zealand] 
TN Tonga 
TO Togo 
TP Sao Tome and Principe
TS Tunisia 
TT Timor-Leste
TU Turkey 
TV Tuvalu 
TW Taiwan
TX Turkmenistan 
TZ Tanzania 
UG Uganda
UK United Kingdom 
UP Ukraine 
US United States 
UV Burkina Faso 
UY Uruguay 
UZ Uzbekistan 
VC Saint Vincent and the Grenadines
VE Venezuela 
VM Vietnam 
VQ Virgin Islands [United States] 
WA Namibia 
WF Wallis and Futuna [France]
WI Western Sahara 
WQ Wake Island [United States]
WS Samoa
WZ Swaziland 
XX Unknown
YM Yemen
ZA Zambia 
ZI Zimbabwe 

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in NCDC - GHCN Issues and tagged , . Bookmark the permalink.

51 Responses to GHCN v4 First Peek

  1. Bill in Oz says:

    E M I am watching you do this with admiration and no small amount of envy.

    I also understand that you are DOCUMENTING your process so that others can redo it and test the process you are doing, when & if they want. Though it won’t be me as there are too many demands on my time here. :- (

    So you are running a properly conducted scientific experiment !

    Thanks for your commitment to this !

    I expect to see you sign off, at the end with a QED ! ( Quod Erat Demonstrandum ! )

  2. gallopingcamel says:

    It seems those crooks at NOAA/GHCN did not enjoy Tony Heller’s mockery. As usual with gumment conspiracies the answer is less transparency so they have tried to make it difficult to do a v2 vs. v3 vs. v4 comparison.

    Having a corrupt DoJ puts our nation at risk so we need to fix it. Having a corrupt worldwide “Science Establishment” puts global prosperity at risk. We definitely need to fix that.

    Thank you Chiefio for casting light into dark places. Let us watch the cockroaches scurrying. Too bad than none of these crooks will ever serve a day in jail.

  3. A C Osborn says:

    EM, providing that you can match up the Station number to V3 you can then use V3 supplementary data.
    I would attempt to split out the Station numbers in to their own field for as you say for Key use.
    It will be fascinating to compare the individual values in V4 with those in V3 to see what they have corrupted.
    Other people have different versions of V4 which show Adjustments every time it was run, so we know it has happened.

    I wouldn’t mind a copy of your finalised SQL tables, but I would use Access SQL and VBA to work with the data.

    Would you be interested in other versions of V4 if I could locate who has them?
    I am pretty sure Tony Heller has some, but I am not sure if he will share.

  4. E.M.Smith says:

    @A.C.Osborn:

    I am generally a data Pack Rat so I’m always interested in “versions”. That said, I think I need to get this one loaded up and make some mileage toward what it says before I run after too many more copies.

    Since it was “Beta” until about 5 months ago, most “differences” can be blown off with a claim of “in development and not production” anyway.. Interesting in what it can say about their process, but not going to have much effect beyond that.

    The references I’m using for SQL syntax frequently point out the differences between Oracle, Microsoft, and MySQL versions (and Postgress? I think…) Mostly these are things IN the other versions that are missing in MySQL. So I expect that most of what I do, and publish, will run with little or no change in other systems.

    Visual Basic? I once spent a year+ writing production code in HP’s “Business Basic” using their Image and Query database product. A Cost Accounting system. Anything I’ve done with FORTRAN can be done in BASIC. The translation would not be hard as the nature of both languages is similar (procedural, stuff things in variables, similar commands, similar control structures). I’ve not used any of the fancy math features available in FORTRAN. It would not be my first choice, but nothing makes it a bad choice, IMHO.

    FWIW, HP’s BASIC was written by a frustrated Pascal programmer IMHO. Had “BEGIN END” blocks, subroutines, functions, long variable names, DO WHILE, and more. Yeah, limited formal BASIC would work in it, but you could also write mostly Pascal and it would be “right” ;-) Both compiled and interpreted. What I wrote in it looked a lot like Pascal with a line number on every line ;-)

    I had one maintenance program that was a big recursive decent (as the language was fine with recursive and re-entrant functions). Did great on the test case. Then we ran it at the first production site and it “blew up”. A Dentist where people would visit regularly for decades… The decent of the history used all memory and swap and just ran out of space before it reached the end of the record. In one made week I got to rewrite the thing to NOT be recursive… and use a lot less memory (even if it was also a lot less “cool” ;-)

    Since then I’ve tended to avoid “interesting” and “cool” and “trick” code and write “the most direct and simple code” possible….

  5. E.M.Smith says:

    Well this is interesting… Looks like 9999 in the altitude column may have some special meaning:

    MGXLT069203  47.7000  108.4000 9999.0 MONGOLIAN_STATION_C           
    MGXLT102509  49.2000  115.4000 9999.0 MONGOLIAN_STATION_E           
    MGXLT131091  48.7000   98.3000 2134.0 TOSONTSENGEL_DZAVHA           
    MGXLT229724  49.7000   96.4000 9999.0 BAYAN_UUL_DZAVHAN             
    MGXLT252313  46.3000   93.9000 9999.0 TONHIL_GOVIALTAY              
    MGXLT258096  48.6000  110.6000  747.0 BINDER_HENTIY                 
    MGXLT261700  49.8000  106.7000 9999.0 YOROO_SELENGE                 
    MGXLT284560  47.6000   95.0000 9999.0 MONGOLIAN_STATION_G           
    MGXLT304312  47.2000  117.3000  642.0 TAMSAGBULAG_EASTERN           
    MGXLT319937  50.2000  106.2000 9999.0 SUHBAATA_SELENGE              
    MGXLT546984  48.9000  106.9000 9999.0 MONGOLIAN_STATION_S           
    MGXLT571594  46.1000   91.6000 9999.0 BULGAN_HOVD                   
    MGXLT584073  48.3000   89.5000 9999.0 ALTAY_BAYAN_OLIGY             
    MGXLT595925  43.2000  101.0000 9999.0 MONGOLIAN_STATION_S           
    MGXLT627866  44.1000  103.6000 9999.0 BILGAN_SOUTH_GOVI             
    MGXLT629616  47.6000  118.6000 9999.0 MONGOLIAN_STATION_E           
    MGXLT669395  45.4000  103.9000 9999.0 SAYAN_OVO_MIDDLE_GO  
    

    So ^MG is “Mongolia. Then who knows what XLT5 is (or XLTx) then the WMO Number (I think…) Or maybe there’s some 6 digit version of the WMO#? But that altitude ought to be known unless they are flagging something else.

    The third line has XLT but also has an altitude, so disjoint meanings in the two fields (i.e. not both coding for ‘missing’ or whatever).

    I think I need to make a table if WMO# / Name and start from there… Picking one, it does have data in it

    chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ grep MGXLT546984 *.dat
    MGXLT5469841965TAVG-2290  "-1960  " -700  "  190  " 1080  " 1720  " 1660  " 1570  "  840  "  220  "-1220  "-2190  "
    MGXLT5469841966TAVG-2530  "-1750  "-1120  "    0 S" 1070  " 1550  " 1600  " 1570  " 1060  "    0  "-1180  "-2750  "
    MGXLT5469841967TAVG-2580  "-2100  " -690  "  290  " 1220  " 1500  " 1770  " 1490  "  780  "   80  "-1600  "-2470  "
    MGXLT5469841968TAVG-2800  "-2410  " -690  "  290  " 1000  " 1690  " 1980  " 1500  "  700  " -110  "-1040  "-2020  "
    MGXLT5469841969TAVG-3100  "-2730  " -920  "  210  " 1000  " 1730  " 1920  " 1430  "  800  "   20  "-1090  "-2020  "
    MGXLT5469841970TAVG-2300  "-2090  "-1510  "  210  "  920  " 1690  " 1990  " 1650  "  710  "  -40  "-1440  "-2410  "
    MGXLT5469841971TAVG-2240  "-2400  "-1470  "  240  " 1090  " 1510  " 1710  " 1630  "  710  "   20  " -900  "-2140  "
    MGXLT5469841972TAVG-2220  "-2170  " -580  "  310  " 1110  " 1740  " 1860  " 1520  "  780  " -110  "-1150  "-2130  "
    MGXLT5469841973TAVG-2400  "-2170  " -750  "  200  "  950  " 1770  " 1630  " 1470  " 1050  " -110  " -790  "-1679  "
    MGXLT5469841974TAVG-2180  "-2040  "-1190  "  290  "  940  " 1510  " 1830  " 1860  "  800  " -370  "-1260  "-2320  "
    MGXLT5469841975TAVG-2029  "-1740  " -500  "  160  " 1090  " 1590  " 1660  " 1620  " 1030  "   90  " -910  "-2420  "
    MGXLT5469841976TAVG-2300  "-1660  " -960  "   80  "  810  " 1460  " 1820  " 1380  " 1060  "  -50  "-1430  "-2110  "
    MGXLT5469841977TAVG-2950  "-2140  " -710  "  160  "  830  " 1610  " 2950 O" 1770  " 1050  "  130  " -889  " 1900 O"
    MGXLT5469841978TAVG-2240  "-2110  " -680  "  280  " 1000  " 1760  " 1660  " 1520  "  940  "    0  " -939  "-2090  "
    chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ 
    

    but only a few years (1965 to 1978). OK, so can’t just ignore those with 9999 altitude.

  6. E.M.Smith says:

    Looks like 6 digit WMO per this interesting and useful list:
    http://cms.ashrae.biz/weatherdata/STATIONS/StnList_s.pdf

    A nice list of global sites and WMO numbers, lat, long, elev.

    So, OK, that just leaves “XLT” to explain…

    Sidebar:

    Sky News is showing pictures of snow falling in Greece right now… (I presume now, it is a news show). So spring snow in Greece is “warming”? …

  7. E.M.Smith says:

    Comparing FUKUSHIMA (that ought to exist in both)

    v3.3:

    chiefio@PiM3Devuan2:~/SQL/v3$ grep FUKU inventory.in 
    21047595001  37.7500  140.4700   67.0 FUKUSHIMA                       152U  247MVxxno-9x-9COOL MIXED      C
    21047616001  36.0500  136.2300    9.0 FUKUI                            64U  232MVxxCO15x-9COOL FOR./FIELD C
    21047767001  34.4500  133.2500    2.0 FUKUYAMA                         98U  330HIxxCO 5x-9WATER           C
    21047807000  33.5800  130.3800   14.0 FUKUOKA                          41U 1002FLxxCO 1x-9COASTAL EDGES   C
    21047843000  32.7000  128.8300   26.0 FUKUE                            65S   32HIxxCO 2A 2WATER           A
    

    Then v4:

    chiefio@PiM3Devuan2:/WDTB/xfs/GHCNv4/unpacking/ghcnm.v4.0.0.20190317$ grep FUKU *.inv
    JA000047595  37.2330  140.4330  375.0 FUKUSHIMA_ARPT                
    JA000047616  36.0500  136.2170   17.0 FUKUI                         
    JA000047767  34.4500  133.2500    3.0 FUKUYAMA                      
    JA000047807  33.5830  130.3830   15.0 FUKUOKA                       
    JA000047843  32.7000  128.8330   26.0 FUKUE                         
    JAW00043310  33.5833  130.4500   12.2 FUKUOKA                       
    JAXLT020742  37.4550  140.2820 9999.0 FUKUSHIMA                     
    JAXLT058200  32.4160  128.4960 9999.0 FUKUE                         
    JAXLT121328  36.3300  136.1330 9999.0 FUKUI                         
    JAXLT358607  37.7500  140.4700   67.0 FUKUSHIMA                     
    JAXLT878450  33.3490  130.2250 9999.0 FUKUOKA                       
    

    Hmmm… Putting just the records of interest next to each other, first line v3, the rest v4:

    21047595001  37.7500  140.4700   67.0 FUKUSHIMA
    JA000047595  37.2330  140.4330  375.0 FUKUSHIMA_ARPT
    JAXLT020742  37.4550  140.2820 9999.0 FUKUSHIMA     
    JAXLT358607  37.7500  140.4700   67.0 FUKUSHIMA
    

    So four records with 3 different LAT LONG and Altitude (one of which is the 9999 flag and has XLT in the mystery fields). The last one matches v3 on LAT, LONG & Altitude, but not WMO. The 2nd one looks to match on 5 digits of WMO, but not on LAT, LONG, or Altitude. And is a different name at the airport.

    How to unscramble that?…

    Heck, how to just detect how much of it there is in the data set… Match the core 5 digits of WMO and then compare for differences in LAT, LONG, ALT? Grumble…

  8. E.M.Smith says:

    A useful 49 page PDF document that describes all the stuff they do in creating the data set:

    ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v4/documentation/CDRP-ATBD-0859%20Rev%201%20GHCN-M%20Mean%20Temperature-v4.pdf

    Trudging through it now…

  9. E.M.Smith says:

    Page 8 of that doc has a nice map of stations color coded by age. Then on page 30 there’s this:

    In version 4, as in version 3, of the GHCNm temperature data, the apparent impacts of documented and undocumented inhomogeneities are detected, and corrected for, through automated pairwise comparisons of mean monthly temperature series as detailed in Menne and Williams (2009). The pairwise algorithm (Menne and Williams, 2009) starts by forming a large number of pairwise difference series between serial monthly temperature values from a region. Each difference series is then statistically evaluated for abrupt shifts, and the station series responsible for a particular break is identified automatically. Neighbors used in creating the difference series are those which are best correlated with the target. There is no limit to the physical distance between the target and its neighbors. In at least once case (St Helena Island), neighbors more than 1000 km away were sufficiently correlated to identify and correct for an inhomogeneity that occurred in 1976.

    After all of the shifts that are detectable by the algorithm are attributed to the appropriate station within the network, an adjustment is made for each target shift. Adjustments are determined by estimating the magnitude of change in pairwise difference series between the target series and highly correlated neighboring series that have no apparent shifts at the same time as the target. Adjustments are not applied for statistically insignificant changes.

    In GHCNm version 4 one or more bias corrections were applied to approximately 20,000 stations (Figure 6). The magnitude of corrections necessary for removing inhomogeneities from station records were applied equally to all months preceding the inhomogeneity, and corrections generally ranged from +/- 0.2°C to 2.0°C. Less than 5% of all corrections exceeded +/- 2.0°C (Figure 7)

    So 20,000 stations were “corrected”… but most of them are under 2 C…

    One wonders how much this “homogenizing” the data accounts for the loss of low excursions and the “warming” seen when the tops don’t rise but low going excursions “go away”…

  10. Larry Ledwick says:

    21047595001 37.7500 140.4700 67.0 FUKUSHIMA
    JA000047595 37.2330 140.4330 375.0 FUKUSHIMA_ARPT
    JAXLT020742 37.4550 140.2820 9999.0 FUKUSHIMA
    JAXLT358607 37.7500 140.4700 67.0 FUKUSHIMA

    On google maps 37.7500 140.4700 turns out to be in the middle of the Abukuma river in Fukushima.

    The third entry 37.4550 140.2820 pins in the middle of an agricultural field 20 miles south south west of Fukushima. and about 43 miles west north west of the power plant.

    The power plant is tagged at 37.421370, 141.028079 on google maps

    37.2330 140.4330 turns out to be just off the runway of the Fukushima airport, and is about 31 ft from a white object which might be a Stevenson screen

  11. E.M.Smith says:

    Well this is going to be fun… NOT!

    Looks like they have gone to a 6 digit WMO number but with MOST of the old station having their old 5 digit number that can duplicate a newer station for 5 of 6 digits…

    I made the FORTRAN program to take in the inventory file, break it up into Country, WMO, LAT, LONG, ALT, NAME along with assigning the unknown 4 digits to Flag Fields, compute the COS(LAT) and spit it all out with TABS ready for MariaDB to load. On a whim, looking at the tail of the file. I decided to check if there were duplicates of the 5 digit WMO. There are…

    chiefio@PiM3Devuan2:~/SQL/v4$ grep 71333 invent_tabs 
    JAXLT071333	JA	X	L	T	0	71333	 34.4400	 136.3110	9999.0	TSU                           	Gv4  	18Mar2019 	QCUA	 112.42
    ZIXLT371333	ZI	X	L	T	3	71333	-17.8300	  31.0200	1471.0	HARARE_BELVEDERE              	Gv4  	18Mar2019 	QCUA	  29.53
    chiefio@PiM3Devuan2:~/SQL/v4$
    

    So… what to do… I think I’ll need a 6 digit WMO, but then strip the leading zero off the ones that match to the old V2 / V3 type…

    I’m thinking “load both a 5 and 6 digit WMO” field so I can do both. Easy to create the data at this point and just stuff a field.

    Still no idea what the XLT flags are for / mean. Still a bit bothered by the 9999.0 elevation flag…

    In about an hour I ought to have the inventory data in a loadable format and into a database table. Then I can start trying to compare it to the v3 inventory of stations. I think I’m going to take a coffee break first though ;-)

  12. E.M.Smith says:

    @Larry L:

    Thanks for looking those up! So they have the same metadata quality they always had, eh? Stations in the middle of rivers…

    Nice to know the one at the airport is near the runway (‘density altitude’ determines if you can fly or crash and that depends on knowing the actual temperature at the runway… Great for aviation, not so great for “climate science’).

    So we have a choice of the runway, in the river, or lost? Sounds about right…

  13. E.M.Smith says:

    Yeah, that looks like something I can work with. So I can load the wmo as both 5 and 6 and then figure out which is the keeper later, or does that leading digit (Flag 4) sometimes have a 6th WMO digit and sometimes have something else?

    After a coffee & snack break, I’ll make the database table and load script, then “kick it around” a while to decide what’s best. Then a final round from FORTRAN to database once settled.

    You will need to scroll this to the right to see it, but I have both F4 / WMO and WML (L for long ;) in the tab separated input file to the database:

    chiefio@PiM3Devuan2:~/SQL/v4$ tail invent_tabs 
    ZI000067964	ZI	0	0	0	0	67964	067964	-20.1500	  28.6170	1344.0	BULAWAYO_GOETZ_OBS            	Gv4  	18Mar2019 	QCUA	  26.87
    ZI000067965	ZI	0	0	0	0	67965	067965	-20.0170	  28.6170	1326.0	BULAWAYO_AIRPORT              	Gv4  	18Mar2019 	QCUA	  26.89
    ZI000067969	ZI	0	0	0	0	67969	067969	-21.0500	  29.3670	 861.0	WEST_NICHOLSON                	Gv4  	18Mar2019 	QCUA	  27.41
    ZI000067975	ZI	0	0	0	0	67975	067975	-20.0670	  30.8670	1095.0	MASVINGO                      	Gv4  	18Mar2019 	QCUA	  28.99
    ZI000067977	ZI	0	0	0	0	67977	067977	-21.0170	  31.5830	 430.0	BUFFALO_RANGE                 	Gv4  	18Mar2019 	QCUA	  29.48
    ZI000067983	ZI	0	0	0	0	67983	067983	-20.2000	  32.6160	1132.0	CHIPINGE                      	Gv4  	18Mar2019 	QCUA	  30.61
    ZI000067991	ZI	0	0	0	0	67991	067991	-22.2170	  30.0000	 457.0	BEITBRIDGE                    	Gv4  	18Mar2019 	QCUA	  27.77
    ZIXLT371333	ZI	X	L	T	3	71333	371333	-17.8300	  31.0200	1471.0	HARARE_BELVEDERE              	Gv4  	18Mar2019 	QCUA	  29.53
    ZIXLT443557	ZI	X	L	T	4	43557	443557	-18.9800	  32.4500	1018.0	GRAND_REEF                    	Gv4  	18Mar2019 	QCUA	  30.69
    ZIXLT622116	ZI	X	L	T	6	22116	622116	-19.4300	  29.7500	1411.0	GWELO                         	Gv4  	18Mar2019 	QCUA	  28.06
    
  14. E.M.Smith says:

    And it’s loaded up…

    MariaDB [temps]> source bin/lPi4.sql
    Query OK, 27361 rows affected (2.98 sec)             
    Records: 27361  Deleted: 0  Skipped: 0  Warnings: 0
    

    So, in theory, 27,361 “stations”.

    Here’s the structure I put it in for now (to kick it around and see where this pig squeals..)

    chiefio@PiM3Devuan2:~/SQL/tables$ cat invent4
    CREATE TABLE invent4 (
        stnID      CHAR(11) NOT NULL,
        country    CHAR(3) NOT NULL,
        f1         CHAR(1),
        f2         CHAR(1),
        f3         CHAR(1),
        f4         CHAR(1),
        wmo        CHAR(5) NOT NULL,
        wnl        CHAR(6) NOT NULL,
        latitude   CHAR(8),
        longitude  CHAR(9),
        stn_elev   CHAR(6),
        name       CHAR(30) NOT NULL,
        version    CHAR(5) NOT NULL,
        ascension  CHAR(10),
        type       CHAR(4),
        coslong    CHAR(9),
        PRIMARY KEY (stnID)
        ) ;
    

    And as I printed the fields in the same order as the table, the load command is fairly trivial:

    chiefio@PiM3Devuan2:~/SQL/bin$ cat lPi4.sql 
    LOAD DATA LOCAL INFILE '/SG500/xfs/chiefio/SQL/v4/invent_tabs' INTO TABLE invent4;
    

    Once I’ve figured out the end state, I’ll make a posting with the FORTRAN and the final table layout.

    For now I’m just going to use this one to explore the data.

  15. E.M.Smith says:

    Well, looks like I’ll be busy with this unscrambling problem for a little while:

    So lets just match the old 5 digit WMO with the 5 digit part of the new WMO and then inspect the records where they match… Oh, that would be 3350 of them….

    chiefio@PiM3Devuan2:~/SQL/bin$ cat cmpi3i4.sql 
    SELECT I.stnID, N.stnID, I.wmo, N.wmo 
    FROM invent3 AS I
    INNER JOIN invent4 AS N
    WHERE I.wmo=N.wmo
    GROUP BY I.stnID
    ORDER BY I.stnID;
    [...]
    | 64917202005 | TUM00017202 | 17202 | 17202 |
    | 64917203000 | TUM00017203 | 17203 | 17203 |
    | 64917203001 | TUM00017203 | 17203 | 17203 |
    | 64917203002 | TUM00017203 | 17203 | 17203 |
    | 64917204000 | TUM00017204 | 17204 | 17204 |
    | 64917204001 | TUM00017204 | 17204 | 17204 |
    | 64917204002 | TUM00017204 | 17204 | 17204 |
    | 64917205000 | USC00417205 | 17205 | 17205 |
    | 64917205001 | USC00417205 | 17205 | 17205 |
    | 64917205002 | USC00417205 | 17205 | 17205 |
    | 64917205003 | USC00417205 | 17205 | 17205 |
    | 64917210000 | TUM00017210 | 17210 | 17210 |
    | 64917210001 | TUM00017210 | 17210 | 17210 |
    | 64917219001 | TUM00017219 | 17219 | 17219 |
    | 64917219002 | TUM00017219 | 17219 | 17219 |
    | 64917219003 | TUM00017219 | 17219 | 17219 |
    | 64917220001 | TUM00017220 | 17220 | 17220 |
    | 64917220002 | TUM00017220 | 17220 | 17220 |
    | 64917234000 | TUM00017234 | 17234 | 17234 |
    | 64917234001 | TUM00017234 | 17234 | 17234 |
    | 64917234002 | TUM00017234 | 17234 | 17234 |
    | 64917234003 | TUM00017234 | 17234 | 17234 |
    | 64917234004 | TUM00017234 | 17234 | 17234 |
    | 64917237000 | TUM00017237 | 17237 | 17237 |
    | 64917237001 | TUM00017237 | 17237 | 17237 |
    | 64917237002 | TUM00017237 | 17237 | 17237 |
    | 64917237003 | TUM00017237 | 17237 | 17237 |
    | 64917240000 | TU000017240 | 17240 | 17240 |
    | 64917240001 | TU000017240 | 17240 | 17240 |
    | 64917240002 | TU000017240 | 17240 | 17240 |
    | 64917240003 | TU000017240 | 17240 | 17240 |
    | 64917244000 | TUM00017244 | 17244 | 17244 |
    | 64917244001 | TUM00017244 | 17244 | 17244 |
    | 64917244002 | TUM00017244 | 17244 | 17244 |
    | 64917244003 | TUM00017244 | 17244 | 17244 |
    | 64917244004 | TUM00017244 | 17244 | 17244 |
    | 64917244005 | TUM00017244 | 17244 | 17244 |
    | 64917244006 | TUM00017244 | 17244 | 17244 |
    | 64917244007 | TUM00017244 | 17244 | 17244 |
    | 64917244008 | TUM00017244 | 17244 | 17244 |
    | 64917250000 | TUM00017250 | 17250 | 17250 |
    | 64917255000 | TUM00017255 | 17255 | 17255 |
    | 64917255001 | TUM00017255 | 17255 | 17255 |
    | 64917255002 | TUM00017255 | 17255 | 17255 |
    | 64917260000 | TUM00017260 | 17260 | 17260 |
    | 64917260001 | TUM00017260 | 17260 | 17260 |
    | 64917260002 | TUM00017260 | 17260 | 17260 |
    | 64917270000 | CA007017270 | 17270 | 17270 |
    | 64917270001 | CA007017270 | 17270 | 17270 |
    | 64917280000 | CA003017280 | 17280 | 17280 |
    | 64917280001 | CA003017280 | 17280 | 17280 |
    | 64917282000 | TUM00017282 | 17282 | 17282 |
    | 64917285000 | TUM00017285 | 17285 | 17285 |
    | 64917285001 | TUM00017285 | 17285 | 17285 |
    | 64917285002 | TUM00017285 | 17285 | 17285 |
    | 64917292000 | TUM00017292 | 17292 | 17292 |
    | 64917292001 | TUM00017292 | 17292 | 17292 |
    | 64917300000 | TUM00017300 | 17300 | 17300 |
    | 64917300001 | TUM00017300 | 17300 | 17300 |
    | 64917300002 | TUM00017300 | 17300 | 17300 |
    | 64917320000 | CA004017320 | 17320 | 17320 |
    | 64917320001 | CA004017320 | 17320 | 17320 |
    | 64917320002 | CA004017320 | 17320 | 17320 |
    | 64917320003 | CA004017320 | 17320 | 17320 |
    | 64917340000 | TUM00017340 | 17340 | 17340 |
    | 64917340001 | TUM00017340 | 17340 | 17340 |
    | 64917351001 | TUM00017351 | 17351 | 17351 |
    | 64917351002 | TUM00017351 | 17351 | 17351 |
    | 64917351003 | TUM00017351 | 17351 | 17351 |
    | 64917370000 | CA004017370 | 17370 | 17370 |
    | 64917370001 | CA004017370 | 17370 | 17370 |
    | 64917370002 | CA004017370 | 17370 | 17370 |
    | 64917370003 | CA004017370 | 17370 | 17370 |
    | 64917370004 | CA004017370 | 17370 | 17370 |
    | 64917375000 | TU000017375 | 17375 | 17375 |
    | 64917375001 | TU000017375 | 17375 | 17375 |
    | 65033301000 | UPM00033301 | 33301 | 33301 |
    | 65033345000 | JAXLT833345 | 33345 | 33345 |
    | 65033377000 | UPM00033377 | 33377 | 33377 |
    | 65033393000 | UPM00033393 | 33393 | 33393 |
    | 65033506000 | UPM00033506 | 33506 | 33506 |
    | 65033562000 | UPM00033562 | 33562 | 33562 |
    | 65033587000 | UPM00033587 | 33587 | 33587 |
    | 65033631000 | UPM00033631 | 33631 | 33631 |
    | 65033658000 | UPM00033658 | 33658 | 33658 |
    | 65033699001 | UPM00033699 | 33699 | 33699 |
    | 65033711000 | UPM00033711 | 33711 | 33711 |
    | 65033837000 | UPM00033837 | 33837 | 33837 |
    | 65033846000 | UPM00033846 | 33846 | 33846 |
    | 65033889000 | UPM00033889 | 33889 | 33889 |
    | 65033910000 | UPM00033910 | 33910 | 33910 |
    | 65033915000 | UPM00033915 | 33915 | 33915 |
    | 65033934000 | UPM00033934 | 33934 | 33934 |
    | 65033946000 | UPM00033946 | 33946 | 33946 |
    | 65033976000 | UPM00033976 | 33976 | 33976 |
    | 65033983000 | UPM00033983 | 33983 | 33983 |
    | 65033990000 | NOE00133990 | 33990 | 33990 |
    | 65034300000 | UPM00034300 | 34300 | 34300 |
    | 65034523001 | UPM00034523 | 34523 | 34523 |
    | 65034524001 | UPM00034524 | 34524 | 34524 |
    | 65034601000 | UPM00034601 | 34601 | 34601 |
    | 65103005000 | MX000003005 | 03005 | 03005 |
    | 65103026000 | UK000003026 | 03026 | 03026 |
    | 65103038000 | GM000003038 | 03038 | 03038 |
    | 65103038001 | GM000003038 | 03038 | 03038 |
    | 65103055001 | USW00003055 | 03055 | 03055 |
    | 65103068001 | MX000003068 | 03068 | 03068 |
    | 65103072001 | USC00503072 | 03072 | 03072 |
    | 65103091000 | UKM00003091 | 03091 | 03091 |
    | 65103100000 | CA008103100 | 03100 | 03100 |
    | 65103160000 | USC00503160 | 03160 | 03160 |
    | 65103160001 | USC00503160 | 03160 | 03160 |
    | 65103162000 | UK000003162 | 03162 | 03162 |
    | 65103257000 | UKM00003257 | 03257 | 03257 |
    | 65103292001 | UKM00003292 | 03292 | 03292 |
    | 65103302000 | UK000003302 | 03302 | 03302 |
    | 65103334000 | CA001103334 | 03334 | 03334 |
    | 65103334001 | CA001103334 | 03334 | 03334 |
    | 65103345001 | UK000003345 | 03345 | 03345 |
    | 65103355001 | UK000003355 | 03355 | 03355 |
    | 65103377000 | UK000003377 | 03377 | 03377 |
    | 65103496001 | UK000003496 | 03496 | 03496 |
    | 65103501001 | CA008403501 | 03501 | 03501 |
    | 65103590001 | UKM00003590 | 03590 | 03590 |
    | 65103670001 | USC00503670 | 03670 | 03670 |
    | 65103672001 | CA001203672 | 03672 | 03672 |
    | 65103761001 | UKM00003761 | 03761 | 03761 |
    | 65103817001 | USW00003817 | 03817 | 03817 |
    | 65103827000 | UKM00003827 | 03827 | 03827 |
    | 65103862000 | UKM00003862 | 03862 | 03862 |
    | 65103865000 | GM000003865 | 03865 | 03865 |
    | 65103894000 | USW00003894 | 03894 | 03894 |
    | 65103917000 | UKM00003917 | 03917 | 03917 |
    | 65206011000 | ASN00006011 | 06011 | 06011 |
    | 65308495000 | GIM00008495 | 08495 | 08495 |
    | 65308495001 | GIM00008495 | 08495 | 08495 |
    | 65408522000 | PO000008522 | 08522 | 08522 |
    | 65408524000 | POM00008524 | 08524 | 08524 |
    | 70088963001 | AYM00088963 | 88963 | 88963 |
    | 70089001000 | AYM00089001 | 89001 | 89001 |
    | 70089002000 | ASN00089002 | 89002 | 89002 |
    | 70089022000 | AYM00089022 | 89022 | 89022 |
    | 70089034000 | AYM00089034 | 89034 | 89034 |
    | 70089034001 | AYM00089034 | 89034 | 89034 |
    | 70089050000 | AYM00089050 | 89050 | 89050 |
    | 70089053001 | AYM00089053 | 89053 | 89053 |
    | 70089053002 | AYM00089053 | 89053 | 89053 |
    | 70089053003 | AYM00089053 | 89053 | 89053 |
    | 70089055000 | AYM00089055 | 89055 | 89055 |
    | 70089055001 | AYM00089055 | 89055 | 89055 |
    | 70089055002 | AYM00089055 | 89055 | 89055 |
    | 70089055003 | AYM00089055 | 89055 | 89055 |
    | 70089059001 | AYM00089059 | 89059 | 89059 |
    | 70089061000 | AYM00089061 | 89061 | 89061 |
    | 70089062000 | AYM00089062 | 89062 | 89062 |
    | 70089063000 | AYM00089063 | 89063 | 89063 |
    | 70089066000 | AYM00089066 | 89066 | 89066 |
    | 70089512000 | AYM00089512 | 89512 | 89512 |
    | 70089532000 | AYM00089532 | 89532 | 89532 |
    | 70089542000 | AYM00089542 | 89542 | 89542 |
    | 70089564000 | AYM00089564 | 89564 | 89564 |
    | 70089571000 | AYM00089571 | 89571 | 89571 |
    | 70089592000 | AYM00089592 | 89592 | 89592 |
    | 70089606000 | AYM00089606 | 89606 | 89606 |
    | 70089606001 | AYM00089606 | 89606 | 89606 |
    | 70089611000 | AYM00089611 | 89611 | 89611 |
    | 70089642000 | AYM00089642 | 89642 | 89642 |
    | 70089664000 | AYM00089664 | 89664 | 89664 |
    | 70188968000 | AYM00088968 | 88968 | 88968 |
    | 80099901001 | IN099999901 | 99901 | 99901 |
    +-------------+-------------+-------+-------+
    3350 rows in set (2 min 32.18 sec)
    

    OK… So there’s some interesting bits in there

    | 65103160000 | USC00503160 | 03160 | 03160 |
    | 65103160001 | USC00503160 | 03160 | 03160 |
    

    So the old “minor number” or modification flag series have now been combined into one set of data, hiding the equipment differences that happened behind some homogenizing I presume. Then it looks like a “5” is prepended while the old way was 6 for continent and 51 for country. Now US codes for country in this European record. Wonder why it is a US station and in Europe? Maybe later… ;-) One hopes that they didn’t just reuse the station number from Europe in a US station somewhere….

    That still leaves C00 flags as unexplained. I hope that the added digit (5 in this case) maps to some simple thing and isn’t a randomly arriving digit… i.e. systematic you can code, random not so much and will likely take a manual matching.

    Here’s a Canada station with the same thing going on. Matching somewhere in Europe on 5 digits. But also matching on more….

    | 65103334000 | CA001103334 | 03334 | 03334 |
    | 65103334001 | CA001103334 | 03334 | 03334 |
    

    That kind of gives the lie to the idea that they might have just assumed a 0 in the leading digit for the 6th digit of the v3 set. No such luck. Here we’ve got a 6 digit match, and the first record is in Europe while the second says Canada.

    At this point it looks like the WMO number is no longer a usable key for matching stations. To be sure, more ‘dig here’ required to show if that’s just a small error sample or a trend, or maybe the USA and Canada have a small part of Europe ;-) (Maybe embassies? But I doubt it…)

    OK, I’m going down this rabbit hole for a while. Don’t hold your breath… Needing to back-figure how someone scrambled their data and unscramble it is not my favorite puzzle… but I’ve done it before. Those years as a DBA … Data Base Administrator. Forensics can be like this too…

    I’m off to ponder with a coffee cup…

  16. E.M.Smith says:

    Looks like another example of a splice – a “3 way” in Antarctica

    | 70089053001 | AYM00089053 | 89053 | 89053 |
    | 70089053002 | AYM00089053 | 89053 | 89053 |
    | 70089053003 | AYM00089053 | 89053 | 89053 
    

    That last digit of 1, 2, 3 in the first number – the old form Station ID, compared to the new station ID that’s all the same single number…

    But at least for some stations the WMO matches…

  17. E.M.Smith says:

    Took 7 seconds shy of 3 minutes to make this report (same as above but wiht names added):

    | 63834880000 | ASTRAHAN'                      | ASTRAKAN                       | RSM00034880 | 34880 |
    | 63834929000 | KRASNODAR                      | MARIETTA_WTR_WKS               | USC00334929 | 34929 |
    | 63834949000 | STAVROPOL'                     | STAVROPOL_AMSG                 | RSM00034949 | 34949 |
    | 63837004001 | NOVOROSSISK         USSR       | GELENDZIK                      | RSM00037004 | 37004 |
    | 63837018000 | TUAPSE                         | TUAPSE                         | RSM00037018 | 37018 |
    | 63837031000 | ARMAVIR                        | ARMAVIR                        | RSM00037031 | 37031 |
    | 63837050000 | PJATIGORSK                     | PJATIGORSK                     | RSM00037050 | 37050 |
    | 63837054000 | MINERAL'NYE V                  | MINERAL_NYE_VODY               | RSM00037054 | 37054 |
    | 63837099000 | SOTCHI                         | SOTCHI                         | RSM00037099 | 37099 |
    | 63837235000 | GROZNYJ                        | GROZNYJ                        | RSM00037235 | 37235 |
    | 63837472000 | MAHACKALA                      | MAHACKALA                      | RSM00037472 | 37472 |
    | 63913160000 | SOMBOR                         | KVITFJELL                      | NON00013160 | 13160 |
    | 63913174000 | KIKINDA                        | GLENWOOD_2_WNW                 | USC00213174 | 13174 |
    | 63913183000 | VRSAC                          | FLATONIA_4SE                   | USC00413183 | 13183 |
    | 63913266000 | SREMSKA MITRO                  | FT_HANCOCK_8SSE                | USC00413266 | 13266 |
    | 63913285000 | VELIKO GRADIS                  | FT_WORTH_WSFO                  | USC00413285 | 13285 |
    | 63913289000 | CRNI VRH                       | FT_WORTH_NATURE_CTR            | USC00413289 | 13289 |
    | 63913367000 | ZLATIBOR                       | KAPOHO_93                      | USC00513367 | 13367 |
    | 63913384000 | CUPRIJA                        | GENESEO                        | USC00113384 | 13384 |
    | 64111816000 | BRATISLAVA IV                  | BRATISLAVA_IVANKA              | LOM00011816 | 11816 |
    | 64111903000 | SLIAC                          | SLIAC                          | LOM00011903 | 11903 |
    | 64111916000 | CHOPOK                         | CHOPOK                         | LOM00011916 | 11916 |
    | 64111918001 | ARVAVARALJA         CZECH      | LIESEK                         | LOM00011918 | 11918 |
    | 64111930000 | LOMNICKY STIT                  | LOMNICKY_STIT                  | LOM00011930 | 11930 |
    | 64111934000 | POPRAD/TATRY                   | POPRAD_TATRY                   | LO000011934 | 11934 |
    | 64214015000 | LJUBLJANA/BEZ                  | DARWIN_AIRPORT                 | ASN00014015 | 14015 |
    | 64308025000 | BILBAO              SPAIN      | CARNAMAH                       | ASN00008025 | 08025 |
    | 64308027000 | SAN SEBASTIAN                  | SAN_SEBASTIANIGUELDO           | SP000008027 | 08027 |
    | 64308160000 | ZARAGOZA/AEROPUERTO            | SELMER                         | USC00408160 | 08160 |
    | 64308160001 | ZARAGOZA                       | SELMER                         | USC00408160 | 08160 |
    | 64308181001 | BARCELONA           SPAIN      | BARCELONA_AEROPUERTO           | SP000008181 | 08181 |
    | 64308184001 | MONTSENY             SPAIN     | NUEVA_CASAS_GRANDES            | MX000008184 | 08184 |
    | 64308202000 | SALAMANCA/MAT                  | TODAGIN_RANCH                  | CA001208202 | 08202 |
    

    Some stay the same. Some change.

    I think I need to translate “country” to a constant thing and then select on both WMO and country…

  18. E.M.Smith says:

    I’ve made a “country” table and loaded V2 and V3 into it (v4 “soon” I was doing 2 & 3 to test on the easy one first…) There are some differences (version is on the far right V2 vs V3.3):

    | 141     | SOUTH AFRICA                                         | Gv3.3   |
    | 141     | SOUTH AFRICA                                         | Gv2     |
    | 143     | TERRITORY OF THE FRENCH SOUTHERN AND ANTARCTIC LANDS | Gv3.3   |
    | 147     | BRITISH OVERSEAS TERRITORIES                         | Gv3.3   |
    | 148     | SUDAN                                                | Gv3.3   |
    | 148     | SUDAN                                                | Gv2     |
    | 149     | TANZANIA                                             | Gv2     |
    [...]
    | 641     | SLOVAKIA                                             | Gv3.3   |
    | 641     | SLOVAKIA                                             | Gv2     |
    | 642     | SLOVENIA                                             | Gv3.3   |
    | 643     | SPAIN                                                | Gv3.3   |
    | 643     | SPAIN                                                | Gv2     |
    [...]
    +---------+------------------------------------------------------+---------+
    470 rows in set (0.01 sec)
    

    So we gained a “Territory” of France in Antarctic lands and a British overseas territory along with Slovenia in v3.3 (so wonder if Slovenia was in the data before but with the old country name?…)

    Now I’m going to try loading up v4 and seeing if I can figure out how to match these guys up…

  19. E.M.Smith says:

    Still a bit of work to do, but getting there… I didn’t put a V4 in version on that set, so it is showing NULL, but I’ve got all three versions loaded and can now get down to matching who’s a what and can I get a map of the country number to the country abreviation that works for all countries in all versions.

    I also need to settle on a caps choice. V4 has gone to lowercase…

    MariaDB [temps]> SELECT abrev,couname,country,version FROM country ORDER BY couname;
    +-------+----------------------------------------------------------+---------+---------+
    | abrev | couname                                                  | country | version |
    +-------+----------------------------------------------------------+---------+---------+
    |       | AFGHANISTAN                                              | 201     | Gv3.3   |
    | AF    | Afghanistan                                              |         | NULL    |
    |       | AFGHANISTAN                                              | 201     | Gv2     |
    |       | ALBANIA                                                  | 601     | Gv3.3   |
    |       | ALBANIA                                                  | 601     | Gv2     |
    | AL    | Albania                                                  |         | NULL    |
    | AG    | Algeria                                                  |         | NULL    |
    |       | ALGERIA                                                  | 101     | Gv3.3   |
    |       | ALGERIA                                                  | 101     | Gv2     |
    |       | AMERICAN SAMOA (U.S.A.)                                  | 521     | Gv3.3   |
    |       | AMERICAN SAMOA (U.S.A.)                                  | 521     | Gv2     |
    | AQ    | American Samoa [United States]                           |         | NULL    |
    |       | AMSTERDAM ISLAND (FRANCE)                                | 157     | Gv2     |
    |       | AMSTERDAM ISLAND (FRANCE)                                | 157     | Gv3.3   |
    |       | ANGOLA                                                   | 102     | Gv3.3   |
    [...]
    
  20. Bill in Oz says:

    Holy Fu*k E M !
    I have just read all this with awe and a good dose of perplexity. All power to your arm and I hope the coffee is good & strong !

    by the way Antartica is interesting. I think 9-10 countries made territorial claims there before 1950’s. The it was all put under an international treaty which ‘froze’ all territorial claims.. And put them to one side in the interests of scientific research. ( Seen as a problem to sort in the future ! )

    So there should be a lot meteorological records for Antarctica from Australia, NZ, Chile, Argentina, UK, France, Russia, USA, China, Japan, Netherlands, etc etc…dating a long way back in time… ( With the weather so freezing almost all the time they had lots of time to do weather station records ! )https://en.wikipedia.org/wiki/Research_stations_in_Antarctica

    A unified list of all of them might be useful in a climate sense – but extra work for you ! But I wonder whether they are listed as Antarctica or under the individual countries research bases.

  21. E.M.Smith says:

    Almost got it. Here’s the result of hand typing the country numbers into the file with the country as 2 letters. There’s a few countries without a number, so it looks like the abbreviation will be the key field, then two countries had two numbers (Russia and Kazakhstan IIRC) so that must be handled somehow.

    I need to do a QA run on it and see if some place just changed the name a little. I caught several like that already. 240 rows in 0.01 seconds to report it ;-)

    Here’s how it looks so far:

    MariaDB [temps]> source bin/country.sql
    +-------+------+---------+----------------------------------------------------------+
    | abrev | cnum | version | cname                                                    |
    +-------+------+---------+----------------------------------------------------------+
    | AF    | 201  | G4      | Afghanistan                                              |
    | AL    | 601  | G4      | Albania                                                  |
    | AG    | 101  | G4      | Algeria                                                  |
    | AQ    | 521  | G4      | American Samoa [United States]                           |
    | AO    | 102  | G4      | Angola                                                   |
    | AY    | 700  | G4      | Antarctica                                               |
    | AC    | 426  | G4      | Antigua and Barbuda                                      |
    | AR    | 301  | G4      | Argentina                                                |
    | AM    | 602  | G4      | Armenia                                                  |
    | AS    | 501  | G4      | Australia                                                |
    | AU    | 603  | G4      | Austria                                                  |
    | AJ    | 604  | G4      | Azerbaijan                                               |
    | BF    | 423  | G4      | Bahamas, The                                             |
    | BA    | 202  | G4      | Bahrain                                                  |
    | BG    | 203  | G4      | Bangladesh                                               |
    | BB    | 401  | G4      | Barbados                                                 |
    | BO    | 605  | G4      | Belarus                                                  |
    | BE    | 606  | G4      | Belgium                                                  |
    | BH    | 402  | G4      | Belize                                                   |
    | BN    | 103  | G4      | Benin                                                    |
    | BD    | 427  | G4      | Bermuda [United Kingdom]                                 |
    | BL    | 302  | G4      | Bolivia                                                  |
    | BK    | 607  | G4      | Bosnia and Herzegovina                                   |
    | BC    | 104  | G4      | Botswana                                                 |
    | BR    | 303  | G4      | Brazil                                                   |
    | IO    |      | G4      | British Indian Ocean Territory [United Kingdom]          |
    | BX    | 522  | G4      | Brunei                                                   |
    | BU    | 608  | G4      | Bulgaria                                                 |
    | UV    | 105  | G4      | Burkina Faso                                             |
    | BM    | 216  | G4      | Burma                                                    |
    | BY    | 106  | G4      | Burundi                                                  |
    | CB    | 204  | G4      | Cambodia                                                 |
    | CM    | 107  | G4      | Cameroon                                                 |
    | CA    | 403  | G4      | Canada                                                   |
    | CV    | 108  | G4      | Cape Verde                                               |
    | CJ    | 429  | G4      | Cayman Islands [United Kingdom]                          |
    | CT    | 109  | G4      | Central African Republic                                 |
    | CD    | 110  | G4      | Chad                                                     |
    | CI    | 304  | G4      | Chile                                                    |
    | CH    | 205  | G4      | China                                                    |
    | KT    | 523  | G4      | Christmas Island [Australia]                             |
    | CK    | 524  | G4      | Cocos (Keeling) Islands [Australia]                      |
    | CO    | 305  | G4      | Colombia                                                 |
    | CN    | 111  | G4      | Comoros                                                  |
    | CF    | 112  | G4      | Congo (Brazzaville)                                      |
    | CG    | 112  | G4      | Congo (Kinshasa)                                         |
    | CW    | 525  | G4      | Cook Islands [New Zealand]                               |
    | CS    | 405  | G4      | Costa Rica                                               |
    | IV    | 113  | G4      | Cote D'Ivoire                                            |
    | HR    | 609  | G4      | Croatia                                                  |
    | CU    | 406  | G4      | Cuba                                                     |
    | CY    | 610  | G4      | Cyprus                                                   |
    | EZ    | 611  | G4      | Czech Republic                                           |
    | DA    | 612  | G4      | Denmark                                                  |
    | DJ    | 114  | G4      | Dijibouti                                                |
    | DO    | 430  | G4      | Dominica                                                 |
    | DR    | 407  | G4      | Dominican Republic                                       |
    | EC    | 306  | G4      | Ecuador                                                  |
    | EG    | 115  | G4      | Egypt                                                    |
    | ES    | 408  | G4      | El Salvador                                              |
    | EK    |      | G4      | Equatorial Guinea                                        |
    | ER    | 116  | G4      | Eritrea                                                  |
    | EN    | 613  | G4      | Estonia                                                  |
    | ET    | 117  | G4      | Ethiopia                                                 |
    | EU    |      | G4      | Europa Island [France]                                   |
    | FK    | 316  | G4      | Falkland Islands (Islas Malvinas) [United Kingdom]       |
    | FM    | 527  | G4      | Federated States of Micronesia                           |
    | FJ    | 502  | G4      | Fiji                                                     |
    | FI    | 614  | G4      | Finland                                                  |
    | FR    | 615  | G4      | France                                                   |
    | FG    | 315  | G4      | French Guiana [France]                                   |
    | FP    | 528  | G4      | French Polynesia                                         |
    | FS    | 143  | G4      | French Southern and Antarctic Lands [France]             |
    | GB    | 118  | G4      | Gabon                                                    |
    | GA    | 150  | G4      | Gambia, The                                              |
    | GG    | 616  | G4      | Georgia                                                  |
    | GM    | 617  | G4      | Germany                                                  |
    | GH    | 119  | G4      | Ghana                                                    |
    | GI    | 653  | G4      | Gibraltar [United Kingdom]                               |
    | GR    | 618  | G4      | Greece                                                   |
    | GL    | 431  | G4      | Greenland [Denmark]                                      |
    | GJ    | 409  | G4      | Grenada                                                  |
    | GP    | 432  | G4      | Guadeloupe [France]                                      |
    | GQ    | 529  | G4      | Guam [United States]                                     |
    | GT    | 410  | G4      | Guatemala                                                |
    | GK    |      | G4      | Guernsey                                                 |
    | GV    | 120  | G4      | Guinea                                                   |
    | PU    | 121  | G4      | Guinea-Bissau                                            |
    | GY    | 307  | G4      | Guyana                                                   |
    | HA    | 411  | G4      | Haiti                                                    |
    | HO    | 412  | G4      | Honduras                                                 |
    | HK    |      | G4      | Hong Kong                                                |
    | HU    | 619  | G4      | Hungary                                                  |
    | IC    | 620  | G4      | Iceland                                                  |
    | IN    | 207  | G4      | India                                                    |
    | ID    | 503  | G4      | Indonesia                                                |
    | IR    | 208  | G4      | Iran                                                     |
    | IZ    | 209  | G4      | Iraq                                                     |
    | EI    | 621  | G4      | Ireland                                                  |
    | IM    |      | G4      | Isle of Man                                              |
    | IS    | 622  | G4      | Israel                                                   |
    | IT    | 623  | G4      | Italy                                                    |
    | JM    | 413  | G4      | Jamaica                                                  |
    | JN    |      | G4      | Jan Mayen [Norway]                                       |
    | JA    | 210  | G4      | Japan                                                    |
    | JE    |      | G4      | Jersey                                                   |
    | JQ    | 530  | G4      | Johnston Atoll [United States]                           |
    | JO    | 624  | G4      | Jordan                                                   |
    | JU    |      | G4      | Juan De Nova Island [France]                             |
    | KZ    | 211  | G4      | Kazakhstan                                               |
    | KZ    | 625  | G4      | Kazakhstan                                               |
    | KE    | 122  | G4      | Kenya                                                    |
    | KR    | 504  | G4      | Kiribati                                                 |
    | KN    |      | G4      | Korea, North                                             |
    | KS    | 221  | G4      | Korea, South                                             |
    | KU    | 212  | G4      | Kuwait                                                   |
    | KG    | 213  | G4      | Kyrgyzstan                                               |
    | LA    | 214  | G4      | Laos                                                     |
    | LG    | 626  | G4      | Latvia                                                   |
    | LE    | 627  | G4      | Lebanon                                                  |
    | LT    | 162  | G4      | Lesotho                                                  |
    | LI    | 123  | G4      | Liberia                                                  |
    | LY    | 124  | G4      | Libya                                                    |
    | LS    |      | G4      | Liechtenstein                                            |
    | LH    | 628  | G4      | Lithuania                                                |
    | LU    | 629  | G4      | Luxembourg                                               |
    | MC    | 234  | G4      | Macau S.A.R                                              |
    | MK    | 648  | G4      | Macedonia                                                |
    | MA    | 125  | G4      | Madagascar                                               |
    | MI    | 126  | G4      | Malawi                                                   |
    | MY    | 505  | G4      | Malaysia                                                 |
    | MV    | 235  | G4      | Maldives                                                 |
    | ML    | 127  | G4      | Mali                                                     |
    | MT    | 630  | G4      | Malta                                                    |
    | RM    | 531  | G4      | Marshall Islands                                         |
    | MB    | 433  | G4      | Martinique [France]                                      |
    | MR    | 128  | G4      | Mauritania                                               |
    | MP    | 129  | G4      | Mauritius                                                |
    | MF    | 163  | G4      | Mayotte [France]                                         |
    | MX    | 414  | G4      | Mexico                                                   |
    | MQ    |      | G4      | Midway Islands [United States}                           |
    | MD    | 631  | G4      | Moldova                                                  |
    | MG    | 215  | G4      | Mongolia                                                 |
    | MJ    | 632  | G4      | Montenegro                                               |
    | MH    |      | G4      | Montserrat                                               |
    | MO    | 130  | G4      | Morocco                                                  |
    | MZ    | 131  | G4      | Mozambique                                               |
    | WA    | 132  | G4      | Namibia                                                  |
    | NR    | 506  | G4      | Nauru                                                    |
    | NP    | 217  | G4      | Nepal                                                    |
    | NL    | 633  | G4      | Netherlands                                              |
    | NT    | 434  | G4      | Netherlands Antilles [Netherlands]                       |
    | NC    | 532  | G4      | New Caledonia [France]                                   |
    | NZ    | 507  | G4      | New Zealand                                              |
    | NU    | 415  | G4      | Nicaragua                                                |
    | NG    | 113  | G4      | Niger                                                    |
    | NI    | 134  | G4      | Nigeria                                                  |
    | NE    | 533  | G4      | Niue [New Zealand]                                       |
    | NF    | 534  | G4      | Norfolk Island [Australia]                               |
    | CQ    | 535  | G4      | Northern Mariana Islands [United States]                 |
    | NO    | 634  | G4      | Norway                                                   |
    | MU    | 218  | G4      | Oman                                                     |
    | PK    | 219  | G4      | Pakistan                                                 |
    | PS    |      | G4      | Palau                                                    |
    | LQ    |      | G4      | Palmyra Atoll [United States]                            |
    | PM    | 416  | G4      | Panama                                                   |
    | PP    | 508  | G4      | Papua New Guinea                                         |
    | PA    | 308  | G4      | Paraguay                                                 |
    | PE    | 309  | G4      | Peru                                                     |
    | RP    | 509  | G4      | Philippines                                              |
    | PC    | 537  | G4      | Pitcairn Islands [United Kingdom]                        |
    | PL    | 635  | G4      | Poland                                                   |
    | PO    | 636  | G4      | Portugal                                                 |
    | RQ    | 435  | G4      | Puerto Rico [United States]                              |
    | QA    | 220  | G4      | Qatar                                                    |
    | RE    | 165  | G4      | Reunion [France]                                         |
    | RO    | 637  | G4      | Romania                                                  |
    | RS    | 638  | G4      | Russia                                                   |
    | RS    | 222  | G4      | Russia                                                   |
    | RW    | 166  | G4      | Rwanda                                                   |
    | SH    |      | G4      | Saint Helena [United Kingdom]                            |
    | SC    | 417  | G4      | Saint Kitts and Nevis                                    |
    | ST    | 436  | G4      | Saint Lucia                                              |
    | SB    | 438  | G4      | Saint Pierre and Miquelon [France]                       |
    | VC    | 437  | G4      | Saint Vincent and the Grenadines                         |
    | WS    | 541  | G4      | Samoa                                                    |
    | TP    | 136  | G4      | Sao Tome and Principe                                    |
    | SA    | 223  | G4      | Saudi Arabia                                             |
    | SG    | 137  | G4      | Senegal                                                  |
    | RI    | 639  | G4      | Serbia                                                   |
    | SE    | 138  | G4      | Seychelles                                               |
    | SL    | 139  | G4      | Sierra Leone                                             |
    | SN    | 511  | G4      | Singapore                                                |
    | LO    | 641  | G4      | Slovakia                                                 |
    | SI    | 642  | G4      | Slovenia                                                 |
    | BP    | 512  | G4      | Solomon Islands                                          |
    | SO    | 140  | G4      | Somalia                                                  |
    | SF    | 141  | G4      | South Africa                                             |
    | SX    | 317  | G4      | South Georgia and the South Sandwich Islands [United Kin |
    | SP    | 643  | G4      | Spain                                                    |
    | CE    | 224  | G4      | Sri Lanka                                                |
    | SU    | 148  | G4      | Sudan                                                    |
    | NS    | 312  | G4      | Suriname                                                 |
    | SV    |      | G4      | Svalbard [Norway]                                        |
    | WZ    | 167  | G4      | Swaziland                                                |
    | SW    | 645  | G4      | Sweden                                                   |
    | SZ    | 646  | G4      | Switzerland                                              |
    | SY    | 647  | G4      | Syria                                                    |
    | TW    | 236  | G4      | Taiwan                                                   |
    | TI    | 227  | G4      | Tajikistan                                               |
    | TZ    | 149  | G4      | Tanzania                                                 |
    | TH    | 228  | G4      | Thailand                                                 |
    | TT    |      | G4      | Timor-Leste                                              |
    | TO    | 151  | G4      | Togo                                                     |
    | TL    | 538  | G4      | Tokelau [New Zealand]                                    |
    | TN    | 517  | G4      | Tonga                                                    |
    | TD    | 424  | G4      | Trinidad and Tobago                                      |
    | TE    | 168  | G4      | Tromelin Island [France]                                 |
    | TS    | 152  | G4      | Tunisia                                                  |
    | TU    | 649  | G4      | Turkey                                                   |
    | TX    | 229  | G4      | Turkmenistan                                             |
    | TV    | 518  | G4      | Tuvalu                                                   |
    | UG    | 153  | G4      | Uganda                                                   |
    | UP    | 650  | G4      | Ukraine                                                  |
    | AE    | 230  | G4      | United Arab Emirates                                     |
    | UK    | 651  | G4      | United Kingdom                                           |
    | US    | 425  | G4      | United States                                            |
    | XX    |      | G4      | Unknown                                                  |
    | UY    | 313  | G4      | Uruguay                                                  |
    | UZ    | 231  | G4      | Uzbekistan                                               |
    | NH    | 520  | G4      | Vanuatu                                                  |
    | VE    | 314  | G4      | Venezuela                                                |
    | VM    | 232  | G4      | Vietnam                                                  |
    | VQ    | 440  | G4      | Virgin Islands [United States]                           |
    | WQ    | 539  | G4      | Wake Island [United States]                              |
    | WF    | 540  | G4      | Wallis and Futuna [France]                               |
    | WI    | 169  | G4      | Western Sahara                                           |
    | YM    | 233  | G4      | Yemen                                                    |
    | ZA    | 155  | G4      | Zambia                                                   |
    | ZI    | 156  | G4      | Zimbabwe                                                 |
    +-------+------+---------+----------------------------------------------------------+
    240 rows in set (0.01 sec)
    
    
  22. Larry Ledwick says:

    Yes the list of Antarctic stations is long

    https://en.wikipedia.org/wiki/Research_stations_in_Antarctica

  23. E.M.Smith says:

    Antarctic data are reported under Antarctica as a 700 continent / country. See the 6th item listed in my just prior posting ;-)

  24. E.M.Smith says:

    Since V3 is a superset of V2 for countries (add 2 IIRC) I’m just going to load it for comparison. Here’s a sample of the report I’m going through now.

    MariaDB [temps]> source bin/country.sql
    +-------+------+---------+----------------------------------------------------------+
    | abrev | cnum | version | cname                                                    |
    +-------+------+---------+----------------------------------------------------------+
    | AF    | 201  | G4      | Afghanistan                                              |
    |       | 201  | Gv3.3   | AFGHANISTAN                                              |
    | AL    | 601  | G4      | Albania                                                  |
    |       | 601  | Gv3.3   | ALBANIA                                                  |
    |       | 101  | Gv3.3   | ALGERIA                                                  |
    | AG    | 101  | G4      | Algeria                                                  |
    |       | 521  | Gv3.3   | AMERICAN SAMOA (U.S.A.)                                  |
    | AQ    | 521  | G4      | American Samoa [United States]                           |
    |       | 157  | Gv3.3   | AMSTERDAM ISLAND (FRANCE)                                |
    |       | 102  | Gv3.3   | ANGOLA                                                   |
    | AO    | 102  | G4      | Angola                                                   |
    | AY    | 700  | G4      | Antarctica                                               |
    |       | 700  | Gv3.3   | ANTARCTICA                                               |
    | AC    | 426  | G4      | Antigua and Barbuda                                      |
    

    Basically, for anything with a similar name, I’m looking to see if there is a chance one with no number and one with no abreviation are really the same one… Then for any others (like Amsterdam Island) I’ll be hitting the Wiki to see if it has some other name that just happens to be listed in v4 instead…

    Basically skipping all the matched sets of 2, and just picking out the leftovers to research. Then there will be a “final load” with a key field and I’ll have my Country Mapping pretty much done between the versions. There is an XX abbreviation defined as “unknown” so at the end of this I’ll likely use it for any V3 leftovers without an assignment. If any.

  25. A C Osborn says:

    E.M.Smith says: 19 March 2019 at 5:33 pm
    “One wonders how much this “homogenizing” the data accounts for the loss of low excursions and the “warming” seen when the tops don’t rise but low going excursions “go away”… ”

    Nail Hammer Head on hit.
    They have literally removed all Weather from the Climate, all those well known weather events like Lake Cooling, Foehns etc need no longer apply to be included in this super dataset.
    And guess what they hardly ever add to the Temps, they practically always reduce the past temps.
    20,000 stations adjusted, just wow!

    They really have made it difficult to compare V1, V2, V3 and especially V4.

  26. A C Osborn says:

    Could they be using the site 9999 as a code for do not include in Global Calculations?

  27. E.M.Smith says:

    @A C Osborn:

    Most anything can be used as a “omit” flag, but it would be pretty obvious. IMHO there’s been much more “dumb stuff” than “malicious and deceptive” stuff. So I’d expect that for some source of records there was no elevation available and that’s what got stuck in. I mean, after all, using Google Maps is Soooooo…. hard 8-{ /sarc;

    Just as an FYI to others:

    I really really hate it when folks have their own “special” definition of fundamental words. Like “all”. What part of “all” was unclear? Apparently some part was…

    There’s this nice feature in SQL that lets you dump things to a file. I wanted to dump my loaded country set so it would be easy to use an editor to pick out the lines that needed some R&D for a final ccode or whatever…

    chiefio@PiM3Devuan2:~/SQL/bin$ cat dcountry.sql 
    SELECT cnum,abrev,cname,region,version,ascension
    INTO OUTFILE 'Country.list'
    FROM country
    GROUP BY cname
    ORDER BY cname
    ;
    

    But it gives:

    MariaDB [temps]> source bin/dcountry.sql
    ERROR 1045 (28000) at line 1 in file: 'bin/dcountry.sql': Access denied for user 'chiefio'@'localhost' (using password: YES)
    MariaDB [temps]> 
    

    After a too long a time digging around, knowing I’d done a ‘GRANT ALL’ on permissions, I ran into this factoid:

        GRANT ALL does not imply GRANT FILE
        GRANT FILE only works with *.*
    

    from here:
    https://stackoverflow.com/questions/6091427/mysql-into-outfile-access-denied-but-my-user-has-all-access-and-the-fold

    OK….

    So “GRANT ALL” does not grant all, it is really “GRANT UNKNOWN SUBSET THEN GO FISH”.

    Well GRANT FILE doesn’t seem to be letting me do it either, so I think I’m stuck in Security Hell and need to drop out to the security level and come back with full on root privs and oh, never mind… it will be easier to just write them down with pen and paper….

    Security Paranoia can be a good thing in a full on industrial setting, but not for the Home Gamer… on a Raspberry Pi… with only one user… inside 2 layers of firewall… with public data…

    I’ll revisit the whole issue of how to actually write output to a file some other day… though, IMHO, having a database system where BY DEFAULT you can’t write output reports to a file, is “broken by design” at the point of shipment. And certainly so when “GRANT ALL” permissions doesn’t grant the ability to write out reports…

  28. E.M.Smith says:

    Because I can paste it here (but for some reason vi doesn’t want to take the paste) I’m going to be publishing this to the world instead of putting it in a private temp file… all because MySQL/MariaDB won’t let me write a private file without pain. Gee, that “feature” sure is keeping the data from leaking out…

    I’m cutting out those records that are done, only leaving those where they need a cnum but are new so there is none; and those with a number but I’ve not matched them to a current abbreviation.

    One I can see right off the top is that I’ve left the Koreas in as they have a name change to sort out. Some others I’m not sure about ( I thought I’d caught “Bahamas, The” vs “The Bahamas” so need to check that one. Others are just WT? country is that and what was it then?

    Then the additional “countries” (many dependent islands): It looks like the “old” number system started with a continent number then just sequentially numbered around. I’m thinking I’ll start at 99 and count down so they don’t collide, but it will have a number even if no data exists for it. So, for example, some island off in the Pacific that wasn’t in the set before would be 599 and the next one 598, etc.

    That just leaves problems like Russia that has two numbers but only one abbreviation. Keep that? Squash it? Hmmm…. and places like CEUTA (SPAIN) with no abbreviation – is it a name change or what? Or maybe just assign it to Spain? So some geography and history digging needed.

    Anyone with ideas, toss ’em out there. There will be NO clean solution to this, only “good enough” and I’m open to suggestions and especially things like “Dog’s Breakfast Africa is now named King’s Paradise Democratic Socialist Republic Of Climate Change”…

    MariaDB [temps]> source bin/rcountry.sql
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    | cnum | abrev | cname                                                    | region | version | ascension |
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    | 521  |       | AMERICAN SAMOA (U.S.A.)                                  | 5      | Gv3.3   | 7Sept2015 |
    | 157  |       | AMSTERDAM ISLAND (FRANCE)                                | 1      | Gv3.3   | 7Sept2015 |
    | 701  |       | ARGENTINE BASE IN ANTARCTICA                             | 7      | Gv3.3   | 7Sept2015 |
    | 158  |       | ASCENSION ISLAND (U.K.)                                  | 1      | Gv3.3   | 7Sept2015 |
    | 536  |       | BELAU                                                    | 5      | Gv3.3   | 7Sept2015 |
    |      | IO    | British Indian Ocean Territory [United Kingdom]          |        | G4      | NULL      |
    | 147  |       | BRITISH OVERSEAS TERRITORIES                             | 1      | Gv3.3   | 7Sept2015 |
    | 428  |       | BRITISH VIRGIN ISLANDS (U.K.)                            | 4      | Gv3.3   | 7Sept2015 |
    | 159  |       | CANARY ISLANDS (SPAIN)                                   | 1      | Gv3.3   | 7Sept2015 |
    | 160  |       | CEUTA (SPAIN)                                            | 1      | Gv3.3   | 7Sept2015 |
    | 161  |       | CHAGOS ARCHIPELAGO (U.K.)                                | 1      | Gv3.3   | 7Sept2015 |
    | 112  |       | CONGO                                                    | 1      | Gv3.3   | 7Sept2015 |
    | 112  | CF    | Congo (Brazzaville)                                      |        | G4      | NULL      |
    | 112  | CG    | Congo (Kinshasa)                                         |        | G4      | NULL      |
    | 526  |       | CORAL SEA ISLANDS (AUSTRALIA)                            | 5      | Gv3.3   | 7Sept2015 |
    | 206  |       | DEMOCRATIC PEOPLE'S REPUBLIC OF KOREA                    | 2      | Gv3.3   | 7Sept2015 |
    |      | EK    | Equatorial Guinea                                        |        | G4      | NULL      |
    |      | EU    | Europa Island [France]                                   |        | G4      | NULL      |
    | 652  |       | FAROE ISLANDS (DENMARK)                                  | 6      | Gv3.3   | 7Sept2015 |
    |      | GK    | Guernsey                                                 |        | G4      | NULL      |
    |      | HK    | Hong Kong                                                |        | G4      | NULL      |
    |      | IM    | Isle of Man                                              |        | G4      | NULL      |
    |      | JN    | Jan Mayen [Norway]                                       |        | G4      | NULL      |
    |      | JE    | Jersey                                                   |        | G4      | NULL      |
    |      | JU    | Juan De Nova Island [France]                             |        | G4      | NULL      |
    |      | KN    | Korea, North                                             |        | G4      | NULL      |
    | 221  | KS    | Korea, South                                             |        | G4      | NULL      |
    |      | LS    | Liechtenstein                                            |        | G4      | NULL      |
    | 654  |       | MADEIRA ISLANDS (PORTUGAL)                               | 6      | Gv3.3   | 7Sept2015 |
    | 164  |       | MELILLA (SPAIN)                                          | 1      | Gv3.3   | 7Sept2015 |
    |      | MQ    | Midway Islands [United States}                           |        | G4      | NULL      |
    |      | MH    | Montserrat                                               |        | G4      | NULL      |
    | 216  |       | MYANMAR                                                  | 2      | Gv3.3   | 7Sept2015 |
    |      | PS    | Palau                                                    |        | G4      | NULL      |
    |      | LQ    | Palmyra Atoll [United States]                            |        | G4      | NULL      |
    | 221  |       | REPUBLIC OF KOREA                                        | 2      | Gv3.3   | 7Sept2015 |
    | 222  | RS    | Russia                                                   |        | G4      | NULL      |
    | 222  |       | RUSSIAN FEDERATION (ASIAN SECTOR)                        | 2      | Gv3.3   | 7Sept2015 |
    | 638  |       | RUSSIAN FEDERATION (EUROPEAN SECTOR)                     | 6      | Gv3.3   | 7Sept2015 |
    |      | SH    | Saint Helena [United Kingdom]                            |        | G4      | NULL      |
    | 800  |       | SHIP STATIONS                                            | 8      | Gv3.3   | 7Sept2015 |
    |      | SV    | Svalbard [Norway]                                        |        | G4      | NULL      |
    | 143  |       | TERRITORY OF THE FRENCH SOUTHERN AND ANTARCTIC LANDS     | 1      | Gv3.3   | 7Sept2015 |
    | 423  |       | THE BAHAMAS                                              | 4      | Gv3.3   | 7Sept2015 |
    | 150  |       | THE GAMBIA                                               | 1      | Gv3.3   | 7Sept2015 |
    |      | TT    | Timor-Leste                                              |        | G4      | NULL      |
    | 439  |       | TURKS AND CAICOS ISLANDS                                 | 4      | Gv3.3   | 7Sept2015 |
    |      | XX    | Unknown                                                  |        | G4      | NULL      |
    | 154  |       | ZAIRE                                                    | 1      | Gv3.3   | 7Sept2015 |
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    298 rows in set (0.01 sec)
    MariaDB [temps]>
  29. Larry Ledwick says:

    EM I just talked to my head dba and he thinks the issue is here:

    INTO OUTFILE ‘Country.list’

    He suspects it is trying to dump that file in the home directory of mysql so you need to do something like this with a fully qualified path name where your user has permission to write out files.

    INTO OUTFILE ‘/myhomedirectory/Country.list’

  30. E.M.Smith says:

    OK, I’ve put in an ascension date for G4, and I’ve found where North Korea was the D.R. of Korea and I’ve gone and assigned numbers in the x99-x90 range for FIPS countries without a number (as they will not be assigning one anyway and that is an unused range that’s easy to spot or filter if desired – so for each one I looked up what continent they were on, or near for all those islands, and assigned one.

    I assigned the 900 block to the XX code as it is unused in the Gv1-3 sets. As ships are also unused in v4 (but 800) I’ll be giving it XX as the FIPS code.

    As I run into information (like if a given island WMO number and LAT / LONG match it to a different cnum in the past) I may change it if appropriate.

    That just leaves 17 “question” records From V3 not in V4 or where the name has changed and I need to do some more matching. They are the ones with a blank FIPS code abbreviation. (I’ve also made cnum a key now, since all of v4 has one in my modified load file). These are stations where I need to find an FIPS code, assign them XX, or realize they were some other country then and have a FIPS abbreviation now and I just didn’t make the match.

    So here’s the size of the space now (I’ll leave in a couple of x99ish so you can see what they look like). The warnings on the load of v3 countries is just telling me I already claimed that cnum value with my v4 table load, which is by design.

    I’ve left in “British Indian Ocean Territory” and BRITISH OVERSEAS TERRITORIES until I can figure out what is in each and are they maybe the same? It may also be that the batch of islands scattered about was in the “Overseas” bit and have gotten un-sorted…

    There’s also some Zaire / Congo & All The Little Guineas sorting and validation needed… You would think that with only a few hundred countries we could given them all unique names, but Nooo…. everyone has to reuse parts…

    MariaDB [temps]> source bin/LC4.sql;
    Query OK, 236 rows affected, 6 warnings (0.82 sec)   
    Records: 240  Deleted: 0  Skipped: 4  Warnings: 6
    [...]
    MariaDB [temps]> source bin/LC3.sql
    Query OK, 17 rows affected, 220 warnings (0.45 sec)  
    Records: 237  Deleted: 0  Skipped: 220  Warnings: 220
    [...]
    MariaDB [temps]> source bin/rcountry.sql
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    | cnum | abrev | cname                                                    | region | version | ascension |
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    | 157  |       | AMSTERDAM ISLAND (FRANCE)                                | 1      | Gv3.3   | 7Sept2015 |
    | 701  |       | ARGENTINE BASE IN ANTARCTICA                             | 7      | Gv3.3   | 7Sept2015 |
    | 158  |       | ASCENSION ISLAND (U.K.)                                  | 1      | Gv3.3   | 7Sept2015 |
    | 536  |       | BELAU                                                    | 5      | Gv3.3   | 7Sept2015 |
    | 298  | IO    | British Indian Ocean Territory [United Kingdom]          |        | Gv4     | 17MAR2019 |
    | 147  |       | BRITISH OVERSEAS TERRITORIES                             | 1      | Gv3.3   | 7Sept2015 |
    | 428  |       | BRITISH VIRGIN ISLANDS (U.K.)                            | 4      | Gv3.3   | 7Sept2015 |
    | 159  |       | CANARY ISLANDS (SPAIN)                                   | 1      | Gv3.3   | 7Sept2015 |
    | 160  |       | CEUTA (SPAIN)                                            | 1      | Gv3.3   | 7Sept2015 |
    | 161  |       | CHAGOS ARCHIPELAGO (U.K.)                                | 1      | Gv3.3   | 7Sept2015 |
    | 526  |       | CORAL SEA ISLANDS (AUSTRALIA)                            | 5      | Gv3.3   | 7Sept2015 |
    | 199  | EK    | Equatorial Guinea                                        |        | Gv4     | 17MAR2019 |
    | 198  | EU    | Europa Island [France]                                   |        | Gv4     | 17MAR2019 |
    | 652  |       | FAROE ISLANDS (DENMARK)                                  | 6      | Gv3.3   | 7Sept2015 |
    | 696  | LS    | Liechtenstein                                            |        | Gv4     | 17MAR2019 |
    | 654  |       | MADEIRA ISLANDS (PORTUGAL)                               | 6      | Gv3.3   | 7Sept2015 |
    | 164  |       | MELILLA (SPAIN)                                          | 1      | Gv3.3   | 7Sept2015 |
    | 800  |       | SHIP STATIONS                                            | 8      | Gv3.3   | 7Sept2015 |
    | 900  | XX    | Unknown                                                  |        | Gv4     | 17MAR2019 |
    | 154  |       | ZAIRE                                                    | 1      | Gv3.3   | 7Sept2015 |
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    251 rows in set (0.01 sec)
    :

    So basically just need to figure out the FIPS abbreviation for those 17 with a blank, or decide to stick an XX in them and be done.

  31. E.M.Smith says:
    chiefio@PiM3Devuan2:~/SQL/bin$ ls -ld /tmp
    drwxrwxrwx 8 root root 157 Mar 20 19:40 /tmp
    [...]
    chiefio@PiM3Devuan2:~/SQL/bin$ cat dcountry.sql 
    SELECT cnum,abrev,cname,region,version,ascension
    INTO OUTFILE '/tmp/Country.list'
    FROM country
    GROUP BY cname
    ORDER BY cname
    ;
    [...]
    MariaDB [temps]> source bin/dcountry.sql
    ERROR 1045 (28000) at line 1 in file: 'bin/dcountry.sql': Access denied for user 'chiefio'@'localhost' (using password: YES)
    MariaDB [temps]>
    

    So “nice try”, but no…

    (I’d done something like that earlier when I then had to run off and find that “GRANT ALL” does not grant ALL… As I know I did a GRANT ALL and have not done a GRANT FILE I’m fairly certain that’s the issue… but having worked around it with a cut/paste here I’m not going to go down the permission black hole today…)

  32. E.M.Smith says:

    One Down – from the wiki:

    Zaire (/zɑːˈɪər/), officially the Republic of Zaire (French: République du Zaïre; French pronunciation: ​[za.iʁ]), was the name of a sovereign state between 1971 and 1997 in Central Africa that is now known as Democratic Republic of the Congo.
    […]
    Capital Kinshasa

    So it is the “Congo (Kinshasa)” now.

  33. E.M.Smith says:

    Looks like several of these are formally part of a major country, so likely now counted in it and under that FIPS code. Mellia, Madiera, Faroe Isl., Coral Sea Isl., I’ve checked so far, so I need to give them that country FIPS number.

    Then “Chagos” starts to answer that British Indian territories thing. Again from the Wiki::

    The Chagos Archipelago (/ˈtʃɑːɡəs, -ɡoʊs/) or Chagos Islands (formerly the Bassas de Chagas, and later the Oil Islands) are a group of seven atolls comprising more than 60 individual tropical islands in the Indian Ocean about 500 kilometres (310 mi) south of the Maldives archipelago. This chain of islands is the southernmost archipelago of the Chagos-Laccadive Ridge, a long submarine mountain range in the Indian Ocean.

    Officially a colonized territory by the United Kingdom, as part of its British Indian Ocean Territory,

    So looks like that BIOT may have been divided into a couple of parts, though that wiki just talks about Chagos / DIego Garcia too:

    https://en.wikipedia.org/wiki/British_Indian_Ocean_Territory

    So for now I’m just going to assume those two are an identity and match them.

  34. Bill in Oz says:

    E M I have never heard any reference to “Coral Sea Islands” before here in Oz. There are hundreds of islands in the Gra Barrier Reef but in the Coral sea ? Nah.”I wonder also how Lord Howe Island and Norfolk Island stand in this set up. Tey are both small islands in the Tasman Sea between New Zealand & Oz but territorially part of Australia.

  35. E.M.Smith says:

    Looks like British Virgin Islands are technically part of Britain, so the same FIPS (and British Citizens and so also EU citizens… at least for another week ;-)

    BELAU is now PALAU, so that x90s code can be replaced with the 536 code…

    We’re getting there, one gratuitous change at a time…

    https://en.wikipedia.org/wiki/Ascension_Island

    Ascension Island is an isolated volcanic island, 7°56′ south of the Equator in the South Atlantic Ocean. It is about 1,600 kilometres (1,000 mi) from the coast of Africa and 2,250 kilometres (1,400 mi) from the coast of Brazil. It is governed as part of the British Overseas Territory of Saint Helena, Ascension and Tristan da Cunha, of which the main island, Saint Helena, is around 1,300 kilometres (800 mi) to the southeast.

    So that looks to match the British Overseas Territory line… Except that the two have different cnum values and no FIPS codes.

    The Argentine Antarctic will be in the antarctic group somewhere so I’ll need to match them, then we have, from the wiki:

    Île Amsterdam (French pronunciation: ​[ilamstɛʁdam]), also known as Amsterdam Island, New Amsterdam, or Nouvelle Amsterdam, is an island of the French Southern and Antarctic Lands in the southern Indian Ocean that together with neighbouring Île Saint-Paul 85 km (53 mi) to the south forms one of the five districts of the territory.

    Which is entirely unclear. Looks sort of like their own country, but has the French anthem and the French were found of just making bits of the globe ‘France’ and moving on…So that one needs a bit more.

  36. E.M.Smith says:

    @Bill in Oz:

    Well, there is reality and then there is what NOAA / WMO do with thermometer records. My interest is in sorting the thermometer records into reasonably “like” buckets for comparison, so if I find some tropical islands now coded with Britain, but separate in the past, I’ve got to find some way to deal with that. This is complicated by there being lots of things in one set and just gone in the other…

    So my goal is to get them sorted “close enough” for comparisons, but not worry too much about the details of actual ownership.

    Realize that v1,2 &3 were geographic oriented sets (first digit continent) while v4 is political entity oriented (that FIPS Nation Code). As nations are volatile and mutable things, I think that a mistake…

    (Were I doing it, each station would get a LAT/LONG digit assigned and that wold be that… but since their LAT / LONG records have been dodgy at times and they didn’t keep the past record of moves so the LAT LONG now may not be what it was 40 years ago nor what it will be tomorrow, that’s got issues too).

    So what I’m doing right now is just unscrambling what can be unscrambled, and marking the rest as “a bit odd” with the x9x numbers and the XX FIPS abbreviations. That ought to leave me with at most a half dozen thermometers that could be a bit wrong in their assignment into some average (geographic region or country /political entity) and that’s not going to change things enough to matter for gross comparisons.

    IF then, that lets me unscramble the WMO changes, then I can get to One Pure Number per station and then make another pass through the data structure with that for the final preening. Or not if it doesn’t… and just use it as “close enough”…

    Clearly the folks managing the thermometers and the temperature record “have issues” with not understanding databases, data structure, archiving, etc. etc. Way too much gratuitous change going on and it smells greatly of “How can I justify next years pork allotment budget increase if I don’t change things?”

  37. E.M.Smith says:

    OK, in theory I’m ready to run with this one. The Region (or continent) field is still blank for most of the entries (but I’m pretty sure I can fix that pretty quick and it isn’t needed yet).

    It looks like about 14 lines left with some kind of XX flag record override or similar issue to think about, but it ought to be good enough to use to get the basic shakedown of the rest of the Gv4 load done and first fire evaluated.

    So with that, I’m done on the country stuff for a while and next posting ought to be about temperatures and thermometers by LAT LONG and such.

    Look it over. If you see anything daft or wrong in it, holler. I “moved fast” on this one and may have made assumptions that are not valid ;-)

    The first three things are just showing a new build of the country table and loading it first with the v4 data (that has had the cnum manually typed in for those records into the load file) and then the load of the v3.3 data (where it can not override the v4 data, only add to it the records that are not in v4 / where they don’t match). Those ought to be the ones with XX abbreviations.

    MariaDB [temps]> source tables/country
    Query OK, 0 rows affected (0.19 sec)
    [...]
    MariaDB [temps]> source bin/LC4.sql
    Query OK, 238 rows affected, 4 warnings (0.17 sec)   
    Records: 240  Deleted: 0  Skipped: 2  Warnings: 4
    [...]
    MariaDB [temps]> source bin/LC3.sql
    Query OK, 14 rows affected, 223 warnings (0.01 sec)  
    Records: 237  Deleted: 0  Skipped: 223  Warnings: 223
    
    MariaDB [temps]> source bin/rcountry.sql
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    | cnum | abrev | cname                                                    | region | version | ascension |
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    | 201  | AF    | Afghanistan                                              |        | Gv4     | 17MAR2019 |
    | 601  | AL    | Albania                                                  |        | Gv4     | 17MAR2019 |
    | 101  | AG    | Algeria                                                  |        | Gv4     | 17MAR2019 |
    | 521  | AQ    | American Samoa [United States]                           |        | Gv4     | 17MAR2019 |
    | 157  | XX    | AMSTERDAM ISLAND (FRANCE)                                | 1      | Gv3.3   | 7Sept2015 |
    | 102  | AO    | Angola                                                   |        | Gv4     | 17MAR2019 |
    | 700  | AY    | Antarctica                                               |        | Gv4     | 17MAR2019 |
    | 426  | AC    | Antigua and Barbuda                                      |        | Gv4     | 17MAR2019 |
    | 301  | AR    | Argentina                                                |        | Gv4     | 17MAR2019 |
    | 701  | XX    | ARGENTINE BASE IN ANTARCTICA                             | 7      | Gv3.3   | 7Sept2015 |
    | 602  | AM    | Armenia                                                  |        | Gv4     | 17MAR2019 |
    | 158  | XX    | ASCENSION ISLAND (U.K.)                                  | 1      | Gv3.3   | 7Sept2015 |
    | 501  | AS    | Australia                                                |        | Gv4     | 17MAR2019 |
    | 603  | AU    | Austria                                                  |        | Gv4     | 17MAR2019 |
    | 604  | AJ    | Azerbaijan                                               |        | Gv4     | 17MAR2019 |
    | 423  | BF    | Bahamas, The                                             |        | Gv4     | 17MAR2019 |
    | 202  | BA    | Bahrain                                                  |        | Gv4     | 17MAR2019 |
    | 203  | BG    | Bangladesh                                               |        | Gv4     | 17MAR2019 |
    | 401  | BB    | Barbados                                                 |        | Gv4     | 17MAR2019 |
    | 605  | BO    | Belarus                                                  |        | Gv4     | 17MAR2019 |
    | 606  | BE    | Belgium                                                  |        | Gv4     | 17MAR2019 |
    | 402  | BH    | Belize                                                   |        | Gv4     | 17MAR2019 |
    | 103  | BN    | Benin                                                    |        | Gv4     | 17MAR2019 |
    | 427  | BD    | Bermuda [United Kingdom]                                 |        | Gv4     | 17MAR2019 |
    | 302  | BL    | Bolivia                                                  |        | Gv4     | 17MAR2019 |
    | 607  | BK    | Bosnia and Herzegovina                                   |        | Gv4     | 17MAR2019 |
    | 104  | BC    | Botswana                                                 |        | Gv4     | 17MAR2019 |
    | 303  | BR    | Brazil                                                   |        | Gv4     | 17MAR2019 |
    | 161  | IO    | British Indian Ocean Territory [United Kingdom]          |        | Gv4     | 17MAR2019 |
    | 147  | XX    | BRITISH OVERSEAS TERRITORIES                             | 1      | Gv3.3   | 7Sept2015 |
    | 428  | XX    | BRITISH VIRGIN ISLANDS (U.K.)                            | 4      | Gv3.3   | 7Sept2015 |
    | 522  | BX    | Brunei                                                   |        | Gv4     | 17MAR2019 |
    | 608  | BU    | Bulgaria                                                 |        | Gv4     | 17MAR2019 |
    | 105  | UV    | Burkina Faso                                             |        | Gv4     | 17MAR2019 |
    | 216  | BM    | Burma                                                    |        | Gv4     | 17MAR2019 |
    | 106  | BY    | Burundi                                                  |        | Gv4     | 17MAR2019 |
    | 204  | CB    | Cambodia                                                 |        | Gv4     | 17MAR2019 |
    | 107  | CM    | Cameroon                                                 |        | Gv4     | 17MAR2019 |
    | 403  | CA    | Canada                                                   |        | Gv4     | 17MAR2019 |
    | 159  | XX    | CANARY ISLANDS (SPAIN)                                   | 1      | Gv3.3   | 7Sept2015 |
    | 108  | CV    | Cape Verde                                               |        | Gv4     | 17MAR2019 |
    | 429  | CJ    | Cayman Islands [United Kingdom]                          |        | Gv4     | 17MAR2019 |
    | 109  | CT    | Central African Republic                                 |        | Gv4     | 17MAR2019 |
    | 160  | XX    | CEUTA (SPAIN)                                            | 1      | Gv3.3   | 7Sept2015 |
    | 110  | CD    | Chad                                                     |        | Gv4     | 17MAR2019 |
    | 304  | CI    | Chile                                                    |        | Gv4     | 17MAR2019 |
    | 205  | CH    | China                                                    |        | Gv4     | 17MAR2019 |
    | 523  | KT    | Christmas Island [Australia]                             |        | Gv4     | 17MAR2019 |
    | 524  | CK    | Cocos (Keeling) Islands [Australia]                      |        | Gv4     | 17MAR2019 |
    | 305  | CO    | Colombia                                                 |        | Gv4     | 17MAR2019 |
    | 111  | CN    | Comoros                                                  |        | Gv4     | 17MAR2019 |
    | 112  | CF    | Congo (Brazzaville)                                      |        | Gv4     | 17MAR2019 |
    | 154  | CG    | Congo (Kinshasa)                                         |        | Gv4     | 17MAR2019 |
    | 525  | CW    | Cook Islands [New Zealand]                               |        | Gv4     | 17MAR2019 |
    | 526  | AS    | CORAL SEA ISLANDS (AUSTRALIA)                            | 5      | Gv3.3   | 7Sept2015 |
    | 405  | CS    | Costa Rica                                               |        | Gv4     | 17MAR2019 |
    | 113  | IV    | Cote D'Ivoire                                            |        | Gv4     | 17MAR2019 |
    | 609  | HR    | Croatia                                                  |        | Gv4     | 17MAR2019 |
    | 406  | CU    | Cuba                                                     |        | Gv4     | 17MAR2019 |
    | 610  | CY    | Cyprus                                                   |        | Gv4     | 17MAR2019 |
    | 611  | EZ    | Czech Republic                                           |        | Gv4     | 17MAR2019 |
    | 612  | DA    | Denmark                                                  |        | Gv4     | 17MAR2019 |
    | 114  | DJ    | Dijibouti                                                |        | Gv4     | 17MAR2019 |
    | 430  | DO    | Dominica                                                 |        | Gv4     | 17MAR2019 |
    | 407  | DR    | Dominican Republic                                       |        | Gv4     | 17MAR2019 |
    | 306  | EC    | Ecuador                                                  |        | Gv4     | 17MAR2019 |
    | 115  | EG    | Egypt                                                    |        | Gv4     | 17MAR2019 |
    | 408  | ES    | El Salvador                                              |        | Gv4     | 17MAR2019 |
    | 199  | EK    | Equatorial Guinea                                        |        | Gv4     | 17MAR2019 |
    | 116  | ER    | Eritrea                                                  |        | Gv4     | 17MAR2019 |
    | 613  | EN    | Estonia                                                  |        | Gv4     | 17MAR2019 |
    | 117  | ET    | Ethiopia                                                 |        | Gv4     | 17MAR2019 |
    | 198  | EU    | Europa Island [France]                                   |        | Gv4     | 17MAR2019 |
    | 316  | FK    | Falkland Islands (Islas Malvinas) [United Kingdom]       |        | Gv4     | 17MAR2019 |
    | 652  | DE    | FAROE ISLANDS (DENMARK)                                  | 6      | Gv3.3   | 7Sept2015 |
    | 527  | FM    | Federated States of Micronesia                           |        | Gv4     | 17MAR2019 |
    | 502  | FJ    | Fiji                                                     |        | Gv4     | 17MAR2019 |
    | 614  | FI    | Finland                                                  |        | Gv4     | 17MAR2019 |
    | 615  | FR    | France                                                   |        | Gv4     | 17MAR2019 |
    | 315  | FG    | French Guiana [France]                                   |        | Gv4     | 17MAR2019 |
    | 528  | FP    | French Polynesia                                         |        | Gv4     | 17MAR2019 |
    | 143  | FS    | French Southern and Antarctic Lands [France]             |        | Gv4     | 17MAR2019 |
    | 118  | GB    | Gabon                                                    |        | Gv4     | 17MAR2019 |
    | 150  | GA    | Gambia, The                                              |        | Gv4     | 17MAR2019 |
    | 616  | GG    | Georgia                                                  |        | Gv4     | 17MAR2019 |
    | 617  | GM    | Germany                                                  |        | Gv4     | 17MAR2019 |
    | 119  | GH    | Ghana                                                    |        | Gv4     | 17MAR2019 |
    | 653  | GI    | Gibraltar [United Kingdom]                               |        | Gv4     | 17MAR2019 |
    | 618  | GR    | Greece                                                   |        | Gv4     | 17MAR2019 |
    | 431  | GL    | Greenland [Denmark]                                      |        | Gv4     | 17MAR2019 |
    | 409  | GJ    | Grenada                                                  |        | Gv4     | 17MAR2019 |
    | 432  | GP    | Guadeloupe [France]                                      |        | Gv4     | 17MAR2019 |
    | 529  | GQ    | Guam [United States]                                     |        | Gv4     | 17MAR2019 |
    | 410  | GT    | Guatemala                                                |        | Gv4     | 17MAR2019 |
    | 699  | GK    | Guernsey                                                 |        | Gv4     | 17MAR2019 |
    | 120  | GV    | Guinea                                                   |        | Gv4     | 17MAR2019 |
    | 121  | PU    | Guinea-Bissau                                            |        | Gv4     | 17MAR2019 |
    | 307  | GY    | Guyana                                                   |        | Gv4     | 17MAR2019 |
    | 411  | HA    | Haiti                                                    |        | Gv4     | 17MAR2019 |
    | 412  | HO    | Honduras                                                 |        | Gv4     | 17MAR2019 |
    | 299  | HK    | Hong Kong                                                |        | Gv4     | 17MAR2019 |
    | 619  | HU    | Hungary                                                  |        | Gv4     | 17MAR2019 |
    | 620  | IC    | Iceland                                                  |        | Gv4     | 17MAR2019 |
    | 207  | IN    | India                                                    |        | Gv4     | 17MAR2019 |
    | 503  | ID    | Indonesia                                                |        | Gv4     | 17MAR2019 |
    | 208  | IR    | Iran                                                     |        | Gv4     | 17MAR2019 |
    | 209  | IZ    | Iraq                                                     |        | Gv4     | 17MAR2019 |
    | 621  | EI    | Ireland                                                  |        | Gv4     | 17MAR2019 |
    | 698  | IM    | Isle of Man                                              |        | Gv4     | 17MAR2019 |
    | 622  | IS    | Israel                                                   |        | Gv4     | 17MAR2019 |
    | 623  | IT    | Italy                                                    |        | Gv4     | 17MAR2019 |
    | 413  | JM    | Jamaica                                                  |        | Gv4     | 17MAR2019 |
    | 697  | JN    | Jan Mayen [Norway]                                       |        | Gv4     | 17MAR2019 |
    | 210  | JA    | Japan                                                    |        | Gv4     | 17MAR2019 |
    | 530  | JQ    | Johnston Atoll [United States]                           |        | Gv4     | 17MAR2019 |
    | 624  | JO    | Jordan                                                   |        | Gv4     | 17MAR2019 |
    | 197  | JU    | Juan De Nova Island [France]                             |        | Gv4     | 17MAR2019 |
    | 211  | KZ    | Kazakhstan                                               |        | Gv4     | 17MAR2019 |
    | 122  | KE    | Kenya                                                    |        | Gv4     | 17MAR2019 |
    | 504  | KR    | Kiribati                                                 |        | Gv4     | 17MAR2019 |
    | 206  | KN    | Korea, North                                             |        | Gv4     | 17MAR2019 |
    | 221  | KS    | Korea, South                                             |        | Gv4     | 17MAR2019 |
    | 212  | KU    | Kuwait                                                   |        | Gv4     | 17MAR2019 |
    | 213  | KG    | Kyrgyzstan                                               |        | Gv4     | 17MAR2019 |
    | 214  | LA    | Laos                                                     |        | Gv4     | 17MAR2019 |
    | 626  | LG    | Latvia                                                   |        | Gv4     | 17MAR2019 |
    | 627  | LE    | Lebanon                                                  |        | Gv4     | 17MAR2019 |
    | 162  | LT    | Lesotho                                                  |        | Gv4     | 17MAR2019 |
    | 123  | LI    | Liberia                                                  |        | Gv4     | 17MAR2019 |
    | 124  | LY    | Libya                                                    |        | Gv4     | 17MAR2019 |
    | 696  | LS    | Liechtenstein                                            |        | Gv4     | 17MAR2019 |
    | 628  | LH    | Lithuania                                                |        | Gv4     | 17MAR2019 |
    | 629  | LU    | Luxembourg                                               |        | Gv4     | 17MAR2019 |
    | 234  | MC    | Macau S.A.R                                              |        | Gv4     | 17MAR2019 |
    | 648  | MK    | Macedonia                                                |        | Gv4     | 17MAR2019 |
    | 125  | MA    | Madagascar                                               |        | Gv4     | 17MAR2019 |
    | 654  | PO    | MADEIRA ISLANDS (PORTUGAL)                               | 6      | Gv3.3   | 7Sept2015 |
    | 126  | MI    | Malawi                                                   |        | Gv4     | 17MAR2019 |
    | 505  | MY    | Malaysia                                                 |        | Gv4     | 17MAR2019 |
    | 235  | MV    | Maldives                                                 |        | Gv4     | 17MAR2019 |
    | 127  | ML    | Mali                                                     |        | Gv4     | 17MAR2019 |
    | 630  | MT    | Malta                                                    |        | Gv4     | 17MAR2019 |
    | 531  | RM    | Marshall Islands                                         |        | Gv4     | 17MAR2019 |
    | 433  | MB    | Martinique [France]                                      |        | Gv4     | 17MAR2019 |
    | 128  | MR    | Mauritania                                               |        | Gv4     | 17MAR2019 |
    | 129  | MP    | Mauritius                                                |        | Gv4     | 17MAR2019 |
    | 163  | MF    | Mayotte [France]                                         |        | Gv4     | 17MAR2019 |
    | 164  | ES    | MELILLA (SPAIN)                                          | 1      | Gv3.3   | 7Sept2015 |
    | 414  | MX    | Mexico                                                   |        | Gv4     | 17MAR2019 |
    | 598  | MQ    | Midway Islands [United States}                           |        | Gv4     | 17MAR2019 |
    | 631  | MD    | Moldova                                                  |        | Gv4     | 17MAR2019 |
    | 215  | MG    | Mongolia                                                 |        | Gv4     | 17MAR2019 |
    | 632  | MJ    | Montenegro                                               |        | Gv4     | 17MAR2019 |
    | 695  | MH    | Montserrat                                               |        | Gv4     | 17MAR2019 |
    | 130  | MO    | Morocco                                                  |        | Gv4     | 17MAR2019 |
    | 131  | MZ    | Mozambique                                               |        | Gv4     | 17MAR2019 |
    | 132  | WA    | Namibia                                                  |        | Gv4     | 17MAR2019 |
    | 506  | NR    | Nauru                                                    |        | Gv4     | 17MAR2019 |
    | 217  | NP    | Nepal                                                    |        | Gv4     | 17MAR2019 |
    | 633  | NL    | Netherlands                                              |        | Gv4     | 17MAR2019 |
    | 434  | NT    | Netherlands Antilles [Netherlands]                       |        | Gv4     | 17MAR2019 |
    | 532  | NC    | New Caledonia [France]                                   |        | Gv4     | 17MAR2019 |
    | 507  | NZ    | New Zealand                                              |        | Gv4     | 17MAR2019 |
    | 415  | NU    | Nicaragua                                                |        | Gv4     | 17MAR2019 |
    | 133  | XX    | NIGER                                                    | 1      | Gv3.3   | 7Sept2015 |
    | 134  | NI    | Nigeria                                                  |        | Gv4     | 17MAR2019 |
    | 533  | NE    | Niue [New Zealand]                                       |        | Gv4     | 17MAR2019 |
    | 534  | NF    | Norfolk Island [Australia]                               |        | Gv4     | 17MAR2019 |
    | 535  | CQ    | Northern Mariana Islands [United States]                 |        | Gv4     | 17MAR2019 |
    | 634  | NO    | Norway                                                   |        | Gv4     | 17MAR2019 |
    | 218  | MU    | Oman                                                     |        | Gv4     | 17MAR2019 |
    | 219  | PK    | Pakistan                                                 |        | Gv4     | 17MAR2019 |
    | 536  | PS    | Palau                                                    |        | Gv4     | 17MAR2019 |
    | 599  | LQ    | Palmyra Atoll [United States]                            |        | Gv4     | 17MAR2019 |
    | 416  | PM    | Panama                                                   |        | Gv4     | 17MAR2019 |
    | 508  | PP    | Papua New Guinea                                         |        | Gv4     | 17MAR2019 |
    | 308  | PA    | Paraguay                                                 |        | Gv4     | 17MAR2019 |
    | 309  | PE    | Peru                                                     |        | Gv4     | 17MAR2019 |
    | 509  | RP    | Philippines                                              |        | Gv4     | 17MAR2019 |
    | 537  | PC    | Pitcairn Islands [United Kingdom]                        |        | Gv4     | 17MAR2019 |
    | 635  | PL    | Poland                                                   |        | Gv4     | 17MAR2019 |
    | 636  | PO    | Portugal                                                 |        | Gv4     | 17MAR2019 |
    | 435  | RQ    | Puerto Rico [United States]                              |        | Gv4     | 17MAR2019 |
    | 220  | QA    | Qatar                                                    |        | Gv4     | 17MAR2019 |
    | 165  | RE    | Reunion [France]                                         |        | Gv4     | 17MAR2019 |
    | 637  | RO    | Romania                                                  |        | Gv4     | 17MAR2019 |
    | 222  | RS    | Russia                                                   |        | Gv4     | 17MAR2019 |
    | 166  | RW    | Rwanda                                                   |        | Gv4     | 17MAR2019 |
    | 298  | SH    | Saint Helena [United Kingdom]                            |        | Gv4     | 17MAR2019 |
    | 417  | SC    | Saint Kitts and Nevis                                    |        | Gv4     | 17MAR2019 |
    | 436  | ST    | Saint Lucia                                              |        | Gv4     | 17MAR2019 |
    | 438  | SB    | Saint Pierre and Miquelon [France]                       |        | Gv4     | 17MAR2019 |
    | 437  | VC    | Saint Vincent and the Grenadines                         |        | Gv4     | 17MAR2019 |
    | 541  | WS    | Samoa                                                    |        | Gv4     | 17MAR2019 |
    | 136  | TP    | Sao Tome and Principe                                    |        | Gv4     | 17MAR2019 |
    | 223  | SA    | Saudi Arabia                                             |        | Gv4     | 17MAR2019 |
    | 137  | SG    | Senegal                                                  |        | Gv4     | 17MAR2019 |
    | 639  | RI    | Serbia                                                   |        | Gv4     | 17MAR2019 |
    | 138  | SE    | Seychelles                                               |        | Gv4     | 17MAR2019 |
    | 800  | XX    | SHIP STATIONS                                            | 8      | Gv3.3   | 7Sept2015 |
    | 139  | SL    | Sierra Leone                                             |        | Gv4     | 17MAR2019 |
    | 511  | SN    | Singapore                                                |        | Gv4     | 17MAR2019 |
    | 641  | LO    | Slovakia                                                 |        | Gv4     | 17MAR2019 |
    | 642  | SI    | Slovenia                                                 |        | Gv4     | 17MAR2019 |
    | 512  | BP    | Solomon Islands                                          |        | Gv4     | 17MAR2019 |
    | 140  | SO    | Somalia                                                  |        | Gv4     | 17MAR2019 |
    | 141  | SF    | South Africa                                             |        | Gv4     | 17MAR2019 |
    | 317  | SX    | South Georgia and the South Sandwich Islands [United Kin |        | Gv4     | 17MAR2019 |
    | 643  | SP    | Spain                                                    |        | Gv4     | 17MAR2019 |
    | 224  | CE    | Sri Lanka                                                |        | Gv4     | 17MAR2019 |
    | 148  | SU    | Sudan                                                    |        | Gv4     | 17MAR2019 |
    | 312  | NS    | Suriname                                                 |        | Gv4     | 17MAR2019 |
    | 694  | SV    | Svalbard [Norway]                                        |        | Gv4     | 17MAR2019 |
    | 167  | WZ    | Swaziland                                                |        | Gv4     | 17MAR2019 |
    | 645  | SW    | Sweden                                                   |        | Gv4     | 17MAR2019 |
    | 646  | SZ    | Switzerland                                              |        | Gv4     | 17MAR2019 |
    | 647  | SY    | Syria                                                    |        | Gv4     | 17MAR2019 |
    | 236  | TW    | Taiwan                                                   |        | Gv4     | 17MAR2019 |
    | 227  | TI    | Tajikistan                                               |        | Gv4     | 17MAR2019 |
    | 149  | TZ    | Tanzania                                                 |        | Gv4     | 17MAR2019 |
    | 228  | TH    | Thailand                                                 |        | Gv4     | 17MAR2019 |
    | 597  | TT    | Timor-Leste                                              |        | Gv4     | 17MAR2019 |
    | 151  | TO    | Togo                                                     |        | Gv4     | 17MAR2019 |
    | 538  | TL    | Tokelau [New Zealand]                                    |        | Gv4     | 17MAR2019 |
    | 517  | TN    | Tonga                                                    |        | Gv4     | 17MAR2019 |
    | 424  | TD    | Trinidad and Tobago                                      |        | Gv4     | 17MAR2019 |
    | 168  | TE    | Tromelin Island [France]                                 |        | Gv4     | 17MAR2019 |
    | 152  | TS    | Tunisia                                                  |        | Gv4     | 17MAR2019 |
    | 649  | TU    | Turkey                                                   |        | Gv4     | 17MAR2019 |
    | 229  | TX    | Turkmenistan                                             |        | Gv4     | 17MAR2019 |
    | 439  | XX    | TURKS AND CAICOS ISLANDS                                 | 4      | Gv3.3   | 7Sept2015 |
    | 518  | TV    | Tuvalu                                                   |        | Gv4     | 17MAR2019 |
    | 153  | UG    | Uganda                                                   |        | Gv4     | 17MAR2019 |
    | 650  | UP    | Ukraine                                                  |        | Gv4     | 17MAR2019 |
    | 230  | AE    | United Arab Emirates                                     |        | Gv4     | 17MAR2019 |
    | 651  | UK    | United Kingdom                                           |        | Gv4     | 17MAR2019 |
    | 425  | US    | United States                                            |        | Gv4     | 17MAR2019 |
    | 900  | XX    | Unknown                                                  |        | Gv4     | 17MAR2019 |
    | 313  | UY    | Uruguay                                                  |        | Gv4     | 17MAR2019 |
    | 231  | UZ    | Uzbekistan                                               |        | Gv4     | 17MAR2019 |
    | 520  | NH    | Vanuatu                                                  |        | Gv4     | 17MAR2019 |
    | 314  | VE    | Venezuela                                                |        | Gv4     | 17MAR2019 |
    | 232  | VM    | Vietnam                                                  |        | Gv4     | 17MAR2019 |
    | 440  | VQ    | Virgin Islands [United States]                           |        | Gv4     | 17MAR2019 |
    | 539  | WQ    | Wake Island [United States]                              |        | Gv4     | 17MAR2019 |
    | 540  | WF    | Wallis and Futuna [France]                               |        | Gv4     | 17MAR2019 |
    | 169  | WI    | Western Sahara                                           |        | Gv4     | 17MAR2019 |
    | 233  | YM    | Yemen                                                    |        | Gv4     | 17MAR2019 |
    | 155  | ZA    | Zambia                                                   |        | Gv4     | 17MAR2019 |
    | 156  | ZI    | Zimbabwe                                                 |        | Gv4     | 17MAR2019 |
    +------+-------+----------------------------------------------------------+--------+---------+-----------+
    
  38. Bill in Oz says:

    This looks pretty comprehensive EM.

    And as fortune has it, there have been no major changes of territorial boundaries recently ~ 1945.

    Going through it I noticed one oddity though.

    The Channel islands off the coast of Normandy are not listed. ( A Crown territory which is technically not part of the UK- no MP’s in parliament for example. A weird British thing !! ! ) But Guernsey Island which is a part of the Channel Islands is listed. But that means that Jersey which is bigger & has far more people is omitted. Small but puzzling.

  39. Bill in Oz says:

    A PS : re Panama… How is data re the Panama Canal territory formerly controlled by the USA, treated ? Now it is all part of Panama..But prior to 1990’s was weather data from this area treated as part of the USA set ?

  40. E.M.Smith says:

    That is odd… I know Jersey is in the initial data load from GHCNv4…

    chiefio@PiM3Devuan2:~/SQL/v4$ grep Jersey country-tabs 
    698	JE	Jersey
    

    So it ought to have survived the install of GHCN v3.3….

    chiefio@PiM3Devuan2:~/SQL/v4$ cd ../v3
    chiefio@PiM3Devuan2:~/SQL/v3$ grep JERS country_tabs 
    chiefio@PiM3Devuan2:~/SQL/v3$ grep 689 country_tabs 
    

    As there is nothing there to over write it… (and v4 ought to dominate anyway)…

    Ah, there’s my mistake. Assigned the same number to two places:

    chiefio@PiM3Devuan2:~/SQL/v4$ grep 698 country-tabs 
    698	IM	Isle of Man
    698	JE	Jersey
    

    So Isle of Man beat Jersey (and it wasn’t even football season ;-)

    Thanks for the catch! I think I’ll do a report of how many of what exist…

  41. E.M.Smith says:

    I’ll bet it’s the “Skipped 2” part of the messages I ignored and that I’ve got another one:

    MariaDB [temps]> source bin/LC4.sql
    Query OK, 238 rows affected, 4 warnings (0.17 sec)   
    Records: 240  Deleted: 0  Skipped: 2  Warnings: 4
    
  42. E.M.Smith says:

    Well I changed 698 to 697 and tried again, got the same warning count for the same reason ;-) (AND I was certain I checked that 697 wasn’t in use but maybe the find needed to wrap back to the top first…)

    MariaDB [temps]> show warnings;
    +---------+------+----------------------------------------------+
    | Level   | Code | Message                                      |
    +---------+------+----------------------------------------------+
    | Warning | 1364 | Field 'region' doesn't have a default value  |
    | Warning | 1062 | Duplicate entry '697' for key 'PRIMARY'      |
    | Warning | 1062 | Duplicate entry '113' for key 'PRIMARY'      |
    | Warning | 1265 | Data truncated for column 'cname' at row 202 |
    

    Yeah, I know, I need to add region to all the v4 stuff. That is what I was going to do when I sat down…

    Then looks like I have two entries with duplicate primary keys. 113? Hmmm…

    chiefio@PiM3Devuan2:~/SQL/v4$ grep 113 country-tabs 
    113	IV	Cote D'Ivoire 
    113	NG	Niger 
    

    Well that one wasn’t one I assigned. Wonder if I typo’d it… I’d not be surprised at one mis-key out of a couple of hundred…

    chiefio@PiM3Devuan2:~/SQL/v4$ grep 697 country-tabs 
    697	JE	Jersey
    697	JN	Jan Mayen [Norway] 
    

    OK, so I moved Jerset right on top of Jan Mayen…. Got it.

    Note To Self: Do “show warnings” after running an SQL program….

    Once again, thanks for the eyeballs / QA catch!

    UPDATE:

    Looks like Niger is supposed to be 133 and I doubled the wrong digit…

    chiefio@PiM3Devuan2:~/SQL/v3$ grep NIGER country_tabs 
    133	XX	1	NIGER                                                   	Gv3.3	7Sept2015 
    134	XX	1	NIGERIA                                                 	Gv3.3	7Sept2015 
    

    So I’ve got two fixes and a re-run, then back to QA land…

  43. E.M.Smith says:
    MariaDB [temps]> drop table country
        -> ;
    Query OK, 0 rows affected (1.03 sec)
    
    MariaDB [temps]> source tables/country;
    Query OK, 0 rows affected (0.13 sec)
    
    MariaDB [temps]> source bin/LC4.sql
    Query OK, 240 rows affected, 1 warning (0.13 sec)    
    Records: 240  Deleted: 0  Skipped: 0  Warnings: 1
    
    MariaDB [temps]> show warnings;
    +---------+------+----------------------------------------------+
    | Level   | Code | Message                                      |
    +---------+------+----------------------------------------------+
    | Warning | 1265 | Data truncated for column 'cname' at row 202 |
    +---------+------+----------------------------------------------+
    1 row in set (0.00 sec)
    

    And that warming is for

    MariaDB [temps]> SELECT cnum, abrev,cname 
        -> FROM country WHERE abrev='SX';
    +------+-------+----------------------------------------------------------+
    | cnum | abrev | cname                                                    |
    +------+-------+----------------------------------------------------------+
    | 317  | SX    | South Georgia and the South Sandwich Islands [United Kin |
    +------+-------+----------------------------------------------------------+
    1 row in set (0.00 sec)
    

    Where the input file has “Kingdom”… so 4 more letters. I’m pretty sure the docs said 56 CHAR (it isn’t the kind of number I would make up…) but I can bump it to 60 and not lose any sleep…

    I’d not be at all surprised to find out either they didn’t follow their own docs or that the field “just grew”..

  44. E.M.Smith says:

    Well, to convert the v4 data to be full of tabs and ready to load:

    chiefio@PiM3Devuan2:~/SQL/v4$ time ./stationdat >temps.out
    
    real	7m25.760s
    user	6m50.644s
    sys	0m33.937s
    

    7 1/2 minutes… Not bad for the Little Pi That Could ;-)

    Since the way I went at this is a bit on the simple-but-daft side, it duplicates the basic station and reading data for each year 12 times making the month line of data to load. That blows up the size of the input to about 1.2 GB from about 162 MB…..

    -rw-r--r-- 1 chiefio chiefio  162513448 Mar 21 02:51 temps.in
    -rw-r--r-- 1 chiefio chiefio 1260880200 Mar 21 03:05 temps.out
    

    Then again, I’m doing this on a 500 GB disk that’s the smallest one I’ve got and it has 18% used… so it isn’t like I care about using the bits… Then, when the load is done, the database has very little redundancy in it. It is on a 1 TB disk so really doesn’t care ;-)

    Being on 2 different disks means the “loading” process can just linear read one and write into the other without a lot of head seeks. Still, at USB 2.0 speed and on a Pi M3 I’m likely pushing it. We’ll see when the load finishes. Then I’ll do a report or two and we’ll find out if this equipment is “enough machine” or do I need to go upscale to a larger SBC.

    OTOH, it’s kind of fun to tweak the nose of all the “Must Have New Supercomputer!!!” guys by doing things on minimal hardware ;-)

    When I first ported GIStemp, I deliberately used an old “White Box PC” (with a Pentium class CPU IIRC) that had started life as a 486 and still has the 5 inch floppy drive. Later I found out some folks were snickering about the crapy hardware… completely missing the joke on them of “if it runs on an old WB PC why do you need a $50 Million Super Computer, eh?” Oh Well, personal jokes are for your own enjoyment, not that of others…

    But the truth is that I’ve always been a minimalist. As a Database Consultant my specialty was efficiency reviews precisely because I like running in the minimal resources possible. So there’s a certain Haiku kind of charm / challenge in doing this on a system anyone in the world could buy. Who knows, maybe I’ll inspire some kid in Congo to get into this Climate Skeptic Kitchen Science thing ;-) I’d like that…

    Aaaannnd… I knew if I typed long enough it would finish!

    MariaDB [temps]> source bin/temps4L.sql
    Query OK, 16809258 rows affected, 50 warnings (22 min 20.32 sec)
    Records: 16809258  Deleted: 0  Skipped: 0  Warnings: 50
    

    So 22 minutes to load the temperature data table…

    | Warning | 1262 | Row 1071847 was truncated; it contained more data than there were input columns |
    | Warning | 1265 | Data truncated for column 'source' at row 1073732                               |
    | Warning | 1262 | Row 1073732 was truncated; it contained more data than there were input columns |
    | Warning | 1265 | Data truncated for column 'source' at row 1082193                               |
    | Warning | 1262 | Row 1082193 was truncated; it contained more data than there were input columns |
    +---------+------+---------------------------------------------------------------------------------+
    50 rows in set (0.06 sec)
    
    MariaDB [temps]> 
    

    Sporadic issues with the “source” column. OK, I’ll have to check on that. (May mean another data load… at just a few minutes that wasn’t an issue. At max 3 / hour, that’s a bother.) As “source” is the very last column, that could indicate a length issue on some records, or just an alignment issue. “We’ll see” with some reports & checks.

    Looks like most of it is fine. I did a spot check on “Albania” (figuring they might not have too much data…) and got over 3000 rows:

    | ALM00013629 | AL    | 13629 | 2012 | TAVG | Gv4     |  NOV  |   8.89 |
    | ALM00013629 | AL    | 13629 | 2012 | TAVG | Gv4     |  OCT  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2012 | TAVG | Gv4     | JULY  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2012 | TAVG | Gv4     | JUNE  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2012 | TAVG | Gv4     | SEPT  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  APR  |  11.67 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  AUG  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  DEC  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  FEB  |   4.00 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  JAN  |   2.39 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  MAR  |   6.42 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  MAY  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  NOV  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     |  OCT  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     | JULY  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     | JUNE  | -99.99 |
    | ALM00013629 | AL    | 13629 | 2013 | TAVG | Gv4     | SEPT  | -99.99 |
    +-------------+-------+-------+------+------+---------+-------+--------+
    3204 rows in set (1 min 45.76 sec)
    
    MariaDB [temps]> 
    

    Station ID looks right. AL country abbreviation is right. WMO picked out the 5 used digits. Then year is clearly properly aligned. So are the type of data, the version, month, and the floating point temperature as C (even the -99.99 missing data flags have their digits right).

    OK, I can live with that. I’m going to proceed with using this load to test / devo / debug some of the same reports I did on v3.3 while I look to stamp out those warnings. Since I don’t use the source flag for anything, if it is missing a beat I won’t care. It does repeat for each month, so hopefully I’ve not got any alignment issue. (Comparing some known temperatures will tell).

    For now, I’m happy. I’ll likely add an index on that Abbreviation field as it will be used in searches and selections. Still at a bit under 2 minutes to work the whole data set, not too bad.

    So “first fire” on a Version 4 temperature data report looks good and done!

    Whoot!!

    (And yeah, I kinda like Albania… in a Potsylvania kind of way ;-)

  45. E.M.Smith says:

    It would seem that if you suck a 1.2 GB file into VI to look at the data and see where it’s not aligned right, on a Pi with 1 GB of memory, you get 600+ MB of swap used… a few minutes later when your keyboard and mouse respond once again ;-)

    root@PiM3Devuan2:/SG500/xfs/chiefio/SQL/v3# swapon -s
    Filename				Type		Size	Used	Priority
    /dev/sda2                              	partition	1048572	0	64
    /dev/sda11                             	partition	2117628	652700	1024
    /dev/sdb2                              	partition	1048572	0	128
    

    Most I’ve ever seen the Pi use…

    I have a habit, from many years standing, of putting a swap partition on all my disks equal to memory size. As I have both 1 and 2 GB SBCs, some are 1, some are 2. In this case, I have both a 1 and a 2 on the same disk. That’s my system disk AND my home directory. The two swaps are on opposite ends of the disk. I change which is high vs low priority depending on when disk activity will be nearest the system area or my home directory area… Minimize long head seeks during swap…. The other disk is the one with the database on it. So right now priority is set to mostly use the swap partition near where I do all that input data munging, and not go to the other end of the disk unless desperate. Avoiding the DB disk, but using it if I run over 2 GB of swap (that ought never to happen)….

    But yeah, a R. Pi using 1/2 GB+ of swap ;-) Now that’s fun ;-)

  46. p.g.sharrow says:

    I am Delighted that this is being done on the Ras Pi 3 with Devuan2 by an old guy at his kitchen table! 8-) HURRAH ! quite a feat Maestro. Brains and experience trumps money and official connections. Maybe some the day real science advancement will again be the purview of Amateurs…pg…

  47. E.M.Smith says:

    @P.G.:

    Glad I can be the bringer of delight! While it is a minor purpose (meaning I’d toss a pile of cash at this if that was what it required as I think it is that important) one of my guiding lights is that whole “minimalist” thing. I just love using small hardware to do big jobs.

    I think it is the Amish roots. When I was 8 my Dad had use tear down and rebuild the “garage”. It had been a horse & buggy barn at the start of life, and used rough sawn REAL 2 x 4 s and square nails. We took it apart, poured a cement floor, and then put it back together. One of my jobs was to straighten the square nails to reuse them…. All the lumber was stacked and then bad bits replaced, but mostly going back where it came from.

    I learned a lot about minimal resources…

    Something about the Great Depression and all got soaked into my bones…

    So, like everyone, I covet the latest, greatest, fastest Whizzz BANG!… but what really make me satisfied is doing the most with as close to nothing as possible ;-) Like cooking a meal for 4 at 50 ¢ / person using a camp stove and 2 pots…. Just heaven ;-) Yeah, I’m that kind of strange… I’m happiest cooking a full dinner in a dutch oven over charcoal on the patio with vegetables from the garden. What was bought? A chunk of meat. Fuel being tree trimmings…

    Dinner tonight was what is left when the roast chicken can give no more sandwiches…. So pan drippings, bones, and maybe 1/2 pound of meat bits. Add a big onion, a few carrots, about 5 inches of celery (leaf end that many folks toss), salt, pepper, handful of lentils, sprinkle of French Seasoning mix and garlic… then a couple of bouillon cubes and in the last 10 minutes some “soup stars”. All up I doubt I’ve got $2 in it. We’ll have about 6 meals of a fine chicken and vegetable stew… I like it more than dinner out.

    Saw an interesting YouTube last night. Minimal “Small Home” made in a 6 x 8 “cargo” trailer. Found myself wanting one… Camp stove, miniature fridge, built in bed, TV on the wall. All up the guy had about $4000 in it built from new. It looked “just enough” and no more. Light enough I think the Subaru could tow it. Now if only I went anywhere that justified it ;-)

    There’s something about “Just enough and no more” that is curiously attractive… An optimization point.

  48. E.M.Smith says:

    FWIW, looking at the lines that gave a warning, they look to have a \ in the “source” position perhaps as an extra value.

    I’m pretty sure this is a “Dirty Data” issue. I have no use for a backslash in a field of “source” so I don’t mind it got truncated. I may “fix it” later, but not on any kind of priority…

  49. Pingback: GHCN v4 – First Graphs | Musings from the Chiefio

  50. Bob Koss says:

    Just saw this thread. I did some occasional voluntary work (2 years) on v4 data sending them corrections to erroneous data. Stopped when they went to beta 2 and suddenly added 1338 stations and removed 638 others which didn’t look bad to me. Got disgusted.

    Here is a really ugly one. It is co-located with three other IDs with entirely good data in the 1973 to 1993. This is because they merged in data from a couple 1000 km north.
    CA006092920 45.8833 -82.5667 189 GORE_BAY_CLIMATE
    Told them about it and a couple other similar stations and they told me they have confidence their adjustments would take care of it. Of course, due to the extreme latitude difference the shape of the annual temperature cycle is much different for about the first 20 years from the rest of the record even after adjustment, so it screws up the monthly anomalies.

    There is still a lot of erroneous data in the dbase e.g. Very first ID in inventory is incorrect because the data comes from Sweden and not Antigua and Barbuda. I told them 1-2 years ago, but it is still there. I bitched about them using only exact matches for qc ‘E’ flag because they couldn’t even copy the data from the original data files into v4 without having occasional differences of 1-2 100ths. They then expanded their duplicate range to +-0.015, but it still doesn’t nearly account for all cross-station duplicates. Here are a couple stations which should have been flagged in many years. Different math usage gave them different results.
    ASN00004028 -20.1078 119.5811 9 PARDOO_STATION
    ASN00004032 -20.3725 118.6317 6.4 PORT_HEDLAND_AIRPORT

    elevation 9999.0 is missing
    3rd character of ID is a particular network within the country if not zero.
    If ID char. 3-5 are XLT it is a record assembled from found data. Don’t know whether the rest of the number is newly minted or if they have an actual reference. A lot of Japan is XLT with many duplicates.

    I’ll try to get back later today or tomorrow. If you have any questions I might be able to answer them.

  51. bitchilly says:

    From the above post by Bob Koss “so it screws up the monthly anomalies.” It would be interesting to know in which direction. I hope screws up was just a figure of speech, though my cynical side says in reality if probably isn’t.

Anything to say?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.