GHCN Country Codes V1, V2, V3

A very boring posting, so don’t get your expectations up.

I’m working on a comparison of GHCN v1, v2, and v3 versions of the temperature data. Yes, I know, NCDC has removed the v1 and v2 copies from their web site (leaving one to wonder what they have to hide… or where they’ve hid it). But as I’ve got my own copies saved from “back when” that’s not an impediment at all. ;-) If anyone else needs a copy of the v1 or v2 data, let me know and for a minor donation I’ll make a CD and mail it.

A larger impediment is their tendency to change things for no good reason. (Almost like they are trying to deliberately make it hard for anyone to do comparisons or check up on them… or just sloppy. Pick one.) Things like changing all the country codes from v1 to v2. So I’ve got a fair amount of data matching and translation tables to make. Not decided yet if I’m going to suck it into Open Office (likely) or stick with the FORTRAN where I’ve got a fair number of tools built, but it’s a PITA to do improvisational reporting stuff. I’ll likely try loading a couple of tables into O.O. and see how robust the database features might be, at least for the ‘translation’ to a common comparable data set.

This particular “file” is the “Country Code” file, made from the country code file from each of v1, v2, and v3.

Turns out there were a lot of countries came and went, so figuring out what temperature goes where is a bit complicated. There were also a few “screwing arounds” with what to do about islands and “dependencies”. So we see things like Yugoslavia break up into a bunch of countries, along with the USSR. But we also see a bunch of Islands that did not change status being moved about from “dependencies” to separate country codes. Likely to suit someone’s sense of what is P.C.

My first actual data compare will be to pick some reasonably stable country codes out of this mush and just do a ‘quick compare’ to see what might have changed; or if more comparisons are ‘worth it’. ( I’d started to do that and found the country code / WMO composite key all messed up between v1 and the rest, thus this file).

Why post this here? Well, first off, so y’all can see what I’m doing and not wonder what happened to me. Second, so folks who are interested can see ‘how things changed’. Finally, it gives me a kind of archive if my laptop suddenly got lost.

With that, the file:

The first three fields are the Country Code from v1, v2, and v3 respectively. The description is mostly from v2, but v3 seems to match. When v1 is very different, I’ve prepended the v1 name or a short form of it, then a slash, then the v2 name. In some cases I’ve added a note in parenthesis (such as (split Yugo) for it being split off of Yugoslavia). When you see a number range with a dash, that is from the v1 file and I think it’s the WMO number range in that ‘country’. (Now I need to pick stations from the geographies that changed, then see if the WMO numbers map correctly. So, for example, pick a Hong Kong station and see if that WMO is in country code Hong Kong in v1, and China in v2 and v3.)

Country codes of “000” do not exist in that particular version Country Code file. For those that looked like a split of a prior country, or where it was in a broad “dependent Islands” bucket, I’ve gone ahead and put in the CC that I think belongs. All of those need a cross check too.

And yes, for the terminally curious, I typed the v1 and v3 country codes by hand and did the cut / paste of names by hand. It was quicker than trying to do it automated and come up with a decent QA method. (Might have been faster with Unix / Linux tools, but I’m doing this on the PC laptop for now. The Linux box is shut down at the moment and hard to get set up quickly. Besides, this box has Open Office on it and I’m interested in trying to do a data load to its database system.)

Part of what I find interesting in looking at the file is just how much the country codes gratuitously change from v1 to v2. Another part is just how unstable the notion of “country” is in our modern world.

One curious loose end is Liechtenstein. It is 613 in v1, then simply evaporates. Last time I looked it still existed. Might be interesting to see if they have a very stable thermometer or one with dropping temperatures ;-)

106 101 101 ALGERIA                                 
144 102 102 ANGOLA                                  
138 103 103 BENIN                                   
152 104 104 BOTSWANA                                
141 105 105 BURKINA FASO                            
130 106 106 BURUNDI                                 
136 107 107 CAMEROON                                
101 108 108 CAPE VERDE                              
133 109 109 CENTRAL AFRICAN REPUBLIC                
134 110 110 CHAD                                    
145 111 111 COMOROS                                 
131 112 112 CONGO                                   
142 113 113 IVORY COAST / COTE D'IVOIRE
122 114 114 DJIBOUTI                                
119 115 115 EGYPT                                   
121 116 116 ERITREA                   (Ethiopia?)               
121 117 117 ETHIOPIA                                
132 118 118 GABON                                   
140 119 119 GHANA                                   
114 120 120 GUINEA                                  
113 121 121 GUINEA-BISSAU              
135 000 000 EQUATORIAL GUINEA      
124 122 122 KENYA                                   
143 123 123 LIBERIA                                 
118 124 124 LIBYA                                   
146 125 125 MADAGASCAR                              
149 126 126 MALAWI                                  
109 127 127 MALI                                    
110 128 128 MAURITANIA                              
117 129 129 MAURITIUS                               
103 130 130 MOROCCO                                 
147 131 131 MOZAMBIQUE                              
151 132 132 NAMIBIA                                 
108 133 133 NIGER                                   
137 134 134 NIGERIA                                 
116 136 136 SAO TOME AND PRINCIPE                   
111 137 137 SENEGAL                                 
127 138 138 SEYCHELLES                              
115 139 139 SIERRA LEONE                            
123 140 140 SOMALIA                                 
153 141 141 SOUTH AFRICA   
000 000 143 TERRITORY OF THE FRENCH SOUTHERN AND ANTARCTIC LANDS          
000 000 147 BRITISH OVERSEAS TERRITORIES            
120 148 148 SUDAN                                   
125 149 149 TANZANIA                                
112 150 150 THE GAMBIA                              
139 151 151 TOGO                                    
107 152 152 TUNISIA                                 
126 153 153 UGANDA                                  
128 154 154 ZAIRE                                   
148 155 155 ZAMBIA                                  
150 156 156 ZIMBABWE       
199 157 157 AMSTERDAM ISLAND (FRANCE)               
199 158 158 ASCENSION ISLAND (U.K.)                 
199 159 159 CANARY ISLANDS (SPAIN)                  
104 160 160 CEUTA (SPAIN)                           
199 161 161 CHAGOS ARCHIPELAGO (U.K.)               
155 162 162 LESOTHO                               
199 163 163 MAYOTTE (FRANCE)                        
105 164 164 MELILLA (SPAIN)                         
199 165 165 REUNION ISLAND (FRANCE)                 
129 166 166 RWANDA                                  
154 167 167 SWAZILAND                               
199 168 168 TROMELIN ISLAND (FRANCE)                
102 169 169 WESTERN SAHARA (MOROCCO)        
199 000 000 WMO REGION 1 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS 60000-68999

206 201 201 AFGHANISTAN                             
207 202 202 BAHRAIN                                 
214 203 203 BANGLADESH                              
229 204 204 KAMPUCHEA / CAMBODIA                                
230 205 205 CHINA                                    50000 - 59999
220 205 205 HONG KONG (U.K.)  45000-45010  45030-45040 (China?)
222 206 206 NORTH KOREA / DEMOCRATIC PEOPLE'S REPUBLIC OF KOREA   
215 207 207 INDIA                                   
205 208 208 IRAN                                    
204 209 209 IRAQ                                    
224 210 210 JAPAN                                   
201 211 211 USSR / KAZAKHSTAN                              
203 212 212 KUWAIT                                  
201 213 213 USSR / KYRGYZSTAN                              
228 214 214 LAOS                                    
218 215 215 MONGOLIA                                
225 216 216 BURMA / MYANMAR                                 
219 217 217 NEPAL                                   
210 218 218 OMAN                                    
213 219 219 PAKISTAN                                
208 220 220 QATAR                                   
223 221 221 SOUTH KOREA / REPUBLIC OF KOREA                       
201 222 222 RUSSIAN FEDERATION (ASIAN SECTOR)       
202 223 223 SAUDI ARABIA                            
216 224 224 SRI LANKA                               
201 227 227 USSR / TAJIKISTAN                              
226 228 228 THAILAND                                
201 229 229 USSR / TURKMENISTAN                            
209 230 230 UNITED ARAB EMIRATES                    
201 231 231 USSR / UZBEKISTAN                              
227 232 232 VIETNAM                                 
211 233 233 YEMEN                   
212 233 233 PEOPLE'S DEMOCRATIC REPUBLIC OF YEMEN                (Merged Yemen?)
221 234 234 MACAU (PORTUGAL)                        
217 235 235 MALDIVES                                
230 236 236 TAIWAN     ( 230 China ? )       50000 - 59999                     
299 000 000 WMO REGION 2 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS 20000-59999

313 301 301 ARGENTINA                               
309 302 302 BOLIVIA                                 
306 303 303 BRAZIL                                  
310 304 304 CHILE                                   
301 305 305 COLOMBIA                                
307 306 306 ECUADOR                                 
303 307 307 GUYANA                                  
311 308 308 PARAGUAY                                
308 309 309 PERU                                    
304 312 312 SURINAME                                
312 313 313 URUGUAY                                 
302 314 314 VENEZUELA                               
305 315 315 FRENCH GUIANA (FRANCE)                  
000 316 316 FALKLAND ISLANDS (U.K.)                 
000 317 317 SOUTH GEORGIA (U.K.)              
399 000 000 WMO REGION 3 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS 80000-88903 

422 401 401 BARBADOS                                
411 402 402 BELIZE                                  
402 403 403 CANADA                                  
416 405 404 COSTA RICA                              
407 406 406 CUBA                                    
410 407 407 DOMINICAN REPUBLIC                      
413 408 408 EL SALVADOR                             
423 409 409 GRENADA                                 
412 410 410 GUATEMALA                               
409 411 411 HAITI                                   
414 412 412 HONDURAS                                
408 413 413 JAMAICA                                 
405 414 414 MEXICO                                  
415 415 415 NICARAGUA                               
417 416 416 PANAMA                                  
499 417 417 SAINT KITTS AND NEVIS              499?      
406 423 423 BAHAMAS / THE BAHAMAS                             
424 424 424 TRINIDAD AND TOBAGO                     
404 425 425 UNITED STATES OF AMERICA    
401 425 425 ALASKA / USA            
418 426 426 ANTIGUA AND BARBUDA                     
499 427 427 BERMUDA (U.K.)                  499?        
499 428 428 BRITISH VIRGIN ISLANDS (U.K.)           499?
499 429 429 CAYMAN ISLANDS (U.K.)                   499?
419 430 430 DOMINICA                                
607 431 431 GREENLAND (DENMARK)                     
499 432 432 GUADELOUPE (FRANCE)                     499?
499 433 433 MARTINIQUE (FRANCE)                     499?
499 434 434 NETHERLANDS ANTILLES (NETHERLANDS)      499?
499 435 435 PUERTO RICO (U.S.A.)                    499?
420 436 436 SAINT LUCIA                             
421 437 437 SAINT VINCENT AND THE GRENADINES        
403 438 438 SAINT PIERRE & MIQUELON ISLAND (FRANCE) 
499 439 439 TURKS AND CAICOS ISLANDS                499?
499 440 440 VIRGIN ISLANDS (U.S.A.)       	499?
499 000 000 WMO REGION 4 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS  70000-79999

513 501 501 AUSTRALIA                               
508 502 502 FIJI                                    
515 503 503 INDONESIA                               
506 504 504 KIRIBATI                                
501 505 505 MALAYSIA                                
504 506 506 NAURU                                   
511 507 507 NEW ZEALAND                             
512 508 508 PAPUA NEW GUINEA                        
517 509 509 PHILIPPINES                             
502 511 511 SINGAPORE                               
503 512 512 SOLOMON ISLANDS                         
510 517 517 TONGA                                   
507 518 518 TUVALU                                  
505 520 520 VANUATU                                 
599 521 521 AMERICAN SAMOA (U.S.A.)                 599?
516 522 522 BRUNEI                                  
599 523 523 CHRISTMAS ISLAND (AUSTRALIA)            599?
599 524 524 COCOS ISLANDS (AUSTRALIA)               
599 525 525 COOK ISLANDS (NEW ZEALAND)              
599 526 526 CORAL SEA ISLANDS (AUSTRALIA)           
599 527 527 FEDERATED STATES OF MICRONESIA          
599 528 528 FRENCH POLYNESIA (FRANCE)               
599 529 529 GUAM (U.S.A.)                           
599 530 530 JOHNSTON ATOLL (U.S.A.)                 
599 531 531 MARSHALL ISLANDS                        
599 532 532 NEW CALEDONIA (FRANCE)                  
599 533 533 NIUE (NEW ZEALAND)                      
599 534 534 NORFOLK ISLAND (AUSTRALIA)              
599 535 535 NORTHERN MARIANA ISLANDS (U.S.A.)       
599 536 536 BELAU                                   
599 537 537 PITCAIRN ISLAND (U.K.)                  
599 538 538 TOKELAU                                 
599 539 539 WAKE ISLAND (U.S.A.)                    
599 540 540 WALLIS AND FUTUNA (FRANCE)              
509 541 541 WESTERN SAMOA / SAMOA                                  
599 000 000 WMO REGION 5 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS  90000-98999 
514 000 000 ADELIE LAND   95500-95510 (Antarctica?) 
 
624 601 601 ALBANIA                                 
632 602 602 USSR / ARMENIA                                 
619 603 603 AUSTRIA                                 
632 604 604 USSR / AZERBAIJAN                              
632 605 605 USSR / BELARUS                                 
610 606 606 BELGIUM                                 
623 607 607 BOSNIA AND HERZEGOVINA                  (split Yugo)
625 608 608 BULGARIA                                
623 609 609 CROATIA                                 (split Yugo)
631 610 610 CYPRUS                                  
620 611 611 CZECH REPUBLIC                  (split 641)        
608 612 612 DENMARK                                 
632 613 613 USSR / ESTONIA                                 
603 614 614 FINLAND                                 
614 615 615 FRANCE                                  
632 616 616 USSR / GEORGIA                                 
618 617 617 GERMANY                                 
629 618 618 GREECE                                  
622 619 619 HUNGARY                                 
606 620 620 ICELAND                                 
605 621 621 IRELAND                                 
635 622 622 ISRAEL                                  
627 623 623 ITALY                                   
636 624 624 JORDAN                                  
632 625 625 USSR / KAZAKHSTAN                              
632 626 626 USSR / LATVIA                                  
634 627 627 LEBANON                                 
632 628 628 USSR / LITHUANIA                               
611 629 629 LUXEMBOURG                              
628 630 630 MALTA                                   
632 631 631 USSR / MOLDOVA                                 
623 632 632 MONTENEGRO                              (split Yugo)
609 633 633 NETHERLANDS                             
601 634 634 NORWAY                                  
621 635 635 POLAND                                  
617 636 636 PORTUGAL                                
625 637 637 ROMANIA                                 
632 638 638 USSR / RUSSIAN FEDERATION (EUROPEAN SECTOR)       
623 639 639 SERBIA                                  (split Yugo)
620 641 641 SLOVAKIA                                (split 611 )
623 000 642 SLOVENIA                            (split Yugo)
615 643 643 SPAIN                                   
602 645 645 SWEDEN                                  
612 646 646 SWITZERLAND                             
633 647 647 SYRIA                                   
623 648 648 MACEDONIA                               (split Yugo)
630 649 649 TURKEY                                  
632 650 650 USSR / UKRAINE                                 
604 651 651 UNITED KINGDOM                          
699 652 652 FAROE ISLANDS (DENMARK)                 699?
616 653 653 GIBRALTAR (U.K.)                        
699 654 654 MADEIRA ISLANDS (PORTUGAL)              699?
613 000 000 LIECHTENSTEIN     06600-06999  (So where is it now?) 
699 000 000 WMO REGION 6 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS  01001-40349 

799 700 700 ANTARCTICA                  88904-89999            
000 000 701 ARGENTINE BASE IN ANTARCTICA

699 800 800 SHIP STATIONS 01001-40349 WMO REGION 6 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS

So there you go. Hopefully it is useful to someone or at least of minor interest.

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in NCDC - GHCN Issues and tagged , , . Bookmark the permalink.

15 Responses to GHCN Country Codes V1, V2, V3

  1. Verity Jones says:

    E.M.,
    “If anyone else needs a copy …. I’ll make a CD and mail it.
    Even quicker and easier – you could set up a dropbox for free (also back up storage for you). Check email.

  2. Radovan Korda says:

    623 631 631 MOLDOVA (split Yugo)
    It is not split Yugo but split USSR

  3. Hugo M says:

    I doubt that y’ll get much amused regarding the usability of the OO database interface. It is really very far away from MS-access. When I’m doing such an analysis, I create a postgresql database and import the data using psql’s copy command (which is able to read arbitrary CSV formats). Then I use the really exciting R – SQL interface to read the data in and pass them to lattice graphics functions using their formula interface. It is very simple and very fast. Example:


    require(RPostgreSQL)
    require(lattice)

    drv <- dbDriver("PostgreSQL")

    con <- dbConnect(drv, user = "smith", password = "deltaT",
    dbname = "ghcn", host = "localhost")

    df <- dbGetQuery(con, "select station,date, temp, version from temps
    where country = 309 and station like '221A%')
    xyplot(temp ~ date | station+version,data=df)

  4. adolfogiurfa says:

    @E.M. The real “trick” is to make you think there is such a thing called “temperature”. We should study, instead, whence it comes, how it is produced. For example: If we eat “trash food” we generate more “heat” and originate the UHI effect. This is for sure. Now, as you have pointed out recently, there is a relation between changes in GMF and temperatures.
    These are the real issues not those temperature records of dubious origin, which have been adjusted so many times that no one ever will find the original data.

  5. E.M.Smith says:

    @Verity:

    Yeah, but if nobody wants one I’ve done that account set up for nothing… but yes, a good idea. Right now I’ve got a dozen accounts on different things where I’ve forgotten how to log in to them (passwords change in one place, not in a low usage one, after a few cycles the “three tries and you get locked out” bites…) so I’m trying to avoid accumulating even more accounts where they end up “bricked”…

    But if I get more than one request, hey, it’s worth it…. ( Or I suppose I could just set it up, load the data, and then forget about it as it won’t change in the future. I’m not NASA or NCDC after all ;-)

    @Radovan Korda:

    Thanks! All the QA possible helps. (And frankly, the USA education on ‘odd little countries in Europe and Asia’ is about as lame as one can get… I knew them all before the USSR breakup but some of the post-breakup have not been high on my ‘must keep straight’ list. Then you get similar names mixed in. Slovakia Slovenia Monrovia Moravia Moldova. Sigh. At least I had it as a break up of a Slavic Socialist State even if not the right one ;-)

    @Hugo M:

    Yeah, but it’s the hammer I have right now so the data looks kinda like a nail ;-)

    Part of this is just to find out if the OO database is worth using at all for things. It will take me all of about an hour to do a ‘trial fit’. If it can’t handle fixed format data ( i.e. doesn’t have positional field descriptors on input data, only CSV) then I’m likely to run it through FORTRAN to turn fixed format into CSV; as THAT takes me all of about 10 minutes, READ (format) WRITE (format with commas). Once I’m doing that FORTRAN input statement, well, I’ll likely just stay in FORTRAN land for the basic comparisons. Just to get the answer fastest.

    I don’t have “MS-access” and I’m not going to buy it. (Tried using it once a few years back. Didn’t like it. It couldn’t do fixed format then either, so unless they added that feature, I’m back in the land of CSV. Don’t know why the last few decades folks have thought fixed format input didn’t matter, but it was trendy for a while…)

    Frankly, if OO doesn’t do it, I’m most likely to use one of the SQL types on LINUX ( IFF I can find one of them that can handle fixed format data input… Yeah, I could dump it through the Linux sed or similar Stream EDitors and add commas, but if I’m doing that, I’m back at why not just use the FORTRAN and be done? Other than wanting to play with the database and prepare for some unknown future requests, it’s not needed at that point.

    Yeah, the typical “How to approach this problem most efficiently / easily” you get with any data handling problem. Trying to balance the immediate desire for a “any there there?” answer with the longer term “build some infrastructure for an easier round 2, if any”….

    @Adolfo:

    I’m following both threads.

    1) What is the history of data diddling reflected in the data? (How much have things changed from version to version, for example.)

    2) Are the temperature data really of any use? Do they connect to anything people do? (Also noting that enthalpy is ignored in temperature data and that alone is a fatal flaw. That almost all the thermometers are at airports that grew over time, went from grass fields to square km of tarmac in the sun; and that alone makes the data useless for climatology.)

    And more.

    So yes, asking things like “Did the sun do it via the aa modulation and UV variation and does that reflect in the magnetic fields?” is part of what we do here. But it is also important to keep an eye on how the data ‘drift’ over time; as that is direct evidence of data diddling… even if some folks assert it is valid diddling…

  6. E.M.Smith says:

    Well, using OO was not all that hard, but it has the usual “round about” way that OO tends to have.

    I found this page with ‘pointers’:

    http://sheepdogguides.com/fdb/fdb1imp1.htm

    Which had the helpful bit that you must pass through a spreadsheet on your way to a data load operation (for what I wanted to do), but had the misleading “push” toward CSV as the way and the light.

    To import a CSV file into an ooCalc spreadsheet:

    Open a spreadsheet. An old one will do, but a new one would be more usual for this sort of work.

    You can either…

    i) Open the CSV file into an new ooCalc workbook by using ooCalc’s “File | Open”
      ….or….
    ii) Open the CSV file as a new sheet within an already open workbook by using ooCalc’s “Insert | Sheet from file”.

    Turns out that you can similarly load data from a fixed format file and OO gives a nice little window where you make divider lines and slide them to positions to demark fields. Only ‘odd bits’ were needing to make a gratuitous ‘field’ for the spaces between columns (that end up as full columns in the spreadsheet that then you must delete later or OO makes a similar database field…) and being a bit careful about field types. ( It turned the numeric country code into an odd text field if the blank column was left as part of that field).

    Then it’s just a matter of ‘select all’ and ‘copy’ then go to the database panel, select ‘table’ and in the table box right click and paste (then fill out a lot of questions about field types, names, etc.)

    Remember that “a database” may consist of several tables, forms, queries, reports. What you may think of as “a database” may merely be a table.

    You are merely creating a new table. Either open an existing database, or, if appropriate, start a new database in the usual way. If starting a new one, select “Register / Open database”, but do not select “Create tables with wizard”. I created a new database and saved it in….

    C:\My Documents\FreeDB\FDB006\

    …as FDB006.odb

    Go back to the spreadsheet holding your data. Insert a row above the top of the data. Type column names into the new cells. The the contents of these cells will be used to name the fields.

    Select all of the data, and the field name cells. (Ctrl-A is the easy way to do this. Don’t be alarmed by the fact that the whole sheet gets selected…. unless you have “stuff” on the sheet besides the data (and headings).)

    Press ctrl-c (Shortcut for copy)

    Go back to the database. Select “Tables” in the “Database” pane.

    Right-click in the “Tables” pane. Click on “Paste”

    Select “Data and definitions” selected, if it isn’t already selected.

    Give the table a meaningful name

    Be sure to tick “Create primary key”.

    Click Next.

    The “Apply columns” page lets you decide which columns you want in the new table. You will probably want them all, and the >> button allows you to have them easily. You can delete superfluous columns later easily; you can’t add missed columns easily.

    Click next.

    In the “Type formatting” page, you may be able to leave most things unchanged. Note that in this dialog, you are setting properties for each of the fields in the table. As you change the field selected in the left hand column, you see that field’ properties in the rest of the dialog box.

    Setting the data type properly is especially important. If you have any numerical data with which you want to do arithmetic (for example, a “price” field in an inventory system), it would be best to change the field type for those fields. However, as explained earlier, you will probably want to use one of the “text” types for serial “numbers”, etc.

    Be careful if you decide to try to make any changes to the primary key field. If you aren’t sure that you know what you are doing, it would be best to leave the settings as ooBase made them.

    If you specify too few characters for a text field, then any records which are too long will trigger a message during the “create table” phase, and I’m afraid the records are just dropped. You can use the database’s ID field’s value as a guide for finding the lost records in the spreadsheet copy of the data.

    If you have some numbers which are not integers, for instance amounts of money, the type which would work for me was “Decimal”, AND I had to raise the number of decimal places above the default “0”.

    If you have dates to import, they are probably going to be their own special sort of nightmare…. the usually are, anyway… across all applications! But they “should” work, and sometimes will. Experiment, but check your data imports carefully! Among other things, be sure that 1/2/2006 comes out the way you want it. 1st of Feb or 2nd of Jan? It can mean either, depending on what part of the world you are in, and computers get confused. Having said all of that, I should add that I have seen dates work… with a following wind!

    Click “Create”, and, after a moment, your table should be done. Double-click on its name to open it.

    That wasn’t too painful, now, was it? (Unless dates were involved!)

    The bit about dropping records if the text was too long for the field ‘bit me’ so I get to either redo the operation with a longer text field ( I’d guessed 50 char but…) or find the ones that didn’t make it and hand enter. (Ought to be easy with the ID record counter)

    At any rate, looks like loading a fixed format data file is not too hard. That just leaves “join” and “report” as odd bits to learn.

    Open Office has a very “modal” way of working, with many things hidden on ‘right click’ menus or menus that only show up in certain contexts. (Apple was all about ‘non modal’ user interfaces where things are intuitive and discoverable… Doing the ‘modal’ stuff is non-intuitive and often non-discoverable; so a bit painful unless you can find a written guide sheet somewhere or have learned the manual by heart…) At any rate, knowing what things to click to get the right modes makes it not very hard to load data (now).

    It’s also unclear (as of now…) how to make specific fields ‘keys’ or match on them. That will be important for things like taking temperature records with one country code and matching them to other version records with a different code, but we’ll see. Any decent database must have a way to do a ‘match on field’.

    For now it’s “So far, so good”.

  7. Hugo M says:

    Well, it will get interesting how said hammer will react when you confront him with 200 metric tonnes of nails…. I’ve just seen that v1, v2 and v3 are online:

    ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v1
    ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2
    ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v3

  8. E.M.Smith says:

    @Hugo M:

    I was there a week+ ago.

    It’s sneaky. It looks like v1 and v2 are on line, but look inside those directories and it’s only the precipitation and pressure data. The inv (inventory) file and the v1.mean v1.min and v1.max files are all missing. The temperature data sets have been ‘pruned’ out of those online archives. Same thing for v2.mean v2.max v2.min.

    I’m sure it’s all innocent… /sarcoff>;

    BTW, loading the v1.mean temperature data into Open Office looked like it was going to be very straight forward. Unfortunately, it’s painfully slow and has now failed 3 times with an obscure error message. I think at 308,000 records it’s running into some kind of size limit or data dependent problem.

    If I don’t get it working inside an hour, I’m just going to head off to FORTRAN land. It works. It’s not got ‘size’ problem. It is very deterministic. Not ‘trendy’ but effective. (Ought to clear the desk and get the Linux box set up anyway…)

    Loading the data into OO spreadsheet had no problems. The “paste” into the database table proceeds normally, but when you try to open the table it says it can’t load the data. Smells like a size limit “wall” being run into, to me.

  9. Hugo M says:

    @E.M: Interesting. The v1 readme says: “This subdirectory contains 19 files”. However, there are only 16 files (actually only 15 when not counting the “README.ghcn~” backup file), with “temp.data” and “precip.data” missing. In v2, only max temperature and precipitation is left, whith minimum temperature, average temperature and pressure data missing. Whereas in v3, temperature min/max/avg data are available, but this time pressure and precipitation data are missing. What could be the possible sense? Plausible deniability?

    Regarding OO, while the fixed column width import is very neat feature that I did not know about, I was expecting that it would become increasingly slow and also very difficult to match inconsistently named stations. However, if you can load the data into a spread sheed, you can directly re-export them into a csv format using “save as”, and then import that file into a decent SQL database, having constraints in place that ensure that the deliberate chaos that has done to the naming scheme will be consistently dealt with.

  10. E.M.Smith says:

    @Hugo M:

    It is an “odd thing” that the old data have been evaporated…

    I must admit a private conceit here.

    Shortly after I came up with the dT/dt method (right hand category list on the side there) I got an inquiry about availability of the source code / method (which I’d already published during the process). I pointed to the older code postings and that was that.

    The dT/dt method starts with a “First Differences” approach, but does just a couple of bits differently. For one thing, it is run ‘backwards’. On the presumption that present measurement is done better than in 1800, it starts with the present temperature and calculates anomalies backward from that. Any ‘wacky’ temperature from way back when doesn’t cause all future anomalies to be wacky too… But what is most likely the strongest part of it is that when it comes to missing data, it simply does nothing. No “infilling” or “homogenizing” or whatever. It simply says the truth: “We do not know”. So if a +1.1 anomaly exists and it runs into 2 years of missing data, it holds that +1.1 anomaly for that whole time until the next valid data point. THEN it calculates the new anomaly. If, for example, that new temperature was -1 anomaly, it would find a +0.1 anomaly then existed. So you are never “speculating” in the code, only saying what you know.

    Running this code on the old form data showed up a lot of what looked like “splice artifacts” in the way other methods glued together station fragments. So if you had 3 stations contributing to the temperature series for, oh, Marble Bar Australia, you would see the splice artifact nature of the way they were merged in the other code (which merges first, then applies an anomaly calculation). Like I found at Marble Bar:

    https://chiefio.wordpress.com/2010/04/03/mysterious-marble-bar/

    My method “fixed” that (and tended to highlight it).

    The new GHCN (v3) was mysteriously delayed for a while.

    When it came out, the “splice” and merge step had been moved “upstream” into the GHCN creation phase. One could no longer demonstrate the ‘splice artifact’ nature of things nor apply the dT/dt method to demonstrate the difference if one did not do a ‘splice first’. The “Spice Was In”…

    I like to flatter myself that perhaps it was my little dT/dt method that caused the need to “hide the splice”… It’s not possible to know, really. At least not unless the folks doing it ‘come clean’. But all the timing fits. As does all the effects.

    I just figured I’d used the old data to “show up the differences” if ever needed.

    Then I saw the old data evaporate with nary a word. No “Last Call”. Just POOF! The magician and cloud of smoke…

    Strange that…

    However, I’d squirreled away a copy or two ;-)

    So at some point I’m going to just take the more recent bits of data spliced onto the end of V2 to make V3; unsplice them and call them a new series, and re-run dT/dt on the “old data” and have a minor discontinuity where V3 starts. Any “jump” at that point will stand out pretty clearly. The splice will be ‘unhidden’.

    But it IS a very strange thing that they are so determined to assure that the older versions of the data are hidden. Guess the ‘rewrite the past’ had gotten so extreme that there was nothing left to do but burn the old books…

    So that’s my private conceit. That perhaps in some small way my efforts goaded them into this rather public act of desperation. Or maybe they are just incredibly dumb and figured it was just old data nobody needed and dumped it. Malice or stupidity. Hell of a choice…

  11. Hugo M says:

    @E.M.: Never assume your opponents to be stupid… However, what else might be the reason that the NOAA publishes ghcn v3 with three different metadata files for tmin, tmax and tavg? Do they use a different set of station for each of these derived quantities?


    $ diff -u ghcnm.tmin.v3.1.0.20120421.qcu.inv.sorted ghcnm.tmax.v3.1.0.20120421.qcu.inv.sorted | grep "^-" | wc
    7 61 736

    $ diff -u ghcnm.tavg.v3.1.0.20120424.qcu.inv.sorted ghcnm.tmax.v3.1.0.20120421.qcu.inv.sorted | grep "^-" | wc
    1617 15864 176226

    Not even the meta data produced at the same day are congruent (7 different entries). Comparing metadata for Tmax versus Tavg reveals 1617 different entries, which is more than 10% of the metadata of all stations.

  12. Costa Rica: code 405 (not 404) is used in GHCN v3 as well as in GHCN v2

    v3:
    40578760000 9.9700 -84.8300 3.0 PUNTARENAS 136S 26FLxxCO 1x-9WATER C
    40578762000 10.0000 -84.2200 939.0 JUAN SANTAMAR 1060S 33MVxxno-9A 2TROP. SEASONAL C
    40578762001 10.0000 -84.1000 1141.0 SAN JOSE/CENTRAL OFFICE 1327U 391MVxxno-9x-9TROP. SEASONAL C
    40578767000 10.0000 -83.0500 3.0 PUERTO LIMON 60S 30FLxxCO 1x-9TROP. SEASONAL C

    Slovenia: was used in GHCN v2 with same country code 642 as in GHCN v3.

    64214014000 LJUBLJANA/BRN 46.218 14.473 384 427U 169MVxxno-9A15WARM CROPS B
    64214015000 LJUBLJANA/BEZ 46.066 14.512 298 319U 169MVxxno-9x-9WARM CROPS C
    64214026000 MARIBOR/SLIVN 46.480 15.682 265 293U 95MVxxno-9A 6WARM FOR./FIELD B
    64214105001 PORTOROZ 45.52 13.57 95 27R -9HIxxCO 1x-9MED. GRAZING B

    Liechtenstein: does not have any station in GHCN v2 or v3. WMO combines Liechtenstein with Switzerland. The single WMO station is Vaduz, 06990, at 47 07 39N, 09 31 03E, which would become 64606990000 if Vaduz were somehow to sneak into GHCN

  13. No my hero, this is not boring at all, very useful!

    Do you have a link to GHCN version 1 data?
    I hope so …

    K.R. Frank

  14. Pingback: GHCN V1 vs V2 (V3 soon) Teaser | Musings from the Chiefio

  15. Pingback: GHCN v3 Portugal and Spain | Musings from the Chiefio

Comments are closed.