MySQL-Server, SQL, and GHCN Database

Some long time ago I wanted to make a MySQL database of GHCN temperatures. Then I found someone had made a “load it up” ready to go set for V3 and just sort of “moved on”. https://chiefio.wordpress.com/2015/08/31/ghcn-in-mysql-quick-and-easy/

Now I see NOAA are up to GHCN V4 and I’m thinking it would be nice to have V1, V2, V3, and V4 all in one nice giant database where I could compare and contrast easily. So my round tuit has arrived.

I’ve sketched out the database (DB) schema I want to follow. (I know, SQL databases don’t really have a schema, but I’m an old dog and it’s a good mental trick to think in terms of a formal design graph). Now it’s time to try it and start finding where I forgot to put a “;” or spelled something wrongly.

First off, you must put MySQL Server on your box. Here we come to the fine point that the SQL language is not the server. (Unless you are a DBA Database Adminstrator for a living you are unlikely to care…) FWIW, most folks pronounce SQL “sequel”. I say it S-Q-L. The reason is simple. There once was a database named Sequel. They won a lawsuit against others using their name, and the others renamed it SQL but continued to say it as “sequel”. If find that offensive…

https://stackoverflow.com/questions/18523040/what-is-the-difference-between-sequel-and-sql

in the early 1970s. Initially called SEQUEL (Structured English Query Language) and based on their original language called SQUARE (Specifying Queries As Relational Expressions).SEQUEL was later renamed to SQL by dropping the vowels, because SEQUEL was a trade mark registered by the Hawker Siddeley aircraft company.

Say it however you want, just know when you are endorsing fraud and deception…

With that out of the way…

There are many database server back ends that support the S-Q-L Structured Query Language reporting abilities. The one that is common, and free, on Linux is MySQL-server. It also includes some administrative commands that are server oriented and not formally part of SQL The Language. (Microsoft also has an SQL Server with a whole lot of proprietary stuff in it but I’m entirely uninterested in it; even though it pays well to hold your nose and know / support it.) So some of this stuff will be generic SQL, hopefully most of it, but some will be different from The Micro$oft Way.

Install MySQL-server

Here’s how you install it on Devuan / Debian:

Is it already there?

root@odroidxu4:/SG2/ext/chiefio# which mysql
root@odroidxu4:/SG2/ext/chiefio# 

Nope.

So install it:

root@odroidxu4:/SG2/ext/chiefio# apt-get install mysql-server
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libjsoncpp0 libuuid-perl
Use 'apt-get autoremove' to remove them.
The following extra packages will be installed:
  libaio1 libdbd-mysql-perl libdbi-perl libterm-readkey-perl mysql-client-5.5 mysql-server-5.5 mysql-server-core-5.5
Suggested packages:
  libclone-perl libmldbm-perl libnet-daemon-perl libsql-statmysql> CREATE DATABASE temps;
Query OK, 1 row affected (0.00 sec)

mysql> GRANT ALL ON temps.* to chiefio@localhost
    -> IDENTIFIED BY 'LetMeIn!';
Query OK, 0 rows affected (0.00 sec)

mysql> DROP TABLE IF EXISTS inventory;
ERROR 1046 (3D000): No database selected
ement-perl mailx tinyca
Recommended packages:
  libhtml-template-perl
The following NEW packages will be installed:
  libaio1 libdbd-mysql-perl libdbi-perl libterm-readkey-perl mysql-client-5.5 mysql-server mysql-server-5.5 mysql-server-core-5.5
0 upgraded, 8 newly installed, 0 to remove and 5 not upgraded.
Need to get 7,262 kB of archives.
After this operation, 81.6 MB of additional disk space will be used.
Do you want to continue? [Y/n] y

After you answer “y” it will, somewhere in the middle of the stuff below, put up an ncurses (blue block menu stuff) panel asking for a “root” SQL manager password. I gave it “OpenUp!” for this demo case. Remember you must “tab” to the OK button to tell it to take your answer.

Get:1 http://auto.mirror.devuan.org/merged/ jessie/main libaio1 armhf 0.3.110-1 [9,160 B]
Get:2 http://auto.mirror.devuan.org/merged/ jessie/main libdbi-perl armhf 1.631-3+b1 [810 kB]
Get:3 http://auto.mirror.devuan.org/merged/ jessie/main libdbd-mysql-perl armhf 4.028-2+deb8u2 [113 kB]
Get:4 http://auto.mirror.devuan.org/merged/ jessie/main libterm-readkey-perl armhf 2.32-1+b1 [26.5 kB]
Get:5 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-client-5.5 armhf 5.5.62-0+deb8u1 [1,454 kB]
Get:6 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-server-core-5.5 armhf 5.5.62-0+deb8u1 [3,171 kB]
Get:7 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-server-5.5 armhf 5.5.62-0+deb8u1 [1,604 kB]                                      
Get:8 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-server all 5.5.62-0+deb8u1 [73.9 kB]                                             
Fetched 7,262 kB in 6s (1,047 kB/s)                                                                                                                     
Preconfiguring packages ...
Selecting previously unselected package libaio1:armhf.
(Reading database ... 92227 files and directories currently installed.)
Preparing to unpack .../libaio1_0.3.110-1_armhf.deb ...
Unpacking libaio1:armhf (0.3.110-1) ...
Selecting previously unselected package libdbi-perl.
Preparing to unpack .../libdbi-perl_1.631-3+b1_armhf.deb ...
Unpacking libdbi-perl (1.631-3+b1) ...
Selecting previously unselected package libdbd-mysql-perl.
Preparing to unpack .../libdbd-mysql-perl_4.028-2+deb8u2_armhf.deb ...
Unpacking libdbd-mysql-perl (4.028-2+deb8u2) ...
Selecting previously unselected package libterm-readkey-perl.
Preparing to unpack .../libterm-readkey-perl_2.32-1+b1_armhf.deb ...
Unpacking libterm-readkey-perl (2.32-1+b1) ...
Selecting previously unselected package mysql-client-5.5.
Preparing to unpack .../mysql-client-5.5_5.5.62-0+deb8u1_armhf.deb ...
Unpacking mysql-client-5.5 (5.5.62-0+deb8u1) ...
Selecting previously unselected package mysql-server-core-5.5.
Preparing to unpack .../mysql-server-core-5.5_5.5.62-0+deb8u1_armhf.deb ...
Unpacking mysql-server-core-5.5 (5.5.62-0+deb8u1) ...
Selecting previously unselected package mysql-server-5.5.
Preparing to unpack .../mysql-server-5.5_5.5.62-0+deb8u1_armhf.deb ...
Unpacking mysql-server-5.5 (5.5.62-0+deb8u1) ...
Selecting previously unselected package mysql-server.
Preparing to unpack .../mysql-server_5.5.62-0+deb8u1_all.deb ...
Unpacking mysql-server (5.5.62-0+deb8u1) ...
Processing triggers for man-db (2.7.0.2-5) ...
Processing triggers for systemd (215-17+deb8u7) ...
Setting up libaio1:armhf (0.3.110-1) ...
Setting up libdbi-perl (1.631-3+b1) ...
Setting up libdbd-mysql-perl (4.028-2+deb8u2) ...
Setting up libterm-readkey-perl (2.32-1+b1) ...
Setting up mysql-client-5.5 (5.5.62-0+deb8u1) ...
Setting up mysql-server-core-5.5 (5.5.62-0+deb8u1) ...
Setting up mysql-server-5.5 (5.5.62-0+deb8u1) ...
[ ok ] Stopping MySQL database server: mysqld.
190118 13:25:06 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
190118 13:25:06 [Note] Ignoring --secure-file-priv value as server is running with --bootstrap.
190118 13:25:06 [Note] /usr/sbin/mysqld (mysqld 5.5.62-0+deb8u1) starting as process 24590 ...
[ ok ] Starting MySQL database server: mysqld ..
[info] Checking for tables which need an upgrade, are corrupt or were 
not closed cleanly..
Setting up mysql-server (5.5.62-0+deb8u1) ...
Processing triggers for libc-bin (2.19-18+deb8u10) ...
Processing triggers for systemd (215-17+deb8u7) ...

So, is it there now?

root@odroidxu4:/SG2/ext/chiefio# which mysql
/usr/bin/mysql

Yes!

There is also a utility we will use a bit later to load some data from a file:

root@odroidxu4:/SG2/ext/chiefio# which mysqlimport
/usr/bin/mysqlimport

Is there other stuff? OH yeah…

root@odroidxu4:/SG2/ext/chiefio# ls /usr/bin/my*
/usr/bin/myisamchk	    /usr/bin/mysqlbinlog		 /usr/bin/mysql_find_rows	/usr/bin/mysql_secure_installation
/usr/bin/myisam_ftdump	    /usr/bin/mysqlbug			 /usr/bin/mysql_fix_extensions	/usr/bin/mysql_setpermission
/usr/bin/myisamlog	    /usr/bin/mysqlcheck			 /usr/bin/mysqlhotcopy		/usr/bin/mysqlshow
/usr/bin/myisampack	    /usr/bin/mysql_client_test		 /usr/bin/mysqlimport		/usr/bin/mysqlslap
/usr/bin/my_print_defaults  /usr/bin/mysql_convert_table_format  /usr/bin/mysql_install_db	/usr/bin/mysqltest
/usr/bin/mysql		    /usr/bin/mysqld_multi		 /usr/bin/mysqloptimize		/usr/bin/mysql_tzinfo_to_sql
/usr/bin/mysqlaccess	    /usr/bin/mysqld_safe		 /usr/bin/mysql_plugin		/usr/bin/mysql_upgrade
/usr/bin/mysqladmin	    /usr/bin/mysqldump			 /usr/bin/mysqlrepair		/usr/bin/mysql_waitpid
/usr/bin/mysqlanalyze	    /usr/bin/mysqldumpslow		 /usr/bin/mysqlreport		/usr/bin/mysql_zap

Basically to become a MySQL DBA, you learn what all of those do and how to make SQL work as you wish it to work. I’ll be using a subset and only learning the bits I need as I need them. “mysqudump” for example will be low on my list as I’m just loading static data from other sources and can simply recreate the data input load process.

A Test Case

Before getting around to actually loading the GHCN stuff, since it is large and with several variations, I’m going to do a couple of simple test cases to see if things work and have I got the syntax at all right.

Make A Database and A Table and Load It

I tried to log in the way most things tell you do do it, and failed.

sudo mysql -u root -p

Found a page saying you need to set up the security options first and did so, then it worked. Even though I just left everything the same. Sooo…. you probably need to do this:

chiefio@odroidxu4:/var/log/mysql$ sudo mysql_secure_installation

NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MySQL
      SERVERS IN PRODUCTION USE!  PLEASE READ EACH STEP CAREFULLY!


In order to log into MySQL to secure it, we'll need the current
password for the root user.  If you've just installed MySQL, and
you haven't set the root password yet, the password will be blank,
so you should just press enter here.

Enter current password for root (enter for none): 
OK, successfully used password, moving on...

Setting the root password ensures that nobody can log into the MySQL
root user without the proper authorisation.

You already have a root password set, so you can safely answer 'n'.

Change the root password? [Y/n] n
 ... skipping.

By default, a MySQL installation has an anonymous user, allowing anyone
to log into MySQL without having to have a user account created for
them.  This is intended only for testing, and to make the installation
go a bit smoother.  You should remove them before moving into a
production environment.

Remove anonymous users? [Y/n] n
 ... skipping.

Normally, root should only be allowed to connect from 'localhost'.  This
ensures that someone cannot guess at the root password from the network.

Disallow root login remotely? [Y/n] Y
 ... Success!

By default, MySQL comes with a database named 'test' that anyone can
access.  This is also intended only for testing, and should be removed
before moving into a production environment.

Remove test database and access to it? [Y/n] n
 ... skipping.

Reloading the privilege tables will ensure that all changes made so far
will take effect immediately.

Reload privilege tables now? [Y/n] Y
 ... Success!

Cleaning up...

All done!  If you've completed all of the above steps, your MySQL
installation should now be secure.

Thanks for using MySQL!

After which I could, in fact, log in to mysql.

chiefio@odroidxu4:/var/log/mysql$ sudo mysql -u root
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

Won’t let me do it without a password…

chiefio@odroidxu4:/var/log/mysql$ sudo mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 52
Server version: 5.5.62-0+deb8u1 (Debian)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

Eventually I’ll add a user of my choosing and loosen up security so it doesn’t need a “sudo” to make it go. ( I found some pages saying you need a sudo after some 5.7? release or some such, with some directions about it) Lots of interesting “wandering in the woods” commentary here:

https://stackoverflow.com/questions/33991228/what-is-the-default-root-pasword-for-mysql-5-7

Then add the database, let “chiefio” use it with a particular password, and put a table in it:

mysql> CREATE DATABASE temps;
Query OK, 1 row affected (0.00 sec)

mysql> GRANT ALL ON temps.* to chiefio@localhost
    -> IDENTIFIED BY 'LetMeIn!';
Query OK, 0 rows affected (0.00 sec)

mysql> DROP TABLE IF EXISTS inventory;
ERROR 1046 (3D000): No database selected

Oh Dear… need to “select” our database. Note that I’m using ALL CAPS for keyworks. MySQL is not sensitive to case, but this makes it easier to see what I’m doing.

mysql> USE temps;
Database changed
mysql> DROP TABLE IF EXISTS inventory;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE TABLE inventory (
    -> region CHAR(1) NOT NULL,
    -> country CHAR(2) NOT NULL,
    -> wmo CHAR(5) NOT NULL,
    -> name CHAR(30) NOT NULL,
    -> attr CHAR(20),
    -> );
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 7
mysql> 

There’s a tutorial here with some interesting examples:
http://www.mysqltutorial.org/mysql-create-table/

By inspection of this one:

CREATE TABLE IF NOT EXISTS tasks (
    task_id INT AUTO_INCREMENT,
    title VARCHAR(255) NOT NULL,
    start_date DATE,
    due_date DATE,
    status TINYINT NOT NULL,
    priority TINYINT NOT NULL,
    description TEXT,
    PRIMARY KEY (task_id)
)  ENGINE=INNODB;

I’d guess my error was putting a comma after that last line before the ); or else maybe that “ENGINE=INNODB” implies I need something after the ) so decisions decisions…

mysql> CREATE TABLE inventory (
    -> region CHAR(1) NOT NULL,
    -> country CHAR(2) NOT NULL,
    -> wmo CHAR(5) NOT NULL,
    -> name CHAR(30) NOT NULL,
    -> attr CHAR(20)
    -> ) ;
Query OK, 0 rows affected (0.08 sec)

Looks like that attr line with the “,” was the issue.

mysql> show tables;
+-----------------+
| Tables_in_temps |
+-----------------+
| inventory       |
+-----------------+
1 row in set (0.01 sec)

mysql> show full tables;
+-----------------+------------+
| Tables_in_temps | Table_type |
+-----------------+------------+
| inventory       | BASE TABLE |
+-----------------+------------+
1 row in set (0.01 sec)
mysql> 

Yup, I’ve got a table. Note that I’ve not used any field as an index yet, nor have I divided the attributes into individual fields. This is a “toy” version. When it’s time for the real one, I’ll do that kind of detail stuff. I’ll also load the build script from a file and I’ll load the data from a file. For now I’m just getting the syntax set in my habits. Building the basic skills.

At some point I’ll need to explore what these other default databases might be:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| temps              |
+--------------------+
4 rows in set (0.00 sec)

“temps” is mine, I presume the others are either demo databases or are for some internal use. A minor “dig here” for later.

So let’s put something in our table.

If done for all fields in order, you need not list field names:

mysql> INSERT INTO inventory VALUES ('1','23','56789','Test Site One','Attr list of stuff');
Query OK, 1 row affected (0.04 sec)

As I’m going to load lots of data, doing it from a file is better. Mysql expects data with TAB as the separator and a Newln at the end of a row. Since the original data is in fixed format, I’ll need to write a little ‘stub’ FORTRAN program to read it in, then write it back out with TAB between each field. I’ll put that in another posting (as this one is a bit long already and it will take a while to write the FORTRAN…)

So where IS this database, anyway? Don’t see it in my home directory or in my current working directory?

root@odroidxu4:/var/lib# pwd
/var/lib
root@odroidxu4:/var/lib# ls -l mysql
total 28736
-rw-r--r-- 1 root  root         0 Jan 18 13:25 debian-5.5.flag
-rw-rw---- 1 mysql mysql 18874368 Jan 18 14:44 ibdata1
-rw-rw---- 1 mysql mysql  5242880 Jan 18 14:44 ib_logfile0
-rw-rw---- 1 mysql mysql  5242880 Jan 18 13:25 ib_logfile1
drwx------ 2 mysql root      4096 Jan 18 13:25 mysql
-rw------- 1 root  root         6 Jan 18 13:25 mysql_upgrade_info
drwx------ 2 mysql mysql     4096 Jan 18 13:25 performance_schema
drwx------ 2 mysql mysql     4096 Jan 18 14:35 temps
root@odroidxu4:/var/lib# 

Yeah, you can change that. And I will for the final production version. For now, for the toy version, that’s OK. I’ve got /var mounted from a hard disk partition already anyway:

root@odroidxu4:/var/lib# df .
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sda5        3997376 1119740   2667924  30% /var

Just realize that if you are running from a 4 GB uSD card with /var in root (/) this might be an issue. We’ll cover moving locations in a future episode…

Then, just to show stuff got into the table:

mysql> SELECT * FROM inventory ;
+--------+---------+-------+---------------+--------------------+
| region | country | wmo   | name          | attr               |
+--------+---------+-------+---------------+--------------------+
| 1      | 23      | 56789 | Test Site One | Attr list of stuff |
+--------+---------+-------+---------------+--------------------+
1 row in set (0.00 sec)
mysql> 

The general format is:

SELECT fields
FROM table
WHERE condition-test;

More on that in the future too, once I have the final table up and loaded with real data.

In Conclusion

So that’s your basic basic intro to MySQL. Minimum needed to create a database, create a table, load some data, and dump it out; along with some minimal on the topic of permissions and operational issues.

I’m generally looking at a design with 2 major tables. One the “inventory” file with data about the individual site location. That attr list includes latitude, longitude, elevation, distance to shore, and a bunch of other stuff. All those specifics need to be broken out into distinct fields (for my purposes) and I need to think a bit about indexing and matching issue. Countries change over time, so that’s one of the issues with comparing V1 to V2 to V3 to V4. Similarly, WMO Numbers have some more minor changes. Like the use of a “mod number” in V2 that isn’t in other version where it was homogenized away.

One field I clearly want to add is “version” for that GHCN v1 vs GHCN v2 and USHCN v1 and Hadley… Then I’m thinking about having an “ascension” number as they have in seed banks. That would let me load GHCN V2 from different dated saved copies and compare them, too. That might be overkill for the first cut, though. I think at first I just need to get V1, V2, and V3 loaded from known good copies.

The other file needed is the Station_Data table with the actual monthly data (temperature) values along with the quality flags and such. In some ways it is more simple as the “key” is clearly Region-Country-WMO#-ModFlag in some combination. WMO#-Mod likely enough. I may need to make a V1 vs V2 vs V3 vs V4 WMO# table that lets me map all of them to one standard WMO number. That depends on the details of how the WMO number changes over time and space. Is it just dropping the Mod Flag, or what else changed?

Sometime tonight I ought to have the first cut of a live load from files of actual data for the inventory file, for at least 1 and maybe all 3 of the GHCN data sets. USHCN to follow. Hadley just an idea at this stage; and for some very later time.

Then I’ll start on all that Station Data, once I have clue how to tell what station is whom.

FWIW, the GHCH V4 adds something like 14,000 stations, so well over the prior 6k worth. The only problem is that almost all of them are in the USA, Germany and a couple of other places. It does NOT improve spacial coverage of all those remote pristine places and it DOES increase the bias to a few highly industrialized urban places. Oh, and it does a LOT more “homogenizing” and fixing up of the data into an even more complex “data food product”. I’m not sure if you can even get “raw” data versions anymore. As NOAA is down during this shutdown budget food fight, I’m not looking at V4 just yet. Maybe after The Wall is funded ;-)

Subscribe to feed

Advertisements

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in NCDC - GHCN Issues, Tech Bits and tagged , , . Bookmark the permalink.

28 Responses to MySQL-Server, SQL, and GHCN Database

  1. jim2 says:

    It’s been a long time I’ve done anything with MySQL, but IIRC, there are two database engines from which to choose, Innodb and myisam. (Yep, I recalled correctly :) They use different locking schemes.

    https://stackoverflow.com/questions/3818759/what-is-innodb-and-myisam-in-mysql

  2. A C Osborn says:

    It sounds like you are not bothering to normalise the data, just a straight download with some improvements.

  3. Eric Barnes says:

    You might want to change the storage engine to myisam. innodb does a lot processing to be https://en.wikipedia.org/wiki/ACID_(computer_science) compliant, which won’t be needed for etl/query/olap type of stuff. isam will be much faster for queries. Last time I looked there wasn’t a good olap storage engine for mysql, and it doesn’t look like that’s changed unfortunately.
    https://dev.mysql.com/doc/refman/8.0/en/storage-engines.html

  4. E.M.Smith says:

    @A.C.Osborn:

    Yes. First you just PRESERVE the data. Any and all changes get done from that Golden Master and get flagged as changed when the processing is done. I know, not the way “climate scientists” do things, but I’m more chemist than climastrologist…. (Thanks to Mr. McGuire… my High School chem teacher).

    @Eric & Jim:

    OK, I’ll look into that. As quantity of data is very very small, I think it won’t matter much how efficient things are, but I still like to tend to such things ;-)

    That’s an interesting link to BI software… but later… I’m in a “coding frenzy” now ;-)

    @All:

    Here’s a FORTRAN program to read in the GHCN v1 “country.codes” file, and spit it back out with TAB between the fields. (Yeah, next I get to load it into a database table…) This is the first exemplar of the (trivial) FORTRAN glue needed to turn the fixed position data into something MySQL will like as input data…

    chiefio@odroidxu4:~/SQL/v1$ cat ccodes.f 
    C FORTRAN to read the country.codes v1 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 5  STARTWMO, ENDWMO
          CHARACTER * 25 NAME
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='country.codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) STARTWMO, ENDWMO, CONT, COUNTRY, NAME
    C
       11 FORMAT (A5, X, A5, X, A1, A2, X, A25)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE (6, 6) STARTWMO,TAB, ENDWMO,TAB, CONT,TAB, COUNTRY,TAB, NAME
    C
        6 FORMAT (A5, A1, A5, A1, A1, A1, A2, A1, A25)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    

    I hacked up the supplied “inventory” reading program to make it (it is next on the list…) as a quick test case. I had the country.codes file in this particular set, but the V1 data files were not in it (this was from my last scrape of their site and they have deleted any old annoying historical data…) so now I get to “go fish” in my archives somewhere to find the actual data files ;-) But at least I could get this bit done

    Here’s a sample of the top of the country.codes file:

    chiefio@odroidxu4:~/SQL/v1$ head country.codes 
    08583-08594 101  CAPE VERDE                                                  
    60033-60033 102  WESTERN SAHARA                                              
    60096-60096 102  WESTERN SAHARA                                              
    60100-60319 103  MOROCCO                                                     
    60321-60337 103  MOROCCO                                                     
    60339-60349 103  MOROCCO                                                     
    60320-60320 104  CEUTA (SPAIN)                                               
    60338-60338 105  MELILLA (SPAIN)  
    

    and after the program above changes it:

    08583	08594	1	01	 CAPE VERDE              
    60033	60033	1	02	 WESTERN SAHARA          
    60096	60096	1	02	 WESTERN SAHARA          
    60100	60319	1	03	 MOROCCO                 
    60321	60337	1	03	 MOROCCO                 
    60339	60349	1	03	 MOROCCO                 
    60320	60320	1	04	 CEUTA (SPAIN)           
    60338	60338	1	05	 MELILLA (SPAIN)   
    

    I’ve also made a more accurate inventory table description (still subject to change as I move on to V2, V3, etc.)

    chiefio@odroidxu4:~/SQL$ cat inventory 
    CREATE TABLE inventory (
        version CHAR(5) NOT NULL,
        ascension date(8),
        region CHAR(1) NOT NULL,
        country CHAR(2) NOT NULL,
        wmo CHAR(5) NOT NULL,
        mod_flag CHAR(3),
        name CHAR(25) NOT NULL,
        lat CHAR(6),
        long CHAR(7),
        elev CHAR(4),
        first_year CHAR(4),
        last_year CHAR(4),
        missing_pct INT,
        discontinuity CHAR(1),
        map_elev CHAR(2),
        pop_type CHAR(2),
        pop_size CHAR(2),
        topo_type CHAR(2),
        veg CHAR(2),
        prox_water CHAR(2),
        dist_water CHAR(2),
        airport CHAR(2),
        dist_A_to_U CHAR(2),
        veg_grid CHAR(2)
        ) ;
    

    This is based on what I’ve seen so far. Note that the “date” fields are left lower case. That’s ’cause I’m not sure I want them that way. Might make it an INT or a CHAR instead… Also still no index fields identified. This is just the “draft” table description file at the part way point – but I wanted folks to see how it’s grown ;-)

    Some fields, like first_year, last_year, and missing_pct are only in v1 being dropped later.

    Once I’m ready to load some actual data (after next cup of coffee ;-) I’ll post a comment with the “load table description from file” and “load table from file” bits in it…

  5. E.M.Smith says:

    The V2 country.codes file is similar but different. Lacking the “range” data on wmo numbers. Just a 3 digit continent/country and the name. Here’s the FORTRAN to read it and add tabs:

    root@odroidxu4:/SG2/ext/chiefio/SQL/v2# cat ccodesV2.f 
    C FORTRAN to read the country.codes v2 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 25 NAME
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='v2.country.codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) CONT, COUNTRY, NAME
    C
       11 FORMAT (A1, A2, X, A25)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE (6, 6) CONT,TAB, COUNTRY,TAB, NAME
    C
        6 FORMAT (A1, A1, A2, A1, A25)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    

    Basically the same program with some stuff cut out and the input file name changed.

  6. E.M.Smith says:

    It looks like v3 has the data files in the last major scrape I did. Dated 2015:

    -rw-r--r-- 1 chiefio chiefio    10624 Sep 26  2010 country-codes
    -rw-r--r-- 1 chiefio chiefio 12535272 Sep  7  2015 ghcnm.tavg.latest.qca.tar.gz
    -rw-r--r-- 1 chiefio chiefio 12634086 Sep  7  2015 ghcnm.tavg.latest.qcu.tar.gz
    -rw-r--r-- 1 chiefio chiefio  9017384 Sep  5  2015 ghcnm.tmax.latest.qca.tar.gz
    -rw-r--r-- 1 chiefio chiefio  8973681 Sep  5  2015 ghcnm.tmax.latest.qcu.tar.gz
    -rw-r--r-- 1 chiefio chiefio  8978110 Sep  5  2015 ghcnm.tmin.latest.qca.tar.gz
    -rw-r--r-- 1 chiefio chiefio  8957689 Sep  5  2015 ghcnm.tmin.latest.qcu.tar.gz
    

    So that’s the one that will be loaded first. Then I need to “go fish” in my canonical collection of old data for the v1 and v2 data. I know I have it… somewhere ;-)

    The v3 country.codes file looks the same as the v2 in structure, so the same Tab_Adder can be used on it:

    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# head country-codes 
    101 ALGERIA                                 
    102 ANGOLA                                  
    103 BENIN                                   
    104 BOTSWANA                                
    105 BURKINA FASO                            
    106 BURUNDI                                 
    107 CAMEROON                                
    108 CAPE VERDE                              
    109 CENTRAL AFRICAN REPUBLIC  
    

    As the 2 digits of the specific country repeat between continents, I think I’ll be changing things to just add the “continent” as a field (to make search / sort stuff easier) while also leaving it on the “country” portion in the “country” field. Otherwise you would need to always concatenate the two for various searches and such. I know, violates good DB practice as it is redundant data… but simplifies operations ‘downstream’. I’m not worried about wasting a couple of hundred characters of storage…

    That would mean my final Tab_Adder will change a bit.

          WRITE (6, 6) CONT,TAB, COUNTRY,TAB, NAME
        6 FORMAT (A1, A1, A2, A1, A25)
    

    would become:

          WRITE (6, 6) CONT,TAB, CONT,COUNTRY,TAB, NAME
        6 FORMAT (A1, A1, A1,A2, A1, A25)
    

    Here’s the one for v3 (it uses a different file name too… with a – in it):

    root@odroidxu4:/SG2/ext/chiefio/SQL/v3#  cat ccodes.f 
    C FORTRAN to read the country.codes v2 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 25 NAME
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='country-codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) CONT, COUNTRY, NAME
    C
       11 FORMAT (A1, A2, X, A25)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE (6, 6) CONT,TAB, CONT,COUNTRY,TAB, NAME
    C
        6 FORMAT (A1, A1, A1,A2, A1, A25)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    

    Output now looks like this:

    oot@odroidxu4:/SG2/ext/chiefio/SQL/v3# ./a.out 
    1	101	ALGERIA                  
    1	102	ANGOLA                   
    1	103	BENIN                    
    1	104	BOTSWANA                 
    1	105	BURKINA FASO             
    1	106	BURUNDI                  
    1	107	CAMEROON                 
    1	108	CAPE VERDE    
    

    So I can make CountryCode the index key and “Continent” will just be a datum (though you can select on it for just one continent of data…) That way each country code is a unique key.

    Next up I need to create the country table and load it, then see if I can make any interesting or useful report of changes to country name by v by countrycode… Essentially compare v1 CC NAME to v2 CC NAME to v3 CC NAME and spit out any that are different… which IIRC was basically all of them between two of the versions as they did some kind of renumber… but we’ll see. If so, then I’ll need to make a map of who turned into what so I can compare matching records on one key.

    Essentially I may need to add a OldCountry field with the country as in the original data and use the v3 Country-Code as the constant key for any one country. Might “have issues” with things like USSR breaking up into multiples but…

    2	222	RUSSIAN FEDERATION (ASIAN
              
    2	227	TAJIKISTAN               
                
    2	229	TURKMENISTAN    
    

    I think that can be sorted pretty easy…

  7. E.M.Smith says:

    OK, the promised “how to load from a file”:

    First we make the files somewhere. I’ve just stuck them in a directory named SQL in my home dir.

    chiefio@odroidxu4:~/SQL/Table.Schemas$ ls
    continent.sql  inventory.sql       station_data.sql
    country.sql    load.continent.sql  wmo.sql
    

    Here’s what load.continent.sql looks like. I’ve “created” this data myself by inspection of the WMO numbers. It is just the direct INSERT command syntax from earlier since there are few records.

    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# cat load.continent.sql 
    INSERT INTO continent VALUES ('Gv3','7Sept2015','1','Africa');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','2','Asia');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','3','South America');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','4','North America');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','5','Australia Pacific Islands');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','6','Europe');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','7','Antarctica');
    INSERT INTO continent VALUES ('Gv3','7Sept2015','8','Ship Stations Ocean');
    

    Then I launch the MySQL system:

    chiefio@odroidxu4:~/SQL/Table.Schemas$ mysql -u root -p
    Enter password: 
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 38
    Server version: 5.5.62-0+deb8u1 (Debian)
    
    Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    

    Seems that the sudo wasn’t needed after all…

    The command to execute stuff from a file is pretty trivial. SOURCE…

    mysql> source continent.sql
    ERROR 1046 (3D000): No database selected
    mysql> use temps
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    
    Database changed
    

    Provided you remember to “use” the database… First we load the schema… create the table

    mysql> source continent.sql
    Query OK, 0 rows affected (0.16 sec)
    
    mysql> show tables;
    +-----------------+
    | Tables_in_temps |
    +-----------------+
    | continent       |
    | inventory       |
    +-----------------+
    2 rows in set (0.00 sec)
    mysql> 
    

    Here’s the schema I used in that continents file:

    CREATE TABLE continent (
        version CHAR(5) NOT NULL,
        ascension date(8),
        region CHAR(1) NOT NULL,
        region_name CHAR(15)
        ) ;
    

    Then we load the data:

    mysql> source load.continent.sql
    Query OK, 1 row affected (0.04 sec)
    
    Query OK, 1 row affected (0.04 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected, 1 warning (0.05 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.04 sec)
    
    Query OK, 1 row affected, 1 warning (0.03 sec)
    

    Warning? WARNING? We don’t want no steeenking WARNING! What’s up?

    mysql> SELECT * from continent;
    +---------+-----------+--------+-----------------+
    | version | ascension | region | region_name     |
    +---------+-----------+--------+-----------------+
    | Gv3     | 7Sept2015 | 1      | Africa          |
    | Gv3     | 7Sept2015 | 2      | Asia            |
    | Gv3     | 7Sept2015 | 3      | South America   |
    | Gv3     | 7Sept2015 | 4      | North America   |
    | Gv3     | 7Sept2015 | 5      | Australia Pacif |
    | Gv3     | 7Sept2015 | 6      | Europe          |
    | Gv3     | 7Sept2015 | 7      | Antarctica      |
    | Gv3     | 7Sept2015 | 8      | Ship Stations O |
    +---------+-----------+--------+-----------------+
    8 rows in set (0.00 sec)
    

    Oh, it’s chopped off “Pacific” and “Ocean”… So I’ll make the field longer. New schema:

    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# cat continent.sql 
    CREATE TABLE continent (
        version CHAR(5) NOT NULL,
        ascension CHAR (10),
        region CHAR(1) NOT NULL,
        region_name CHAR(25)
        ) ;
    

    region_name now 25 CHAR long… So now we delete the old table, reload the schema, and reload the data:

    mysql> DROP TABLE continent;
    Query OK, 0 rows affected (0.09 sec)
    
    mysql> source continent.sql
    Query OK, 0 rows affected (0.12 sec)
    
    mysql> source load.continent.sql
    Query OK, 1 row affected (0.05 sec)
    
    Query OK, 1 row affected (0.16 sec)
    
    Query OK, 1 row affected (0.04 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.02 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.04 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    mysql> SELECT * FROM continent;
    +---------+-----------+--------+---------------------------+
    | version | ascension | region | region_name               |
    +---------+-----------+--------+---------------------------+
    | Gv3     | 7Sept2015 | 1      | Africa                    |
    | Gv3     | 7Sept2015 | 2      | Asia                      |
    | Gv3     | 7Sept2015 | 3      | South America             |
    | Gv3     | 7Sept2015 | 4      | North America             |
    | Gv3     | 7Sept2015 | 5      | Australia Pacific Islands |
    | Gv3     | 7Sept2015 | 6      | Europe                    |
    | Gv3     | 7Sept2015 | 7      | Antarctica                |
    | Gv3     | 7Sept2015 | 8      | Ship Stations Ocean       |
    +---------+-----------+--------+---------------------------+
    8 rows in set (0.00 sec)
    
    mysql> 
    

    No more error / Warning messages. We now have the full names displayed on a data dump. Note that I’ve chosen, for now, to make date just a 10 CHAR text field. I can always change it later. I’ve also decided on G for GHCN, U for USHCN, and H for Hadley (should that day ever come) with the v# for each version. Yes, storing a bunch of redundant Gv characters. But easier to read dumps…

    So there you have the basic “Create a table from a description in a file” and “load data from a set of commands in a file”. Next I’m going to load the data for “inventory” records. That will be a batch load from a file of only data, no SQL commands.

    FWIW, I’m pretty sure the Continent data doesn’t change between GHCN versions, so having a version number and ascension date is likely wasted effort. We’ll see as I go through the rest.

  8. kneel63 says:

    umm, sorry to be late, but…
    load data infile
    command has options
    “separated by” and “lines separated by”
    and has “enclosed by” as well.
    might evren do what you need with no coding.

    just sayin’…

  9. E.M.Smith says:

    @kneal63:

    Thanks for the contribution.

    Took me less time to code it (as it is trivial) than to look up the relevant command manual pages… but the bigger issue is that all examples I could find were either tab or comma delimited and I could see no way to set a “no separator” option. Position dependent formatting is often left out in “modern” systems despite being highly useful and necessary.

    So it might be in there somewhere, and I will keep an eye out for it, but I’m not going to let it limit my progress. I can always come back and clean it up later if such an option presents.

    I did think of trying SEPARATED BY “” to see if then it would take the default widths with no separator, but could find no evidence that behaviour would work. So that will be an experiment for play time.

    IMHO CSV is a lousy way to make a file as so much of text and numbers has commas in it.
    TAB is only a little better as tabs are also common.
    Fixed position has the advantage that all characters can be in the data, AND you don’t waste space on separators; so no idea why folks leave it out of languages these days.

  10. A C Osborn says:

    As you said a coding frenzy & great progress.
    I am a DOS/BASIC MS EXCEL & ACCESS VBA programmer and have coded quite a few import routines to handle poorly controlled data.
    It is great to see how your approach with FORTRAN works.
    Not knowing FORTRAN I was a bit surprised by the
    C
    GO TO 10
    C
    C If end of file, then stop…

    In BASIC & VBA the “if EOF” would come before “goto 10” statement to prevent an endless loop or an error occurring.

  11. E.M.Smith says:

    @A.C.Osborn:

    That “C if end…” is just a comment. All lines with a C in the start are comments. That actual magic sauce was applied earlier:

    10 READ (1, 11, END=99) CONT, COUNTRY, NAME
    

    At the moment you opened the file to READ, it was supplied with what to do on END, that being go to line 99 and continue there. The comment is just in place to let folks know that somewhere “way up there” something has a “GO TO 99” or in this case an END=99 in a READ.

    It is common to put a “comes from” comment next to a target of a GOTO.

    20    GO TO 45
    .....
    C     Comes From 20 or Comes From READ Loop near 20 or... 
    45    CONTINUE
    

    It is also considered bad form to just send a GOTO to some random line that seems unconnected like having it “GOTO 45” where 45 does something like some calculation. Then there is no indication at line 45 that this calculation will suddenly be done after some jump instead of just inline in order.

    Some folks get their panties in a bunch about the GOTO… others of us just adapt and use the Comes From comment to make it work reasonably well and clearly…

    @All:

    I’d intended to just use the mysqlimport command first to demonstrate loading the database. Instead I’ve used the internal LOAD command. It, by default, blocks loading data from local sources (claiming some kind of security issue…). As I am the only person on this box, it isn’t a security issue… but I got to deal with it anyway.

    First off, if your session is idle too long, it must restart SQL… but you don’t need to actively do anything to make that happen (so why issue the error message anyway?…) Oh, and eventually I’ll remember that every command must terminate with both a ; and a /cr…

    mysql> source country.sql
    ERROR 2006 (HY000): MySQL server has gone away
    No connection. Trying to reconnect...
    Connection id:    39
    Current database: temps
    
    Query OK, 0 rows affected (0.11 sec)
    
    mysql> show tables
        -> ;
    +-----------------+
    | Tables_in_temps |
    +-----------------+
    | continent       |
    | country         |
    | inventory       |
    +-----------------+
    3 rows in set (0.00 sec)
    
    mysql> describe country
        -> ;
    +--------------+----------+------+-----+---------+-------+
    | Field        | Type     | Null | Key | Default | Extra |
    +--------------+----------+------+-----+---------+-------+
    | version      | char(5)  | NO   |     | NULL    |       |
    | ascension    | char(8)  | YES  |     | NULL    |       |
    | region       | char(1)  | NO   |     | NULL    |       |
    | country      | char(3)  | NO   |     | NULL    |       |
    | country_name | char(15) | YES  |     | NULL    |       |
    | start_wmo    | char(5)  | YES  |     | NULL    |       |
    | end_wmo      | char(5)  | YES  |     | NULL    |       |
    +--------------+----------+------+-----+---------+-------+
    7 rows in set (0.01 sec)
    

    All well and good, so now I’m ready to load some country data. I’ve put it nicely in a file for each version (and I’ve changed the format to put the fields in the same order as the database…) Now if I could just remember to type the ; before hitting return…

    mysql> LOAD DATA LOCAL INFILE '../v1/CCodes.txt' INTO TABLE country
        -> ;
    ERROR 1148 (42000): The used command is not allowed with this MySQL version
    
    mysql> LOAD DATA LOCAL INFILE '/SG2/ext/chiefio/SQL/v1/CCodes.txt' INTO TABLE country;
    ERROR 1148 (42000): The used command is not allowed with this MySQL version
    mysql> exit
    Bye
    

    Seems that you must launch mysql with a particular flag to load local data…

    chiefio@odroidxu4:~/SQL/Table.Schemas$ mysql -u root -p --local-infile temps
    Enter password: 
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 40
    Server version: 5.5.62-0+deb8u1 (Debian)
    
    Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> LOAD DATA LOCAL INFILE '../v1/CCodes.txt' INTO TABLE country
        -> ;
    Query OK, 208 rows affected, 41 warnings (0.07 sec)
    Records: 208  Deleted: 0  Skipped: 0  Warnings: 41
    

    Oddly, it didn’t complain about my not issuing a “USE temps”…

    mysql>  select * from country ;
    +---------+-----------+--------+---------+-----------------+-----------+---------+
    | version | ascension | region | country | country_name    | start_wmo | end_wmo |
    +---------+-----------+--------+---------+-----------------+-----------+---------+
    | Gv1     | UNKNOWN   | 1      | 101     |  CAPE VERDE     | 08583     | 08594   |
    | Gv1     | UNKNOWN   | 1      | 102     |  WESTERN SAHARA | 60033     | 60033   |
    | Gv1     | UNKNOWN   | 1      | 102     |  WESTERN SAHARA | 60096     | 60096   |
    | Gv1     | UNKNOWN   | 1      | 103     |  MOROCCO        | 60100     | 60319   
    

    Yay! It worked. Now I can load the v2 and v3 data as well and then do some comparison reports on how countries and numbers change between them. (Or find out it’s not very interesting and I just need to manually fix up one country marker and move on…) Note that STARTWMO and ENDWMO only exist in v1, so while I could figure them out for V2 and V3, they are not supplied.

    Here’s the current version of the ccodesv2.f program. Note I’ve added hard set constants for an “UNKNOWN” ascension date and “Gv2” VERSION value. Also fields have been reordered into DB order for easier loading. I’m not convinced I’ll actually end up using the ascension date, and I’ve still not put in place any index fields. That will come later after I’m done playing and learning a bit…

    root@odroidxu4:/SG2/ext/chiefio/SQL/v2# cat ccodesv2.f 
    C FORTRAN to read the country.codes v2 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 5 VERSION
          CHARACTER * 10 ASCEN
          CHARACTER * 25 NAME
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Set some constants
          ASCEN="UNKNOWN"
          VERSION="Gv2"
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='v2.country.codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) CONT, COUNTRY, NAME
    C
       11 FORMAT (A1, A2, X, A25)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE (6, 6) VERSION,TAB, ASCEN,TAB ,CONT,TAB, CONT,COUNTRY,TAB,  &
         &NAME, TAB, TAB
    C
        6 FORMAT (A5,A1,A10,A1, A1,A1, A1,A2,A1, A25, A1, A1)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    
  12. E.M.Smith says:

    Well that was quick…

    mysql> LOAD DATA LOCAL INFILE '../v2/CCodes.txt' INTO TABLE country
        -> ;
    Query OK, 233 rows affected, 56 warnings (0.06 sec)
    Records: 233  Deleted: 0  Skipped: 0  Warnings: 56
    
    mysql> LOAD DATA LOCAL INFILE '../v3/CCodes.txt' INTO TABLE country ;
    Query OK, 237 rows affected, 296 warnings (0.04 sec)
    Records: 237  Deleted: 0  Skipped: 0  Warnings: 296
    

    About as fast as I could type it…

    I do wonder what the 56 and 296 Warnings are about. Maybe I need to check a field length ;-)

    mysql> SELECT version, country_name, country FROM country WHERE country_name="TURKEY";
    +---------+--------------+---------+
    | version | country_name | country |
    +---------+--------------+---------+
    | Gv2     | TURKEY       | 649     |
    | Gv3     | TURKEY       | 649     |
    +---------+--------------+---------+
    2 rows in set (0.00 sec)
    
    mysql> 
    

    So wonder why no TURKEY in v1….

    mysql> SELECT version, country_name FROM country WHERE version="Gv1";
    +---------+-----------------+
    | version | country_name    |
    +---------+-----------------+
    | Gv1     |  CAPE VERDE     |
    | Gv1     |  WESTERN SAHARA |
    ...
    | Gv1     |  TURKEY         |
    | Gv1     |  CYPRUS         |
    | Gv1     |  UNION OF SOVIE |
    | Gv1     |  UNION OF SOVIE |
    | Gv1     |  UNION OF SOVIE |
    | Gv1     |  UNION OF SOVIE |
    | Gv1     |  UNION OF SOVIE |
    | Gv1     |  SYRIA          |
    | Gv1     |  LEBANON        |
    | Gv1     |  ISRAEL         |
    | Gv1     |  JORDAN         |
    | Gv1     |  WMO REGION 6 D |
    | Gv1     |  ANTARCTIC STAT |
    +---------+-----------------+
    

    Yet there it is… I wonder if we have variable “blank” space issues in the names… Looks like Turkey might have a leading blank in V1. Then clearly the Union of Soviet… got cut off. I think I see my “Warning” issue… I think I need to make NAME more like 25 long… Wonder if it got longer between versions?

    So, OK, it’s working. Need to polish it some but the basics are done for the inventory stuff. I’ll polish a bit today and likely get to loading temperature data tomorrow.

  13. E.M.Smith says:

    OK, the input file to the v1 program has 2X spaces before the name, not one. Got it. Also it looks like some names are very long. 54 characters:

    60000-68999 199  WMO REGION 1 DEPENDENT ISLANDS / OCEAN VESSEL STATIONS
    

    So I need to bump up the field size in the table s well as in the FORTRAN that picks names out of these input files. Maybe 56 CHAR just for grins… A nice 7 byte size…

    It looks like there are 6 of those WMO REGION 1-6 islands and ocean entries for the 6 major continents, so I suppose it’s important enough to capture the whole name… We’re talking about 600 total records so it isn’t like it’s a lot of bytes to make it 56 instead of 20…

  14. E.M.Smith says:

    I WILL type a ; at the end of the command.
    I WILL type a ; at the end of the command.
    I WILL…

    mysql> drop table continent
        -> ;
    Query OK, 0 rows affected (0.05 sec)
    
    mysql> source continent.sql
    Query OK, 0 rows affected (0.12 sec)
    
    mysql> describe continent;
    +-------------+----------+------+-----+---------+-------+
    | Field       | Type     | Null | Key | Default | Extra |
    +-------------+----------+------+-----+---------+-------+
    | version     | char(5)  | NO   |     | NULL    |       |
    | ascension   | char(10) | YES  |     | NULL    |       |
    | region      | char(1)  | NO   |     | NULL    |       |
    | region_name | char(56) | YES  |     | NULL    |       |
    +-------------+----------+------+-----+---------+-------+
    4 rows in set (0.01 sec)
    mysql> 
    
    Hmmm.... Now I've got too many Turkeys ;-)    
    
    And I've still got warnings.  Looks like I'll need to find out how to turn on full warning messages...
    
    
    mysql> LOAD DATA LOCAL INFILE '../v1/CCodes.txt' INTO TABLE country;
    Query OK, 208 rows affected, 38 warnings (0.06 sec)
    Records: 208  Deleted: 0  Skipped: 0  Warnings: 38
    
    mysql> LOAD DATA LOCAL INFILE '../v2/CCodes.txt' INTO TABLE country;
    Query OK, 233 rows affected, 56 warnings (0.06 sec)
    Records: 233  Deleted: 0  Skipped: 0  Warnings: 56
    
    mysql> LOAD DATA LOCAL INFILE '../v3/CCodes.txt' INTO TABLE country;
    Query OK, 237 rows affected, 296 warnings (0.07 sec)
    Records: 237  Deleted: 0  Skipped: 0  Warnings: 296
    
    mysql> SELECT version, country, country_name FROM country WHERE country_name="TURKEY";
    +---------+---------+--------------+
    | version | country | country_name |
    +---------+---------+--------------+
    | Gv2     | 649     | TURKEY       |
    | Gv3     | 649     | TURKEY       |
    | Gv1     | 630     | TURKEY       |
    | Gv2     | 649     | TURKEY       |
    | Gv3     | 649     | TURKEY       |
    +---------+---------+--------------+
    5 rows in set (0.01 sec)
    

    So the Country number changes between V1 and V2/V3, then I’ve got two entries for each of V2 and V3.

    There’s only one TURKEY in the v3/CCodes.txt file, so checking a “SELECT *” report, it shows 2 sets of data for all three versions (including the original v1 with leading space, which is why it doesn’t show up in the “country_name=” report).

    OK, I didn’t scrub and remake the table…. looks like I dropped “continent” instead… Maybe these need names that are more different ;-0

    Why am I showing all these “goofs”? So folks can see just how much programming is random walk through the error messages. Don’t be discouraged if you get errors and things get screwed up. We all do it. The “Pros” just know not to give up and to clean up the messes as they accrue…

    So now I’m going back to have another “do over” ;-)

  15. E.M.Smith says:
    mysql> drop table continent;
    Query OK, 0 rows affected (0.06 sec)
    
    mysql> source continent.sql
    Query OK, 0 rows affected (0.14 sec)
    
    mysql> describe continent
        -> ;
    +-------------+----------+------+-----+---------+-------+
    | Field       | Type     | Null | Key | Default | Extra |
    +-------------+----------+------+-----+---------+-------+
    | version     | char(5)  | NO   |     | NULL    |       |
    | ascension   | char(10) | YES  |     | NULL    |       |
    | region      | char(1)  | NO   |     | NULL    |       |
    | region_name | char(25) | YES  |     | NULL    |       |
    +-------------+----------+------+-----+---------+-------+
    4 rows in set (0.00 sec)
    mysql> drop table country;
    Query OK, 0 rows affected (0.06 sec)
    
    mysql> source country.sql
    Query OK, 0 rows affected (0.07 sec)
    
    mysql> describe country;
    +--------------+----------+------+-----+---------+-------+
    | Field        | Type     | Null | Key | Default | Extra |
    +--------------+----------+------+-----+---------+-------+
    | version      | char(5)  | NO   |     | NULL    |       |
    | ascension    | char(8)  | YES  |     | NULL    |       |
    | region       | char(1)  | NO   |     | NULL    |       |
    | country      | char(3)  | NO   |     | NULL    |       |
    | country_name | char(56) | YES  |     | NULL    |       |
    | start_wmo    | char(5)  | YES  |     | NULL    |       |
    | end_wmo      | char(5)  | YES  |     | NULL    |       |
    +--------------+----------+------+-----+---------+-------+
    7 rows in set (0.00 sec)
    

    So now they are both pristine again. And I can try this One More Time on the data load:

    mysql> source load.continent.sql ;
    Query OK, 1 row affected (0.05 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.05 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.04 sec)
    
    Query OK, 1 row affected (0.03 sec)
    
    Query OK, 1 row affected (0.05 sec)
    
    mysql> SELECT * FROM continent;
    +---------+-----------+--------+---------------------------+
    | version | ascension | region | region_name               |
    +---------+-----------+--------+---------------------------+
    | Gv3     | 7Sept2015 | 1      | Africa                    |
    | Gv3     | 7Sept2015 | 2      | Asia                      |
    | Gv3     | 7Sept2015 | 3      | South America             |
    | Gv3     | 7Sept2015 | 4      | North America             |
    | Gv3     | 7Sept2015 | 5      | Australia Pacific Islands |
    | Gv3     | 7Sept2015 | 6      | Europe                    |
    | Gv3     | 7Sept2015 | 7      | Antarctica                |
    | Gv3     | 7Sept2015 | 8      | Ship Stations Ocean       |
    +---------+-----------+--------+---------------------------+
    8 rows in set (0.00 sec)
    mysql> LOAD DATA LOCAL INFILE '../v1/CCodes.txt' INTO TABLE country;
    Query OK, 208 rows affected (0.24 sec)
    Records: 208  Deleted: 0  Skipped: 0  Warnings: 0
    
    mysql> LOAD DATA LOCAL INFILE '../v2/CCodes.txt' INTO TABLE country;
    Query OK, 233 rows affected (0.07 sec)
    Records: 233  Deleted: 0  Skipped: 0  Warnings: 0
    
    mysql> LOAD DATA LOCAL INFILE '../v3/CCodes.txt' INTO TABLE country;
    Query OK, 237 rows affected, 237 warnings (0.07 sec)
    Records: 237  Deleted: 0  Skipped: 0  Warnings: 237
    
    mysql> 
    

    All the warnings are not gone from the v2 load, but the v3 load is still complaining. I’ll need to kick it around some more… But this now looks pretty good:

    mysql> SELECT * FROM country WHERE country_name="TURKEY";
    +---------+-----------+--------+---------+--------------+-----------+---------+
    | version | ascension | region | country | country_name | start_wmo | end_wmo |
    +---------+-----------+--------+---------+--------------+-----------+---------+
    | Gv1     | UNKNOWN   | 6      | 630     | TURKEY       | 17000     | 17399   |
    | Gv2     | UNKNOWN   | 6      | 649     | TURKEY       |           |         |
    | Gv3     | 7Sept201  | 6      | 649     | TURKEY       |           |         |
    +---------+-----------+--------+---------+--------------+-----------+---------+
    3 rows in set (0.00 sec)
    
    mysql> 
    

    One entry from each version. We can see the change of country number between V1 and the others, and we can see the others have no STARTWMO or ENDWMO data. Names now nicely all start in the first character without a blank. It does look like my V3 date gets cut off and the last digit is missing. I’d wager that’s my error message..

    Gv3  	7Sept2015 	6	654	MADEIRA ISLANDS (PORTUGAL)                              		
    Gv3  	7Sept2015 	7	700	ANTARCTICA                                              		
    Gv3  	7Sept2015 	7	701	ARGENTINE BASE IN ANTARCTICA                            		
    Gv3  	7Sept2015 	8	800	SHIP STATIONS                                           		
    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# 
    

    It is in the source data file I create with the FORTRAN program and here’s where I don’t like TAB delimited files. You can not see the TAB. So is there a bogus SPACE in there somewhere or are the TABs where they need to be? You can’t tell. At least with Fixed Format you can lay a ruler on it and check positions…

    So OK, one more bit of debugging on the date field and then it looks like this one will be done.

  16. E.M.Smith says:

    Ah, that was easy. I’d not updated the country.sql shema file to reflect my decision to change from 8 bytes to 10 bytes for the Ascension number:

    ascension CHAR(8),
    

    As the other files have “UNKNOWN” and that’s only 7 CHAR, it fit without error. The added byte in the date didn’t fit, so tossed a warning….

    So I’ll drop the table and recreate it gain but with 10 CHAR field, then reload. I’m not going to post the details this time as it would be very redundant with the above. So “This time for sure!” ;-)

  17. E.M.Smith says:

    And it worked. No warning messages:

    mysql> LOAD DATA LOCAL INFILE '../v1/CCodes.txt' INTO TABLE country;
    Query OK, 208 rows affected (0.07 sec)
    Records: 208  Deleted: 0  Skipped: 0  Warnings: 0
    
    mysql> LOAD DATA LOCAL INFILE '../v2/CCodes.txt' INTO TABLE country;
    Query OK, 233 rows affected (0.08 sec)
    Records: 233  Deleted: 0  Skipped: 0  Warnings: 0
    
    mysql> LOAD DATA LOCAL INFILE '../v3/CCodes.txt' INTO TABLE country;
    Query OK, 237 rows affected (0.07 sec)
    Records: 237  Deleted: 0  Skipped: 0  Warnings: 0
    mysql> 
    

    So at this time I’m done with the ‘continent’ and ‘country’ tables (other than eventually adding index fields once I find a need… and which ones matter most).

    So this will conclude what I put in this posting. I’ll make a new one for when I load the temperature data table and for any interesting results from reports run.

    Just as completion, here’s the schema files for the two tables, and the final version of the FORTRAN programs:

    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# cat continent.sql 
    CREATE TABLE continent (
        version CHAR(5) NOT NULL,
        ascension CHAR (10),
        region CHAR(1) NOT NULL,
        region_name CHAR(25)
        ) ;
    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# cat country.sql 
    CREATE TABLE country (
        version CHAR(5) NOT NULL,
        ascension CHAR(10),
        region CHAR(1) NOT NULL,
        country CHAR(3) NOT NULL,
        country_name  CHAR(56),
        start_wmo CHAR(5),
        end_wmo CHAR(5)
        ) ;
    

    v1 FORTRAN:

    root@odroidxu4:/SG2/ext/chiefio/SQL/v1# cat ccodes.f 
    C FORTRAN to read the country.codes v1 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
          CHARACTER * 10 ASCEN
          CHARACTER * 5  STARTWMO ,ENDWMO, VERSION
          CHARACTER * 56 NAME
    C
    C Set the TAB character
          TAB=CHAR(09)
    C Set other constants
          VERSION="Gv1"
          ASCEN="UNKNOWN"
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='country.codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) STARTWMO, ENDWMO, CONT, COUNTRY, NAME
    C
       11 FORMAT (A5, X, A5, X, A1, A2, 2X, A56)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE(6, 6) VERSION,TAB, ASCEN,TAB, CONT,TAB,CONT,COUNTRY,TAB,    &
         &NAME,TAB, STARTWMO,TAB,ENDWMO,TAB
    C
        6 FORMAT (A5,A1, A10,A1, A1,A1, A1,A2,A1, A56,A1, A5,A1, A5,A1)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    

    The v2 FORTRAN:

    root@odroidxu4:/SG2/ext/chiefio/SQL/v2# cat ccodesv2.f 
    C FORTRAN to read the country.codes v2 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 5 VERSION
          CHARACTER * 10 ASCEN
          CHARACTER * 56 NAME
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Set some constants
          ASCEN="UNKNOWN"
          VERSION="Gv2"
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='v2.country.codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) CONT, COUNTRY, NAME
    C
       11 FORMAT (A1, A2, X, A56)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE (6, 6) VERSION,TAB, ASCEN,TAB ,CONT,TAB, CONT,COUNTRY,TAB,  &
         &NAME, TAB, TAB
    C
        6 FORMAT (A5,A1,A10,A1, A1,A1, A1,A2,A1, A56, A1, A1)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    

    And finally the v3 FORTRAN:

    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# cat ccodesv3.f 
    C FORTRAN to read the country.codes v3 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 1 TAB
          CHARACTER * 5 VERSION
          CHARACTER * 10 ASCEN
          CHARACTER * 56 NAME
          CHARACTER * 1 CONT
          CHARACTER * 2 COUNTRY
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Set some constants
          VERSION="Gv3"
          ASCEN="7Sept2015"
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='country-codes', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) CONT, COUNTRY, NAME
    C
       11 FORMAT (A1, A2, X, A56)
    C
    C Write out one line of data with TAB between fields
    C
          WRITE (6, 6) VERSION,TAB, ASCEN,TAB, CONT,TAB, CONT,COUNTRY,TAB,  &
         &NAME, TAB, TAB
    C
        6 FORMAT (A5,A1, A10,A1, A1,A1, A1,A2,A1, A56, A1, A1)
    C
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop...
    C
       99 STOP
          END
    

    With some ponder time on it, I’m likely to make a bit more changes happen. One I can easily see is having a single country code for each country, but also store the original country codes. That will take some matching of the versions, but not too hard. It looks like about 288 lines and most of v2 & v3 the same.

    A similar but much more complicated issue will come up with the WMO numbers as they changed between v1 and v2/v3 also; but there’s a whole lot more of them to unscramble… so this case might be a good learning tool.

    With that, I’m done posting code in comments here. I’ll just be doing discussion now in this thread.

  18. jim2 says:

    When I was playing around with MySQL, I used a GUI front-end. It removes a lot of the annoyances. Here are 10 GUIs …

    https://www.databasejournal.com/features/mysql/slideshows/top-10-mysql-gui-tools.html

  19. Pingback: GHCN Country Code Changes by Version – MySQL results | Musings from the Chiefio

  20. E.M.Smith says:

    @Jim2:

    Thanks! I’ll take a look.

    I like to learn the command line layer in any case, but eventually I may get around to using a graphical overlay bit. My experience has been that you end up at the command line for the tricky bits anyway, so might as well learn it. Don’t know if that applies to MySQL or not.

  21. jim2 says:

    Looking back at my archives, it appears I had SQLWorkbench. You can execute command lines from it. The blurb:

    MySQL Workbench is a unified visual tool for database architects, developers, and DBAs. MySQL Workbench provides data modeling, SQL development, and comprehensive administration tools for server configuration, user administration, backup, and much more. MySQL Workbench is available on Windows, Linux and Mac OS X.

    https://www.mysql.com/products/workbench/

  22. jim2 says:

    Oops, you can run ad-hoc sql queries and commands, not command line.

  23. E.M.Smith says:

    Arn’t those the same thing? Or is it a subset?….

  24. jim2 says:

    I don’t recall it having something like terminal built in to the app. But you can run SQL … create tables, views, select data … whatever u can do with SQL. But wouldn’t run grep, for example.

  25. E.M.Smith says:

    Ah, Got It! The mysql> command line, not the linux#> command line. Works for me… I have two open terminals, one of each, while doing this stuff.

  26. Pingback: MariaDB (MySQL) Login Failure New Install Fix | Musings from the Chiefio

  27. E.M.Smith says:

    This install procedure assumes you have mysql_native_password authentication set and as of Debian 9 that is not true, it uses “unix_socket”. The difference (and how to fix it) is recorded here:
    https://chiefio.wordpress.com/2019/03/08/mariadb-mysql-login-failure-new-install-fix/

Comments are closed.