Some long time ago I wanted to make a MySQL database of GHCN temperatures. Then I found someone had made a “load it up” ready to go set for V3 and just sort of “moved on”. https://chiefio.wordpress.com/2015/08/31/ghcn-in-mysql-quick-and-easy/
Now I see NOAA are up to GHCN V4 and I’m thinking it would be nice to have V1, V2, V3, and V4 all in one nice giant database where I could compare and contrast easily. So my round tuit has arrived.
I’ve sketched out the database (DB) schema I want to follow. (I know, SQL databases don’t really have a schema, but I’m an old dog and it’s a good mental trick to think in terms of a formal design graph). Now it’s time to try it and start finding where I forgot to put a “;” or spelled something wrongly.
First off, you must put MySQL Server on your box. Here we come to the fine point that the SQL language is not the server. (Unless you are a DBA Database Adminstrator for a living you are unlikely to care…) FWIW, most folks pronounce SQL “sequel”. I say it S-Q-L. The reason is simple. There once was a database named Sequel. They won a lawsuit against others using their name, and the others renamed it SQL but continued to say it as “sequel”. If find that offensive…
in the early 1970s. Initially called SEQUEL (Structured English Query Language) and based on their original language called SQUARE (Specifying Queries As Relational Expressions).SEQUEL was later renamed to SQL by dropping the vowels, because SEQUEL was a trade mark registered by the Hawker Siddeley aircraft company.
Say it however you want, just know when you are endorsing fraud and deception…
With that out of the way…
There are many database server back ends that support the S-Q-L Structured Query Language reporting abilities. The one that is common, and free, on Linux is MySQL-server. It also includes some administrative commands that are server oriented and not formally part of SQL The Language. (Microsoft also has an SQL Server with a whole lot of proprietary stuff in it but I’m entirely uninterested in it; even though it pays well to hold your nose and know / support it.) So some of this stuff will be generic SQL, hopefully most of it, but some will be different from The Micro$oft Way.
Here’s how you install it on Devuan / Debian:
Is it already there?
root@odroidxu4:/SG2/ext/chiefio# which mysql root@odroidxu4:/SG2/ext/chiefio#
So install it:
root@odroidxu4:/SG2/ext/chiefio# apt-get install mysql-server Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: libjsoncpp0 libuuid-perl Use 'apt-get autoremove' to remove them. The following extra packages will be installed: libaio1 libdbd-mysql-perl libdbi-perl libterm-readkey-perl mysql-client-5.5 mysql-server-5.5 mysql-server-core-5.5 Suggested packages: libclone-perl libmldbm-perl libnet-daemon-perl libsql-statmysql> CREATE DATABASE temps; Query OK, 1 row affected (0.00 sec) mysql> GRANT ALL ON temps.* to chiefio@localhost -> IDENTIFIED BY 'LetMeIn!'; Query OK, 0 rows affected (0.00 sec) mysql> DROP TABLE IF EXISTS inventory; ERROR 1046 (3D000): No database selected ement-perl mailx tinyca Recommended packages: libhtml-template-perl The following NEW packages will be installed: libaio1 libdbd-mysql-perl libdbi-perl libterm-readkey-perl mysql-client-5.5 mysql-server mysql-server-5.5 mysql-server-core-5.5 0 upgraded, 8 newly installed, 0 to remove and 5 not upgraded. Need to get 7,262 kB of archives. After this operation, 81.6 MB of additional disk space will be used. Do you want to continue? [Y/n] y
After you answer “y” it will, somewhere in the middle of the stuff below, put up an ncurses (blue block menu stuff) panel asking for a “root” SQL manager password. I gave it “OpenUp!” for this demo case. Remember you must “tab” to the OK button to tell it to take your answer.
Get:1 http://auto.mirror.devuan.org/merged/ jessie/main libaio1 armhf 0.3.110-1 [9,160 B] Get:2 http://auto.mirror.devuan.org/merged/ jessie/main libdbi-perl armhf 1.631-3+b1 [810 kB] Get:3 http://auto.mirror.devuan.org/merged/ jessie/main libdbd-mysql-perl armhf 4.028-2+deb8u2 [113 kB] Get:4 http://auto.mirror.devuan.org/merged/ jessie/main libterm-readkey-perl armhf 2.32-1+b1 [26.5 kB] Get:5 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-client-5.5 armhf 5.5.62-0+deb8u1 [1,454 kB] Get:6 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-server-core-5.5 armhf 5.5.62-0+deb8u1 [3,171 kB] Get:7 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-server-5.5 armhf 5.5.62-0+deb8u1 [1,604 kB] Get:8 http://auto.mirror.devuan.org/merged/ jessie-security/main mysql-server all 5.5.62-0+deb8u1 [73.9 kB] Fetched 7,262 kB in 6s (1,047 kB/s) Preconfiguring packages ... Selecting previously unselected package libaio1:armhf. (Reading database ... 92227 files and directories currently installed.) Preparing to unpack .../libaio1_0.3.110-1_armhf.deb ... Unpacking libaio1:armhf (0.3.110-1) ... Selecting previously unselected package libdbi-perl. Preparing to unpack .../libdbi-perl_1.631-3+b1_armhf.deb ... Unpacking libdbi-perl (1.631-3+b1) ... Selecting previously unselected package libdbd-mysql-perl. Preparing to unpack .../libdbd-mysql-perl_4.028-2+deb8u2_armhf.deb ... Unpacking libdbd-mysql-perl (4.028-2+deb8u2) ... Selecting previously unselected package libterm-readkey-perl. Preparing to unpack .../libterm-readkey-perl_2.32-1+b1_armhf.deb ... Unpacking libterm-readkey-perl (2.32-1+b1) ... Selecting previously unselected package mysql-client-5.5. Preparing to unpack .../mysql-client-5.5_5.5.62-0+deb8u1_armhf.deb ... Unpacking mysql-client-5.5 (5.5.62-0+deb8u1) ... Selecting previously unselected package mysql-server-core-5.5. Preparing to unpack .../mysql-server-core-5.5_5.5.62-0+deb8u1_armhf.deb ... Unpacking mysql-server-core-5.5 (5.5.62-0+deb8u1) ... Selecting previously unselected package mysql-server-5.5. Preparing to unpack .../mysql-server-5.5_5.5.62-0+deb8u1_armhf.deb ... Unpacking mysql-server-5.5 (5.5.62-0+deb8u1) ... Selecting previously unselected package mysql-server. Preparing to unpack .../mysql-server_5.5.62-0+deb8u1_all.deb ... Unpacking mysql-server (5.5.62-0+deb8u1) ... Processing triggers for man-db (18.104.22.168-5) ... Processing triggers for systemd (215-17+deb8u7) ... Setting up libaio1:armhf (0.3.110-1) ... Setting up libdbi-perl (1.631-3+b1) ... Setting up libdbd-mysql-perl (4.028-2+deb8u2) ... Setting up libterm-readkey-perl (2.32-1+b1) ... Setting up mysql-client-5.5 (5.5.62-0+deb8u1) ... Setting up mysql-server-core-5.5 (5.5.62-0+deb8u1) ... Setting up mysql-server-5.5 (5.5.62-0+deb8u1) ... [ ok ] Stopping MySQL database server: mysqld. 190118 13:25:06 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead. 190118 13:25:06 [Note] Ignoring --secure-file-priv value as server is running with --bootstrap. 190118 13:25:06 [Note] /usr/sbin/mysqld (mysqld 5.5.62-0+deb8u1) starting as process 24590 ... [ ok ] Starting MySQL database server: mysqld .. [info] Checking for tables which need an upgrade, are corrupt or were not closed cleanly.. Setting up mysql-server (5.5.62-0+deb8u1) ... Processing triggers for libc-bin (2.19-18+deb8u10) ... Processing triggers for systemd (215-17+deb8u7) ...
So, is it there now?
root@odroidxu4:/SG2/ext/chiefio# which mysql /usr/bin/mysql
There is also a utility we will use a bit later to load some data from a file:
root@odroidxu4:/SG2/ext/chiefio# which mysqlimport /usr/bin/mysqlimport
Is there other stuff? OH yeah…
root@odroidxu4:/SG2/ext/chiefio# ls /usr/bin/my* /usr/bin/myisamchk /usr/bin/mysqlbinlog /usr/bin/mysql_find_rows /usr/bin/mysql_secure_installation /usr/bin/myisam_ftdump /usr/bin/mysqlbug /usr/bin/mysql_fix_extensions /usr/bin/mysql_setpermission /usr/bin/myisamlog /usr/bin/mysqlcheck /usr/bin/mysqlhotcopy /usr/bin/mysqlshow /usr/bin/myisampack /usr/bin/mysql_client_test /usr/bin/mysqlimport /usr/bin/mysqlslap /usr/bin/my_print_defaults /usr/bin/mysql_convert_table_format /usr/bin/mysql_install_db /usr/bin/mysqltest /usr/bin/mysql /usr/bin/mysqld_multi /usr/bin/mysqloptimize /usr/bin/mysql_tzinfo_to_sql /usr/bin/mysqlaccess /usr/bin/mysqld_safe /usr/bin/mysql_plugin /usr/bin/mysql_upgrade /usr/bin/mysqladmin /usr/bin/mysqldump /usr/bin/mysqlrepair /usr/bin/mysql_waitpid /usr/bin/mysqlanalyze /usr/bin/mysqldumpslow /usr/bin/mysqlreport /usr/bin/mysql_zap
Basically to become a MySQL DBA, you learn what all of those do and how to make SQL work as you wish it to work. I’ll be using a subset and only learning the bits I need as I need them. “mysqudump” for example will be low on my list as I’m just loading static data from other sources and can simply recreate the data input load process.
A Test Case
Before getting around to actually loading the GHCN stuff, since it is large and with several variations, I’m going to do a couple of simple test cases to see if things work and have I got the syntax at all right.
Make A Database and A Table and Load It
I tried to log in the way most things tell you do do it, and failed.
sudo mysql -u root -p
Found a page saying you need to set up the security options first and did so, then it worked. Even though I just left everything the same. Sooo…. you probably need to do this:
chiefio@odroidxu4:/var/log/mysql$ sudo mysql_secure_installation NOTE: RUNNING ALL PARTS OF THIS SCRIPT IS RECOMMENDED FOR ALL MySQL SERVERS IN PRODUCTION USE! PLEASE READ EACH STEP CAREFULLY! In order to log into MySQL to secure it, we'll need the current password for the root user. If you've just installed MySQL, and you haven't set the root password yet, the password will be blank, so you should just press enter here. Enter current password for root (enter for none): OK, successfully used password, moving on... Setting the root password ensures that nobody can log into the MySQL root user without the proper authorisation. You already have a root password set, so you can safely answer 'n'. Change the root password? [Y/n] n ... skipping. By default, a MySQL installation has an anonymous user, allowing anyone to log into MySQL without having to have a user account created for them. This is intended only for testing, and to make the installation go a bit smoother. You should remove them before moving into a production environment. Remove anonymous users? [Y/n] n ... skipping. Normally, root should only be allowed to connect from 'localhost'. This ensures that someone cannot guess at the root password from the network. Disallow root login remotely? [Y/n] Y ... Success! By default, MySQL comes with a database named 'test' that anyone can access. This is also intended only for testing, and should be removed before moving into a production environment. Remove test database and access to it? [Y/n] n ... skipping. Reloading the privilege tables will ensure that all changes made so far will take effect immediately. Reload privilege tables now? [Y/n] Y ... Success! Cleaning up... All done! If you've completed all of the above steps, your MySQL installation should now be secure. Thanks for using MySQL!
After which I could, in fact, log in to mysql.
chiefio@odroidxu4:/var/log/mysql$ sudo mysql -u root ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
Won’t let me do it without a password…
chiefio@odroidxu4:/var/log/mysql$ sudo mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 52 Server version: 5.5.62-0+deb8u1 (Debian) Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>
Eventually I’ll add a user of my choosing and loosen up security so it doesn’t need a “sudo” to make it go. ( I found some pages saying you need a sudo after some 5.7? release or some such, with some directions about it) Lots of interesting “wandering in the woods” commentary here:
Then add the database, let “chiefio” use it with a particular password, and put a table in it:
mysql> CREATE DATABASE temps; Query OK, 1 row affected (0.00 sec) mysql> GRANT ALL ON temps.* to chiefio@localhost -> IDENTIFIED BY 'LetMeIn!'; Query OK, 0 rows affected (0.00 sec) mysql> DROP TABLE IF EXISTS inventory; ERROR 1046 (3D000): No database selected
Oh Dear… need to “select” our database. Note that I’m using ALL CAPS for keyworks. MySQL is not sensitive to case, but this makes it easier to see what I’m doing.
mysql> USE temps; Database changed mysql> DROP TABLE IF EXISTS inventory; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> CREATE TABLE inventory ( -> region CHAR(1) NOT NULL, -> country CHAR(2) NOT NULL, -> wmo CHAR(5) NOT NULL, -> name CHAR(30) NOT NULL, -> attr CHAR(20), -> ); ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 7 mysql>
There’s a tutorial here with some interesting examples:
By inspection of this one:
CREATE TABLE IF NOT EXISTS tasks ( task_id INT AUTO_INCREMENT, title VARCHAR(255) NOT NULL, start_date DATE, due_date DATE, status TINYINT NOT NULL, priority TINYINT NOT NULL, description TEXT, PRIMARY KEY (task_id) ) ENGINE=INNODB;
I’d guess my error was putting a comma after that last line before the ); or else maybe that “ENGINE=INNODB” implies I need something after the ) so decisions decisions…
mysql> CREATE TABLE inventory ( -> region CHAR(1) NOT NULL, -> country CHAR(2) NOT NULL, -> wmo CHAR(5) NOT NULL, -> name CHAR(30) NOT NULL, -> attr CHAR(20) -> ) ; Query OK, 0 rows affected (0.08 sec)
Looks like that attr line with the “,” was the issue.
mysql> show tables; +-----------------+ | Tables_in_temps | +-----------------+ | inventory | +-----------------+ 1 row in set (0.01 sec) mysql> show full tables; +-----------------+------------+ | Tables_in_temps | Table_type | +-----------------+------------+ | inventory | BASE TABLE | +-----------------+------------+ 1 row in set (0.01 sec) mysql>
Yup, I’ve got a table. Note that I’ve not used any field as an index yet, nor have I divided the attributes into individual fields. This is a “toy” version. When it’s time for the real one, I’ll do that kind of detail stuff. I’ll also load the build script from a file and I’ll load the data from a file. For now I’m just getting the syntax set in my habits. Building the basic skills.
At some point I’ll need to explore what these other default databases might be:
mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | temps | +--------------------+ 4 rows in set (0.00 sec)
“temps” is mine, I presume the others are either demo databases or are for some internal use. A minor “dig here” for later.
So let’s put something in our table.
If done for all fields in order, you need not list field names:
mysql> INSERT INTO inventory VALUES ('1','23','56789','Test Site One','Attr list of stuff'); Query OK, 1 row affected (0.04 sec)
As I’m going to load lots of data, doing it from a file is better. Mysql expects data with TAB as the separator and a Newln at the end of a row. Since the original data is in fixed format, I’ll need to write a little ‘stub’ FORTRAN program to read it in, then write it back out with TAB between each field. I’ll put that in another posting (as this one is a bit long already and it will take a while to write the FORTRAN…)
So where IS this database, anyway? Don’t see it in my home directory or in my current working directory?
root@odroidxu4:/var/lib# pwd /var/lib root@odroidxu4:/var/lib# ls -l mysql total 28736 -rw-r--r-- 1 root root 0 Jan 18 13:25 debian-5.5.flag -rw-rw---- 1 mysql mysql 18874368 Jan 18 14:44 ibdata1 -rw-rw---- 1 mysql mysql 5242880 Jan 18 14:44 ib_logfile0 -rw-rw---- 1 mysql mysql 5242880 Jan 18 13:25 ib_logfile1 drwx------ 2 mysql root 4096 Jan 18 13:25 mysql -rw------- 1 root root 6 Jan 18 13:25 mysql_upgrade_info drwx------ 2 mysql mysql 4096 Jan 18 13:25 performance_schema drwx------ 2 mysql mysql 4096 Jan 18 14:35 temps root@odroidxu4:/var/lib#
Yeah, you can change that. And I will for the final production version. For now, for the toy version, that’s OK. I’ve got /var mounted from a hard disk partition already anyway:
root@odroidxu4:/var/lib# df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda5 3997376 1119740 2667924 30% /var
Just realize that if you are running from a 4 GB uSD card with /var in root (/) this might be an issue. We’ll cover moving locations in a future episode…
Then, just to show stuff got into the table:
mysql> SELECT * FROM inventory ; +--------+---------+-------+---------------+--------------------+ | region | country | wmo | name | attr | +--------+---------+-------+---------------+--------------------+ | 1 | 23 | 56789 | Test Site One | Attr list of stuff | +--------+---------+-------+---------------+--------------------+ 1 row in set (0.00 sec) mysql>
The general format is:
More on that in the future too, once I have the final table up and loaded with real data.
So that’s your basic basic intro to MySQL. Minimum needed to create a database, create a table, load some data, and dump it out; along with some minimal on the topic of permissions and operational issues.
I’m generally looking at a design with 2 major tables. One the “inventory” file with data about the individual site location. That attr list includes latitude, longitude, elevation, distance to shore, and a bunch of other stuff. All those specifics need to be broken out into distinct fields (for my purposes) and I need to think a bit about indexing and matching issue. Countries change over time, so that’s one of the issues with comparing V1 to V2 to V3 to V4. Similarly, WMO Numbers have some more minor changes. Like the use of a “mod number” in V2 that isn’t in other version where it was homogenized away.
One field I clearly want to add is “version” for that GHCN v1 vs GHCN v2 and USHCN v1 and Hadley… Then I’m thinking about having an “ascension” number as they have in seed banks. That would let me load GHCN V2 from different dated saved copies and compare them, too. That might be overkill for the first cut, though. I think at first I just need to get V1, V2, and V3 loaded from known good copies.
The other file needed is the Station_Data table with the actual monthly data (temperature) values along with the quality flags and such. In some ways it is more simple as the “key” is clearly Region-Country-WMO#-ModFlag in some combination. WMO#-Mod likely enough. I may need to make a V1 vs V2 vs V3 vs V4 WMO# table that lets me map all of them to one standard WMO number. That depends on the details of how the WMO number changes over time and space. Is it just dropping the Mod Flag, or what else changed?
Sometime tonight I ought to have the first cut of a live load from files of actual data for the inventory file, for at least 1 and maybe all 3 of the GHCN data sets. USHCN to follow. Hadley just an idea at this stage; and for some very later time.
Then I’ll start on all that Station Data, once I have clue how to tell what station is whom.
FWIW, the GHCH V4 adds something like 14,000 stations, so well over the prior 6k worth. The only problem is that almost all of them are in the USA, Germany and a couple of other places. It does NOT improve spacial coverage of all those remote pristine places and it DOES increase the bias to a few highly industrialized urban places. Oh, and it does a LOT more “homogenizing” and fixing up of the data into an even more complex “data food product”. I’m not sure if you can even get “raw” data versions anymore. As NOAA is down during this shutdown budget food fight, I’m not looking at V4 just yet. Maybe after The Wall is funded ;-)