GHCN Global Thermometers -Sine Globe with Labels

I said I’d post an updated version with labels and heading, and minus the gratuitous white border you get by default. Well, here it is:

GHCN v3.3 Thermometers on sine globe with lables

GHCN v3.3 Thermometers on sine globe with lables

I’m now using IDLE so that I can save the text of programs between runs and not need to restart my environment via relaunching the python interpreter each time. The debian command idle3 launches a ‘terminal window’ that is “special”. It has syntax checkers, auto indents, a ‘run’ menu item (run module to run what is currently being written) and a save / save as menu choice. Also open for pulling in the existing programs. There’s LOTS of other choices too, but that’s the minimum that got me going. Things like “open module” and “comment out region”. When you chose “run module” it pops open a Python shell window, runs the code, and error messages are displayed there. Repeated runs add to the history in that window.

I’ve made a specific directory for it to work in and where I’m saving my Python programs:

chiefio@odroidxu4:~/Py$ ls
abase.py  My2nd.py  MyFirst.py  SineDotLabels.py

abase.py is my starting base full of the standard set up stuff and some model code. I can then delete what isn’t needed and add what is, while not needing to retype a lot of boilerplate stuff. Sine means the sine projection, dot is the dots, and labels tells me it has labels in it. ;-)

Here’s the program in total (it includes some gratuitous “import FOO” lines from the base skeleton that I can suck in that just has all the ‘set up’ stuff in it already. I suppose I ought to remove them, but…):

chiefio@odroidxu4:~/Py$ cat SineDotLabels.py 

import datetime
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import math

df = pd.read_csv('../SQL/v3/invent.csv')

plt.title("Global Thermometer GHCN v3.3")
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.xlim(-180,180)
plt.ylim(-90,90)

PLON=df["LON"]*np.cos(np.radians(df["LAT"]))
plt.scatter(PLON, df["LAT"],s=1)
plt.show()

Note that since the .py programs are being stored in a ~/Py directory, the path name for the file to use to suck in the data has changed to ../SQL/… I suppose I ought to eventually set up some place like /SQL or whatever and make that an absolute path (so it would work wherever launched) but can’t do everything at once…

Next steps will be moving to direct use of the MySQL database along with making 2 more graphs. One with baseline thermometers, the other with current. At least, that’s the work plan for the rest of the day. As I get bits done I’ll post an update here with the graphs.

The good news is that with IDLE this is going much faster, and also faster as I learn more bits of the code (and as it gets stuck in the brain in cache ;-) Spending part of the day making your environment work better is a Very Good Thing, but having a reason to do so because you have actually made a graph or bit of code is also a Very Good Thing. ;-) So for now at least, I’m a happy camper ;-)

UPDATE 1:

Here’s the graph made from the database data directly. Not yet in sine globe shape, but I’ve got the database stuff working now!

GHCN v3.3 from the MySQL database

GHCN v3.3 from the MySQL database

UPDATE 2:

And here’s a Sine perspective globe with data from the database. I “kind of cheated” in that I compute the cos(latitude) in the FORTRAN db load file program, but I figured since I’m going to do it a lot, better to just compute it once and store it in the DB. Sometime later I’ll figure out how to do that in Python (as a Python data load prep program) and in SQL inside the MySQL program. Just so I know all the options and can use them later if needed for something else. With that, here’s your graph:

Sine Perspective globe of GHCN V3.3 stations from database

Sine Perspective globe of GHCN V3.3 stations from database

It ought to look just like the ones done from the CSV file load, so not a lot of visual spice here; just a statement of an accomplishment. Another checklist item of progress ticked off.

Update 3:

This is a graph of only those stations that had reported data in the last year of the V3.3 dataset I have. It looks a bit “moth eaten” but this is what is making the “now” temperature value compared to baseline.

Only the 2015 reporting stations in GHCN v3.3

Only the 2015 reporting stations in GHCN v3.3

Update 4:

Here’s the more or less GIStemp baseline graph:

GHCN v3.3 from 1950 to 1980 GIStemp baseline

GHCN v3.3 from 1950 to 1980 GIStemp baseline

Update 5:

After discovering that the “present in the last year of the dataset – the ‘now'” was almost identical to the “now” set (only 3 stations missing) I decided to turn it around. Here’s the map of what IS in the baseline but is MISSING from the present. So these stations set the temperature in the past, to which a DIFFERENT set of stations are compared in the present.

GHCN v3.3 stations present in the Baseline but missing in the final year

GHCN v3.3 stations present in the Baseline but missing in the final year

Update 6:

Oh this is a fun one. I’ve put both the baseline and the “now” stations on the same map, but those only NOW are red ( 1/2 transparency) while those only in the baseline are blue (full color density). This means a red on top of a blue will give a purple, but a whole lot of reds will tend to hide the blue, and out in the boonies you can see the light red vs the blue as what was dropped vs kept.

GHCN v3.3 NOW in red over Baseline in Blue

GHCN v3.3 NOW in red 50% transparent over Baseline in Blue

Then, since the RED tends to swamp the blue where there’s a lot of thermometers now, I’ve also made one with red first, then blue as 50% transparent over it.

GHCN v3.2 Baseline blue 50% transparent over 2015 "now" in red

GHCN v3.2 Baseline blue 50% transparent over 2015 “now” in red


red[/caption]

Subscribe to feed

Advertisement

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in NCDC - GHCN Issues, Tech Bits and tagged , . Bookmark the permalink.

56 Responses to GHCN Global Thermometers -Sine Globe with Labels

  1. E.M.Smith says:

    Aaaand we’re back in non-compatible / incomplete land…

    To use a database, you need a database connector. To use MySQL, you include in your program the line:

    import MySQLdb
    

    This depends on the connector, that’s in a library, that isn’t included by default, but that you can install…

    https://packages.debian.org/search?keywords=python3-mysqldb

    root@odroidxu4:/tmp# apt-get install python3-mysqldb
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    Package python3-mysqldb is not available, but is referred to by another package.
    This may mean that the package is missing, has been obsoleted, or
    is only available from another source
    
    E: Package 'python3-mysqldb' has no installation candidate
    

    But the one for Python 2 is there:

    root@odroidxu4:/tmp# apt-get install python-mysqldb
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    Suggested packages:
      python-egenix-mxdatetime python-mysqldb-dbg
    The following NEW packages will be installed:
      python-mysqldb
    0 upgraded, 1 newly installed, 0 to remove and 5 not upgraded.
    Need to get 59.2 kB of archives.
    After this operation, 175 kB of additional disk space will be used.
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main python-mysqldb armhf 1.2.3-2.1 [59.2 kB]
    Fetched 59.2 kB in 2s (25.1 kB/s)   
    Selecting previously unselected package python-mysqldb.
    (Reading database ... 100842 files and directories currently installed.)
    Preparing to unpack .../python-mysqldb_1.2.3-2.1_armhf.deb ...
    Unpacking python-mysqldb (1.2.3-2.1) ...
    Setting up python-mysqldb (1.2.3-2.1) ...
    

    So it looks like I’ll be swapping over to Python 2 for my SQL stuff until such time as either:

    a) Someone ports the 3 version to arm and puts it in Debian / Devuan.
    b) “Some other connector” is figured out.

    That “b” is because it looks like there’s another connector that I now get to go investigate. I’m first going to just see if this works in Python2. If it does, I’ll press on. If not, I’ll go investigate alternative connectors.

    Python is looking ever more like a “DIY Kit Of Parts” and less like a finished language…

    UPDATE: (that was quick ;-)

    Had to install idle (as opposed to idle3 I did before) but then it worked with the import of the MySQL …

    root@odroidxu4:/tmp# apt-get install idle
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      idle-python2.7 python-tk
    Suggested packages:
      tix python-tk-dbg
    The following NEW packages will be installed:
      idle idle-python2.7 python-tk
    0 upgraded, 3 newly installed, 0 to remove and 5 not upgraded.
    Need to get 332 kB of archives.
    After this operation, 1,363 kB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main python-tk armhf 2.7.8-2+b1 [23.8 kB]
    Get:2 http://auto.mirror.devuan.org/merged/ jessie-security/main idle-python2.7 all 2.7.9-2+deb8u2 [305 kB]
    Get:3 http://auto.mirror.devuan.org/merged/ jessie/main idle all 2.7.9-1 [3,176 B]
    Fetched 332 kB in 2s (131 kB/s) 
    Selecting previously unselected package python-tk.
    (Reading database ... 101053 files and directories currently installed.)
    Preparing to unpack .../python-tk_2.7.8-2+b1_armhf.deb ...
    Unpacking python-tk (2.7.8-2+b1) ...
    Selecting previously unselected package idle-python2.7.
    Preparing to unpack .../idle-python2.7_2.7.9-2+deb8u2_all.deb ...
    Unpacking idle-python2.7 (2.7.9-2+deb8u2) ...
    Selecting previously unselected package idle.
    Preparing to unpack .../archives/idle_2.7.9-1_all.deb ...
    Unpacking idle (2.7.9-1) ...
    Processing triggers for desktop-file-utils (0.22-1) ...
    Processing triggers for mime-support (3.58) ...
    Processing triggers for man-db (2.7.0.2-5) ...
    Setting up python-tk (2.7.8-2+b1) ...
    Setting up idle-python2.7 (2.7.9-2+deb8u2) ...
    Setting up idle (2.7.9-1) ...
    

    Guess now I’m glad I went ahead and installed all the 2.x stuff along with the 3.x stuff.

    Yeah, very much getting that DIY Kit feeling. Every time you want to do something new, it’s back to “apt-get install” and figuring out if it will work… Were Julia already in the apt-get repositories I’d be swapping over to it about now. As it is a choice of “install julia from some tarball or ‘whatever'” vs “Kit of Python Parts”, I’ll stick with The Kit for now.

  2. tom0mason says:

    I’m not sure if you will find this useful but at https://www.tutorialspoint.com/python3/python_database_access.htm they do an example by importing PyMySQL —

    PyMySQL is an interface for connecting to a MySQL database server from Python. It implements the Python Database API v2.0 and contains a pure-Python MySQL client library. The goal of PyMySQL is to be a drop-in replacement for MySQLdb.

  3. tom0mason says:

    Also some other db modules/connectors are listed here, https://wiki.python.org/moin/MySQL

  4. Larry Ledwick says:

    Julia is a bit in the same category, as a rapidly developing language you have to add stuff as you need it.

    It uses Pkg.add to add the new features

    julia> using Pkg
    julia> Pkg.add(“PyPlot”)

    I found an interesting Julia book that just arrived today, called “The Julia Language Handbook”

    He mentions the fact that due to the rapid changes in the distributions things are still changing and some reference books list syntax which has since been depricated and dropped from support.
    He is an engineering guy and he made the choice of Julia over Python due to a different issue than the fixed spacing, he disliked indexing arrays from 0 in Python where Julia indexes arrays starting with 1.
    He also notes that Julia being a compiled language does not run at full speed until the second time the job runs, as the first run includes compile time for the functions being used. Minor issue but slows development a bit if you are making lots of changes between runs.

    In chapter 14 he does list tips and hints on plotting in Julia using PyPlot to get the various types of plots an engineer needs like log plots, scatter plots etc. Might be useful for your project even if you stay with python. (only about 5 pages both sides on the plots but quick and dirty hints can save a lot of frustration for some thing)

  5. E.M.Smith says:

    @Larry:

    Thanks for the book pointer. I’ll consider it. I’m from the American School of arrays starting at 1, not the European “Ground floor and 1st floor is one flight up” arrays start at zero. But I’m functional in European ;-) OTOH, early on I dealt with languages that let you declare an array starting anywhere, and now I find it somewhat annoying when I can’t just say “give me an array from -90 to 90″… You get used to things…

    @tom0mason:

    Thanks for the pointers! I’m going to press on a bit more with Python 2 to see if I can get the rest of ‘how to suck in a database’ running (like somehow you must open it / log in). IF I get stuck, OR after I’ve got it working OK but want to try something in Python3, then I’ll give those a try.

    It does illustrate, though, one of my gripes about OO languages and “import library” heavy designs. You end up with the moving river problem. (“You can never cross the same river twice”) Or even figure out what the ford is named now and where it is…

    I get to stop to make dinner now, so don’t know when I’ll have the next bit of progress. I’m working on the “open the database” bit, but took a break for from France / Italy Food Fight news & posting ;-)
    https://chiefio.wordpress.com/2019/02/08/the-soros-globalists-vs-nationalist-populist-split-hits-france-italy/

  6. E.M.Smith says:

    WOOT!

    I’ve succesfully opened the Database, asked it for version, and printed it out. Program still has some cruft in it and still has some bits for future use. It starts with the prior “make a graph” from the CSV stuff (as I’m going to modify it later to use the DataFrame filled from the database) so ignore that first bit. Pick it up after the imports at the “try:” label:

    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    df = pd.read_csv('../SQL/v3/invent.csv')
    
    plt.title("Global Thermometer GHCN v3.3")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    PLON=df["LON"]*np.cos(np.radians(df["LAT"]))
    plt.scatter(PLON, df["LAT"],s=1)
    plt.show()
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
        cursor.execute('SELECT VERSION()')
        data=cursor.fetchone()
        print "Version %s" % data
    
    except:
        print "Error %s: " % e.args[0]
        sys.exit(1)
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    and here’s the output:

    >>> ================================ RESTART ================================
    >>>
    Version 5.5.62-0+deb8u1
    All Done
    >>>

    That RESTART thing is something Python/IDLE puts in the results shell window to note when you are doing another run. I’ve left off the failed runs about it ;-)

    So, to recap:

    At this point I can open my database, extract information from it, catch error conditions, and close it nicely when I’m done.

    Guess it is time to proceed to that “actually take out the data you want and use that to make a graph” step ;-)

    BTW:

    Dinner was a nice bit of pan fried chicken. Floured in a salt/pepper/poultry seasoning in flour mix. Oil at about 325 F halfway up the pieces in a cast iron skillet. 7 to 9 minutes a side, depending on thickness. Oven roasted “Tater Tots” and a small bowl of marinated Artichoke Hearts. Desert of sliced peaches in heavy sugar syrup. YUM!

    But it did put me on the couch (dog in lap of course) for a couple of hours ;-) Thus the slow progress…

    With luck, I’ll get the “data out and first DB graph” fairly quickly.

  7. Bill In Oz says:

    I am not a computer bloke. I have just never got under the bonnet with computers being content to let others do that stuff.

    But I can see the value in this project you are engaged in. Getting an accurate global map of weather stations, both past & present !

    Now that is a valuable idea. Already it’s clear 1 :Tthat the oceans & seas of Earth ( 66 -68 % of the earth’s surface ) have very few weather stations and certainly very few that have been operating for a significant length of time. 2 : That there are major areas of the land surface with few weather stations as well: Outback Antarctica, Australia, Siberia, India, Central Asia, major parts of Africa & large parts of South America

    So that leaves us with the USA, Canada, Mexico, Europe, European Russia, the UK, New Zealnd, South East Asia & the settled parts of Australia.

    By pure coincidence that is where huge numbers of people also live.
    So are there rising mean temperatures in these parts of the world ? And are they due to a human warming effect ? Or due to other processes ?

    i’m looking forward to the updated map with just the existing operational weather stations. That will give an even more interesting picture of what is happening.

    .

  8. E.M.Smith says:

    Well, I’ve manged to make a graph from the database data. Here’s the code. It is a “work in progress” with various things like “diagnostic prints” still in it and some ‘dead code’ commented out and… but it is what gave me the graph. So here it is. I’m adding the graph up top. It doesn’t have the sine layout yet, just the small points rectangular, but hey, one step at a time…

    I import more than I’m using at the moment.
    I’ve left in the commented out line that loaded the CSV stuff, in case I needed it during debugging, but I didn’t, so it ought to go now.
    Sets the graph details like labels
    Opens the database, connects, sends and SQL query for version and prints it out.
    Sends an SQL query to extract specific data from the inventory table (and lets me know what step I’m in with a diagnostic print. (Python has nearly useless error messages: “Syntax error”…)

    At first I stuffed a string but it wasn’t working. So I went to a direct quoted string. That wasn’t working. Then I logged into MySQL directly and realized I was using abbreviated names for the two lat long fields and had the long names in the table proper… So I need to revert back to the string stuffing… I’m not certain the “use temps” is needed. I added it to the model I was following during one debugging run. I’ll test a run without it in a little while.

    I then print the type and a bit of the stn tuple. Again, during debugging trying to figure out what worked and where it was stuck.

    Found a model that showed converting a tuple to a numpy array and using it, so tried that. It worked.

    I’ve left in the prior line that made the sine globe. Next step (after stripping some of the debugging dead legacy code) is to do that for this array.

    Then I plot it.

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    #df = pd.read_csv('../SQL/v3/invent.csv')
    
    plt.title("Global Thermometer GHCN v3.3")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
        cursor.execute('SELECT VERSION()')
        data=cursor.fetchone()
        print "Version %s" % data
        
        sql="SELECT latitude, longitude FROM inventory "
        print "About to SELECT"
        cursor.execute("use temps")
        cursor.execute("SELECT latitude, longitude FROM inventory ")
        print "After the Select"
        stn=cursor.fetchall()
        print "After The Fetch of Data"
        print "Type= %s" % (type(stn))
        
        print "Station %s %s" % (stn[0],stn[1])
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
    #    PLON=df["LON"]*np.cos(np.radians(df["LAT"]))
    #plt.scatter(PLON, df["LAT"],s=1)
        plt.scatter(ys,xs,s=1)
        plt.show()
    #    for row in stn:
    #        lat= row[0]
    #        lon= row[1]
    #        print "in the for loop"
    #        print "Latitude= %s, Longitude= %s" % (lat,lon)
    
    
    except:
        print "This is the exception branch"
    #    print "Error %s: " % error.args[0]
    #    sys.exit(1)
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    The “for row” and following was while inspecting the tuple to make sure data got into it. During that “why isn’t it working?” due to wrong field names stage ;-0

    Then the error code I’d copied fails when executed, so likely some other library needs to be installed… I’ve just commented it out for now. One print and done.

    I’ll be back in a bit with a clean up version. Either doing the sine globe or not, depending on how much longer I go and what I run into ;-)

  9. E.M.Smith says:

    Here’s the cleaned up version. Only real cruft in it is the excess imports:

    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
        cursor.execute('SELECT VERSION()')
        data=cursor.fetchone()
        print "Version %s" % data
        
        sql="SELECT latitude, longitude FROM inventory "
     
    #    cursor.execute("use temps")
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1)
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    It looks like the “use” statement isn’t needed. I’ve just commented it out so that if, after and exit and return, it suddenly fails; I’ll know I do need it and it was just holding over the state from prior runs…

    Spent a little while trying to get that cos(latitude) code to work in this one. Something is “off” in my array.math.foo tonight… or my Python. Kept getting errors. As the error messages were often non-existent I’m not sure what the issue is. Even just doing stuff=np.cos(data) was tossing an error.

    I know enough that when it is late and that kind of thing happens it is best to just pack it in for the night.

    I am going to play around a bit with table joins and subset reporting as that’s all in MySQL land and less alien to my soggy brain ATM ;-) If I can get a JOIN and report of just the stations that are current, then I can plot them even if rectangular.

    In any case, as it is nearly midnight, I’m not going to play with it for long before it’s sack time… Besides, one major goal (use DB) done in a part time day is enough ;-) Two of them (DB & use IDLE / IDE) is more than enough… ;-)

  10. E.M.Smith says:

    A quick test later… exit of IDLE, relaunch. Program ran fine even with the “use temps” commented out and a reset of state. So somehow it knows what database to use even though in the mySQL REPL you must tell it. OK…

  11. E.M.Smith says:

    This is a bit funny…

    What happens when you do a join on two tables that don’t have an index, and request a field from each? Well… I forgot to limit the search / report to unique values and to a select few, so it was all records from the data (station, year, month, reading…)…

    It was being slow, but I wanted to know how long it would take anyway. After 20 or so minutes I started hacking on something else and … time passes…

    | SHIP V                         | 1971 |
    | SHIP V                         | 1971 |
    | SHIP V                         | 1971 |
    | SHIP V                         | 1971 |
    +--------------------------------+------+
    14487816 rows in set (1 hour 57 min 21.98 sec)
    
    mysql> 
    

    Yes, just about 2 hours running around making a “report” of Station & Years… of 14 1/2 MILLION rows.

    So I’ve now got a new inventory file with an index in it. I’m putting in the StationID as all 11 characters of the Region, Country, WMO, “near” fields as a Primary Key. I’ll be doing a similar thing with the actual data file, but including Year and Month in the primary key as easy line is unique to that whole set.

    Then trying this again… but with some SELECT and UNIQUE going on ;-) The goal being to get just one UNIQUE wmo number for each station having data in YEAR > 2000. Then plot them.

    Oh, and while I’m at it, I’ve put that COS(LAT) conversion into the FORTRAN dump program so it will be an already calculated field in the database. That way it’s not something I need to re-calculate all the time.

  12. jim2 says:

    I use loggers in other languages. They are nice. Here’s one way for Python.

    https://stackoverflow.com/questions/11927278/how-to-configure-logging-in-python

  13. E.M.Smith says:

    @Jim2:

    Just a regular old inner join, but I think there’s a lot of records…

    In any case, I’ll get back to that bit later. What I’ve done so far is add a field of the combined full StationID (region, country, wmo, near) all 11 characters and made it a primary key. Reloaded the data into a new table named invent3 and I also took the opportunity to move the cosine math into the FORTRAN data formatter so it’s now a field in the table. (At some future date I’ll come back to how to do that in SQL or Python/NumPy – I tend to just flow around problems, but then come back later with the road grader to flatten them ;-)

    I’ve made a Sine Graph from that database. Here’s the code for it (and I’ll add that graph up top). This sill leaves the step of “select on year from a join” for later today:

    The Python:

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
        cursor.execute('SELECT VERSION()')
        data=cursor.fetchone()
        print "Version %s" % data
        
        sql="SELECT latitude, longitude, coslong FROM invent3 "
     
    #    cursor.execute("use temps")
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
        cs = data.transpose()[2]
    
        plt.scatter(cs,xs,s=1)
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    Doesn’t really need the print version thing anymore as that was more a diagnostic. Then the use of ys is depricated by the use of cs so that whole “real value” could be left out. I’ve left it in for now in case of “bright ideas” or the desire to make both graphs in one program.

    BTW: Thanks for the exception handling link. Something else to fix up today ;-) I’d modeled on another posting / program, but they seem to have assumed things not in evidence (i.e. imported something somewhere?) so I need a deeper dive to actually know what I’m doing…

    The MySQL Schema:

    chiefio@odroidxu4:~/SQL/bin$ cat ../Schemas/invent3.sql 
    CREATE TABLE invent3 (
        stationID  CHAR(11) NOT NULL,
        version    CHAR(5) NOT NULL,
        ascension  CHAR(10),
        type       CHAR(4),
        region     CHAR(1) NOT NULL,
        country    CHAR(3) NOT NULL,
        wmo        CHAR(5) NOT NULL,
        wno_near   CHAR(3) NOT NULL,
        latitude   CHAR(8),
        longitude  CHAR(9),
        stn_elev   CHAR(6),
        name       CHAR(30) NOT NULL,
        grid_elev  CHAR(4),
        pop_class  CHAR(1),
        pop_size   CHAR(5),
        topo_type  CHAR(2),
        veg        CHAR(2),
        prox_water CHAR(2),
        dist_water CHAR(2),
        airport    CHAR(1),
        dist_A_to_U CHAR(2),
        veg_grid    CHAR(16),
        pop_class_nitelites CHAR(1),
        coslong    CHAR(9),
        PRIMARY KEY (stationID)
        ) ;
    

    Note the PRIMARY KEY. This ought to make joins more efficient and avoid “duplication” issues.

    Here’s the revised FORTRAN for the load:

    chiefio@odroidxu4:~/SQL/bin$ cat ../v3/3inven.f 
    C FORTRAN to read the inventory files v3 GHCN file and insert Tabs
    C in the output.  Also divides "country" into Continent and Country
    C
    C Variable declarations...
    C
          CHARACTER * 11 stationID
          CHARACTER * 1  TAB
          CHARACTER * 5  VERSION
          CHARACTER * 10 ASCEN
          CHARACTER * 4  TYPE, GRIDELEV
          CHARACTER * 1  CONT,POPC,AIR,POPNL
          CHARACTER * 2  COUNTRY,TOPOT,VEG,PROXW,DISTW,DISTAU
          CHARACTER * 5  WMO,PSIZE
          CHARACTER * 3  NEAR
          CHARACTER * 6  STNELEV
          CHARACTER * 8  LATITUDE
          CHARACTER * 9  LONGITUDE
          CHARACTER * 16 VEGGRID
          CHARACTER * 30 NAME
          REAL COSLONG,FLAT,FLONG
          REAL PI,RAD
    
          PI=3.1415926
          RAD=PI/180.0
    
    C
    C Set the TAB character
          TAB=CHAR(09)
    C
    C Set some constants
          VERSION="Gv3.3"
          ASCEN="7Sept2015"
          TYPE="QCU"
    C
    C Read in one line of data...
    C
        9 OPEN(1, FILE='inventory.in', STATUS='OLD', ACTION='READ')
       10 READ (1, 11, END=99) CONT, COUNTRY, WMO, NEAR,LATITUDE,           &
         &LONGITUDE,STNELEV,NAME,GRIDELEV,POPC,PSIZE,TOPOT,VEG,PROXW,       &
         &DISTW,AIR,DISTAU,VEGGRID,POPNL
    C
       11 FORMAT (A1,A2,A5,A3,X,A8,X,A9,X,A6,X,A30,X,A4,A1,A5,A2,A2,A2,     &
         &A2,A1,A2,A16,A1)
    C
    C Convert CHAR  to Float 
    
             READ (LATITUDE,*) FLAT
             READ (LONGITUDE,*) FLONG
    C       WRITE (6,22) LATITUDE,FLAT,LONGITUDE,FLONG
    C   22  FORMAT (A8,X,F7.2,X,A9,XF7.2)
    
           COSLONG=FLONG*COS(RAD*FLAT)
           stationID=CONT//COUNTRY//WMO//NEAR
    C       WRITE (6,23) COSLONG
    C   23  FORMAT (F7.2)
    C
    C Write out one line of data with TAB between fields
    C
    C      WRITE (6, 6) StationID,VERSION,ASCEN,TYPE,CONT,COUNTRY,WMO,NEAR,  &
    C     &LATITUDE,LONGITUDE,STNELEV,NAME,GRIDELEV,POPC,PSIZE,TOPOT,VEG,    &
    C     &PROXW,DISTW,AIR,DISTAU,VEGGRID,POPNL,COSLONG
    
    C    6 FORMAT (A11,A5,A10,A4,A1,A2,A5,A3,A8,A9,A6,A30,A4,A1,A5,A2,A2,A2, &
    C     &A2,A1,A2,A16,A1,F7.2)
    
          WRITE (6, 7) StationID,TAB,VERSION,TAB,ASCEN,TAB,TYPE,TAB,CONT,   &
         &TAB,CONT,COUNTRY,TAB,WMO,TAB,NEAR,TAB,LATITUDE,TAB,LONGITUDE,TAB, &
         &STNELEV,TAB,NAME,TAB,GRIDELEV,TAB,POPC,TAB,PSIZE,TAB,TOPOT,TAB,   &
         &VEG,TAB,PROXW,TAB,DISTW,TAB,AIR,TAB,DISTAU,TAB,VEGGRID,TAB,POPNL, &
         &TAB,COSLONG
    
        7 FORMAT (A11,A1,A5,A1,A10,A1,A4,A1,A1,A1,A1,A2,A1,A5,A1,A3,A1,A8,  &
         &A1,A9,A1,A6,A1,A30,A1,A4,A1,A1,A1,A5,A1,A2,A1,A2,A1,A2,A1,        &
         &A2,A1,A1,A1,A2,A1,A16,A1,A1,A1,F7.2)
    
    C Retrieve another line of data...
    C
          GO TO 10
    C
    C If end of file, then stop.
    C
       99 STOP
          END
    

    I need to double check that I’m calling things by the right name (i.e. coslong ought not be coslat ;-) and there’s still a lot of commented out diagnostic codes.

    The math to compute the cosine adjusted longitude is in these lines:

           READ (LATITUDE,*) FLAT
           READ (LONGITUDE,*) FLONG
           PI=3.1415926
           RAD=PI/180.0
           COSLONG=FLONG*COS(RAD*FLAT)
    

    I make float versions of LAT and LONG from the CHAR versions then use the formula Pi/180 to convert the degrees to radians and hand that to the cosine function; and stuff the “latitude cosine adjusted longitude” into COSLONG that is now a field in the database.

    Here’s the line that concatenates the parts of the station ID into one key:

           stationID=CONT//COUNTRY//WMO//NEAR
    

    And all the FORMAT statements and WRITE statements get adjusted accordingly. I got this all done and tested while that one “report” was running wild, and had time for a Doggy Doorman break and a few other things… I’m still waaaaay fast in FORTRAN than in Python. I’m improving in Python, but it’s still slow slogging.

    The important bit is that I can now choose either rectangular or Sine perspectives in plotting the stations, and I can also do SQL from inside Python. Next stop is joining the table data for temperatures to the inventory and then doing graphs for only the stations that exist in given years.

    As Friday is “Sushi Day”, I’ll be taking a long dinner break early for the Sushi run, so no idea if I’ll finish it today or not.

    Yes, I’m leaving a tail of unanswered “How To Foo” behind me. It is my way. Flow around the problem and press on to goal; then come back after you are done to “find a better way” and straighten out some of the detours. Don’t let “learning experiences” slow progress to goal, but don’t let reaching the goal fast prevent the learning, only delay it slightly.

  14. E.M.Smith says:

    @Jim2:

    Yeah, not a cross join. Per your link that involves leaving out the “on”

    What I did:

    SELECT I.name, T.year FROM inventory AS I INNER JOIN temps_data as T on I.wmo=T.wmo;
    

    So using a non-indexed WMO number from both files as my “Join” target. Figure 7000 stations max x an average of about 100 years x 12 months = 8,400,000 records we’re in the same order of magnitude as the 14 million I got. Make your average duration 166 years and it’s a match.

    There’s still some room for some kind of duplicates problem, but not a lot. For example, I’m not certain that there is zero overlap of WMO in the records. Especially those with “near” flags not being used to differentiate them from the WMO they are near; but also don’t know for certain there is no reuse between countries (there ought not be, but I’ve fixing that anyway with the new PRIMARY KEY).

  15. E.M.Smith says:

    I’ve got both the inventory and temperature tables now indexed on stationID. The join and report is much faster now ;-O

    Here’s an example that selects for only those stations that have data in the 2000’s and counts up how many records they have in those years:

    SELECT DISTINCT I.name, COUNT(T.year) FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year>1999 GROUP BY I.name;

    For for each distinct NAME, count how many year entries there are (which will be one for every month in that year as the data are monthly and all have a YEAR attached) when you have joined the temperature and the inventory data on Station ID and where the year is greater than 1999.

    It returns with:

    [...]
    | ZIELONA GORA                   |            12 |
    | ZIGALOVO                       |           144 |
    | ZIGUINCHOR                     |           192 |
    | ZINDER                         |           180 |
    | ZION NP                        |           192 |
    | ZONGULDAK                      |           192 |
    | ZORG EN HOOP                   |            12 |
    | ZUARA                          |           144 |
    | ZUGSPITZE                      |           192 |
    | ZUMBROTA                       |           192 |
    | ZURICH (TOWN/                  |           192 |
    | ZUURBEKOM - RWB                |            36 |
    | ZYRJANKA                       |           192 |
    +--------------------------------+---------------+
    3529 rows in set (12.28 sec)
    
    mysql> 
    

    I find it interesting that stations like ZORG EN HOOP and ZIELONA GORA have all of 12 data items. In theory, since this V3.3 data was ended in about 2015, their ought to be 12 x 16 = 192 months of data for any given station as a top value, so we see stations with a full record having that value.

    Guess now I just need to decide what all to plot. Those with any data in the 2000s? Those with data in the final year of 2015? Those with 192 records? Over what threshold, like maybe 100?

    Well, off to charting land, now that I’ve got a fast enough join and can select the groups I want to see. It looks like about 1/2 the thermometers have SOME data in the 2000s, but as noted above, for some it isn’t a whole lot…

  16. E.M.Smith says:

    Interesting… using:

    SELECT I.name, T.year, T.month FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year=2015 GROUP BY I.name;

    So I’m only getting stations with data in the final year of 2015, the count drops to:

    2545 rows in set (12.39 sec)
    

    So about 1000 stations less. Since we are ALWAYS being told how hot it is NOW, might there be a game being played of leaving out stations in the “now” and then putting them back in later as they “catch up”? So always a hotter set now then you get later? Hmmmm…..

  17. E.M.Smith says:

    Wow, what a moth eaten version of the graph you get for 2015 (the then “current” stations.

    I’ll add the graph up above and an update. Here’s the code I used in Python:

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
    
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year=2015 GROUP BY I.stationID;"
     
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1)
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    Still havn’t fixed the exception handling but I know when it tosses an error as I get the nag on my screen ;-)

    By having the cosine value pre-computed the Python / SQL become simpler. I ought to have made the legend more interesting… I think I’ll update that heading before I post the actual graph. Put 2015 in it.

  18. jim2 says:

    Indices should be used on fields used in joins. They don’t have to be unique indices, unless it is a primary key. This makes queries a lot faster.

    On the use of loggers, I use log4net and there is a log4j. What I like is that logging configuration is handled by one file. It includes the location of the log file, the logging level (usually fatal, servere, … , debug, informational), and pruning rules with different conditionals. You can write log statements to various levels and only those that meet the logging level requirements will log. For normal operation you can log only severe or fatal errors. But if you need more for troubleshooting, you just change the confg file to debug and you’re off.

  19. E.M.Smith says:

    I’ve knocked together as “baseline” report. While the “now” is about 2570 stations, the baseline is much more full:

    |  47.0000 |  -11.59 |
    |  34.0000 |  135.96 |
    +----------+---------+
    6988 rows in set (14.48 sec)
    

    Almost 7000 stations.

    I’ll add that graph as an update too. CAVEAT: I’m not sure EXACTLY on the ends of the baseline. They say 1950-1981 but is that inclusive or not? It is supposed to be 30 years, not 31. I need to go back and read the source code to verify. I’m pretty sure it was 1950-1980 as the actual bounds, but even that isn’t 30 years if inclusive of both ends. For now I’ve just gone with 1950-1980 inclusive. I’ll validate exactly the end points for some future run. This ought to be “close enough” for now.

    Here’s the Python, the only real difference being the heading and the SQL SELECT filter:

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3 BASELINE GIStemp")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
    
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I 
                 INNER  JOIN temps3 as T 
                 ON I.stationID=T.stationID 
                 WHERE  year>1949 AND year<1981
                 GROUP BY I.stationID;"
     
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1)
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    
  20. E.M.Smith says:

    @Jim2:

    I made stationID the primary key so it ought to get an automatic index, if I remember the manual right ;-)

    It is part of the compound primary key in the temperature table, so do I need to make a specific index on it to split it out from that? Or is it already an index as it is part of a compound primary key?

    I, um, er, like to, ah, “bother” computers sometimes… It isn’t torture, rally, it’s not like they feel things… so, um, I’ll, er, go ahead and do large non-indexed flat file things just to see how fast (or slow) it is… sometimes… not often, really ;-)

    My “specialty” as a DBA Consultant was an “Efficiency Review” where I’d find all the places folks didn’t do a useful index or a selection before a sort or… then fix them. Somewhere along the way I got interested in just what WAS the relative speed; and thus started my journey down the Dark Side into “lets just run this all the wrong way and see how long it takes” prior to fixing things up the right way…

    I sure hope God isn’t a computer with a disk I/O “discomfort” ;-)

    For the present joins, things are happening in the 10s of seconds range and that’s fast enough for now. Dramatically better than the 2 hours no key no index ;-)

  21. E.M.Smith says:

    scrolling back and forth between the “2015 NOW” and the “Baseline” maps, it just amazes me anyone could think they can be used as a comparison to each other.

    Africa is almost Gone! Canada not much better. Asia is way thinned out. Everything north of Germany is spotty. The Southern Ocean is essentially MIA and most of the Pacific is a big nothing too, with Hawaii over-weighted.

    Just Oh My God.

    I need to repeat this whole exercise now for GHCN v4… but not for a few days…

  22. H.R. says:

    By v9, there will just be one thermometer in Uncle Bob’s back yard. That will make data processing much easier.
    .
    .
    I like the description of ‘moth eaten.’ The animation someone suggested you try after you get things well in hand would show the holes appearing like moths munching on granda’s long johns.

  23. E.M.Smith says:

    just noticed woordpress ate some of the sql text as it had angle brackets… I’ll fix it “soon” (he said, victory sake in hand…)
    UPDATE: Have fixed it, and Sake all gone 8-{ now…

  24. jim2 says:

    I’ve had problems with “natural” composite primary keys. They can grow to 6 or 7 fields just for the primary key. And if you miss a field that should have been part of the composite key, you’ve got a possibly problematic data cleanup. In your case, that won’t be a problem since you have source data, but imagine you don’t and your company is depending on this data to spend project money in a timely manner. Not good.

    Nowadays I usually just use a synthetic primary key. A sequence of integers that grows automatically as one adds rows. This simplifies queries and that means something if you start joining 4 or 6 tables. You can and should leave the original data such as station id in the table, even with a synthetic primary key. If you make station ID a unique index, then you can’t accidentally add one twice.

    Indices can be a performance issue when you have a lot of row insertion and deletion, but that won’t be the case with your data once you have it in a condition you like.

    However, if you are happy with your composite key, each member of it will be indexed in some way. Non-unique data may get a hash map for a primary key, but it’s still a faster lookup.

  25. E.M.Smith says:

    I just noticed that Spain is basically gone and Portugal is down to a couple. It seems warm places are missing along with polar regions..

    I think I need to work out how to graph only stations in both the baseline and “now”…

  26. E.M.Smith says:

    @Jim2:

    Thanks for the advice. I’ve not used composite keys much, so a bit out of my depth.

    OTOH, speed is well inside acceptable at 10s of seconds. There is no more complexity to show up, so this ought to be ok.

    Right now I’m working on a subquery approach to get me just those stations with data in both the Baseline and 2015… just how much “apples to apples” really exists..

    Unfortunately, a fifth of Sake it leading to slow progress :-)

    What is very clear is that the idea that a synthetic grid of hypothetical infilled values can fix the crappy real data is just a stupid idea.

    My problem is how to illustrate that….

  27. beththeserf says:

    Something coming from a null set? – Fuzzy cli-sci logic.
    But when higher goals require a local habitation and
    a name, well then, ‘anything goes!’

  28. tom0mason says:

    Excellent stuff E.M.
    “Africa is almost Gone! Canada not much better. Asia is way thinned out. Everything north of Germany is spotty. The Southern Ocean is essentially …”

    That was one of my reasons for asking about constant area mapping of the stations.
    I have noted that other bloggers had spotted the lack of modern and long record reporting stations in MidEast, etc. As well as all those empty oceans.
    My suspicion was aroused when people gave lame excuses for homogenizing over vast areas when there are no stations over thousands of square miles of non-uniform terrain. And yet (haha) they insist that their modeled global temperatures are good for 1/100th °C (or some such nonsense).
    Tony Hellers well known .gif map of actual stations v modeled warming with his highlighted region —

    Yes it is NOAA (GHCN v3.3.0.0) and land only but it might give you ideas about where else to look …

  29. jim2 says:

    There are a few ways to examine indices. MySQL workbench makes such things obvious, fast, and easy. But for SQL only,

    https://stackoverflow.com/questions/5213339/how-to-see-indexes-for-a-database-or-table-in-mysql

  30. jim2 says:

    @ BTS https://chiefio.wordpress.com/2019/02/07/ghcn-global-thermometers-sine-globe-with-labels/#comment-107510

    A set of zero null sets = zero
    A set of one null set = counting number 1
    A set of two null sets = counting number 2
    A set of three null sets = counting number 3

  31. E.M.Smith says:

    @Tom0Mason:

    I noticed that the blue areas tend to be where current thermometers are being dropped… so drop the cold ones, fill in from the warm ones and make red the completely empty?

  32. E.M.Smith says:

    I was going to make a plot of ONLY those stations that were both in the Baseline and in 2015 (the “now” of this data set) but once I got the SQL written, it turns out that there are 2545 in “now” and 2542 in the “both” so only 3 stations missing. Not really worth uploading a graph that is nearly indistinguishable…

    The SQL is a bit interesting. Learned about the “sub-query” as a data source and as a selection device. I have 2 “joins” in this one

    SELECT I.latitude, I.coslong FROM invent3 AS I 
    INNER JOIN temps3 as T ON I.stationID=T.stationID 
    WHERE year&lt 1981 AND year > 1949 AND I.stationID IN 
    (SELECT I.stationID FROM invent3 AS I 
    INNER JOIN temps3 as T ON I.stationID=T.stationID 
    WHERE  year=2015) GROUP BY I.stationID;
    

    The “sub-query” starts with the ‘(‘ in front of the second “SELECT” statement. It basically makes a list of elements and the “IN” searches the list to assure that the first SELECT I.stationID is in it… So to make the “cut” a station must be in the years in the first range, and found in a list of stations in the final year.

  33. E.M.Smith says:

    I turned the question around the other way: What stations are IN the GIStemp baseline, but MISSING in the present? Turns out it’s about 4446 of them. (Subject to change if I need to move the baseline years by one… after I re-check their code base). The graph is now in the posting as Update 5 and right after the Baseline graph for easy A/B.

    So those are all the stations where some FICTIONAL temperature today is compared to what they were in the past.

    Here/s the Python code(again mostly differing in the SQL statement:

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3 Gone NOW - BASELINE GIStemp")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
        
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I 
         INNER JOIN temps3 as T ON I.stationID=T.stationID 
         WHERE year>1949 AND year<1981 AND I.stationID 
         NOT IN (SELECT I.stationID FROM invent3 AS I 
         INNER JOIN temps3 as T ON I.stationID=T.stationID 
         WHERE  year=2015)GROUP BY I.stationID;"
     
        cursor.execute(sql)on
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1)
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    I’ve “pretty printed” the SQL with added CR / newlines but in the actual program I just left it all on one very long line…

  34. Larry Ledwick says:

    So if I understand you correctly the old weather data is computed from the data assembled from about 6991 stations but today’s data consists of only about 2542 stations. If correct then they have dropped 63.6% of the old data.

    They are comparing apples and pineapples for the current increase in temperature data?

  35. E.M.Smith says:

    @L.L:

    Yes. They are comparing a hypothetical “Grid Cell Temperature” computed from one set of stations in the past, and from a different set in the present. “missing” data (i.e. of the 16,000 grid cells, the ~13,500 without a thermometer in them) are fabricated via the “reference station method” by looking sideways at some existing thermometer up to 1200 km away. Think there’s a nice chance of only keeping thermometers showing warming in the present while spreading them over places with a cold past?

    FWIW, I’ve added 2 more graphs. Color this time. Showing the NOW (of 2015 in this dataset) to the Baseline. Baseline in blue, 2015 in red. With the top color 50% transparent (and with each taking a turn on top) for a much easier A/B of what is here NOW and what was there THEN.

    See “Update 6” above. ( I think I need to start a new posting for any more “updates” ;-)

    Here/s the Python Code in two parts (differing only in which color is on top and transparent):

    Baseline first, on the bottom, full density, 2015 on top 50% transparent.

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3 NOW over BASELINE GIStemp")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
        
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year>1949 AND year<1981 AND I.stationID NOT IN (SELECT I.stationID FROM invent3 AS I INNER JOIN temps3 as T ON I.stationID=T.stationID WHERE  year=2015)GROUP BY I.stationID;"
     
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1,color='blue')
    
    
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year=2015 GROUP BY I.stationID;"
     
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1,color='red',alpha=0.5)
    
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    

    Then 2015 on the bottom full density, Baseline on top 50% transparent. (Kind of fun learning new bits of stuff to with the graphs… )

    # -*- coding: utf-8 -*-
    import datetime
    import pandas as pd
    import numpy as np
    import matplotlib.pylab as plt
    import math
    import MySQLdb
    
    plt.title("Global Thermometer GHCN v3.3 BASELINE over NOW GIStemp")
    plt.xlabel("Longitude")
    plt.ylabel("Latitude")
    plt.xlim(-180,180)
    plt.ylim(-90,90)
    
    try:
        db=MySQLdb.connect("localhost","root","OpenUp!",'temps')
        cursor=db.cursor()
    
        
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year=2015 GROUP BY I.stationID;"
     
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1,color='red',alpha=1)
        
        
        sql="SELECT I.latitude, I.coslong FROM invent3 AS I INNER JOIN temps3 as T on I.stationID=T.stationID WHERE year>1949 AND year<1981 AND I.stationID NOT IN (SELECT I.stationID FROM invent3 AS I INNER JOIN temps3 as T ON I.stationID=T.stationID WHERE  year=2015)GROUP BY I.stationID;"
     
        cursor.execute(sql)
        stn=cursor.fetchall()
        data = np.array(list(stn))
        xs = data.transpose()[0]   # or xs = data.T[0] or  xs = data[:,0]
        ys = data.transpose()[1]
    
        plt.scatter(ys,xs,s=1,color='blue',alpha=0.5)
    
        plt.show()
    
    except:
        print "This is the exception branch"
    
    finally:
        print "All Done"
        if db:
            db.close()
    
    
  36. Larry Ledwick says:

    Since this popped up today on twitter thought I would bring the link over.

    “Oh and don’t trust a word these people say, either. They are all charlatans.”
    https://www.breitbart.com/politics/2019/02/08/the-planet-is-cooling-alarmists-shriek-its-warming/

  37. E.M.Smith says:

    @Bill In Oz:

    No idea why your comment here: https://chiefio.wordpress.com/2019/02/07/ghcn-global-thermometers-sine-globe-with-labels/#comment-107470

    was in the TRASH bucket. Usually that’s only a limited number of things can put them there (like F-bombs) but I see nothing wrong with it. I’ve fished it out, even if a couple of days late -)

    Yes, it is the case that the majority of thermometers are in the places with the most growth of their economies and populations. Not only that, but very often at Airports that grow from Grass Fields to acres of tarmac and cement with tons of kerosene being burned / hour.

  38. E.M.Smith says:

    Looking at those last two color maps is fascinating. Canada & Australia very heavy in the baseline, almost missing from the present data. Sub Sahara Africa also mostly gone now. A big chunk of what looks like China covered with very few now, but lots of blue in the baseline.

    Japan, South Korea and Turkey purplish, so in both, but still with blue fringing around the purple showing change of stations. What looks like Germany (and nearby?) also with decent coverage, but the UK looks to largely be England. Where’s Scotland (at all) and Ireland (in the present / red)?

    Then in the rest of the world you have lots of places (old USSR, South America, India, etc.) where there are blue dots and red dots but not a lot of purple showing. Huge change of instruments.

    There’s just not a lot that looks like a comparison of a thermometer with itself. It is almost entirely “this thermometer in the past to that other one now” and that is just never going to work for calorimetry. Not even to whole degrees worth. Ask any chemist with calorimetry experience what happens if you change (or even touch…) the thermometers…

  39. Pingback: An Intriguing Look At Temperatures In Australia | Musings from the Chiefio

  40. Pingback: An Intriguing Look At Temperatures In Australia | Musings from the Chiefio

  41. Larry Ledwick says:

    Just out of curiosity you have the locations of the thermometers, a query against the data base for all thermometers which have a matching pair (location) in the other would give you a hard number of stations which have not changed location over the time span.

    You might have to allow a small range of change to accommodate trivial shifts like moving to the back of the lot from the front of the lot. But it would be interesting to see the absolute count of instruments which have not changed locations over the span of measurements.

  42. E.M.Smith says:

    @Larry:

    Nice Try, but….

    One of my Major Gripes about GHCN is that a LOT of data that ought to be time series is stored as ONE data point. Take “Airstation”. The grass field that in the 1800s was a marching grounds in Beriln has data from then, but it became an airport (the Berlin Airlift went to it…). In the earlier GHCN it was flagged as “Airstation” because it became one. A few years ago it was converted to a shopping mall.

    So what is it? You get ONE datapoint. A or no A?

    So, location: Any given WMO# has ONE Lat / Long pair. Move it, they got changed, the old pair is lost to the dustbin.

    My intent at one point was to do such a compare between GHCN v1, v2 and v3; but then found out V2 & V3 have the same Inventory File and V1 has an incompatible numbering system (they changed all the countries, for one thing…)

    So there is a HUGE manual process needed to figure out what station in V1 matches who in V2/V3. I have no idea what they have done to it in V4…

    So while it would be great to have Lat / Long / Vegetation / Airstation / etc. by year, it isn’t there. So anything that is an asphalt / cement Jet Airport today, is also one in 1860….

  43. Simon Derricutt says:

    Larry – I’m not sure that a shift from the front of the lot to the back can be regarded as trivial. I’ve seen several degrees difference only a few yards different location round here. AFAICT any change at all in the location, method, or the sensor used will produce data that can’t really be compared as apples to apples. Similarly, a change in the local environment will change the results (that field/airport/shopping mall EM talks about above).

  44. Larry Ledwick says:

    Well that sucks if the data is that devoid of useful information. I remember when WUWT was working on the station list that as people poked through the data they were finding that a fair number of stations were not in the locations they were supposed to be. One article he ran showed a coastal station had actual lat long data that put it in the middle of a bay.

    I realize just a short move can make a difference (ie when you move from the back lawn to next to the parking lot and in the hot exhaust of the window AC unit, but was thinking you could at least pin down that those two stations were in the same part of town, and general micro-climate minus human modifications like concrete runways.

    Bottom line looks like there is literally no valid reason to consider the different data lists comparable in any sense at all if you cannot at least demonstrate two stations share the same local area. So we are in the situation that they are not just comparing apples and oranges but apples and 52 Buicks totally different temperature nets with totally different micro climate conditions and they don’t even have the data available to show which sites have moved between one data set and the other.

    A year or so back there was an item that the temperature recording location in Death Valley had been moved to a significantly different location.

    https://wattsupwiththat.com/2013/06/30/it-seems-noaa-has-de-modernized-the-official-death-valley-station-to-use-older-equipment-to-make-a-record-more-likely/

  45. E.M.Smith says:

    @Larry L.:

    BINGO! Give that man a rubber ducky! Crap data, crappier metadata, all smeared around as the kiddies smear it around with their e-paint brush homogenizer.

  46. Larry Ledwick says:

    Steven Goddard certainly agrees with you.
    Very similar to the work you are doing in this article.

    https://realclimatescience.com/2019/02/61-of-noaa-ushcn-adjusted-temperature-data-is-now-fake/

  47. H.R. says:

    Hockey sticks everywhere you look.

  48. Pingback: GHCN v3.3 Regions Closeup Current vs Baseline | Musings from the Chiefio

  49. Pingback: Hadley Baseline Thermometers Vs Now – GHCN v3.3 | Musings from the Chiefio

  50. Pingback: Hadley Baseline Thermometers Vs Now – GHCN v3.3 | Musings from the Chiefio

  51. Pingback: GHCN v3.3 vs v4 – Top Level Entry Point | Musings from the Chiefio

Comments are closed.