Playing With Pythons & Pandas (& GHCN)

All I wanted to do was make a couple of simple nearly trivial graphs…

After working out how to get the GHCN files loaded into SQL, and doing a couple of ad-hoc queries (proving it was worth the effort), I was ready to move on to “make a graph” at the leading edge of things (while still puttering around with indexes and key fields and efficiency and such on MySQL, as the low priority backend tasks).

A fair amount of searching for MySQL interface libraries for various languages showed they exist for some languages (notably Go has one) but they tended to be a bit of “assemble it yourself”. I didn’t find one for FORTRAN (it might exist, I just did a very quick web search), did find one on Sourceforge for DIY with Go, and there were hints of maybe something for R (but I really didn’t search much for it…). And, at every turn, oodles and oodles of links to stuff about Python.

OK, I can take a hint. When 80%+ of the stuff you turn up says “We use FOO!” (or in Python code “We use SPAM!” they like using Monty Python references so use “SPAM and Eggs” instead of FOO and BAR) eventually you start to think “Maybe I ought to look at FOO?”. So I did.

I don’t hide my (mild) distaste for Python. It’s a fine language and all, but does some things I just don’t like. Due to dynamic typing a variable isn’t really a variable but is a pointer to some data somewhere that DOES have a type. So: a = 1; or b = “SPAM” really just puts a pointer in a or b to those data items “somewhere”. Well, if you are used to thinking of an assignment as actually doing an assignment and pointers as being pointers it is a bit annoying to need to keep in mind that these aren’t. MOST of the time you can ignore it. Yes, it is a nice touch that lets you do things like not bother to assign a data type; but… there can be the odd case where it bites you so you must know “this is different”. Similarly, I don’t like position dependent syntax. Change the white space, change the program… Having white space as part of the reserved word space you need to track is a bother…all to enforce their ideas about good pretty-printing practice for “structured programming”. OK, apply straight jacket and shut up…

So despite Python being the “3rd most popular” language among whoever bothered to be in the survey, (the other two are Java and C) I’ve avoided using it unless there wasn’t much of an alternative. (Like doing maintenance on existing code).

But there it was: Just about every example was using Python… So OK, I’ve spent a couple of days “loading Python” into my brain again, and learning a little of how to use Pandas to graph things. I’m about to do my first “load stuff into Pandas and graph it”, but thought I would post up a couple of notes first. Any success at graphing will be added as an update.

One bit of irk was wandering for an hour or so down the list of Pythons. RPython, Cython, Jython (or some such), PyPy, and more. Why? Well, if you are going to use Python, you MUST PICK ONE, which means you must know enough to pick one, which means at least a minimal familiarity. This is one of my more general complaints about “trendy” things: They end up metastasizing into so many forms it is more work to choose one that to just use something else.

Jython is a Python that makes Java byte code for the JVM. OK, scratch that one.
Cython is a super-set that then spits out C or C++ code, mostly used for making other library like stuff. Scratch.
RPython is Restricted Python, used as a subset to make self-hosting Python. Scratch it, too.

The bottom line is that the Python you get by default on Linux is likely to be the one you want. Maybe.

Then we get 2.x vs 3.x Python. The 3.x Major Release is not entirely compatible with the 2.x one. OK, use python3 (that you may get by default, or not, depending on a lot of things, or you might need to call it out as python3).

It looks like I have both installed already:

root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas#  apt-get install python
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python is already the newest version.
python set to manually installed.
The following packages were automatically installed and are no longer required:
  libjsoncpp0 libuuid-perl
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded.

root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python3
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python3 is already the newest version.
python3 set to manually installed.
The following packages were automatically installed and are no longer required:
  libjsoncpp0 libuuid-perl
Use 'apt-get autoremove' to remove them.
0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded.

But I’ll need to explicitly say “python3” to get the newer dialect.

Then, while researching how big a Dataframe (i.e. what EVERYONE else calls a table…) can be, found that 32 bit Python will barf on about 1.8 GB of data (as memory peaks about 3.x GB during the load and that exceeds what a 32 bit can address) so you will want 64 bit Python then. How one gets 64 vs 32 bit left unclear… though implied it was disjoint from machine word size. I did a quick check of the size of GHCN and either one will do:

chiefio@odroidxu4:~/SQL/v3$ ls -l
total 704796
[...]
-rw-r--r-- 1 root    root       786240 Jan 23 21:14 inventory.in
-rw-r--r-- 1 root    root      1048320 Jan 23 21:43 inv.out
[...]
-rw-r--r-- 1 chiefio chiefio  53690600 Jan 20 20:39 temps.in
-rw-r--r-- 1 chiefio chiefio 344360400 Jan 20 20:40 temps.u.in

So the Inventory File even as a tab column list is only 1 MB, and the entire GHCNv3 temperature data file similarly expanded is 344 MB. They can both be sucked into a Pandas Dataframe of either word size Python. (And fit nicely in memory of my 2 GB XU4…)

At least I can now ignore what “size” Python to run…

CSV, Dataframes, and more

The examples of how to do this generally just call an “object” that is in the Pandas library to do the loading. They universally give lip service to some OTHER format load choices, then show only the CSV (Comma Separated Values) load. Why the world so loves CSV is beyond me. Text and numbers are often FULL of commas that cause grief for that mode. So I’m going to end up “discovering” what those other modes of data loading are and working it out myself. But for my first attempt, I’m going to go ahead and do CSV. On the “inventory” file as it is the smaller one.

A ‘quick search’ showed it very comma infested:

root@odroidxu4:/SG2/ext/chiefio/SQL/v3# grep "," inventory.in 
10266460001 -17.1000   15.7000 1150.0 PEREIRA D,ECA       ANGO       1095R   -9FLxxno-9A-9SUCCULENT THORNSA
10468026000 -18.3700   21.8500 1000.0 SHAKAWE                         920R   -9FLDEno-9A-9MARSH, SWAMP    C
10764910000   4.0000    9.7300    9.0 DOUALA OBS.                      12U  458FLxxCO 5A 1MARSH, SWAMP    C
12061816000  10.9300  -14.3200   69.0 BOKE                             41R   -9HIFOno-9x-9MARSH, SWAMP    A
12161769000  11.5800  -15.4800   20.0 BOLAMA                            3S   10FLxxCO 3A 3MARSH, SWAMP    A
[...]
20550983000  45.7700  132.9700  103.0 HULIN                            76S   20FLxxno-9x-9BOGS, BOG WOODS C
20551334000  44.6200   82.9000  321.0 JINGHE                          441R   -9MVDEno-9x-9MARSH, SWAMP    B
20551716000  39.8000   78.5700 1117.0 BACHU                          1331R   -9FLDEno-9x-9MARSH, SWAMP    A
20551730000  40.5000   81.0500 1013.0 ALAR                           1070R   -9FLDEno-9x-9MARSH, SWAMP    A[...]
21047678001  30.5000  140.3000   83.0 TORISHIMA,JAPAN                   0R   -9HIxxCO 1x-9WATER           A
21544212001  49.8000   92.1000  934.0 MONGOLIAN STATION,BAYAN-OL     1203S   20MVxxno-9x-9COOL DESERT     A
21544213001  49.7000   96.4000 -999.0 BAYAN UUL, DZAVHAN             1680R   -9MVxxno-9x-9COOL DESERT     A
21544214001  48.3000   89.5000 -999.0 ALTAY, BAYAN-OLIGY             2624R   -9MVxxno-9x-9COOL GRASS/SHRUBA
21544218001  47.1000   92.8000 -999.0 MONGOLIAN STATION, HOVD        1390R   -9MVDEno-9x-9WARM GRASS/SHRUBA
21544237001  48.2000   99.9000 -999.0 MONGOLIAN STATION, N. HANG     2529R   -9MVxxno-9x-9COOL CONIFER    A
21544239001  49.0000  104.1000 -999.0 ERDENET, BULGAN                1403R   -9HIxxno-9A-9COOL GRASS/SHRUBB
21544241000  48.9000  106.1000  807.0 BAYAN-GOL, SELENGE              914R   -9HIxxno-9x-9COOL GRASS/SHRUBA
21544241002  48.9000  106.9000 -999.0 MONGOLIAN STATION, SELENGE     1311R   -9MVxxno-9x-9WARM CROPS      A
21544241003  49.2000  105.4000 -999.0 ORBON, SELENGE                  964R   -9HIxxno-9x-9COOL GRASS/SHRUBA
21544241004  49.8000  106.7000 -999.0 YOROO, SELENGE                  958R   -9FLMAno-9x-9COOL GRASS/SHRUBA
21544241005  50.1000  106.2000 -999.0 SHAAM, SELENGE                  731R   -9HIxxno-9x-9COOL GRASS/SHRUBA
[...]

So I get to “de-comma” it… As I learned vi back in the dark ages and it now resides in my brain stem so the magic key patterns just happen without my knowing what they are anymore… I use the vi editor (renamed “vim” in Linux but they have an alias to vi). In vi, you can use “Regular Expressions” (just as in “grep” – global regular expression print). In vi, the “globally replace” command is, at the : prompt

:g/,/s//;/g

You may hate the excessive terseness of hard core *Nix, or love it, but for me being able to replace all “,” with “;” in 10 CHAR is a good thing. To decode that line, the “:” gets you a command prompt in vi, then “g” is “globally”, meaning “do all lines” (you can also do line ranges) the “/,/” says “search for comma”, the “s” is “substitute” “//” for that thing you found (the comma) “;/” “a semicolon”, “g” globally on each line – i.e. not just the first one on the line but anywhere and everywhere in that row. All that in 10 CHAR.

Why use a “;”? Because there were none in the file:

root@odroidxu4:/SG2/ext/chiefio/SQL/v3# grep ";" inventory.in
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# 

So there will not be a collision with any other meaning.

Now, with it stripped of the dreaded comma, I’ve modified my “Tab inserting FORTRAN program” to insert a comma between fields. I did a similar “global substitution” of “COM” for “TAB” and set COM=”,”. Also changed the input / output file names just to keep things clear.

root@odroidxu4:/SG2/ext/chiefio/SQL/v3# head semiinven.in 
10160355000  36.9300    6.9500    7.0 SKIKDA                           18U  107HIxxCO 1x-9WARM DECIDUOUS  C
10160360000  36.8300    7.8200    4.0 ANNABA                           33U  256FLxxCO 1A 7WARM CROPS      C
10160390000  36.7200    3.2500   25.0 DAR-EL-BEIDA                     34U 1365FLxxCO10A 6WARM CROPS      C
10160395001  36.5200    4.1800  942.0 FT. NATIONAL                    805R   -9MVDEno-9x-9WARM CROPS      A
10160400001  36.8000    5.1000  230.0 CAP CARBON                       28R   -9HIxxCO 1x-9WATER           A
10160402000  36.7200    5.0700    2.0 BEJAIA                          121U   90HIxxCO 1A 3WATER           B
10160403000  36.4700    7.4700  227.0 GUELMA                          287S   47HIxxno-9x-9WARM CROPS      C
10160419000  36.2800    6.6200  694.0 CONSTANTINE                     563U  335MVxxno-9A 7WARM FOR./FIELD B
10160425000  36.2200    1.3300  143.0 CHLEF                           242U  106HIxxno-9A 3WARM CROPS      C
10160425001  36.1700    1.5000  112.0 ORLEANSVILLE                    219R   -9HIDEno-9x-9WARM CROPS      A
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# ls *.f
ccodesv3.f  csvinvn.f  inven.f	stationdat.f

So that csvinvn.f is the FORTRAN to put this input file (where I’ve swapped commas to ;) in CSV condition so Pandas can load it “in the usual way”. Someday later I’ll work out how to use some non-CSV Dataframe load; but this gets me over the hump to “doing something in Pandas” with about 2 minutes of effort. Far less than it took to do this writeup about it ;-)

This input file is semiinven.in as it is full of semi-colons ;-)

root@odroidxu4:/SG2/ext/chiefio/SQL/v3# gfortran csvinvn.f 
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# ./a.out > invent.csv
[...]
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# head invent.csv 
Gv3  ,7Sept2015 ,QCU ,1,101,60355,000, 36.9300,   6.9500,   7.0,SKIKDA                        ,  18,U,  107,HI,xx,CO, 1,x,-9,WARM DECIDUOUS  ,C
Gv3  ,7Sept2015 ,QCU ,1,101,60360,000, 36.8300,   7.8200,   4.0,ANNABA                        ,  33,U,  256,FL,xx,CO, 1,A, 7,WARM CROPS      ,C
Gv3  ,7Sept2015 ,QCU ,1,101,60390,000, 36.7200,   3.2500,  25.0,DAR-EL-BEIDA                  ,  34,U, 1365,FL,xx,CO,10,A, 6,WARM CROPS      ,C
Gv3  ,7Sept2015 ,QCU ,1,101,60395,001, 36.5200,   4.1800, 942.0,FT. NATIONAL                  , 805,R,   -9,MV,DE,no,-9,x,-9,WARM CROPS      ,A
Gv3  ,7Sept2015 ,QCU ,1,101,60400,001, 36.8000,   5.1000, 230.0,CAP CARBON                    ,  28,R,   -9,HI,xx,CO, 1,x,-9,WATER           ,A
Gv3  ,7Sept2015 ,QCU ,1,101,60402,000, 36.7200,   5.0700,   2.0,BEJAIA                        , 121,U,   90,HI,xx,CO, 1,A, 3,WATER           ,B
Gv3  ,7Sept2015 ,QCU ,1,101,60403,000, 36.4700,   7.4700, 227.0,GUELMA                        , 287,S,   47,HI,xx,no,-9,x,-9,WARM CROPS      ,C
Gv3  ,7Sept2015 ,QCU ,1,101,60419,000, 36.2800,   6.6200, 694.0,CONSTANTINE                   , 563,U,  335,MV,xx,no,-9,A, 7,WARM FOR./FIELD ,B
Gv3  ,7Sept2015 ,QCU ,1,101,60425,000, 36.2200,   1.3300, 143.0,CHLEF                         , 242,U,  106,HI,xx,no,-9,A, 3,WARM CROPS      ,C
Gv3  ,7Sept2015 ,QCU ,1,101,60425,001, 36.1700,   1.5000, 112.0,ORLEANSVILLE                  , 219,R,   -9,HI,DE,no,-9,x,-9,WARM CROPS      ,A
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# 

And “Bob’s Your Uncle!” it’s done. A CSV file for input.

For completion, this is a listing of the csvinvn.f program, though it is only trivially different from the TAB one.

root@odroidxu4:/SG2/ext/chiefio/SQL/v3# cat csvinvn.f 
C FORTRAN to read the inventory files v3 GHCN file and insert COMMA 
C in the output.  Also divides "country" into Continent and Country
C
C Variable declarations...
C
      CHARACTER * 1  COM
      CHARACTER * 5  VERSION
      CHARACTER * 10 ASCEN
      CHARACTER * 4  TYPE, GRIDELEV
      CHARACTER * 1  CONT,POPC,AIR,POPNL
      CHARACTER * 2  COUNTRY,TOPOT,VEG,PROXW,DISTW,DISTAU
      CHARACTER * 5  WMO,PSIZE
      CHARACTER * 3  NEAR
      CHARACTER * 6  STNELEV
      CHARACTER * 8  LATITUDE
      CHARACTER * 9  LONGITUDE
      CHARACTER * 16 VEGGRID
      CHARACTER * 30 NAME

C
C Set the COM character
      COM=","
C
C Set some constants
      VERSION="Gv3"
      ASCEN="7Sept2015"
      TYPE="QCU"
C
C Read in one line of data...
C
    9 OPEN(1, FILE='semiinven.in', STATUS='OLD', ACTION='READ')
   10 READ (1, 11, END=99) CONT, COUNTRY, WMO, NEAR,LATITUDE,           &
     &LONGITUDE,STNELEV,NAME,GRIDELEV,POPC,PSIZE,TOPOT,VEG,PROXW,       &
     &DISTW,AIR,DISTAU,VEGGRID,POPNL
C
   11 FORMAT (A1,A2,A5,A3,X,A8,X,A9,X,A6,X,A30,X,A4,A1,A5,A2,A2,A2,     &
     &A2,A1,A2,A16,A1)
C
C Convert CHAR  to Float 

C         READ (T(I),*,END=20) F(I)
C
C Write out one line of data with COM between fields
C
C      WRITE (6, 6) VERSION,ASCEN,TYPE,CONT,COUNTRY,WMO,NEAR,LATITUDE,   &
C     &LONGITUDE,STNELEV,NAME,GRIDELEV,POPC,PSIZE,TOPOT,VEG,PROXW,DISTW, &
C     &AIR,DISTAU,VEGGRID,POPNL

C    6 FORMAT (A5,A10,A4,A1,A2,A5,A3,A8,A9,A6,A30,A4,A1,A5,A2,A2,A2,     &
C     &A2,A1,A2,A16,A1)

      WRITE (6, 7) VERSION,COM,ASCEN,COM,TYPE,COM,CONT,COM,CONT,COUNTRY,&
     &COM,WMO,COM,NEAR,COM,LATITUDE,COM,LONGITUDE,COM,STNELEV,COM,NAME, &
     &COM,GRIDELEV,COM,POPC,COM,PSIZE,COM,TOPOT,COM,VEG,COM,PROXW,COM,  &
     &DISTW,COM,AIR,COM,DISTAU,COM,VEGGRID,COM,POPNL

    7 FORMAT (A5,A1,A10,A1,A4,A1,A1,A1,A1,A2,A1,A5,A1,A3,A1,A8,A1,A9,A1,&
     &A6,A1,A30,A1,A4,A1,A1,A1,A5,A1,A2,A1,A2,A1,A2,A1,                 &
     &A2,A1,A1,A1,A2,A1,A16,A1,A1)

C Retrieve another line of data...
C
      GO TO 10
C
C If end of file, then stop.
C
   99 STOP
      END
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# 

I’m now going to spend some unknown number of hours attempting to suck that CSV file into a Pandas Dataframe and graph something. Perhaps number of WMO by latitude… or a scatter chart of that…

Whem I’m done, this posting will get an UPDATE here.

Though first I think I’m going to go get breakfast and fresh coffee… Yes, this was done before breakfast ;-)

UPDATE!

After a lot of wandering in the forest (documented in comments below) I made my first plot. It is LATitude and LONgitude of stations on a graph. No idea if it is right, or not, or what. But here it is.

Latitude and Longitude plated for all stations in GHCN v3

Latitude and Longitude plated for all stations in GHCN v3

This will just be the latitude and longitude for stations plotted against the order of records. It has some quasi meaning as records are arranged by continent, so sort of in physical blobs. Yeah, mostly meaningless. Doesn’t matter. I’ve now done the whole thing from data format to load to graph. So from here on out it is just polishing and adding incremental skills.

Subscribe to feed

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in NCDC - GHCN Issues, Tech Bits and tagged , , , . Bookmark the permalink.

21 Responses to Playing With Pythons & Pandas (& GHCN)

  1. Steven Fraser says:

    Fun, so far….

  2. E.M.Smith says:

    @Steven:

    Yeah, but after breakfast I got side tracked into things like banking, paying bills, grocery run, getting dinner ready, 1/2 bottle of saki, washing dishes & the dog…. not together, mind you :-)

    So I’ve made no further progress today… maybe after the spouse goes to bed and the house is quiet again….

  3. E.M.Smith says:

    Well this sucks…

    Decided I could at least load some of the library stuff and prep…

    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# python3
    Python 3.4.2 (default, Sep 26 2018, 05:38:50) 
    [GCC 4.9.2] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pandas as pd
    Traceback (most recent call last):
      File "", line 1, in 
    ImportError: No module named 'pandas'
    >>> import datetime
    >>> 
    >>> import pandas_datareader
    Traceback (most recent call last):
      File "", line 1, in 
    ImportError: No module named 'pandas_datareader'
    >>> 
    

    OK…. So looks like there’s some “install libraries” step left out of the various tutorials (mostly based around Mac or Windows installs using an IDE of various sorts…

    datetime is there, but the others are not.

  4. E.M.Smith says:

    OK, looks like it is a separate package:

    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python-pandas
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      python-dateutil python-pandas-lib python-tz
    Suggested packages:
      python-pandas-doc
    Recommended packages:
      python-scipy python-matplotlib python-tables python-numexpr python-xlrd python-statsmodels python-openpyxl
      python-xlwt python-bs4
    The following NEW packages will be installed:
      python-dateutil python-pandas python-pandas-lib python-tz
    0 upgraded, 4 newly installed, 0 to remove and 5 not upgraded.
    Need to get 2,591 kB of archives.
    After this operation, 12.9 MB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main python-dateutil all 2.2-2 [51.3 kB]
    Get:2 http://auto.mirror.devuan.org/merged/ jessie/main python-tz all 2012c+dfsg-0.1 [31.9 kB]
    Get:3 http://auto.mirror.devuan.org/merged/ jessie/main python-pandas-lib armhf 0.14.1-2 [1,256 kB]
    Get:4 http://auto.mirror.devuan.org/merged/ jessie/main python-pandas all 0.14.1-2 [1,252 kB]
    Fetched 2,591 kB in 3s (750 kB/s)    
    Selecting previously unselected package python-dateutil.
    (Reading database ... 97490 files and directories currently installed.)
    Preparing to unpack .../python-dateutil_2.2-2_all.deb ...
    Unpacking python-dateutil (2.2-2) ...
    Selecting previously unselected package python-tz.
    Preparing to unpack .../python-tz_2012c+dfsg-0.1_all.deb ...
    Unpacking python-tz (2012c+dfsg-0.1) ...
    Selecting previously unselected package python-pandas-lib.
    Preparing to unpack .../python-pandas-lib_0.14.1-2_armhf.deb ...
    Unpacking python-pandas-lib (0.14.1-2) ...
    Selecting previously unselected package python-pandas.
    Preparing to unpack .../python-pandas_0.14.1-2_all.deb ...
    Unpacking python-pandas (0.14.1-2) ...
    Setting up python-dateutil (2.2-2) ...
    Setting up python-tz (2012c+dfsg-0.1) ...
    Setting up python-pandas-lib (0.14.1-2) ...
    Setting up python-pandas (0.14.1-2) ...
    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# 
    
  5. E.M.Smith says:

    And a separate Python3 version of pandas:

    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python3-pandas
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      python3-dateutil python3-numpy python3-pandas-lib python3-six python3-tz
    Suggested packages:
      python-numpy-doc python3-dev python3-nose python3-numpy-dbg python-pandas-doc
    Recommended packages:
      python3-scipy python3-matplotlib python3-numexpr python3-tables python3-bs4 python3-html5lib
    The following NEW packages will be installed:
      python3-dateutil python3-numpy python3-pandas python3-pandas-lib python3-six python3-tz
    0 upgraded, 6 newly installed, 0 to remove and 5 not upgraded.
    Need to get 4,065 kB of archives.
    After this operation, 20.6 MB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main python3-six all 1.8.0-1 [12.7 kB]
    Get:2 http://auto.mirror.devuan.org/merged/ jessie/main python3-dateutil all 2.2-2 [33.2 kB]       
    Get:3 http://auto.mirror.devuan.org/merged/ jessie/main python3-numpy armhf 1:1.8.2-2 [1,525 kB]
    Get:4 http://auto.mirror.devuan.org/merged/ jessie/main python3-tz all 2012c+dfsg-0.1 [25.4 kB]
    Get:5 http://auto.mirror.devuan.org/merged/ jessie/main python3-pandas-lib armhf 0.14.1-2 [1,219 kB]
    Get:6 http://auto.mirror.devuan.org/merged/ jessie/main python3-pandas all 0.14.1-2 [1,249 kB]
    Fetched 4,065 kB in 2s (1,408 kB/s)   
    Selecting previously unselected package python3-six.
    (Reading database ... 97896 files and directories currently installed.)
    Preparing to unpack .../python3-six_1.8.0-1_all.deb ...
    Unpacking python3-six (1.8.0-1) ...
    Selecting previously unselected package python3-dateutil.
    Preparing to unpack .../python3-dateutil_2.2-2_all.deb ...
    Unpacking python3-dateutil (2.2-2) ...
    Selecting previously unselected package python3-numpy.
    Preparing to unpack .../python3-numpy_1%3a1.8.2-2_armhf.deb ...
    Unpacking python3-numpy (1:1.8.2-2) ...
    Selecting previously unselected package python3-tz.
    Preparing to unpack .../python3-tz_2012c+dfsg-0.1_all.deb ...
    Unpacking python3-tz (2012c+dfsg-0.1) ...
    Selecting previously unselected package python3-pandas-lib.
    Preparing to unpack .../python3-pandas-lib_0.14.1-2_armhf.deb ...
    Unpacking python3-pandas-lib (0.14.1-2) ...
    Selecting previously unselected package python3-pandas.
    Preparing to unpack .../python3-pandas_0.14.1-2_all.deb ...
    Unpacking python3-pandas (0.14.1-2) ...
    Processing triggers for man-db (2.7.0.2-5) ...
    Setting up python3-six (1.8.0-1) ...
    Setting up python3-dateutil (2.2-2) ...
    Setting up python3-numpy (1:1.8.2-2) ...
    Setting up python3-tz (2012c+dfsg-0.1) ...
    Setting up python3-pandas-lib (0.14.1-2) ...
    Setting up python3-pandas (0.14.1-2) ...
    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# 
    

    This link implies there may be a lot more libraries that must be individually installed before you can actually use stuff:

    https://packages.debian.org/sid/python-pandas

    
    Other Packages Related to python-pandas
    
        depends
    
    	
    
        recommends
    
    	
    
        suggests
    
    	
    
        enhances
    
        dep: python
            interactive high-level object-oriented language (Python2 version) 
    
        dep: python (<= 2.7)
    
        dep: python-dateutil
            powerful extensions to the standard Python datetime module 
    
        dep: python-numpy (>= 1:1.7~)
            Numerical Python adds a fast array facility to the Python language 
    
        dep: python-pandas-lib (>= 0.23.3-1)
            low-level implementations and bindings for pandas 
    
        dep: python-pkg-resources
            Package Discovery and Resource Access using pkg_resources 
    
        dep: python-six
            Python 2 and 3 compatibility library (Python 2 interface) 
    
        dep: python-tz
            Python version of the Olson timezone database 
    
        rec: python-bs4
            error-tolerant HTML parser for Python 
    
        rec: python-html5lib
            HTML parser/tokenizer based on the WHATWG HTML5 specification 
    
        rec: python-lxml
            pythonic binding for the libxml2 and libxslt libraries 
    
        rec: python-matplotlib
            Python based plotting system in a style similar to Matlab 
    
        rec: python-numexpr
            Fast numerical array expression evaluator for Python and NumPy 
    
        rec: python-openpyxl
            Python module to read/write OpenXML xlsx/xlsm files 
    
        rec: python-scipy
            scientific tools for Python 
    
        rec: python-statsmodels
            Python module for the estimation of statistical models 
    
        rec: python-tables
            hierarchical database for Python based on HDF5 
    
        rec: python-xlrd
            extract data from Microsoft Excel spreadsheet files 
    
        rec: python-xlwt
            module for writing Microsoft Excel spreadsheet files - Python 2.7 
    
        sug: python-pandas-doc
            documentation and examples for pandas 
    

    That “python-pandas-doc” looks like one I’ll be wanting…

    I suppose it would be tacky to point out that I’m not keen on having things commonly done in a language not installed when you install the language…. But, OK, I’ll get to cruise that list looking for other things I expect to use / need (and check if there are python3-xxxxx variations also…).

  6. E.M.Smith says:

    As the math plot library was one I was planning to use…

    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python-matplotlib
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      fonts-lyx libjs-jquery libjs-jquery-ui libtcl8.6 libtk8.6 python-matplotlib-data python-mock python-nose
      python-pyparsing
    Suggested packages:
      libjs-jquery-ui-docs tcl8.6 tk8.6 dvipng gir1.2-gtk-3.0 inkscape ipython python-cairocffi python-configobj
      python-excelerator python-matplotlib-doc python-qt4 python-scipy python-sip python-tornado python-traits
      python-wxgtk3.0 texlive-extra-utils texlive-latex-extra ttf-staypuft python-mock-doc python-coverage
      python-nose-doc
    Recommended packages:
      javascript-common python-imaging python-tk
    The following NEW packages will be installed:
      fonts-lyx libjs-jquery libjs-jquery-ui libtcl8.6 libtk8.6 python-matplotlib python-matplotlib-data
      python-mock python-nose python-pyparsing
    0 upgraded, 10 newly installed, 0 to remove and 5 not upgraded.
    Need to get 9,297 kB of archives.
    After this operation, 27.2 MB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main libtcl8.6 armhf 8.6.2+dfsg-2 [882 kB]                 
    Get:2 http://auto.mirror.devuan.org/merged/ jessie/main libtk8.6 armhf 8.6.2-1 [686 kB]             
    Get:3 http://auto.mirror.devuan.org/merged/ jessie/main fonts-lyx all 2.1.2-2 [176 kB]             
    Get:4 http://auto.mirror.devuan.org/merged/ jessie/main libjs-jquery all 1.7.2+dfsg-3.2 [97.5 kB]
    Get:5 http://auto.mirror.devuan.org/merged/ jessie/main libjs-jquery-ui all 1.10.1+dfsg-1 [499 kB]
    Get:6 http://auto.mirror.devuan.org/merged/ jessie/main python-matplotlib-data all 1.4.2-3.1 [3,041 kB]
    Get:7 http://auto.mirror.devuan.org/merged/ jessie/main python-pyparsing all 2.0.3+dfsg1-1 [64.0 kB]
    Get:8 http://auto.mirror.devuan.org/merged/ jessie/main python-mock all 1.0.1-3 [33.2 kB]
    Get:9 http://auto.mirror.devuan.org/merged/ jessie/main python-nose all 1.3.4-1 [134 kB]
    Get:10 http://auto.mirror.devuan.org/merged/ jessie/main python-matplotlib armhf 1.4.2-3.1 [3,684 kB]
    Fetched 9,297 kB in 4s (2,091 kB/s)             
    Selecting previously unselected package libtcl8.6:armhf.
    (Reading database ... 98751 files and directories currently installed.)
    Preparing to unpack .../libtcl8.6_8.6.2+dfsg-2_armhf.deb ...
    Unpacking libtcl8.6:armhf (8.6.2+dfsg-2) ...
    Selecting previously unselected package libtk8.6:armhf.
    Preparing to unpack .../libtk8.6_8.6.2-1_armhf.deb ...
    Unpacking libtk8.6:armhf (8.6.2-1) ...
    Selecting previously unselected package fonts-lyx.
    Preparing to unpack .../fonts-lyx_2.1.2-2_all.deb ...
    Unpacking fonts-lyx (2.1.2-2) ...
    Selecting previously unselected package libjs-jquery.
    Preparing to unpack .../libjs-jquery_1.7.2+dfsg-3.2_all.deb ...
    Unpacking libjs-jquery (1.7.2+dfsg-3.2) ...
    Selecting previously unselected package libjs-jquery-ui.
    Preparing to unpack .../libjs-jquery-ui_1.10.1+dfsg-1_all.deb ...
    Unpacking libjs-jquery-ui (1.10.1+dfsg-1) ...
    Selecting previously unselected package python-matplotlib-data.
    Preparing to unpack .../python-matplotlib-data_1.4.2-3.1_all.deb ...
    Unpacking python-matplotlib-data (1.4.2-3.1) ...
    Selecting previously unselected package python-pyparsing.
    Preparing to unpack .../python-pyparsing_2.0.3+dfsg1-1_all.deb ...
    Unpacking python-pyparsing (2.0.3+dfsg1-1) ...
    Selecting previously unselected package python-mock.
    Preparing to unpack .../python-mock_1.0.1-3_all.deb ...
    Unpacking python-mock (1.0.1-3) ...
    Selecting previously unselected package python-nose.
    Preparing to unpack .../python-nose_1.3.4-1_all.deb ...
    Unpacking python-nose (1.3.4-1) ...
    Selecting previously unselected package python-matplotlib.
    Preparing to unpack .../python-matplotlib_1.4.2-3.1_armhf.deb ...
    Unpacking python-matplotlib (1.4.2-3.1) ...
    Processing triggers for fontconfig (2.11.0-6.3+deb8u1) ...
    Processing triggers for man-db (2.7.0.2-5) ...
    Setting up libtcl8.6:armhf (8.6.2+dfsg-2) ...
    Setting up libtk8.6:armhf (8.6.2-1) ...
    Setting up fonts-lyx (2.1.2-2) ...
    Setting up libjs-jquery (1.7.2+dfsg-3.2) ...
    Setting up libjs-jquery-ui (1.10.1+dfsg-1) ...
    Setting up python-matplotlib-data (1.4.2-3.1) ...
    Setting up python-pyparsing (2.0.3+dfsg1-1) ...
    Setting up python-mock (1.0.1-3) ...
    Setting up python-nose (1.3.4-1) ...
    Setting up python-matplotlib (1.4.2-3.1) ...
    Processing triggers for libc-bin (2.19-18+deb8u10) ...
    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python3-matplotlib
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      python3-nose python3-pkg-resources python3-pyparsing
    Suggested packages:
      dvipng gir1.2-gtk-3.0 inkscape ipython3 python-matplotlib-doc python3-cairocffi python3-gi-cairo
      python3-pyqt4 python3-scipy python3-sip python3-tornado texlive-extra-utils texlive-latex-extra
      ttf-staypuft python-nose-doc python3-setuptools
    Recommended packages:
      python3-pil python3-tk
    The following NEW packages will be installed:
      python3-matplotlib python3-nose python3-pkg-resources python3-pyparsing
    0 upgraded, 4 newly installed, 0 to remove and 5 not upgraded.
    Need to get 3,891 kB of archives.
    After this operation, 13.5 MB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main python3-pyparsing all 2.0.3+dfsg1-1 [64.1 kB]
    Get:2 http://auto.mirror.devuan.org/merged/ jessie/main python3-pkg-resources all 5.5.1-1 [34.2 kB]
    Get:3 http://auto.mirror.devuan.org/merged/ jessie/main python3-nose all 1.3.4-1 [131 kB]
    Get:4 http://auto.mirror.devuan.org/merged/ jessie/main python3-matplotlib armhf 1.4.2-3.1 [3,661 kB]
    Fetched 3,891 kB in 3s (1,233 kB/s)             
    Selecting previously unselected package python3-pyparsing.
    (Reading database ... 100013 files and directories currently installed.)
    Preparing to unpack .../python3-pyparsing_2.0.3+dfsg1-1_all.deb ...
    Unpacking python3-pyparsing (2.0.3+dfsg1-1) ...
    Selecting previously unselected package python3-pkg-resources.
    Preparing to unpack .../python3-pkg-resources_5.5.1-1_all.deb ...
    Unpacking python3-pkg-resources (5.5.1-1) ...
    Selecting previously unselected package python3-nose.
    Preparing to unpack .../python3-nose_1.3.4-1_all.deb ...
    Unpacking python3-nose (1.3.4-1) ...
    Selecting previously unselected package python3-matplotlib.
    Preparing to unpack .../python3-matplotlib_1.4.2-3.1_armhf.deb ...
    Unpacking python3-matplotlib (1.4.2-3.1) ...
    Processing triggers for man-db (2.7.0.2-5) ...
    Setting up python3-pyparsing (2.0.3+dfsg1-1) ...
    Setting up python3-pkg-resources (5.5.1-1) ...
    Setting up python3-nose (1.3.4-1) ...
    Setting up python3-matplotlib (1.4.2-3.1) ...
    root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# 
    

    Probably would have been better to just do python3 and ignore python… and not need to install everything twice. Oh Well…

  7. E.M.Smith says:

    And now it imports:

    >>> import pandas as pd
    >>>
    >>> import numpy as np
    >>> 
    >>> 
    >>> df = pd.read_csv('invent.csv') 
    >>> 
    >>> df.head()
       Gv3    7Sept2015   QCU   1  101  60355  000   36.9300     6.9500     7.0  \
    0  Gv3    7Sept2015   QCU   1  101  60360    0     36.83       7.82       4   
    1  Gv3    7Sept2015   QCU   1  101  60390    0     36.72       3.25      25   
    2  Gv3    7Sept2015   QCU   1  101  60395    1     36.52       4.18     942   
    3  Gv3    7Sept2015   QCU   1  101  60400    1     36.80       5.10     230   
    4  Gv3    7Sept2015   QCU   1  101  60402    0     36.72       5.07       2   
    
        ...    U    107  HI  xx  CO   1  x  -9  WARM DECIDUOUS    C  
    0   ...    U    256  FL  xx  CO   1  A   7  WARM CROPS        C  
    1   ...    U   1365  FL  xx  CO  10  A   6  WARM CROPS        C  
    2   ...    R     -9  MV  DE  no  -9  x  -9  WARM CROPS        A  
    3   ...    R     -9  HI  xx  CO   1  x  -9  WATER             A  
    4   ...    U     90  HI  xx  CO   1  A   3  WATER             B  
    
    [5 rows x 22 columns]
    >>> 
    

    OK, I’ve got a basic data load done… now to figure out that graphing thing ;-)

  8. E.M.Smith says:

    Inspecting the dataframe (table) a bit more, it doesn’t look quite right. In particular, not seeing the name field. I think I’m going to be in debugging the data load land for a while…

  9. E.M.Smith says:

    OK, I hand added a first line to the CSV file with header names and now I get those. In general it looks like things loaded OK, except “name” is not printing when the dataframe is printed. Size too big?

    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# head invent.csv                 0R   -9FLxxCO 1x-9WATER           A
    Vers,Ascen,Type,Cont,Country,WMO,Near,LAT,LON,Elev,Name,Grelev,Urban,Psize,Topo,Veg,ProxW,DistW,Air,DistAU,Veggrid,Popnitel
    Gv3  ,7Sept2015 ,QCU ,1,101,60355,000, 36.9300,   6.9500,   7.0,SKIKDA                        ,  18,U,  107,HI,xx,CO, 1,x,-9,WARM DECIDUOUS  ,C
    Gv3  ,7Sept2015 ,QCU ,1,101,60360,000, 36.8300,   7.8200,   4.0,ANNABA                        ,  33,U,  256,FL,xx,CO, 1,A, 7,WARM CROPS      ,C
    Gv3  ,7Sept2015 ,QCU ,1,101,60390,000, 36.7200,   3.2500,  25.0,DAR-EL-BEIDA                  ,  34,U, 1365,FL,xx,CO,10,A, 6,WARM CROPS      ,C
    

    and when I print it from Python:

    >>> df.head()
        Vers       Ascen  Type  Cont  Country    WMO  Near    LAT   LON  Elev  \
    0  Gv3    7Sept2015   QCU      1      101  60355     0  36.93  6.95     7   
    1  Gv3    7Sept2015   QCU      1      101  60360     0  36.83  7.82     4   
    2  Gv3    7Sept2015   QCU      1      101  60390     0  36.72  3.25    25   
    3  Gv3    7Sept2015   QCU      1      101  60395     1  36.52  4.18   942   
    4  Gv3    7Sept2015   QCU      1      101  60400     1  36.80  5.10   230   
    
       ...  Urban  Psize Topo Veg ProxW DistW Air DistAU           Veggrid  \
    0  ...      U    107   HI  xx    CO     1   x     -9  WARM DECIDUOUS     
    1  ...      U    256   FL  xx    CO     1   A      7  WARM CROPS         
    2  ...      U   1365   FL  xx    CO    10   A      6  WARM CROPS         
    3  ...      R     -9   MV  DE    no    -9   x     -9  WARM CROPS         
    4  ...      R     -9   HI  xx    CO     1   x     -9  WATER              
    
      Popnitel  
    0        C  
    1        C  
    2        C  
    3        A  
    4        A  
    
    [5 rows x 22 columns]
    >>> 
    

    Name ends up … for some reason…

  10. E.M.Smith says:

    Trying to import the matplotlib libraries barfed on no tk but told me what to install:

    >>> import matplotlib.pylab as plt
    Traceback (most recent call last):
      File "/usr/lib/python3.4/tkinter/__init__.py", line 39, in 
        import _tkinter
    ImportError: No module named '_tkinter'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "", line 1, in 
      File "/usr/lib/python3/dist-packages/matplotlib/pylab.py", line 274, in 
        from matplotlib.pyplot import *
      File "/usr/lib/python3/dist-packages/matplotlib/pyplot.py", line 109, in 
        _backend_mod, new_figure_manager, draw_if_interactive, _show = pylab_setup()
      File "/usr/lib/python3/dist-packages/matplotlib/backends/__init__.py", line 32, in pylab_setup
        globals(),locals(),[backend_name],0)
      File "/usr/lib/python3/dist-packages/matplotlib/backends/backend_tkagg.py", line 6, in 
        from six.moves import tkinter as Tk
      File "/usr/lib/python3/dist-packages/six.py", line 89, in __get__
        result = self._resolve()
      File "/usr/lib/python3/dist-packages/six.py", line 108, in _resolve
        return _import_module(self.mod)
      File "/usr/lib/python3/dist-packages/six.py", line 79, in _import_module
        __import__(name)
      File "/usr/lib/python3.4/tkinter/__init__.py", line 41, in 
        raise ImportError(str(msg) + ', please install the python3-tk package')
    ImportError: No module named '_tkinter', please install the python3-tk package
    

    So I did:

    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# apt-get install python3-tk 
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      blt tk8.6-blt2.5
    Suggested packages:
      blt-demo tix python3-tk-dbg
    The following NEW packages will be installed:
      blt python3-tk tk8.6-blt2.5
    0 upgraded, 3 newly installed, 0 to remove and 5 not upgraded.
    Need to get 521 kB of archives.
    After this operation, 1,391 kB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie/main tk8.6-blt2.5 armhf 2.5.3+dfsg-1 [484 kB]
    Get:2 http://auto.mirror.devuan.org/merged/ jessie/main blt armhf 2.5.3+dfsg-1 [14.3 kB]
    Get:3 http://auto.mirror.devuan.org/merged/ jessie/main python3-tk armhf 3.4.2-1+b1 [22.7 kB]
    Fetched 521 kB in 2s (199 kB/s)  
    Selecting previously unselected package tk8.6-blt2.5.
    (Reading database ... 100456 files and directories currently installed.)
    Preparing to unpack .../tk8.6-blt2.5_2.5.3+dfsg-1_armhf.deb ...
    Unpacking tk8.6-blt2.5 (2.5.3+dfsg-1) ...
    Selecting previously unselected package blt.
    Preparing to unpack .../blt_2.5.3+dfsg-1_armhf.deb ...
    Unpacking blt (2.5.3+dfsg-1) ...
    Selecting previously unselected package python3-tk.
    Preparing to unpack .../python3-tk_3.4.2-1+b1_armhf.deb ...
    Unpacking python3-tk (3.4.2-1+b1) ...
    Setting up tk8.6-blt2.5 (2.5.3+dfsg-1) ...
    Setting up blt (2.5.3+dfsg-1) ...
    Setting up python3-tk (3.4.2-1+b1) ...
    Processing triggers for libc-bin (2.19-18+deb8u10) ...
    root@odroidxu4:/SG2/ext/chiefio/SQL/v3# 
    

    Then it worked:

    >>> import matplotlib.pylab as plt
    >>> 
    

    Basically I’m going to skip over debugging the Name field issue and just try to get something numeric to plot…

  11. E.M.Smith says:

    Well, I got something to plot. It doesn’t have any meaning in it, but hey… So I stuff LAT and LON into an array and plot that.

    >>> mydata= df[["LAT", "LON"]].dropna(how="any")
    >>> vals= mydata.values
    >>> plt.plot(vals)
    
    >>> plt.show()
    

    I’ve added the plot at the end of the article body as an “UPDATE”. Now I’m going to bed ;-)

    Yeah, it is an ugly and mostly meaningless plot; but it means I can do all the bits… now it’s just fine tuning and deciding what I want to do (and learning more details on the commands available).

  12. rms says:

    I admit I didn’t read all your post and all the comments. I get it that you evidently don’t want Python and Pandas to work for you. No problem. I won’t try to convince you otherwise, but why didn’t you just import direct from MySQL into Pandas using something like:

    df = pd.read_sql(‘SELECT * FROM table_name’, con=db_connection)

    Not sure where you got the “lib service” idea and forced to go back to CSV again. But enjoy the journey!

  13. kneel63 says:

    Maybe perl is more your style? The DBI interface is easy to use, and MySQL, PostgreSQL, etc (even flat files) are all supported with identical calls. Perl would also easily do the file parsing. I’ve even used a recursive descent parser generator in perl to create a parser from a grammar at run-time for non-standard variations of JSON data streams. Nice ability to use the object interface for the libraries (where it makes sense) while keeping the rest procedural too.

  14. Simon Derricutt says:

    I had some problems with Python. A program I wanted to use was written in Python, and turned out it was originally in Python 2.6. (The program converts a picture file to a CNC milling file to make pcbs.) Turned out that the available 2.7 Python was incompatible, and that the libraries were in different places with different names. Took me a while to find out the fixes. 3.x Python is also incompatible. If you find that a program that worked fine suddenly screws up because the language definition and setup changes, then that puts me off using Python for anything. That’s a Micro$oft attitude – change the foundations because they didn’t think it out well-enough the first time and they don’t have to pay for the changes. If the language depends so much on having libraries for the normal functions, then it’s not acceptable to change the names or functions of those libraries, and it’s also not acceptable to point to a specific location and then change that location in an update. If you want to do that, change the name of the language so it’s obvious you need a different interpreter/compiler. Upgrades are supposed to fix bugs, not introduce them.

  15. E.M.Smith says:

    @rms:

    No.

    I make a firm distinction between what I WANT, what I LIKE, and what is USEFUL. So while I don’t LIKE Python for implementation details I very much DO WANT it “to work for me”. It is useful, therefor I use it; like it or not. As I have said many times: it isn’t about me. It is only about the path to a solution.

    And I did get it to work. I made my first, crude, plot.

    Why use CVS: Stated in the stuff you say you didn’t read. It was all the examples I found.

    Why not use the direct SQL? Because I don’t know how and saw no examples, so figured I would make that a second step. As my major goal for the day was to learn to plot something, anything, just swapping to a CVS load was fast and relatively easy. With your SQL example, I’ll model off it too. This is one of the problems with OO & Libraries. You don’t know what you don’t know, so there is a big hump to get over to learn all the relevant objects and libraries (or even just their names and what exists). Late last night I did find an online Matplotlib manual, and that will help. People only learn and discover stuff so fast, and this was what I could do in a couple of part time days. Go from near zero to installing and using Python to load and plot some data.

    I think it pretty good progress for 2 days.

    @kneel63:

    I’m a bit ambivalent on Perl. Since most folks are using Python, there’s more examples in it.

    FWIW, while shutting down for the night, I did discover Julia has an interface to Python matplotlib. Were I going to swap paths now, it would be to Julia. Designed for parallel processing and about as fast as C or FORTRAN (all 3 about 10x faster than Python). Then I still could learn the same plotting library…

    @Simon:

    That’s pretty much my attitude about it all. A big part of my low esteem for both Perl and Python. Incompatible versions.

    Why do I still write FORTRAN. Because despite not touching it for a good 30+ years after my FORTRAN IV class (I.e. even before f77) when I ported GIStemp (in f77 and f95) it was basically the same. Some additions, mostly all compatible or obvious. With roughly zero work to catch up, I was again fully functional in the language.

    Knowing that anything I write will work the same in 30 or 40 years is a nice feature… then to swap versions is just a compiler flag. Season with it still being one of the fastest 2 languages for science stuff and you see why it is still dominant in R&D and Engineering things.

    Though, to be fair, the fixed layout of columns in f77 is a pain (freeform in newer standards) and I like C syntax for operators and begin end blocks better….

    There is no perfect language, and, IMHO, every one of them has things to gripe about: but languages that change so much they are incompatible between times I use them are particularly annoying.

  16. Pingback: Notes On Julia – The Language | Musings from the Chiefio

  17. vcmathjm says:

    For basic python stuff a nice guide is
    “Automate the boring Stuff” which is now free at
    https://automatetheboringstuff.com/
    It has a chapter on working with CSV . I used the working with excel chapter which was very useful to me.
    Jim

  18. Pingback: Graph Of Global Thermometers | Musings from the Chiefio

  19. rms says:

    Recommendations:
    : Recommend you use Anaconda to install and use Python et. al. Much easier than using a virtual env, and much much better than not. This will allow more easily control the environment, have multiple versions of things if you want, e.g. Python 3 and 2.
    : See the documentation for the Pandas import from MySQL (and others) at http://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-sql

  20. E.M.Smith says:

    Anaconda looks interesting but is a bit more than I want to add to my learning curve at the moment. (Something about condas as Yet Another Package Manager puts me off…). I’ll likely eventually get there, but right now I’m just looking to get enough “basics” to do a few graphs.

    I did discover that IDLE is not installed by default and have installed it. I’ll explore it first. It may be enough for just loading / saving programs. I’m pretty much settled in on the version of Python that comes with Python3 on the Devuan release. No real need for “multiple versions” and as the limit on what’s on the Arm chipset tends to be “who has ported it”, I don’t expect Anaconda to have access to any newer version of Python than comes in the basic installation.

    As an FYI, for anyone else doing this, here’s what the IDLE install looks like:

    root@odroidxu4:/# apt-get install idle3
    Reading package lists... Done
    Building dependency tree       
    Reading state information... Done
    The following packages were automatically installed and are no longer required:
      libjsoncpp0 libuuid-perl
    Use 'apt-get autoremove' to remove them.
    The following extra packages will be installed:
      idle-python3.4
    The following NEW packages will be installed:
      idle-python3.4 idle3
    0 upgraded, 2 newly installed, 0 to remove and 5 not upgraded.
    Need to get 85.9 kB of archives.
    After this operation, 211 kB of additional disk space will be used.
    Do you want to continue? [Y/n] y
    Get:1 http://auto.mirror.devuan.org/merged/ jessie-security/main idle-python3.4 all 3.4.2-1+deb8u2 [82.8 kB]
    Get:2 http://auto.mirror.devuan.org/merged/ jessie/main idle3 all 3.4.2-2 [3,120 B]
    Fetched 85.9 kB in 2s (42.4 kB/s)
    Selecting previously unselected package idle-python3.4.
    (Reading database ... 100821 files and directories currently installed.)
    Preparing to unpack .../idle-python3.4_3.4.2-1+deb8u2_all.deb ...
    Unpacking idle-python3.4 (3.4.2-1+deb8u2) ...
    Selecting previously unselected package idle3.
    Preparing to unpack .../archives/idle3_3.4.2-2_all.deb ...
    Unpacking idle3 (3.4.2-2) ...
    Processing triggers for desktop-file-utils (0.22-1) ...
    Processing triggers for mime-support (3.58) ...
    Processing triggers for man-db (2.7.0.2-5) ...
    Setting up idle-python3.4 (3.4.2-1+deb8u2) ...
    Setting up idle3 (3.4.2-2) ...
    root@odroidxu4:/# 
    

    So now all those tutorial / example pages that say “Now do FOO in IDLE” become useful to me…

    Some stuff about IDLE:
    https://realpython.com/interacting-with-python/#running-a-python-script-from-the-command-line

    One of my complaints about Python is that there’s 1/2 dozen “choices” at key points. This means 1/2 dozen “evaluation processes” to pick one to use and it means “look at 6 advice pages to throw out 5” as they don’t match what combination you have chosen. Then multiply by 2 for Python 2 vs 3…

    Similarly I’ve found at least 2 “methods” for loading an SQL database into Python. Now I get to figure out which one is “best” or what to use or will it work or… Another of the problems with OO Design. You get some facility in a loaded library, but someone else doesn’t know it, or is lost trying to find it, or just doesn’t like it; so does the rational thing. They write another one from scratch. Now you have 2 such libraries of ‘objects’ to search for / through / learn in depth before you can reasonably choose and use. Now add that to the two x for releases 2 and 3 and then the 3 to 6 times for IDE / REPL / IDLE / Whatever… You get an exponential explosion of “Mix and Match” before you can ever get any work done. Then an exponential explosion of “Search & Reject” trying to find an example to model.

    Eventually the workload for just wandering in that forest become oppressive and folks wander off to write Yet Another Language; and the cycle repeats. (Thus Go, and Julia, and…)

    Then folks wonder why I’ll knock out a FORTRAN program to do something… Because it requires zero search time, zero choice time, zero rejections of models, zero incompatibles issues; and I can do it in about 2 minutes wall clock for most things when I’ve done anything similar before. (Like turning the TABS filler into a COMMA filler – maybe a minute all told).

    So yeah, I’m just griping that learning new language is hard, and made harder by too many optional paths, and made harder by “fancy environment” choices. And made harder by hiding what, IMHO, ought to be fundamental language properties off in OO Libraries that you “import” but are not in the language spec (so you get to “go fish” for documentation then search that all…). Yes, I know that arrays are handled in NumPy, now. But in FORTRAN they are done in the core language and it’s in the manuals…

    Oh Well. I chose to take this path, and I’m a willing student while learning more about this OO Library way of things; but that also grants me the right to gripe about it. ;-) I’ve learned a lot of languages over the years and so far, while basic Python is very easy, once you go off to “import FOO” land it gets harder than most very fast. (Largely as FOO is not part of the language… and the manual for FOO is off who knows where… and I didn’t set out to learn FOO… and…)

  21. E.M.Smith says:

    Oh Boy, my first saved, loaded, and run Python Program (i.e. not in the REPL just typed in live):

    f = open('/tmp/junk.py', 'w')
    print("This is my first saved program",file=f,flush=True)
    

    then the result:

    root@odroidxu4:/tmp# cat junk.py
    This is my first saved program
    

    I finally feel like I’m getting where I want to be with Python. Not just fooling around in an interpreter in dribs and drabs. I can now save / recall Real Programs, and read / write to files.

    Near as I can tell, you must have something like IDLE or an IDE installed to do the save / read in. It may be buried in the interpreter REPL somewhere but I could only find “launch it as ‘python -i foo.py’ to start a program” and could not find any save / read in options once inside the REPL. (Read Evaluate Print Loop – basically an interpreter) Besides, all the examples immediately run off to any one of several IDEs or IDLE… so it looks like “the one (of several…) true way” in Python.

    So, from here on out, as I’m writing more complicated things and typing them in from scratch into the interpreter you get when you do “sh> python” is getting old fast: I’ll be using IDLE and writing saved programs. I’ll also, now, be able to save more of my output to files if desired. Graphs to the screen, but things like data tables post processing as data tables can go to a file…

    I probably ought to have named that /tmp/junk.pyout just to be clear it isn’t Python text… maybe next time ;-)

    So one more tiny step forward…

    Maybe I’ll explore whatever the Python equivalent of a FORMAT statement for fixed data input might be and see if I can replace those FORTRAN glue-ware programs…

Comments are closed.