All I wanted to do was make a couple of simple nearly trivial graphs…
After working out how to get the GHCN files loaded into SQL, and doing a couple of ad-hoc queries (proving it was worth the effort), I was ready to move on to “make a graph” at the leading edge of things (while still puttering around with indexes and key fields and efficiency and such on MySQL, as the low priority backend tasks).
A fair amount of searching for MySQL interface libraries for various languages showed they exist for some languages (notably Go has one) but they tended to be a bit of “assemble it yourself”. I didn’t find one for FORTRAN (it might exist, I just did a very quick web search), did find one on Sourceforge for DIY with Go, and there were hints of maybe something for R (but I really didn’t search much for it…). And, at every turn, oodles and oodles of links to stuff about Python.
OK, I can take a hint. When 80%+ of the stuff you turn up says “We use FOO!” (or in Python code “We use SPAM!” they like using Monty Python references so use “SPAM and Eggs” instead of FOO and BAR) eventually you start to think “Maybe I ought to look at FOO?”. So I did.
I don’t hide my (mild) distaste for Python. It’s a fine language and all, but does some things I just don’t like. Due to dynamic typing a variable isn’t really a variable but is a pointer to some data somewhere that DOES have a type. So: a = 1; or b = “SPAM” really just puts a pointer in a or b to those data items “somewhere”. Well, if you are used to thinking of an assignment as actually doing an assignment and pointers as being pointers it is a bit annoying to need to keep in mind that these aren’t. MOST of the time you can ignore it. Yes, it is a nice touch that lets you do things like not bother to assign a data type; but… there can be the odd case where it bites you so you must know “this is different”. Similarly, I don’t like position dependent syntax. Change the white space, change the program… Having white space as part of the reserved word space you need to track is a bother…all to enforce their ideas about good pretty-printing practice for “structured programming”. OK, apply straight jacket and shut up…
So despite Python being the “3rd most popular” language among whoever bothered to be in the survey, (the other two are Java and C) I’ve avoided using it unless there wasn’t much of an alternative. (Like doing maintenance on existing code).
But there it was: Just about every example was using Python… So OK, I’ve spent a couple of days “loading Python” into my brain again, and learning a little of how to use Pandas to graph things. I’m about to do my first “load stuff into Pandas and graph it”, but thought I would post up a couple of notes first. Any success at graphing will be added as an update.
One bit of irk was wandering for an hour or so down the list of Pythons. RPython, Cython, Jython (or some such), PyPy, and more. Why? Well, if you are going to use Python, you MUST PICK ONE, which means you must know enough to pick one, which means at least a minimal familiarity. This is one of my more general complaints about “trendy” things: They end up metastasizing into so many forms it is more work to choose one that to just use something else.
Jython is a Python that makes Java byte code for the JVM. OK, scratch that one.
Cython is a super-set that then spits out C or C++ code, mostly used for making other library like stuff. Scratch.
RPython is Restricted Python, used as a subset to make self-hosting Python. Scratch it, too.
The bottom line is that the Python you get by default on Linux is likely to be the one you want. Maybe.
Then we get 2.x vs 3.x Python. The 3.x Major Release is not entirely compatible with the 2.x one. OK, use python3 (that you may get by default, or not, depending on a lot of things, or you might need to call it out as python3).
It looks like I have both installed already:
root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python Reading package lists... Done Building dependency tree Reading state information... Done python is already the newest version. python set to manually installed. The following packages were automatically installed and are no longer required: libjsoncpp0 libuuid-perl Use 'apt-get autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded. root@odroidxu4:/SG2/ext/chiefio/SQL/Table.Schemas# apt-get install python3 Reading package lists... Done Building dependency tree Reading state information... Done python3 is already the newest version. python3 set to manually installed. The following packages were automatically installed and are no longer required: libjsoncpp0 libuuid-perl Use 'apt-get autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 5 not upgraded.
But I’ll need to explicitly say “python3” to get the newer dialect.
Then, while researching how big a Dataframe (i.e. what EVERYONE else calls a table…) can be, found that 32 bit Python will barf on about 1.8 GB of data (as memory peaks about 3.x GB during the load and that exceeds what a 32 bit can address) so you will want 64 bit Python then. How one gets 64 vs 32 bit left unclear… though implied it was disjoint from machine word size. I did a quick check of the size of GHCN and either one will do:
chiefio@odroidxu4:~/SQL/v3$ ls -l total 704796 [...] -rw-r--r-- 1 root root 786240 Jan 23 21:14 inventory.in -rw-r--r-- 1 root root 1048320 Jan 23 21:43 inv.out [...] -rw-r--r-- 1 chiefio chiefio 53690600 Jan 20 20:39 temps.in -rw-r--r-- 1 chiefio chiefio 344360400 Jan 20 20:40 temps.u.in
So the Inventory File even as a tab column list is only 1 MB, and the entire GHCNv3 temperature data file similarly expanded is 344 MB. They can both be sucked into a Pandas Dataframe of either word size Python. (And fit nicely in memory of my 2 GB XU4…)
At least I can now ignore what “size” Python to run…
CSV, Dataframes, and more
The examples of how to do this generally just call an “object” that is in the Pandas library to do the loading. They universally give lip service to some OTHER format load choices, then show only the CSV (Comma Separated Values) load. Why the world so loves CSV is beyond me. Text and numbers are often FULL of commas that cause grief for that mode. So I’m going to end up “discovering” what those other modes of data loading are and working it out myself. But for my first attempt, I’m going to go ahead and do CSV. On the “inventory” file as it is the smaller one.
A ‘quick search’ showed it very comma infested:
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# grep "," inventory.in 10266460001 -17.1000 15.7000 1150.0 PEREIRA D,ECA ANGO 1095R -9FLxxno-9A-9SUCCULENT THORNSA 10468026000 -18.3700 21.8500 1000.0 SHAKAWE 920R -9FLDEno-9A-9MARSH, SWAMP C 10764910000 4.0000 9.7300 9.0 DOUALA OBS. 12U 458FLxxCO 5A 1MARSH, SWAMP C 12061816000 10.9300 -14.3200 69.0 BOKE 41R -9HIFOno-9x-9MARSH, SWAMP A 12161769000 11.5800 -15.4800 20.0 BOLAMA 3S 10FLxxCO 3A 3MARSH, SWAMP A [...] 20550983000 45.7700 132.9700 103.0 HULIN 76S 20FLxxno-9x-9BOGS, BOG WOODS C 20551334000 44.6200 82.9000 321.0 JINGHE 441R -9MVDEno-9x-9MARSH, SWAMP B 20551716000 39.8000 78.5700 1117.0 BACHU 1331R -9FLDEno-9x-9MARSH, SWAMP A 20551730000 40.5000 81.0500 1013.0 ALAR 1070R -9FLDEno-9x-9MARSH, SWAMP A[...] 21047678001 30.5000 140.3000 83.0 TORISHIMA,JAPAN 0R -9HIxxCO 1x-9WATER A 21544212001 49.8000 92.1000 934.0 MONGOLIAN STATION,BAYAN-OL 1203S 20MVxxno-9x-9COOL DESERT A 21544213001 49.7000 96.4000 -999.0 BAYAN UUL, DZAVHAN 1680R -9MVxxno-9x-9COOL DESERT A 21544214001 48.3000 89.5000 -999.0 ALTAY, BAYAN-OLIGY 2624R -9MVxxno-9x-9COOL GRASS/SHRUBA 21544218001 47.1000 92.8000 -999.0 MONGOLIAN STATION, HOVD 1390R -9MVDEno-9x-9WARM GRASS/SHRUBA 21544237001 48.2000 99.9000 -999.0 MONGOLIAN STATION, N. HANG 2529R -9MVxxno-9x-9COOL CONIFER A 21544239001 49.0000 104.1000 -999.0 ERDENET, BULGAN 1403R -9HIxxno-9A-9COOL GRASS/SHRUBB 21544241000 48.9000 106.1000 807.0 BAYAN-GOL, SELENGE 914R -9HIxxno-9x-9COOL GRASS/SHRUBA 21544241002 48.9000 106.9000 -999.0 MONGOLIAN STATION, SELENGE 1311R -9MVxxno-9x-9WARM CROPS A 21544241003 49.2000 105.4000 -999.0 ORBON, SELENGE 964R -9HIxxno-9x-9COOL GRASS/SHRUBA 21544241004 49.8000 106.7000 -999.0 YOROO, SELENGE 958R -9FLMAno-9x-9COOL GRASS/SHRUBA 21544241005 50.1000 106.2000 -999.0 SHAAM, SELENGE 731R -9HIxxno-9x-9COOL GRASS/SHRUBA [...]
So I get to “de-comma” it… As I learned vi back in the dark ages and it now resides in my brain stem so the magic key patterns just happen without my knowing what they are anymore… I use the vi editor (renamed “vim” in Linux but they have an alias to vi). In vi, you can use “Regular Expressions” (just as in “grep” – global regular expression print). In vi, the “globally replace” command is, at the : prompt
You may hate the excessive terseness of hard core *Nix, or love it, but for me being able to replace all “,” with “;” in 10 CHAR is a good thing. To decode that line, the “:” gets you a command prompt in vi, then “g” is “globally”, meaning “do all lines” (you can also do line ranges) the “/,/” says “search for comma”, the “s” is “substitute” “//” for that thing you found (the comma) “;/” “a semicolon”, “g” globally on each line – i.e. not just the first one on the line but anywhere and everywhere in that row. All that in 10 CHAR.
Why use a “;”? Because there were none in the file:
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# grep ";" inventory.in root@odroidxu4:/SG2/ext/chiefio/SQL/v3#
So there will not be a collision with any other meaning.
Now, with it stripped of the dreaded comma, I’ve modified my “Tab inserting FORTRAN program” to insert a comma between fields. I did a similar “global substitution” of “COM” for “TAB” and set COM=”,”. Also changed the input / output file names just to keep things clear.
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# head semiinven.in 10160355000 36.9300 6.9500 7.0 SKIKDA 18U 107HIxxCO 1x-9WARM DECIDUOUS C 10160360000 36.8300 7.8200 4.0 ANNABA 33U 256FLxxCO 1A 7WARM CROPS C 10160390000 36.7200 3.2500 25.0 DAR-EL-BEIDA 34U 1365FLxxCO10A 6WARM CROPS C 10160395001 36.5200 4.1800 942.0 FT. NATIONAL 805R -9MVDEno-9x-9WARM CROPS A 10160400001 36.8000 5.1000 230.0 CAP CARBON 28R -9HIxxCO 1x-9WATER A 10160402000 36.7200 5.0700 2.0 BEJAIA 121U 90HIxxCO 1A 3WATER B 10160403000 36.4700 7.4700 227.0 GUELMA 287S 47HIxxno-9x-9WARM CROPS C 10160419000 36.2800 6.6200 694.0 CONSTANTINE 563U 335MVxxno-9A 7WARM FOR./FIELD B 10160425000 36.2200 1.3300 143.0 CHLEF 242U 106HIxxno-9A 3WARM CROPS C 10160425001 36.1700 1.5000 112.0 ORLEANSVILLE 219R -9HIDEno-9x-9WARM CROPS A root@odroidxu4:/SG2/ext/chiefio/SQL/v3# ls *.f ccodesv3.f csvinvn.f inven.f stationdat.f
So that csvinvn.f is the FORTRAN to put this input file (where I’ve swapped commas to ;) in CSV condition so Pandas can load it “in the usual way”. Someday later I’ll work out how to use some non-CSV Dataframe load; but this gets me over the hump to “doing something in Pandas” with about 2 minutes of effort. Far less than it took to do this writeup about it ;-)
This input file is semiinven.in as it is full of semi-colons ;-)
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# gfortran csvinvn.f root@odroidxu4:/SG2/ext/chiefio/SQL/v3# ./a.out > invent.csv [...] root@odroidxu4:/SG2/ext/chiefio/SQL/v3# head invent.csv Gv3 ,7Sept2015 ,QCU ,1,101,60355,000, 36.9300, 6.9500, 7.0,SKIKDA , 18,U, 107,HI,xx,CO, 1,x,-9,WARM DECIDUOUS ,C Gv3 ,7Sept2015 ,QCU ,1,101,60360,000, 36.8300, 7.8200, 4.0,ANNABA , 33,U, 256,FL,xx,CO, 1,A, 7,WARM CROPS ,C Gv3 ,7Sept2015 ,QCU ,1,101,60390,000, 36.7200, 3.2500, 25.0,DAR-EL-BEIDA , 34,U, 1365,FL,xx,CO,10,A, 6,WARM CROPS ,C Gv3 ,7Sept2015 ,QCU ,1,101,60395,001, 36.5200, 4.1800, 942.0,FT. NATIONAL , 805,R, -9,MV,DE,no,-9,x,-9,WARM CROPS ,A Gv3 ,7Sept2015 ,QCU ,1,101,60400,001, 36.8000, 5.1000, 230.0,CAP CARBON , 28,R, -9,HI,xx,CO, 1,x,-9,WATER ,A Gv3 ,7Sept2015 ,QCU ,1,101,60402,000, 36.7200, 5.0700, 2.0,BEJAIA , 121,U, 90,HI,xx,CO, 1,A, 3,WATER ,B Gv3 ,7Sept2015 ,QCU ,1,101,60403,000, 36.4700, 7.4700, 227.0,GUELMA , 287,S, 47,HI,xx,no,-9,x,-9,WARM CROPS ,C Gv3 ,7Sept2015 ,QCU ,1,101,60419,000, 36.2800, 6.6200, 694.0,CONSTANTINE , 563,U, 335,MV,xx,no,-9,A, 7,WARM FOR./FIELD ,B Gv3 ,7Sept2015 ,QCU ,1,101,60425,000, 36.2200, 1.3300, 143.0,CHLEF , 242,U, 106,HI,xx,no,-9,A, 3,WARM CROPS ,C Gv3 ,7Sept2015 ,QCU ,1,101,60425,001, 36.1700, 1.5000, 112.0,ORLEANSVILLE , 219,R, -9,HI,DE,no,-9,x,-9,WARM CROPS ,A root@odroidxu4:/SG2/ext/chiefio/SQL/v3#
And “Bob’s Your Uncle!” it’s done. A CSV file for input.
For completion, this is a listing of the csvinvn.f program, though it is only trivially different from the TAB one.
root@odroidxu4:/SG2/ext/chiefio/SQL/v3# cat csvinvn.f C FORTRAN to read the inventory files v3 GHCN file and insert COMMA C in the output. Also divides "country" into Continent and Country C C Variable declarations... C CHARACTER * 1 COM CHARACTER * 5 VERSION CHARACTER * 10 ASCEN CHARACTER * 4 TYPE, GRIDELEV CHARACTER * 1 CONT,POPC,AIR,POPNL CHARACTER * 2 COUNTRY,TOPOT,VEG,PROXW,DISTW,DISTAU CHARACTER * 5 WMO,PSIZE CHARACTER * 3 NEAR CHARACTER * 6 STNELEV CHARACTER * 8 LATITUDE CHARACTER * 9 LONGITUDE CHARACTER * 16 VEGGRID CHARACTER * 30 NAME C C Set the COM character COM="," C C Set some constants VERSION="Gv3" ASCEN="7Sept2015" TYPE="QCU" C C Read in one line of data... C 9 OPEN(1, FILE='semiinven.in', STATUS='OLD', ACTION='READ') 10 READ (1, 11, END=99) CONT, COUNTRY, WMO, NEAR,LATITUDE, & &LONGITUDE,STNELEV,NAME,GRIDELEV,POPC,PSIZE,TOPOT,VEG,PROXW, & &DISTW,AIR,DISTAU,VEGGRID,POPNL C 11 FORMAT (A1,A2,A5,A3,X,A8,X,A9,X,A6,X,A30,X,A4,A1,A5,A2,A2,A2, & &A2,A1,A2,A16,A1) C C Convert CHAR to Float C READ (T(I),*,END=20) F(I) C C Write out one line of data with COM between fields C C WRITE (6, 6) VERSION,ASCEN,TYPE,CONT,COUNTRY,WMO,NEAR,LATITUDE, & C &LONGITUDE,STNELEV,NAME,GRIDELEV,POPC,PSIZE,TOPOT,VEG,PROXW,DISTW, & C &AIR,DISTAU,VEGGRID,POPNL C 6 FORMAT (A5,A10,A4,A1,A2,A5,A3,A8,A9,A6,A30,A4,A1,A5,A2,A2,A2, & C &A2,A1,A2,A16,A1) WRITE (6, 7) VERSION,COM,ASCEN,COM,TYPE,COM,CONT,COM,CONT,COUNTRY,& &COM,WMO,COM,NEAR,COM,LATITUDE,COM,LONGITUDE,COM,STNELEV,COM,NAME, & &COM,GRIDELEV,COM,POPC,COM,PSIZE,COM,TOPOT,COM,VEG,COM,PROXW,COM, & &DISTW,COM,AIR,COM,DISTAU,COM,VEGGRID,COM,POPNL 7 FORMAT (A5,A1,A10,A1,A4,A1,A1,A1,A1,A2,A1,A5,A1,A3,A1,A8,A1,A9,A1,& &A6,A1,A30,A1,A4,A1,A1,A1,A5,A1,A2,A1,A2,A1,A2,A1, & &A2,A1,A1,A1,A2,A1,A16,A1,A1) C Retrieve another line of data... C GO TO 10 C C If end of file, then stop. C 99 STOP END root@odroidxu4:/SG2/ext/chiefio/SQL/v3#
I’m now going to spend some unknown number of hours attempting to suck that CSV file into a Pandas Dataframe and graph something. Perhaps number of WMO by latitude… or a scatter chart of that…
Whem I’m done, this posting will get an UPDATE here.
Though first I think I’m going to go get breakfast and fresh coffee… Yes, this was done before breakfast ;-)
After a lot of wandering in the forest (documented in comments below) I made my first plot. It is LATitude and LONgitude of stations on a graph. No idea if it is right, or not, or what. But here it is.
This will just be the latitude and longitude for stations plotted against the order of records. It has some quasi meaning as records are arranged by continent, so sort of in physical blobs. Yeah, mostly meaningless. Doesn’t matter. I’ve now done the whole thing from data format to load to graph. So from here on out it is just polishing and adding incremental skills.
Fun, so far….
Yeah, but after breakfast I got side tracked into things like banking, paying bills, grocery run, getting dinner ready, 1/2 bottle of saki, washing dishes & the dog…. not together, mind you :-)
So I’ve made no further progress today… maybe after the spouse goes to bed and the house is quiet again….
Well this sucks…
Decided I could at least load some of the library stuff and prep…
OK…. So looks like there’s some “install libraries” step left out of the various tutorials (mostly based around Mac or Windows installs using an IDE of various sorts…
datetime is there, but the others are not.
OK, looks like it is a separate package:
And a separate Python3 version of pandas:
This link implies there may be a lot more libraries that must be individually installed before you can actually use stuff:
That “python-pandas-doc” looks like one I’ll be wanting…
I suppose it would be tacky to point out that I’m not keen on having things commonly done in a language not installed when you install the language…. But, OK, I’ll get to cruise that list looking for other things I expect to use / need (and check if there are python3-xxxxx variations also…).
As the math plot library was one I was planning to use…
Probably would have been better to just do python3 and ignore python… and not need to install everything twice. Oh Well…
And now it imports:
OK, I’ve got a basic data load done… now to figure out that graphing thing ;-)
Inspecting the dataframe (table) a bit more, it doesn’t look quite right. In particular, not seeing the name field. I think I’m going to be in debugging the data load land for a while…
OK, I hand added a first line to the CSV file with header names and now I get those. In general it looks like things loaded OK, except “name” is not printing when the dataframe is printed. Size too big?
and when I print it from Python:
Name ends up … for some reason…
Trying to import the matplotlib libraries barfed on no tk but told me what to install:
So I did:
Then it worked:
Basically I’m going to skip over debugging the Name field issue and just try to get something numeric to plot…
Well, I got something to plot. It doesn’t have any meaning in it, but hey… So I stuff LAT and LON into an array and plot that.
I’ve added the plot at the end of the article body as an “UPDATE”. Now I’m going to bed ;-)
Yeah, it is an ugly and mostly meaningless plot; but it means I can do all the bits… now it’s just fine tuning and deciding what I want to do (and learning more details on the commands available).
I admit I didn’t read all your post and all the comments. I get it that you evidently don’t want Python and Pandas to work for you. No problem. I won’t try to convince you otherwise, but why didn’t you just import direct from MySQL into Pandas using something like:
df = pd.read_sql(‘SELECT * FROM table_name’, con=db_connection)
Not sure where you got the “lib service” idea and forced to go back to CSV again. But enjoy the journey!
Maybe perl is more your style? The DBI interface is easy to use, and MySQL, PostgreSQL, etc (even flat files) are all supported with identical calls. Perl would also easily do the file parsing. I’ve even used a recursive descent parser generator in perl to create a parser from a grammar at run-time for non-standard variations of JSON data streams. Nice ability to use the object interface for the libraries (where it makes sense) while keeping the rest procedural too.
I had some problems with Python. A program I wanted to use was written in Python, and turned out it was originally in Python 2.6. (The program converts a picture file to a CNC milling file to make pcbs.) Turned out that the available 2.7 Python was incompatible, and that the libraries were in different places with different names. Took me a while to find out the fixes. 3.x Python is also incompatible. If you find that a program that worked fine suddenly screws up because the language definition and setup changes, then that puts me off using Python for anything. That’s a Micro$oft attitude – change the foundations because they didn’t think it out well-enough the first time and they don’t have to pay for the changes. If the language depends so much on having libraries for the normal functions, then it’s not acceptable to change the names or functions of those libraries, and it’s also not acceptable to point to a specific location and then change that location in an update. If you want to do that, change the name of the language so it’s obvious you need a different interpreter/compiler. Upgrades are supposed to fix bugs, not introduce them.
I make a firm distinction between what I WANT, what I LIKE, and what is USEFUL. So while I don’t LIKE Python for implementation details I very much DO WANT it “to work for me”. It is useful, therefor I use it; like it or not. As I have said many times: it isn’t about me. It is only about the path to a solution.
And I did get it to work. I made my first, crude, plot.
Why use CVS: Stated in the stuff you say you didn’t read. It was all the examples I found.
Why not use the direct SQL? Because I don’t know how and saw no examples, so figured I would make that a second step. As my major goal for the day was to learn to plot something, anything, just swapping to a CVS load was fast and relatively easy. With your SQL example, I’ll model off it too. This is one of the problems with OO & Libraries. You don’t know what you don’t know, so there is a big hump to get over to learn all the relevant objects and libraries (or even just their names and what exists). Late last night I did find an online Matplotlib manual, and that will help. People only learn and discover stuff so fast, and this was what I could do in a couple of part time days. Go from near zero to installing and using Python to load and plot some data.
I think it pretty good progress for 2 days.
I’m a bit ambivalent on Perl. Since most folks are using Python, there’s more examples in it.
FWIW, while shutting down for the night, I did discover Julia has an interface to Python matplotlib. Were I going to swap paths now, it would be to Julia. Designed for parallel processing and about as fast as C or FORTRAN (all 3 about 10x faster than Python). Then I still could learn the same plotting library…
That’s pretty much my attitude about it all. A big part of my low esteem for both Perl and Python. Incompatible versions.
Why do I still write FORTRAN. Because despite not touching it for a good 30+ years after my FORTRAN IV class (I.e. even before f77) when I ported GIStemp (in f77 and f95) it was basically the same. Some additions, mostly all compatible or obvious. With roughly zero work to catch up, I was again fully functional in the language.
Knowing that anything I write will work the same in 30 or 40 years is a nice feature… then to swap versions is just a compiler flag. Season with it still being one of the fastest 2 languages for science stuff and you see why it is still dominant in R&D and Engineering things.
Though, to be fair, the fixed layout of columns in f77 is a pain (freeform in newer standards) and I like C syntax for operators and begin end blocks better….
There is no perfect language, and, IMHO, every one of them has things to gripe about: but languages that change so much they are incompatible between times I use them are particularly annoying.
Pingback: Notes On Julia – The Language | Musings from the Chiefio
For basic python stuff a nice guide is
“Automate the boring Stuff” which is now free at
It has a chapter on working with CSV . I used the working with excel chapter which was very useful to me.
Pingback: Graph Of Global Thermometers | Musings from the Chiefio
: Recommend you use Anaconda to install and use Python et. al. Much easier than using a virtual env, and much much better than not. This will allow more easily control the environment, have multiple versions of things if you want, e.g. Python 3 and 2.
: See the documentation for the Pandas import from MySQL (and others) at http://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-sql
Anaconda looks interesting but is a bit more than I want to add to my learning curve at the moment. (Something about condas as Yet Another Package Manager puts me off…). I’ll likely eventually get there, but right now I’m just looking to get enough “basics” to do a few graphs.
I did discover that IDLE is not installed by default and have installed it. I’ll explore it first. It may be enough for just loading / saving programs. I’m pretty much settled in on the version of Python that comes with Python3 on the Devuan release. No real need for “multiple versions” and as the limit on what’s on the Arm chipset tends to be “who has ported it”, I don’t expect Anaconda to have access to any newer version of Python than comes in the basic installation.
As an FYI, for anyone else doing this, here’s what the IDLE install looks like:
So now all those tutorial / example pages that say “Now do FOO in IDLE” become useful to me…
Some stuff about IDLE:
One of my complaints about Python is that there’s 1/2 dozen “choices” at key points. This means 1/2 dozen “evaluation processes” to pick one to use and it means “look at 6 advice pages to throw out 5” as they don’t match what combination you have chosen. Then multiply by 2 for Python 2 vs 3…
Similarly I’ve found at least 2 “methods” for loading an SQL database into Python. Now I get to figure out which one is “best” or what to use or will it work or… Another of the problems with OO Design. You get some facility in a loaded library, but someone else doesn’t know it, or is lost trying to find it, or just doesn’t like it; so does the rational thing. They write another one from scratch. Now you have 2 such libraries of ‘objects’ to search for / through / learn in depth before you can reasonably choose and use. Now add that to the two x for releases 2 and 3 and then the 3 to 6 times for IDE / REPL / IDLE / Whatever… You get an exponential explosion of “Mix and Match” before you can ever get any work done. Then an exponential explosion of “Search & Reject” trying to find an example to model.
Eventually the workload for just wandering in that forest become oppressive and folks wander off to write Yet Another Language; and the cycle repeats. (Thus Go, and Julia, and…)
Then folks wonder why I’ll knock out a FORTRAN program to do something… Because it requires zero search time, zero choice time, zero rejections of models, zero incompatibles issues; and I can do it in about 2 minutes wall clock for most things when I’ve done anything similar before. (Like turning the TABS filler into a COMMA filler – maybe a minute all told).
So yeah, I’m just griping that learning new language is hard, and made harder by too many optional paths, and made harder by “fancy environment” choices. And made harder by hiding what, IMHO, ought to be fundamental language properties off in OO Libraries that you “import” but are not in the language spec (so you get to “go fish” for documentation then search that all…). Yes, I know that arrays are handled in NumPy, now. But in FORTRAN they are done in the core language and it’s in the manuals…
Oh Well. I chose to take this path, and I’m a willing student while learning more about this OO Library way of things; but that also grants me the right to gripe about it. ;-) I’ve learned a lot of languages over the years and so far, while basic Python is very easy, once you go off to “import FOO” land it gets harder than most very fast. (Largely as FOO is not part of the language… and the manual for FOO is off who knows where… and I didn’t set out to learn FOO… and…)
Oh Boy, my first saved, loaded, and run Python Program (i.e. not in the REPL just typed in live):
then the result:
I finally feel like I’m getting where I want to be with Python. Not just fooling around in an interpreter in dribs and drabs. I can now save / recall Real Programs, and read / write to files.
Near as I can tell, you must have something like IDLE or an IDE installed to do the save / read in. It may be buried in the interpreter REPL somewhere but I could only find “launch it as ‘python -i foo.py’ to start a program” and could not find any save / read in options once inside the REPL. (Read Evaluate Print Loop – basically an interpreter) Besides, all the examples immediately run off to any one of several IDEs or IDLE… so it looks like “the one (of several…) true way” in Python.
So, from here on out, as I’m writing more complicated things and typing them in from scratch into the interpreter you get when you do “sh> python” is getting old fast: I’ll be using IDLE and writing saved programs. I’ll also, now, be able to save more of my output to files if desired. Graphs to the screen, but things like data tables post processing as data tables can go to a file…
I probably ought to have named that /tmp/junk.pyout just to be clear it isn’t Python text… maybe next time ;-)
So one more tiny step forward…
Maybe I’ll explore whatever the Python equivalent of a FORMAT statement for fixed data input might be and see if I can replace those FORTRAN glue-ware programs…