GIStemp STEP3 – the process

This is step 3

STEP3 Overview

STEP3 begins the zonal box anomaly process. Substantially the same code is run again in parts of STEP4_5, so this review is a foundational code review for that step. Strangly, the source code is repeated in STEP4_5 unchanged (though there is the potential for it to mutate since we now have multiple copies laying about. Do maintenance on one and forget about the other copy?…
 

Sizes and Listings

 

Lets look inside STEP3 for how big things are. Files ending in “.f” are FORTRAN program source code (that when compiled into something the computer can run gets the “.exe” executable binary suffix) while those ending in “.sh” are Unix / Linux shell scripts. Oh, and “*” is the “wildcard character” that says “match anything” so *smith would match “Goldsmith” and “Tinsmith” and… So just taking a look at the line count numbers in the text files (cd STEP3, wc -l *) :

   Lines   Words   Bytes File Name

     247     851    8909 annzon.f
      55     252    1714 do_comb_step3.sh
     675    2330   23070 to.SBBXgrid.f
      21      76     459 trimSBBX
     130     525    4412 trimSBBX.f
      35     102     881 zonav
     397    1556   13952 zonav.f
    1560    5692   53397 total

 

For STEP3 there are about 1560 lines of code in total. I will be reviewing the actual code a bit more loosely than we did in the early parts, since otherwise the postings will get a bit long. In particular, to.SBBXgrid.f and zonav.f will be a bit abbreviated in depth.

 

Scripts:

 

Note – all FORTRAN is compiled to runnable binaries in-line in the scripts, executed, then the runnable binaries deleted. You would expect the human readable source code to be in a distinct archive, compiled to runnable binaries in a distinct program library once, and only deleted when a newer version had passed a Quality Assurance test.

One comment on style. I always learned that you created and used your work files in a work file directory; this code creates and uses them in the same directory as the program source code, then when done, moves them into a work directory. Not what you would expect.

There are three scripts. The top level controlling script, do_comb_step3.sh will be reviewed inline here. The “wrapper scripts” that compile the FORTRAN will be reviewed along with the programs they wrap. Those two scripts are trimSBBX and zonav.

zonav

This is called first in the do_comb_step3.sh process. It compiles and runs zonav.f and also does some file housekeeping. Please note that zonav also compiles and runs annzon.f for no explicable reason. Though this does break the style of dedicated wrapper scripts.

trimSBBX

This script is called right after zonav. It is a bare bones comple, run, delete executable, change file names job.

do_comb_step3.sh

This is the start. It runs the show.

A Deeper Look at the Top Level Script

 

We will take a brief look at the top level controlling script do_comb_step3.sh and, after the listing of it, I’ll describe what it does.

Begin listing of “do_comb_step3.sh”:

#!/bin/ksh

fortran_compile=$FC
if [[ $FC = '' ]]
then echo "set an environment variable FC to the fortran_compile_command like f90"
     echo "or do all compilation first and comment the compilation lines"
     exit
fi

label='GHCN.CL.PA' ; rad=1200
if [[ $# -gt 0 ]] ; then rad=$1 ; fi

i="to_next_step/Ts.${label}"

if [[ ! -s $i.1 ]] ; then echo "input files ${i}.1-6 missing"; exit; fi

##  Input files:
n=1
while [[ $n -le 6 ]]
do
   ln ${i}.$n  fort.3$n
   (( n=$n + 1 ))
done

${fortran_compile} to.SBBXgrid.f -o to.exe
to.exe 1880 $rad > to.SBBXgrid.1880.$label.$rad.log

##   Output files:

echo "The following files were created:"
echo "SBBX1880.Ts.${label}.$rad    BX.Ts.${label}.$rad  "
mv fort.10 SBBX1880.Ts.${label}.$rad
mv fort.11   BX.Ts.${label}.$rad
mv fort.77   statn.use.Ts.${label}.$rad

if [[ $rad -eq 1200 ]] ; then ./zonav $label ; fi

./trimSBBX SBBX1880.Ts.${label}.$rad

# Clean-up
rm fort.3[0-6] ; rm  to.exe
mv *exe *log *use* work_files/.
a=$( ls SBBX*.trim ) ; mv $a to_next_step/${a%.trim}
mv BX* SBBX* work_files/.

echo ; echo "If you don't want to use ocean data, you may stop at this point"
echo "You may use the utilities provided on our web site to create maps etc"
echo "using to_next_step/SBBX1880.Ts.${label}.$rad as input file" ; echo

echo "In order to combine this with ocean data, proceed as follows:"
echo "move SBBX1880.Ts.${label}.$rad from STEP3/to_next_step to STEP4_5/input_files/."
echo "create/update the SST-file SBBX.HadR2 and move it to STEP4_5/input_files/."
echo "You may use do_comb_step4.sh to update an existing SBBX.HadR2 file"
echo "You may use do_comb_step5.sh to create the temperature anomaly tables"
echo "that are based on land and ocean data"

 
End of script.

This scripts is once again Korn shell. It starts off with a check that the environment variable “FC” is set to your FORTRAN compiler. Given that we’ve seen f90 constructs in some programs and f77 explicitly called in others, it would be “nice” if they had settled on one and had a script that explicitly set the environment variable for you “up front” as part of the start of the show GIStemp run. Doing that would reduce some of the confusion and constant checking for what has / has not been done.

we set the default radius variable, rad, to 1200 (one presumes km) then check to see if we have been passed an override value.

We also set a fixed lable of “GHCN.CL.PA” that is used throughout the script. One presumes this is to enable easy change later, though why is, as usual, unclear. Heck, it might just be to reduce the typing of capital letters and punctuation. I’ve done it.

We set our input file name “i” to be “to_next_step/Ts.GHCN.CL.PA” and one is tempted to ask why an input file is in a folder named STEP3/to_next_step but: “Why? Don’t ask why, down that path lies insanity and ruin. – e.m.smith”…

We then check to see if the first one of six input files is missing and if it is, we exit. One is left to wonder why “to_next_step/Ts.GHCN.CL.PA[2-6]” are less important than 1, but we all know where “why” leads by now, don’t we… So we use a canary and we hope that means everything is there and the prior program did not puke part way through leaving only a single output file. OK, good enough for G…

We then link these six files to 6 new names of the form “fort.3x” where x ranges from 1 to 6. So fort.31, fort.32, fort.33, fort.34, fort.35, and fort.36 will be our input file name surrogates for the real file names. Why? Excuse me, I think I need a beer… I’ll be right back…

Finally, some real action. We compile the program to.SBBXgrid.f (but name the output executable to.exe) run it (as to.exe) with a passed parameter of 1880 and our 1200 km radius then put the output into the file to.SBBXgrid.1880.GHCN.CL.PA.1200.log and move to a very polite statement that we created some other files too.

These are:

SBBX1880.Ts.GHCN.CL.PA.1200 created from fort.10
BX.Ts.GHCN.LC.PA.1200 created from fort.11
statn.use.Ts.GHCN.CL.PA.1200 created from fort.77

Watch for those fort.xx names inside the FORTRAN code…

Next, a strange thing happens. Only in the case where the radius is 1200 do we run the program “./zonav GHCN.CL.PA” and one is left to wonder. No, I won’t say it. You ought to understand what it means by now when you start to wonder. Especially if it is one of those whx words…

Proceeding on, we get to “./trimSBBX SBBX1880.Ts.GHCN.CL.PA.1200” that we will need to investigate later.

Finally, at the end, we deleted all those “fort.xx” files and move our scratch temp work files into “work_files” where they ought to have been made in the first place. This is getting just a bit irksom, since the programmer clearly knows how to prepend “input_files” they could just as easy to “work_files”. One is left to presume they either don’t care or “don’t get it” that it’s a bad thing to scribble files in with your source code and delete things from the “source archive” during operatons.

We then do an interesing twist of stuffing “a” with the name of whatever is left laying about with the name SBBX[anything].trim and moving that into to_next_step/[whatever a is that already ends in .trim].trim and we move all the BX[anything] and SBBX[anything] leftovers into work_files. I do hope that we are supposed to end up with an input file for the next step that ends in something.trim.trim or maybe we’ll have to revisit this step.

Finally, we get the useful message that we have computed our land anomalies and only need to proceed on to STEP4_5 if we want to add in Sea / Ocean anomalies. This is good to know. It’s also good to realize that all STEP4_5 is going to do is rehash this step, but adding in oceans.

Were I redisigning this, I’m certain I’d find a way to collapes 3, 4, and 5 into a single step with parameters to select desired run type.

Instead we are told that there is a manual step here:

move SBBX1880.Ts.GHCN.GL.PA.1200 from STEP3/to_next_step to STEP4_5/input_files/

Hurray! Rejoice! At last an output moves from to_next_step in the present step into “input_files” of the following step. Now if only such usage were consistent. As it stands now, right when they got you trained that “to_next_step” ment “from_last_step” too, they change it. Sigh. Draft or lager? Maybe I’ll go for Pilsner…

create/update the SST-file SBBX.HadR2 and move it to STEP4_5/input_files/

OK. Some information on how to ‘create/update’ it would be nice. I guess I’m going fishing again in the next step STEP4_5. I’d guess from the name that it’s a sea surface set of data from Hadley. Probably listed in the input files secton.

“You may use do_comb_step4.sh to update an existing SBBX.HadR2 file”

So there will be some sorting out about what’s a creation and what’s an update and when you do which. But at least we know that STEP4 somehow does some update.

“You may use do_comb_step5.sh to create the temperature anomaly tables that are based on land and ocean data”

And at least we know that STEP5 is going to cook up the anomaly tables based on land and oceans.

But STEP4_5 is another step, and this brings us to the end of this one. Those mysteries will be explored elsewhere… For now, we need to take a tiny peek at what FORTRAN programs are laying about and what they do.

So what did we just do?

FORTRAN source:

 

There are 4 FORTRAN programs:

zonav.f

The header:

C*********************************************************************
C *** PROGRAM READS BXdata
C *** Input files:  11    box.data (BX1977.T1200)
C ***
C *** Output files: 10    zonal.means (ZON1977.T1200)
C ***
C*********************************************************************
C****
C**** This program combines the given gridded data (anomalies)
C**** to produce AVERAGES over various LATITUDE BELTS.

Next program:

annzon.f

The header:

C*********************************************************************
C *** program reads ZONAL monthly means and recomputes REGIONAL means
C *** as well as annual means.
C *** Input file:  10    zonal.means (ZON1977.T1200)
C ***
C *** Output files: 11    annual.zonal.means (ANNZON1977.T1200)
C ***               12    zonal.means (ZON1977.T1200) changed
C*********************************************************************
C****
C****
C**** Displays data and their annual means
C****

Next program:

trimSBBX.v2.f

The Header:

C**** This program trims SBBX files by replacing a totally missing
C**** time series by its first element. The number of elements of the
C**** next time series is added at the BEGINNING of the previous record.

Next program:

to.SBBXgrid.f

The Header:

C*********************************************************************
C *** PROGRAM READS NCAR FILES
C *** Input files:  31-36 NEWUPD.1-NEWUPD.6
C ***
C *** Output files: 10    subbox.data
C ***               11    box.data
C ***
C *** This program grids the station data
C*********************************************************************
C****
C**** This program interpolates the given station data or their
C**** ANOMALIES with respect to 1951-1980 to a prescribed grid.

And that concludes the overview of STEP3 FORTRAN.

Their is no STEP3 “readme” file to reproduced below:

About these ads

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in GISStemp Technical and Source Code and tagged , , , , , . Bookmark the permalink.

4 Responses to GIStemp STEP3 – the process

  1. Ian W says:

    I have spent some time testing code – and I have usually found that basic mis-assumptions in design are the source of most errors.

    What is unclear in your check through here is the actual algorithm(s) used to alter whatever projection that is being used into a valid ‘geographic area’ and not double count the same ‘area’. The surface of a sphere being what it is, the area of a degree of latitude and longitude at the poles will be _very_ small whereas the same at the equator will be relatively large. Is this ‘projection’ change catered for in the Fortran code? Is the arbitrary radius of 1200km a safe approach close to the poles? There seems to be the potential for a lot of error – and the messy coding does not enforce confidence in the algorithms used.

  2. E.M.Smith says:

    As near as I can tell, with about 1000 stations making it to the STEP3 process and it filling in 8000 boxes, there is no attempt at all to prevent stations from being reused many times. The log files bear this out too.

    There is a method used that seems to adequately deal with the polar issue. The globe is divided into “Regions” then each region is divided into 100 boxes. There are fewer regions near the poles ( 4 IIRC) than at the equator.

    Yeah, not a lot of trust in it from me either…

  3. Ian W says:

    But that is just the point – I would expect some clever spherical trigonometry to ensure that the boxes were all the same size. Otherwise averaging different size boxes as if they were the same size will grossly skew the results of the ‘average’ temperature. Worse would be if the one degree by one degree simplistic algorithm were used where small areas were equated to huge areas. So a very small change in the polar boxes would have far more impact on the ‘average’ than a large change in the equatorial regions.

    REPLY [ I think the use of a radius from the center is supposed to equalize that. But since the actual stations "spread around" varies dramatically by station density in any given area, I don't think that the shape is the big issue. The whole thing is a bit daft and admiring any one part of the daftness more than another is rather, er, daft ;-) -E.M.Smith ]

  4. Pingback: Temperature in Guam: Filling a Gap | Digging in the Clay

Comments are closed.