FORTRAN Fixed File To C Tab Delimited

Note: After finding a posting in “draft” that I thought I’d published, I’m going through “drafts”, finding a few more, and posting them. They may not be germane to anything current.

https://stackoverflow.com/questions/428109/extract-substring-in-bash

If x is constant, the following parameter expansion performs substring extraction:

b=${a:12:5}
where 12 is the offset (zero-based) and 5 is the length

If the underscores around the digits are the only ones in the input, you can strip off the prefix and suffix (respectively) in two steps:

tmp=${a#*_} # remove prefix ending in “_”
b=${tmp%_*} # remove suffix starting with “_”
If there are other underscores, it’s probably feasible anyway, albeit more tricky. If anyone knows how to perform both expansions in a single expression, I’d like to know too.

Both solutions presented are pure bash, with no process spawning involved, hence very fast.

https://www.cs.bu.edu/teaching/c/file-io/intro/

Continuing our example from above, suppose the input file consists of lines with a username and an integer test score, e.g.:

in.list
——
foo 70
bar 98

and that each username is no more than 8 characters long.
We might use the files we opened above by copying each username and score from the input file to the output file. In the process, we’ll increase each score by 10 points for the output file:

char username[9]; /* One extra for nul char. */
int score;

while (fscanf(ifp, “%s %d”, username, &score) != EOF) {
fprintf(ofp, “%s %d\n”, username, score+10);
}


The function fscanf(), like scanf(), normally returns the number of values it was able to read in. However, when it hits the end of the file, it returns the special value EOF. So, testing the return value against EOF is one way to stop the loop.

The bad thing about testing against EOF is that if the file is not in the right format (e.g., a letter is found when a number is expected):

in.list
——
foo 70
bar 98
biz A+

then fscanf() will not be able to read that line (since there is no integer to read) and it won’t advance to the next line in the file. For this error, fscanf() will not return EOF (it’s not at the end of the file)….

Errors like that will at least mess up how the rest of the file is read. In some cases, they will cause an infinite loop.

One solution is to test against the number of values we expect to be read by fscanf() each time. Since our format is “%s %d”, we expect it to read in 2 values, so our condition could be:

while (fscanf(ifp, “%s %d”, username, &score) == 2) {

Now, if we get 2 values, the loop continues. If we don’t get 2 values, either because we are at the end of the file or some other problem occurred (e.g., it sees a letter when it is trying to read in a number with %d), then the loop will end.

Another way to test for end of file is with the library function feof(). It just takes a file pointer and returns a true/false value based on whether we are at the end of the file.

To use it in the above example, you would do:

while (!feof(ifp)) {
if (fscanf(ifp, “%s %d”, username, &score) != 2)
break;
fprintf(ofp, “%s %d”, username, score+10);
}
Note that, like testing != EOF, it might cause an infinite loop if the format of the input file was not as expected. However, we can add code to make sure it reads in 2 values (as we’ve done above).

Note: When you use fscanf(…) != EOF or feof(…), they will not detect the end of the file until they try to read past it. In other words, they won’t report end-of-file on the last valid read, only on the one after it.

#include 
typedef struct
{
   char id_num[6];
   char Name[51];
   char Departure[11];
   char Arrival[11];
   char Day[3];
} PASSENGER;
void open_file(const char *filename)
{
   PASSENGER p[6] = {0};
   int i = 0;
   FILE *file = fopen(filename, "r");
   if ( file )
   {
      char line[80];
      /* Read the file line by line. */
      while ( fgets(line, sizeof line, file) && i < 6 )
      {
         fputs(line, stdout); /* Display the line. */
         if ( sscanf(line, "%5s %50c%10c%10c%3s",
                     p[i].id_num,
                     p[i].Name,
                     p[i].Departure,
                     p[i].Arrival,
                     p[i].Day) == 5 )
         {
            printf("id_num = \"%s\"\n", p[i].id_num);
            printf("Name = \"%s\"\n", p[i].Name);
            printf("Departure = \"%s\"\n", p[i].Departure);
            printf("Arrival = \"%s\"\n", p[i].Arrival);
            printf("Day = \"%s\"\n", p[i].Day);
            ++i;
         }
      }
      fclose(file);
   }
   else
   {
      perror(filename);
   }
}
int main(void)
{
   open_file("file.txt");
   return 0;
}

Subscribe to feed

About E.M.Smith

A technical managerial sort interested in things from Stonehenge to computer science. My present "hot buttons' are the mythology of Climate Change and ancient metrology; but things change...
This entry was posted in Tech Bits and tagged , , , , . Bookmark the permalink.

Anything to say?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.