Skip to content

formatted i/o improvements #69

Open
@gronki

Description

@gronki

In this example we will attempt to print a following message:

important parameter = 1.5 for n = 4

given by the format:

("important parameter = ", f6.1, " for n = ", i3)

syntax coloring in formats

back in the times when space shuttles were still flying, you would do something along the lines of

666 format ("important parameter = ", f6.1, " for n = ", i3)
print 666, 1.5, 4

this is terrible for many reasons, one of them is that 666 does not tell anything about the format (surely not anything good) and you might spend next 15 minutes looking for it in the code. Later, character literals were allowed as formats, which brought us to the following:

print '("important parameter = ", f6.1, " for n = ", i3)', 9.5, 11

this would be good if not the fact that github, like all text editor I've worked with, display the entire format in string color. If that was not bad enough, you need to use different kind of quotes inside the format than to enclose the character literal, and you have to remember which one you used.

My proposal is to introduce something to distinguish between format strings and other strings. Internally they could be handled the same, but text editors would be able to highlight .

  • Option 1: use apostrophes ``. The downside is that they are difficult to distinguish from single quotes, but some languages (like MySQL) actually do use them. Alternatively, other characters could be used, like // (whatever doesn't conflict with already existing syntax). Example:
    print `("important parameter = ", f6.1, " for n = ", i3)`, 2.4, 3
    print /("important parameter = ", f6.1, " for n = ", i3)/, 2.4, 3
  • Option 2: revive format as "function". The downside is that its quite a long word and would not be very clear when used directly in print/write/read. Example:
    character(len = *), parameter :: fmt = format("important parameter = ", f6.1, " for n = ", i3)
  • Option 3: use character prefix, similarly to z being used for hex numbers. The downside is that you still have to escape the quotes. Example:
    print f'("important parameter = ", f6.1, " for n = ", i3)', 2.4, 3

Yes, this is syntactic sugar, but it is something that compilers just need to allow (does not introduce new rules into the language) but makes life of the developers much easier.

allow format without parethesis

consider the following:

print '(3i3)', 1, 2, 3, 4, 5

Of course, the output will be

  1  2  3
  4  5

Now, in 99% of the cases the repetition of the format will never occur as the number of arguments matches the number of format fields. Therefore, parethesis () only obfuscate the syntax. I propose the following format (currently invalid):

print '3i3', 1, 2, 3, 4, 5

would produce the output:

  1  2  3

In that case, our example would be along the lines of

print '"important parameter = ", f6.1, " for n = ", i3', 9.5, 11
print /"important parameter = ", f6.1, " for n = ", i3/, 9.5, 11

Obviously, not all of them look very good, but there are many choices here that can be made to improve readability.

allow character literals in input formats

this was discussed in comp.lang.fortran. Yet, I still think it is worth considering as it is a simple thing (also to implement) that would increase the functionality of the language.

Consider the following code:

integer i
real f

666 format ("important parameter = ", f6.1, " for n = ", i3)
667 format (22x, f6.1, 9x, i3)

read (*, *) f, i
open (33, status = 'scratch')
write (33, 666) f, i
rewind (33)
! note: we have to use 667 format instead of 666
read (33, 667) f, i
close(33)

print 666, f, i

end

In action:

$ gfortran test.f90 && ./a.out
3 3
important parameter =    3.0 for n =   3

As you can see, a whole new format 667 needs to be constructed to read the data written with format 666. That deprecates the whole idea of separating formatting from data. In this case, I have written a simple function that replaces any character literal within the format with Nx when N is the length of that literal. But I see no reason why this could not be handled internally by the compiler, reducing complexity of maintaining two formats (for write and read). There are two options how this could be done:

  • Option 1: simply treat any character literal as Nx, where N is the length of that literal
  • Option 2: read next N characters into a hidden variable and compare with what's in the format, throwing an error if these sequences do not match.

Example in C:

#include <string.h>
#include <stdio.h>

void main() {

int d, r;
const char fmt1[] = "wolf has %5d sheep\n";
const char fmt2[] = "sheep is %5d kilos\n";
char buf[128];

scanf("%d", &d);

sprintf(buf, fmt1, d);

d = 0;
r = sscanf(buf, fmt1, &d);
printf(fmt1, d);
printf("sscanf exited with %d\n", r);

d = 0;
r = sscanf(buf, fmt2, &d);
printf(fmt2, d);
printf("sscanf exited with %d\n", r);

}

result:

$ gcc test.c && ./a.out
7
wolf has     7 sheep
sscanf exited with 1
sheep is     0 kilos
sscanf exited with 0

As you can see, when wrong format was used, sscanf returned 0 which means that 0 variables has been assigned their value.

As far as I know there should be no conflict with legacy code as character constants were not allowed in formats before.

summary

These are just a few ideas. I think there are many more possible simple improvements to Fortran formatted i/o, I think they can be collected in this thread and then the best ones extracted in clean and polished version.

Dominik

Metadata

Metadata

Assignees

No one assigned

    Labels

    Clause 13Standard Clause 13: Input/output editing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions