Skip to content

formatted i/o improvements #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gronki opened this issue Nov 3, 2019 · 2 comments
Open

formatted i/o improvements #69

gronki opened this issue Nov 3, 2019 · 2 comments
Labels
Clause 13 Standard Clause 13: Input/output editing

Comments

@gronki
Copy link

gronki commented Nov 3, 2019

In this example we will attempt to print a following message:

important parameter = 1.5 for n = 4

given by the format:

("important parameter = ", f6.1, " for n = ", i3)

syntax coloring in formats

back in the times when space shuttles were still flying, you would do something along the lines of

666 format ("important parameter = ", f6.1, " for n = ", i3)
print 666, 1.5, 4

this is terrible for many reasons, one of them is that 666 does not tell anything about the format (surely not anything good) and you might spend next 15 minutes looking for it in the code. Later, character literals were allowed as formats, which brought us to the following:

print '("important parameter = ", f6.1, " for n = ", i3)', 9.5, 11

this would be good if not the fact that github, like all text editor I've worked with, display the entire format in string color. If that was not bad enough, you need to use different kind of quotes inside the format than to enclose the character literal, and you have to remember which one you used.

My proposal is to introduce something to distinguish between format strings and other strings. Internally they could be handled the same, but text editors would be able to highlight .

  • Option 1: use apostrophes ``. The downside is that they are difficult to distinguish from single quotes, but some languages (like MySQL) actually do use them. Alternatively, other characters could be used, like // (whatever doesn't conflict with already existing syntax). Example:
    print `("important parameter = ", f6.1, " for n = ", i3)`, 2.4, 3
    print /("important parameter = ", f6.1, " for n = ", i3)/, 2.4, 3
  • Option 2: revive format as "function". The downside is that its quite a long word and would not be very clear when used directly in print/write/read. Example:
    character(len = *), parameter :: fmt = format("important parameter = ", f6.1, " for n = ", i3)
  • Option 3: use character prefix, similarly to z being used for hex numbers. The downside is that you still have to escape the quotes. Example:
    print f'("important parameter = ", f6.1, " for n = ", i3)', 2.4, 3

Yes, this is syntactic sugar, but it is something that compilers just need to allow (does not introduce new rules into the language) but makes life of the developers much easier.

allow format without parethesis

consider the following:

print '(3i3)', 1, 2, 3, 4, 5

Of course, the output will be

  1  2  3
  4  5

Now, in 99% of the cases the repetition of the format will never occur as the number of arguments matches the number of format fields. Therefore, parethesis () only obfuscate the syntax. I propose the following format (currently invalid):

print '3i3', 1, 2, 3, 4, 5

would produce the output:

  1  2  3

In that case, our example would be along the lines of

print '"important parameter = ", f6.1, " for n = ", i3', 9.5, 11
print /"important parameter = ", f6.1, " for n = ", i3/, 9.5, 11

Obviously, not all of them look very good, but there are many choices here that can be made to improve readability.

allow character literals in input formats

this was discussed in comp.lang.fortran. Yet, I still think it is worth considering as it is a simple thing (also to implement) that would increase the functionality of the language.

Consider the following code:

integer i
real f

666 format ("important parameter = ", f6.1, " for n = ", i3)
667 format (22x, f6.1, 9x, i3)

read (*, *) f, i
open (33, status = 'scratch')
write (33, 666) f, i
rewind (33)
! note: we have to use 667 format instead of 666
read (33, 667) f, i
close(33)

print 666, f, i

end

In action:

$ gfortran test.f90 && ./a.out
3 3
important parameter =    3.0 for n =   3

As you can see, a whole new format 667 needs to be constructed to read the data written with format 666. That deprecates the whole idea of separating formatting from data. In this case, I have written a simple function that replaces any character literal within the format with Nx when N is the length of that literal. But I see no reason why this could not be handled internally by the compiler, reducing complexity of maintaining two formats (for write and read). There are two options how this could be done:

  • Option 1: simply treat any character literal as Nx, where N is the length of that literal
  • Option 2: read next N characters into a hidden variable and compare with what's in the format, throwing an error if these sequences do not match.

Example in C:

#include <string.h>
#include <stdio.h>

void main() {

int d, r;
const char fmt1[] = "wolf has %5d sheep\n";
const char fmt2[] = "sheep is %5d kilos\n";
char buf[128];

scanf("%d", &d);

sprintf(buf, fmt1, d);

d = 0;
r = sscanf(buf, fmt1, &d);
printf(fmt1, d);
printf("sscanf exited with %d\n", r);

d = 0;
r = sscanf(buf, fmt2, &d);
printf(fmt2, d);
printf("sscanf exited with %d\n", r);

}

result:

$ gcc test.c && ./a.out
7
wolf has     7 sheep
sscanf exited with 1
sheep is     0 kilos
sscanf exited with 0

As you can see, when wrong format was used, sscanf returned 0 which means that 0 variables has been assigned their value.

As far as I know there should be no conflict with legacy code as character constants were not allowed in formats before.

summary

These are just a few ideas. I think there are many more possible simple improvements to Fortran formatted i/o, I think they can be collected in this thread and then the best ones extracted in clean and polished version.

Dominik

@klausler
Copy link

klausler commented Nov 4, 2019

The outermost parentheses in CHARACTER formats are useful to determine the end of the format without having to skip over trailing blanks.

@gronki
Copy link
Author

gronki commented Nov 4, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clause 13 Standard Clause 13: Input/output editing
Projects
None yet
Development

No branches or pull requests

3 participants