Skip to content

SeekFrom with different inner types is confusing (and non-POSIX) #1889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amosonn opened this issue Feb 8, 2017 · 7 comments
Closed

SeekFrom with different inner types is confusing (and non-POSIX) #1889

amosonn opened this issue Feb 8, 2017 · 7 comments
Labels
breaking-change The RFC proposes a breaking change. T-libs-api Relevant to the library API team, which will review and decide on the RFC.

Comments

@amosonn
Copy link

amosonn commented Feb 8, 2017

Currently, SeekFrom has different "inner" type depending on whether we seek from the Start (u64) or from the Current or End position (i64). This of course only makes sense, if we assume that Seek objects can (and should) support a u64 size. This has a few problems (in rising order of importance, IMO):

  1. This deviates from POSIX standard for files, which dictates seeking by off_t (http://man7.org/linux/man-pages/man2/lseek.2.html), which is defined to be signed (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html#tag_13_67), for all SeekFrom-s.
  2. You can only access part of the file when seeking from Current or End, which kinda sucks. Not necessarily terrible, but also not the best API.
  3. It makes the arithmetic very tricky. See here: Overflow in libstd/io/cursor.rs "seek" returns incorrect error with large positions rust#39631. Of course, this can be remedied, (one possible solution described there), but I think that almost anyone who wants to implement Seek for a custom type (which is not a wrapper of an inner, Seek type) will encounter this problem. That's how I found it - I was at loss how to properly do the arithmetic, looked for a stdlib example, and found it was buggy. The other place I found in stdlib implementing a custom Seek type was in test_buffered_reader_seek_underflow() in libstd/io/buffered.rs, and there they overcame the problem by making the type a "circular" reader. I presume this is because implementing proper arithmetic considering the two different types was too much work (and irrelevant for the test), but of course isn't applicable for many real-life cases.

Generally speaking, I think that it is better to change an API that pushes people to have buggy implementations, especially when the ramifications are quite measly. The reason the bug above is still in stdlib is because nobody cares, because we never really have objects that are this big, which means we might as well use i64 to store our position. For files, this also the standard, so no trouble with the OS interface there.

@sfackler
Copy link
Member

sfackler commented Feb 8, 2017

We are not going to break backwards compatibility and change the type in this field.

Do you work with seekable structures that are over 9.2 exabytes? How would point 2 be improved if Start and End stored an i64?

@amosonn
Copy link
Author

amosonn commented Feb 8, 2017 via email

@amosonn
Copy link
Author

amosonn commented Feb 8, 2017 via email

@sfackler
Copy link
Member

sfackler commented Feb 8, 2017

Again, it is a year and a half too late to make that change.

On the other hand, if Start were to hold i64, it would be implied that
pos, as well as size, should be i64

I don't think that would be the case - off_t is 32 bits in many cases on systems that support files over 2GB and you have to seek around with multiple calls if _FILE_OFFSET_BITS=64 isn't an option.

@amosonn
Copy link
Author

amosonn commented Feb 12, 2017

Well, "Too late" is definitely a good argument.
Just out of curiosity, how does Rust handle seeking in 32-bit platforms, given that the types in SeekFrom are fixed to 64-bit?
Thanks for your time!

@sfackler
Copy link
Member

It uses lseek64.

@Centril Centril added breaking-change The RFC proposes a breaking change. T-libs-api Relevant to the library API team, which will review and decide on the RFC. labels Feb 23, 2018
@Centril
Copy link
Contributor

Centril commented Oct 8, 2018

Seems this was resolved and the libs team is not inclined to do anything here so I'll close this.

@Centril Centril closed this as completed Oct 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change The RFC proposes a breaking change. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

3 participants