Incorrect handling of errors #2

winstonewert · 2020-09-03T16:19:12Z

I'm not sure if this project is active, but I've found a problem.

read.rs:258

        let out = func(&mut reader)?;
        reader.done();

If func returns an error, then reader.done() isn't called. This means that the reading thread won't stop running and will block forever waiting for its content to be read.

The text was updated successfully, but these errors were encountered:

markschl · 2020-09-03T21:26:44Z

Thanks for reporting, that's indeed very bad. It's a bit strange that the read_fail test didn't bring up this problem. I guess, the following should fix this::

    let result = func(&mut reader);
    reader.done();
    let out = result?;

Could you maybe confirm that this works? Since I'm in holiday, it might take two weeks until I'll be able to take a close look and publish the fix.

winstonewert · 2020-09-03T22:45:33Z

That's exactly how I fixed it in my local copy. If you'd like I can do a pull request. I think that `read_fail` misses this because `read_fail` only tests an error generated by the reading thread at the end of the input. In my case, I have a failure in the processing thread at the start of the output.

…

On Thu, Sep 3, 2020 at 2:27 PM markschl ***@***.***> wrote: Thanks for reporting, that's indeed very bad. It's a bit strange that the read_fail test didn't bring up this problem. I guess, the following should fix this:: let result = func(&mut reader); reader.done(); let out = result?; Since I'm in holiday, it might take two weeks until I'll be able to take a close look and publish the fix. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGRS7NBB2YDII753ANFWTDSEAC2NANCNFSM4QVCCYBA> .

markschl · 2020-09-04T08:39:38Z

That's exactly how I fixed it in my local copy. If you'd like I can do a pull request.

A PR would be nice if you have time, thanks. In addition, after rethinking this, I suspect that let out = result?; should go after handle.join().unwrap()?;. Otherwise, remaining data still in the queue when the error occurs will be lost. Theoretically, it's also possible that a reading error happened and is waiting in the queue to be reported. This reading error will then be returned instead of the other error in the main thread.

I think that read_fail misses this because read_fail only tests an error generated by the reading thread at the end of the input. In my case, I have a failure in the processing thread at the start of the output.

You are right. Maybe adding a test for this would be nice.

winstonewert · 2020-09-04T14:32:00Z

On Fri, Sep 4, 2020 at 1:39 AM markschl ***@***.***> wrote: I suspect that let out = result?; should go *after* handle.join().unwrap()?;. Otherwise, remaining data still in the queue when the error occurs will be lost. Theoretically, it's also possible that a reading error happened and is waiting in the queue to be reported. This reading error will then be returned instead of the other error in the main thread.

What else could happen but lose the remaining data? The processing has already returned with an error. But, yes, if a reading error happened then we'll want to make sure that gets returned.

markschl · 2020-09-04T18:41:29Z

What else could happen but lose the remaining data? The processing has already returned with an error. But, yes, if a reading error happened then we'll want to make sure that gets returned.

Ok, I admit that also returning from the main thread on error without waiting for more data or io::Errors is a valid behaviour, as long as it's documented. If doing so, we would loose some data / errors waiting in the BackgroundReader::full_send queue. On the other hand, if waiting for the background thread to finish with handle.join().unwrap()?; before returning the error (as I proposed in my last comment), there will be reads happening after the error in the main thread occurred because Reader::done() sends a stop signal to the BackgroundReader::empty_recv queue, and the background thread will not stop instantly (delay depends on queuelen). I'm not sure right now, which one is the smarter solution.

Btw, I think the background writer (write.rs) has the same bug. I can address this in another commit, or if you have time, you could also include analogous changes in the PR.

winstonewert · 2020-09-05T00:21:36Z

I think that you should simply throw away any errors that the client code didn't read. My reasons:

The client code may be deliberately not reading to the end of the stream, knowing that it will error out for some reason.
Returning the errors from the background thread creates a race condition. The behavior of the program will be different depending on how far the background thread got.
The client code may be adding context/transforming the io error in some way, which is defeated by trying to return the errors.
Not returning errors that would be later in the file matches the semantics of a typical file more closely.
Your code could be somewhat more generic, because the error type wouldn't need to be From<io::Error>

So, if it were up to me (and its your crate, so its not), i'd delete reader.get_errors()?.

The primary reason I can see to handle those errors is for symmetry with writer. Obviously in the case of the writer you want to capture any errors that might have happened trying to write after processing was finished.

markschl · 2020-09-05T18:42:05Z

Thanks for your reasoning, I agree on points 1-4. It's problematic to do further reads after the error in the main thread occurred, and if an io::Error would be returned (and converted) instead, it would be confusing. The call to Reader::get_errors (and the method itself) can be removed.

Regarding 5., I'm of course open for changes :) How would you propose the reader function signature to be? I think, symmetry to the writer module is not that important.

Here a few thoughts about the write module (it helps me to write them down, since it's not so easy to wrap my head around this, especially in holiday): The get_errors function is important for reporting possible io::Errors, although delayed in relation to the write call in the main thread. When func ends successfully, the last (usually partially filled) buffer is sent to the background for writing (writer.done()?;), and writer.get_errors()?; ensures that errors are reported back. On the other hand, if func ends early with a custom, we don't want to send any more data, writer.done()?; shouldn't be called. However, the writer in the background thread is still writing what is in the queue, and these writes may fail. These failures can happen both before and after the error in the main thread occurred, we cannot just stop the writer at this exact time point. If an error happened in both the main and the background writer thread, would we need to report both to be totally clear? I'm not sure right now how to make an ergonomic API with such a behaviour. But if so, we could separate the two error types like in the reader.

winstonewert · 2020-09-05T19:39:51Z

I think we can just drop the + From<io::Error> from reader and reader_init because without get_errors(), there's no need to convert io::Error into E.

On writing, I think the best option is to return the func error and ignore further io:Error. It would be nice to return both errors in some way, but I think it wouldn't be ergonomic. Given the choice of the two errors, I think I'd rather see the func error.

Also, looking at this code, I think I'd expect flush to actually force the background thread to flush before returning instead of requesting a flush in the background. (Although I can't see a use case for flushing when using this crate.)

markschl · 2020-09-06T19:14:29Z

I think we can just drop the + From<io::Error> from reader and reader_init because without get_errors(), there's no need to convert io::Error into E.

Ah right, I don't know why I missed that this is simply possible. I'm a bit distracted right now.

On writing, I think the best option is to return the func error and ignore further io:Error. It would be nice to return both errors in some way, but I think it wouldn't be ergonomic. Given the choice of the two errors, I think I'd rather see the func error.

Fine for me, I'll document the behaviour in the next release.

Also, looking at this code, I think I'd expect flush to actually force the background thread to flush before returning instead of requesting a flush in the background. (Although I can't see a use case for flushing when using this crate.)

flush() is a required trait method of io::Write, and I prefer to mimic the behaviour of io::Write as closely as possible. The flush call will be in the same order relative to write calls. If flushing after all writes are done, it is guaranteed that no further writes will be done after flushing in the background thread. io::BufWriter will flush automatically when it goes out of scope. Which would be after func returns for writer_init and after the provided writer goes out of scope for writer.

Regarding fs::File, I just found this issue. I haven't read it through completely, not sure if e.g. sync_all and flush are equivalent, and in which case one of them may be needed to be absolutely sure that all data is written. sync_all can be called in the finish closure of writer_finish or writer_init_finish. But unless I miss something, I don't think that it is up to thread_io to provide such functionality.

markschl · 2020-09-06T20:57:53Z

If flushing after all writes are done, it is guaranteed that no further writes will be done after flushing in the background thread

Correcting my own statement, there can be a further write even if the flush call is last. So, if wanting to flush before the handle is closed, it would be necessary to do this in the finish closure. Still, I'm not sure if this should be automatically done (always).

winstonewert · 2020-09-06T20:58:07Z

My understanding is that flush() is supposed to ensure that all application level buffers are written out. sync_all() ensures that all OS level buffers are written to disk. As I think about it, your writer is an application level buffer which is why I expect it to be flushed when `flush()` returns, the same as I'd expect BufWriter::flush() to have an empty buffer when it returns. But that's an entirely distinct issue from the error handling, and I'm not even using your writer (at least not yet), so I'll leave it at that. I plan to send you a pull request sometime this week for the error handling.

…

On Sun, Sep 6, 2020 at 12:14 PM markschl ***@***.***> wrote: I think we can just drop the + From<io::Error> from reader and reader_init because without get_errors(), there's no need to convert io::Error into E. Ah right, I don't know why I missed that this is simply possible. I'm a bit distracted right now. On writing, I think the best option is to return the func error and ignore further io:Error. It would be nice to return both errors in some way, but I think it wouldn't be ergonomic. Given the choice of the two errors, I think I'd rather see the func error. Fine for me, I'll document the behaviour in the next release. Also, looking at this code, I think I'd expect flush to actually force the background thread to flush before returning instead of requesting a flush in the background. (Although I can't see a use case for flushing when using this crate.) flush() is a required trait method of io::Write, and I prefer to mimic the behaviour of io::Write as closely as possible. The flush call will be in the same order relative to write calls. If flushing after all writes are done, it is guaranteed that no further writes will be done after flushing in the background thread. io::BufWriter will flush automatically when it goes out of scope. Which would be after func returns for writer_init and after the provided writer goes out of scope for writer. Regarding fs::File, I just found this issue <rust-lang/rust#32255>. I haven't read it through completely, not sure if e.g. sync_all and flush are equivalent, and in which case one of them may be needed to be absolutely sure that all data is written. sync_all can be called in the finish closure of writer_finish or writer_init_finish. But unless I miss something, I don't think that it is up to thread_io to provide such functionality. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGRS7NAP3S3S2DYED267JDSEPNSFANCNFSM4QVCCYBA> .

markschl · 2020-09-06T21:00:07Z

I plan to send you a pull request sometime this week for the error handling.

But that's an entirely distinct issue from the error handling, and I'm not even using your writer (at least not yet), so I'll leave it at that.

True

I plan to send you a pull request sometime this week for the error handling.

Thank you!

Both read and write did not handle errors properly. In particular, errors happening in either the functions passed into thread_io by the client caused unexpected panics or hangs. Added a bunch of tests to handle the various conditions, and improved some existing tests. Ensured as much as possible that the most informative errors is returned in all cases.

Improved error handling #2

markschl · 2020-09-14T18:43:02Z

Thanks for the PR to fix this issue, especially for the very nice handling of panics.

I have added some documentation on error handling. I also followed your advice on flushing at the end, I agree that it definitely makes sense.

There should be a 0.3 release soon :)

markschl added a commit that referenced this issue Sep 14, 2020

Merge pull request #3 from winstonewert/master

af912ad

Improved error handling #2

markschl closed this as completed Sep 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect handling of errors #2

Incorrect handling of errors #2

winstonewert commented Sep 3, 2020

markschl commented Sep 3, 2020 •

edited

Loading

Uh oh!

winstonewert commented Sep 3, 2020 via email

Uh oh!

markschl commented Sep 4, 2020

Uh oh!

winstonewert commented Sep 4, 2020 via email

Uh oh!

markschl commented Sep 4, 2020

Uh oh!

winstonewert commented Sep 5, 2020

Uh oh!

markschl commented Sep 5, 2020

Uh oh!

winstonewert commented Sep 5, 2020

Uh oh!

markschl commented Sep 6, 2020

Uh oh!

markschl commented Sep 6, 2020 •

edited

Loading

Uh oh!

winstonewert commented Sep 6, 2020 via email

Uh oh!

markschl commented Sep 6, 2020

Uh oh!

markschl commented Sep 14, 2020

Uh oh!

Incorrect handling of errors #2

Incorrect handling of errors #2

Comments

winstonewert commented Sep 3, 2020

markschl commented Sep 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

winstonewert commented Sep 3, 2020 via email

Uh oh!

markschl commented Sep 4, 2020

Uh oh!

winstonewert commented Sep 4, 2020 via email

Uh oh!

markschl commented Sep 4, 2020

Uh oh!

winstonewert commented Sep 5, 2020

Uh oh!

markschl commented Sep 5, 2020

Uh oh!

winstonewert commented Sep 5, 2020

Uh oh!

markschl commented Sep 6, 2020

Uh oh!

markschl commented Sep 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

winstonewert commented Sep 6, 2020 via email

Uh oh!

markschl commented Sep 6, 2020

Uh oh!

markschl commented Sep 14, 2020

Uh oh!

markschl commented Sep 3, 2020 •

edited

Loading

markschl commented Sep 6, 2020 •

edited

Loading