Skip to content

Incorrect handling of errors #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
winstonewert opened this issue Sep 3, 2020 · 13 comments
Closed

Incorrect handling of errors #2

winstonewert opened this issue Sep 3, 2020 · 13 comments

Comments

@winstonewert
Copy link
Contributor

I'm not sure if this project is active, but I've found a problem.

read.rs:258

        let out = func(&mut reader)?;
        reader.done();

If func returns an error, then reader.done() isn't called. This means that the reading thread won't stop running and will block forever waiting for its content to be read.

@markschl
Copy link
Owner

markschl commented Sep 3, 2020

Thanks for reporting, that's indeed very bad. It's a bit strange that the read_fail test didn't bring up this problem. I guess, the following should fix this::

    let result = func(&mut reader);
    reader.done();
    let out = result?;

Could you maybe confirm that this works? Since I'm in holiday, it might take two weeks until I'll be able to take a close look and publish the fix.

@winstonewert
Copy link
Contributor Author

winstonewert commented Sep 3, 2020 via email

@markschl
Copy link
Owner

markschl commented Sep 4, 2020

That's exactly how I fixed it in my local copy. If you'd like I can do a pull request.

A PR would be nice if you have time, thanks. In addition, after rethinking this, I suspect that let out = result?; should go after handle.join().unwrap()?;. Otherwise, remaining data still in the queue when the error occurs will be lost. Theoretically, it's also possible that a reading error happened and is waiting in the queue to be reported. This reading error will then be returned instead of the other error in the main thread.

I think that read_fail misses this because read_fail only tests an error generated by the reading thread at the end of the input. In my case, I have a failure in the processing thread at the start of the output.

You are right. Maybe adding a test for this would be nice.

@winstonewert
Copy link
Contributor Author

winstonewert commented Sep 4, 2020 via email

@markschl
Copy link
Owner

markschl commented Sep 4, 2020

What else could happen but lose the remaining data? The processing has already returned with an error. But, yes, if a reading error happened then we'll want to make sure that gets returned.

Ok, I admit that also returning from the main thread on error without waiting for more data or io::Errors is a valid behaviour, as long as it's documented. If doing so, we would loose some data / errors waiting in the BackgroundReader::full_send queue. On the other hand, if waiting for the background thread to finish with handle.join().unwrap()?; before returning the error (as I proposed in my last comment), there will be reads happening after the error in the main thread occurred because Reader::done() sends a stop signal to the BackgroundReader::empty_recv queue, and the background thread will not stop instantly (delay depends on queuelen). I'm not sure right now, which one is the smarter solution.

Btw, I think the background writer (write.rs) has the same bug. I can address this in another commit, or if you have time, you could also include analogous changes in the PR.

@winstonewert
Copy link
Contributor Author

I think that you should simply throw away any errors that the client code didn't read. My reasons:

  1. The client code may be deliberately not reading to the end of the stream, knowing that it will error out for some reason.
  2. Returning the errors from the background thread creates a race condition. The behavior of the program will be different depending on how far the background thread got.
  3. The client code may be adding context/transforming the io error in some way, which is defeated by trying to return the errors.
  4. Not returning errors that would be later in the file matches the semantics of a typical file more closely.
  5. Your code could be somewhat more generic, because the error type wouldn't need to be From<io::Error>

So, if it were up to me (and its your crate, so its not), i'd delete reader.get_errors()?.

The primary reason I can see to handle those errors is for symmetry with writer. Obviously in the case of the writer you want to capture any errors that might have happened trying to write after processing was finished.

@markschl
Copy link
Owner

markschl commented Sep 5, 2020

Thanks for your reasoning, I agree on points 1-4. It's problematic to do further reads after the error in the main thread occurred, and if an io::Error would be returned (and converted) instead, it would be confusing. The call to Reader::get_errors (and the method itself) can be removed.

Regarding 5., I'm of course open for changes :) How would you propose the reader function signature to be? I think, symmetry to the writer module is not that important.

Here a few thoughts about the write module (it helps me to write them down, since it's not so easy to wrap my head around this, especially in holiday): The get_errors function is important for reporting possible io::Errors, although delayed in relation to the write call in the main thread. When func ends successfully, the last (usually partially filled) buffer is sent to the background for writing (writer.done()?;), and writer.get_errors()?; ensures that errors are reported back. On the other hand, if func ends early with a custom, we don't want to send any more data, writer.done()?; shouldn't be called. However, the writer in the background thread is still writing what is in the queue, and these writes may fail. These failures can happen both before and after the error in the main thread occurred, we cannot just stop the writer at this exact time point. If an error happened in both the main and the background writer thread, would we need to report both to be totally clear? I'm not sure right now how to make an ergonomic API with such a behaviour. But if so, we could separate the two error types like in the reader.

@winstonewert
Copy link
Contributor Author

I think we can just drop the + From<io::Error> from reader and reader_init because without get_errors(), there's no need to convert io::Error into E.

On writing, I think the best option is to return the func error and ignore further io:Error. It would be nice to return both errors in some way, but I think it wouldn't be ergonomic. Given the choice of the two errors, I think I'd rather see the func error.

Also, looking at this code, I think I'd expect flush to actually force the background thread to flush before returning instead of requesting a flush in the background. (Although I can't see a use case for flushing when using this crate.)

@markschl
Copy link
Owner

markschl commented Sep 6, 2020

I think we can just drop the + From<io::Error> from reader and reader_init because without get_errors(), there's no need to convert io::Error into E.

Ah right, I don't know why I missed that this is simply possible. I'm a bit distracted right now.

On writing, I think the best option is to return the func error and ignore further io:Error. It would be nice to return both errors in some way, but I think it wouldn't be ergonomic. Given the choice of the two errors, I think I'd rather see the func error.

Fine for me, I'll document the behaviour in the next release.

Also, looking at this code, I think I'd expect flush to actually force the background thread to flush before returning instead of requesting a flush in the background. (Although I can't see a use case for flushing when using this crate.)

flush() is a required trait method of io::Write, and I prefer to mimic the behaviour of io::Write as closely as possible. The flush call will be in the same order relative to write calls. If flushing after all writes are done, it is guaranteed that no further writes will be done after flushing in the background thread. io::BufWriter will flush automatically when it goes out of scope. Which would be after func returns for writer_init and after the provided writer goes out of scope for writer.

Regarding fs::File, I just found this issue. I haven't read it through completely, not sure if e.g. sync_all and flush are equivalent, and in which case one of them may be needed to be absolutely sure that all data is written. sync_all can be called in the finish closure of writer_finish or writer_init_finish. But unless I miss something, I don't think that it is up to thread_io to provide such functionality.

@markschl
Copy link
Owner

markschl commented Sep 6, 2020

If flushing after all writes are done, it is guaranteed that no further writes will be done after flushing in the background thread

Correcting my own statement, there can be a further write even if the flush call is last. So, if wanting to flush before the handle is closed, it would be necessary to do this in the finish closure. Still, I'm not sure if this should be automatically done (always).

@winstonewert
Copy link
Contributor Author

winstonewert commented Sep 6, 2020 via email

@markschl
Copy link
Owner

markschl commented Sep 6, 2020

I plan to send you a pull request sometime this week for the error handling.

But that's an entirely distinct issue from the error handling, and I'm not even using your writer (at least not yet), so I'll leave it at that.

True

I plan to send you a pull request sometime this week for the error handling.

Thank you!

winstonewert added a commit to winstonewert/thread_io that referenced this issue Sep 8, 2020
Both read and write did not handle errors properly. In particular,
errors happening in either the functions passed into thread_io by the
client caused unexpected panics or hangs.

Added a bunch of tests to handle the various conditions, and improved
some existing tests. Ensured as much as possible that the most
informative errors is returned in all cases.
markschl added a commit that referenced this issue Sep 14, 2020
@markschl
Copy link
Owner

Thanks for the PR to fix this issue, especially for the very nice handling of panics.

I have added some documentation on error handling. I also followed your advice on flushing at the end, I agree that it definitely makes sense.

There should be a 0.3 release soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants