Use simple io::Error for &[u8] Read and Write impl #42156

arthurprs · 2017-05-22T18:01:05Z

I'm not sure this change is gonna fly because of the worse error message (see end of post), but here's the case.

I was checking https://www.reddit.com/r/rust/comments/6cjs7u/analyzing_the_performance_of_a_hash/ and I pinpointed the cause of slowness to the io::read_exact implementation that byteorder read_u32 calls.
It turns out that the optimizer isn't smart enough to avoid the creation of the result altogether and creates a relatively costly io::Error just to discard it after (edit: reason being that creating an io::Error involves a function call and the optimizer can't blindly remove that side effect).

I played a bit with the code and if the code is simple enough (simple enum creation after the inlines) it optimizes out just as well as the version with the branch+unwrap().

Playground link for before: http://play.integer32.com/?gist=04569213693e4862ef1e13a19e408cd5
Playground link for after: http://play.integer32.com/?gist=3755fc64d339c81d8cb01ea41cc0862b

code

fn write(&mut self, mut bytes: &[u8]) {
    while let Ok(n) = bytes.read_u32::<NE>() {
        self.write_u32(n);
    }

    for byte in bytes {
        let i = *byte;
        self.add_to_hash(i as usize);
    }
}

before

cmpq    $4, %rbx
jb  .LBB0_6
....
.LBB0_6:
leaq    str.0(%rip), %rdi
movl    $27, %esi
callq   _ZN3std5error205_$LT$impl$u20$core..convert..From$LT$$RF$$u27$b$u20$str$GT$$u20$for$u20$alloc..boxed..Box$LT$std..error..Error$u20$$u2b$$u20$core..marker..Send$u20$$u2b$$u20$core..marker..Sync$u20$$u2b$$u20$$u27$a$GT$$GT$4from17hf12a25e4e8752742E@PLT
movq    %rdx, %rcx
leaq    8(%rsp), %rdi
movl    $17, %esi
movq    %rax, %rdx
callq   _ZN3std2io5error5Error4_new17h3674736498de977fE@PLT
cmpb    $2, 8(%rsp)
jb  .LBB0_11

after

cmpq	$4, %rsi
jb	.LBB0_6
...
.LBB0_6:
testq	%rsi, %rsi
je	.LBB0_7

Error message changes:

println!("display -> {}", read_u32(&mut bytes).unwrap_err());
println!("debug -> {:?}", read_u32(&mut bytes).unwrap_err());

before:
display -> failed to write whole buffer
debug -> Error { repr: Custom(Custom { kind: WriteZero, error: StringError("failed to write whole buffer") }) }

after:
display -> write zero
debug -> Error { repr: Kind(WriteZero) }

sfackler · 2017-05-22T18:03:28Z

This seems pretty reasonable to me I think.

@rust-lang/libs Does anyone feel like the new errors will be too confusing?

alexcrichton · 2017-05-22T18:26:09Z

Nah this seems reasonable to me, especially because they're just an implementation on slices and not for the whole trait

nagisa · 2017-05-22T18:34:08Z

I’m not following the reasoning. You say:

It turns out that the optimizer isn't smart enough to avoid the creation of the result altogether and creates a relatively costly io::Error just to discard it after.

AFAICT it only creates the Error in the failure case. That is, the Error::new() will not be called if the function is not returning the error. I can see more cleanups in the old version, but the exceptions should be “free” in terms of execution speed. The stack usage between the versions is also the same.

… I guess what you describe does happen ONCE, when the actual failure case happens and the function returns the Error (the inline(never) thus 100% preventing from optimisation taking place), but then I’m doubting that this is the right way to fix the problem?

arthurprs · 2017-05-22T18:44:41Z

The sample code uses that to stop the loop. So yes, it happens at least once. Creating the common io::Error though involves a function call which the optimizer can't just remove (side effects, etc.) so stopping the loop is really expensive compared to the rest of the function.

It's probably fair to accept this optimization as it's reading over the simplest readable type in the language.

nagisa · 2017-05-22T18:58:19Z

See, my point here is not about it being a bad optimisation to make, but about whether it is the right way to optimise this. Any io::Error that has Repr::Custom in it is the target of such inefficiency. This PR fixes the problem for &[u8] (great!), then somebody comes along, puts the &[u8] into Cursor<T> and complains about exactly the same problem (ewww?).

This could also be fixed in general by fixing #24194 + a targeted #[inline] on a relevant constructor function.

arthurprs · 2017-05-22T19:59:33Z

I confirmed that it could work if that issue is fixed [1]. We would need to mark the str/String Into<Box>> inline accordingly. Can we rely on that?

[1] choose stable&release in http://play.integer32.com/?gist=dd721e339a507e560016b8baee698a60 doesn't work on nightly.

carols10cents · 2017-05-29T14:00:44Z

I confirmed that it could work if that issue is fixed [1]. We would need to mark the str/String Into> inline accordingly. Can we rely on that?

@nagisa do you have an answer for @arthurprs's question here?

arthurprs · 2017-05-29T15:03:30Z

To be clearer, can we rely on llvm to optimize out the error path allocations? Maybe it considers it might have side effects or something (like normal functions).

nagisa · 2017-05-29T19:41:16Z

I general you can’t rely on optimisations in any way at all. If you need some particular sequence of instructions, write them down in assembly.

That being said, getting rid of the allocations here by constructing a different variant of the error enumeration and getting rid of the allocations here by a carefully placed inline hint are pretty much equivalent as far as reliability is concerned IMO.

arthurprs · 2017-05-29T20:39:20Z

Fair enough. I suggesting using the simple io::Error then, it was created for this exact purpose.

arthurprs · 2017-05-29T20:47:05Z

Ahh, maybe not... it's too specific. I'll probably close and we can revisit once #24194 is resolved.

sfackler added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 22, 2017

Use simple io::Error for &[u8] Read and Write impl

4ee0320

arthurprs force-pushed the io_error branch from c1b1a55 to 4ee0320 Compare May 22, 2017 18:49

arielb1 added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label May 23, 2017

arthurprs closed this May 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use simple io::Error for &[u8] Read and Write impl #42156

Use simple io::Error for &[u8] Read and Write impl #42156

arthurprs commented May 22, 2017 •

edited

Loading

sfackler commented May 22, 2017

alexcrichton commented May 22, 2017

nagisa commented May 22, 2017 •

edited

Loading

arthurprs commented May 22, 2017 •

edited

Loading

nagisa commented May 22, 2017

arthurprs commented May 22, 2017

carols10cents commented May 29, 2017

arthurprs commented May 29, 2017

nagisa commented May 29, 2017

arthurprs commented May 29, 2017

arthurprs commented May 29, 2017

Use simple io::Error for &[u8] Read and Write impl #42156

Use simple io::Error for &[u8] Read and Write impl #42156

Conversation

arthurprs commented May 22, 2017 • edited Loading

sfackler commented May 22, 2017

alexcrichton commented May 22, 2017

nagisa commented May 22, 2017 • edited Loading

arthurprs commented May 22, 2017 • edited Loading

nagisa commented May 22, 2017

arthurprs commented May 22, 2017

carols10cents commented May 29, 2017

arthurprs commented May 29, 2017

nagisa commented May 29, 2017

arthurprs commented May 29, 2017

arthurprs commented May 29, 2017

arthurprs commented May 22, 2017 •

edited

Loading

nagisa commented May 22, 2017 •

edited

Loading

arthurprs commented May 22, 2017 •

edited

Loading