Optimize initialization of arrays using repeat expressions #43488

Florob · 2017-07-26T14:46:00Z

This PR was inspired by this thread on Reddit.
It tries to bring array initialization in the same ballpark as Vec::from_elem() for unoptimized builds.
For optimized builds this should relieve LLVM of having to figure out the construct we generate is in fact a memset().

To that end this emits llvm.memset() when:

the array is of integer type and all elements are zero (Vec::from_elem() also explicitly optimizes for this case)
the array elements are byte sized

If the array is zero-sized initialization is omitted entirely.

This is mainly for readability of the generated LLVM IR and subsequently assembly. There is a slight positive performance impact, likely due to I-cache effects.

This elides initialization for zero-sized arrays: * for zero-sized elements we previously emitted an empty loop * for arrays with a length of zero we previously emitted a loop with zero iterations This emits llvm.memset() instead of a loop over each element when: * all elements are zero integers * elements are byte sized

rust-highfive · 2017-07-26T14:46:06Z

r? @arielb1

(rust_highfive has picked a reviewer for you, use r? to override)

carols10cents · 2017-07-31T14:41:55Z

friendly ping @arielb1! i think you were on vacation but i think you're back now? checking on IRC too

arielb1 · 2017-08-01T09:41:42Z

I'm back. Wait I thought this was WIP

Florob · 2017-08-01T11:39:31Z

@arielb1 I'm not sure how to take that comment. What made you think this was WIP?
It bootstraps, passes the test suite, and adds a new passing codegen test.
It might still be complete bogus, because I've never really worked on rustc, but I guess that is for you to determine.

arielb1 · 2017-08-01T19:11:37Z

I just had it mixed with another PR. Am reviewing your PR now.

arielb1 · 2017-08-01T19:15:07Z

It would be nice if we had MIRI to deal with more complicated cases like None::<SomethingBig>, but I see no problem with this PR.

@bors r+

bors · 2017-08-01T19:15:08Z

📌 Commit ac43d58 has been approved by arielb1

arielb1 · 2017-08-01T19:17:30Z

Actually, I think the second case could be implemented in a nicer way

@bors r-

arielb1 · 2017-08-01T19:19:52Z

src/librustc_trans/mir/rvalue.rs

+                // Use llvm.memset.p0i8.* to initialize byte arrays
+                let elem_layout = bcx.ccx.layout_of(tr_elem.ty).layout;
+                match *elem_layout {
+                    Layout::Scalar { value: Primitive::Int(layout::I8), .. } |


This "knows" that all scalars are immediates of LLVM type i8. I'm not sure that is always true, and it might break in the future. Can you move this to the previous if with a check that val_ty(v) == Type::i8(ccx)?

Do you have similar concerns for the CEnum case?
Also I'd appreciate thoughts on how to avoid duplicating the call_memset() setup code in the process.

It's better to merge them and just check the LLVM type. Also use from_immediate to catch booleans too.

arielb1 · 2017-08-01T19:21:31Z

r+ with the if-cases merged. Nice optimization - we could make it more general with MIRI, but that's somewhat in the future.

Florob · 2017-08-01T22:35:16Z

Updated to check the LLVM type. I somehow hadn't realized both cases would be i8, though it is obvious in retrospect.
r? @arielb1

arielb1 · 2017-08-03T16:32:03Z

src/librustc_trans/mir/rvalue.rs

+                    if common::val_ty(v) == Type::i8(bcx.ccx) {
+                        let align = align.unwrap_or_else(|| bcx.ccx.align_of(tr_elem.ty));
+                        let align = C_i32(bcx.ccx, align as i32);
+                        let fill = tr_elem.immediate();


use v here rather than calling immediate again.

arielb1 · 2017-08-03T16:33:56Z

Sorry for being a lazy reviewer. r=me with that nit resolved.

Florob · 2017-08-04T00:28:37Z

Nit fixed. r?

arielb1 · 2017-08-04T12:10:31Z

@bors r+

bors · 2017-08-04T12:10:32Z

📌 Commit 6704450 has been approved by arielb1

bors · 2017-08-04T12:44:57Z

⌛ Testing commit 6704450 with merge 4b300779da21658990584e0d64514a23f0e71d3a...

bors · 2017-08-04T13:44:40Z

💔 Test failed - status-travis

kennytm · 2017-08-04T14:05:13Z

The new test case failed on dist-i586-gnu-i686-musl.

[00:55:24] failures:
[00:55:24] 
[00:55:24] ---- [codegen] codegen/slice-init.rs stdout ----
[00:55:24] 	
[00:55:24] error: verification with 'FileCheck' failed
[00:55:24] status: exit code: 1
[00:55:24] command: /checkout/obj/build/x86_64-unknown-linux-gnu/llvm/build/bin/FileCheck -input-file=/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll /checkout/src/test/codegen/slice-init.rs
[00:55:24] stdout:
[00:55:24] ------------------------------------------
[00:55:24] 
[00:55:24] ------------------------------------------
[00:55:24] stderr:
[00:55:24] ------------------------------------------
[00:55:24] /checkout/src/test/codegen/slice-init.rs:36:12: error: expected string not found in input
[00:55:24]  // CHECK: call void @llvm.memset.p0i8.i{{[0-9]+}}(i8* {{.*}}, i8 7, i64 4
[00:55:24]            ^
[00:55:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll:81:24: note: scanning from here
[00:55:24] define void @byte_array() unnamed_addr #1 {
[00:55:24]                        ^
[00:55:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll:87:2: note: possible intended match here
[00:55:24]  call void @llvm.memset.p0i8.i32(i8* %1, i8 7, i32 4, i32 1, i1 false)
[00:55:24]  ^
[00:55:24] /checkout/src/test/codegen/slice-init.rs:52:12: error: expected string not found in input
[00:55:24]  // CHECK: call void @llvm.memset.p0i8.i{{[0-9]+}}(i8* {{.*}}, i8 {{.*}}, i64 4
[00:55:24]            ^
[00:55:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll:99:29: note: scanning from here
[00:55:24] define void @byte_enum_array() unnamed_addr #1 {
[00:55:24]                             ^
[00:55:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll:109:2: note: possible intended match here
[00:55:24]  call void @llvm.memset.p0i8.i32(i8* %2, i8 %1, i32 4, i32 1, i1 false)
[00:55:24]  ^
[00:55:24] /checkout/src/test/codegen/slice-init.rs:61:12: error: expected string not found in input
[00:55:24]  // CHECK: call void @llvm.memset.p0i8.i{{[0-9]+}}(i8* {{.*}}, i8 0, i64 16
[00:55:24]            ^
[00:55:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll:122:34: note: scanning from here
[00:55:24] define void @zeroed_integer_array() unnamed_addr #1 {
[00:55:24]                                  ^
[00:55:24] /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/slice-init.ll:129:2: note: possible intended match here
[00:55:24]  call void @llvm.memset.p0i8.i32(i8* %2, i8 0, i32 16, i32 4, i1 false)
[00:55:24]  ^
[00:55:24] 
[00:55:24] ------------------------------------------
[00:55:24] 
[00:55:24] thread '[codegen] codegen/slice-init.rs' panicked at 'explicit panic', /checkout/src/tools/compiletest/src/runtest.rs:2499:8
[00:55:24] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:55:24] 
[00:55:24] 
[00:55:24] failures:
[00:55:24]     [codegen] codegen/slice-init.rs
[00:55:24] 
[00:55:24] test result: FAILED. 43 passed; 1 failed; 3 ignored; 0 measured; 0 filtered out

Florob · 2017-08-04T15:01:12Z

The type of the len argument to llvm.memset.*() varies between architectures. The test case is now generic over this. Sorry for not paying close enough attention to this.
r? @arielb1

arielb1 · 2017-08-04T22:50:29Z

@bors r+

bors · 2017-08-04T22:50:29Z

📌 Commit 3aa3a5c has been approved by arielb1

bors · 2017-08-05T01:00:14Z

⌛ Testing commit 3aa3a5cf983cf5dfcff23b1485279afd977c8562 with merge fd92186a766ca69fa102f921b1715b8e60b25ef7...

bors · 2017-08-05T01:58:12Z

💔 Test failed - status-travis

…on the exact intrinsic used

Florob · 2017-08-05T02:27:51Z

So it turns out I clearly can't be trusted if I haven't slept properly. Apparently while double checking if I got all instances of llvm.memset(), I still missed one occurrence and completely messed up the commit message m(. Should be okay now. Definitely on x86_64, can't easily test for x86_32.
r? @arielb1 (and maybe pretend I'm an idiot while you're reviewing)

arielb1 · 2017-08-06T08:09:55Z

@bors r+

bors · 2017-08-06T08:09:55Z

📌 Commit 11d6312 has been approved by arielb1

arielb1 · 2017-08-06T08:10:07Z

That happens to everyone. That's why we have bors.

bors · 2017-08-06T08:10:07Z

⌛ Testing commit 11d6312 with merge a9c24fd...

Optimize initialization of arrays using repeat expressions This PR was inspired by [this thread](https://www.reddit.com/r/rust/comments/6o8ok9/understanding_rust_performances_a_newbie_question/) on Reddit. It tries to bring array initialization in the same ballpark as `Vec::from_elem()` for unoptimized builds. For optimized builds this should relieve LLVM of having to figure out the construct we generate is in fact a `memset()`. To that end this emits `llvm.memset()` when: * the array is of integer type and all elements are zero (`Vec::from_elem()` also explicitly optimizes for this case) * the array elements are byte sized If the array is zero-sized initialization is omitted entirely.

bors · 2017-08-06T10:25:49Z

☀️ Test successful - status-appveyor, status-travis
Approved by: arielb1
Pushing a9c24fd to master...

Use llvm.memset.p0i8.* to initialize all same-bytes arrays It doesn't affect tests, LLVM seems smart enough for it, but then I wonder why we have the zero case at all (it was introduced in rust-lang#43488, maybe LLVM wasn't smart enough then). So let's run perf to see if there's any build time effect, and if no, I'll remove the zero special case and also run perf.

Use llvm.memset.p0i8.* to initialize all same-bytes arrays Similar to rust-lang#43488 debug builds can now handle `0x0101_u16` and other multi-byte scalars that have all the same bytes (instead of special casing just `0`)

Florob added 2 commits July 26, 2017 16:23

trans: Reorder basic blocks in slice_for_each

d721c1f

This is mainly for readability of the generated LLVM IR and subsequently assembly. There is a slight positive performance impact, likely due to I-cache effects.

rust-highfive assigned arielb1 Jul 26, 2017

alexcrichton added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 27, 2017

arielb1 reviewed Aug 1, 2017

View reviewed changes

trans: Check LLVM type instead of Layout

c3603f3

arielb1 reviewed Aug 3, 2017

View reviewed changes

trans: Reuse immediate value in call to call_memset()

6704450

codegen tests: Check type of len argument to llvm.memset.* based …

11d6312

…on the exact intrinsic used

Florob force-pushed the repeat-opt branch from 3aa3a5c to 11d6312 Compare August 5, 2017 02:19

bors merged commit 11d6312 into rust-lang:master Aug 6, 2017

oli-obk mentioned this pull request Jan 8, 2025

Use llvm.memset.p0i8.* to initialize all same-bytes arrays #135258

Merged

Optimize initialization of arrays using repeat expressions #43488

Optimize initialization of arrays using repeat expressions #43488

Uh oh!

Conversation

Florob commented Jul 26, 2017

Uh oh!

rust-highfive commented Jul 26, 2017

Uh oh!

carols10cents commented Jul 31, 2017

Uh oh!

arielb1 commented Aug 1, 2017

Uh oh!

Florob commented Aug 1, 2017

Uh oh!

arielb1 commented Aug 1, 2017

Uh oh!

arielb1 commented Aug 1, 2017

Uh oh!

bors commented Aug 1, 2017

Uh oh!

arielb1 commented Aug 1, 2017

Uh oh!

arielb1 Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

Florob Aug 1, 2017

Choose a reason for hiding this comment

Uh oh!

arielb1 Aug 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arielb1 commented Aug 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Florob commented Aug 1, 2017

Uh oh!

arielb1 Aug 3, 2017

Choose a reason for hiding this comment

Uh oh!

arielb1 commented Aug 3, 2017

Uh oh!

Florob commented Aug 4, 2017

Uh oh!

arielb1 commented Aug 4, 2017

Uh oh!

bors commented Aug 4, 2017

Uh oh!

bors commented Aug 4, 2017

Uh oh!

bors commented Aug 4, 2017

Uh oh!

kennytm commented Aug 4, 2017

Uh oh!

Florob commented Aug 4, 2017

Uh oh!

arielb1 commented Aug 4, 2017

Uh oh!

bors commented Aug 4, 2017

Uh oh!

bors commented Aug 5, 2017

Uh oh!

bors commented Aug 5, 2017

Uh oh!

Florob commented Aug 5, 2017

Uh oh!

arielb1 commented Aug 6, 2017

Uh oh!

bors commented Aug 6, 2017

Uh oh!

arielb1 commented Aug 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bors commented Aug 6, 2017

Uh oh!

bors commented Aug 6, 2017

Uh oh!

Uh oh!

arielb1 Aug 1, 2017 •

edited

Loading

arielb1 commented Aug 1, 2017 •

edited

Loading

arielb1 commented Aug 6, 2017 •

edited

Loading