-
Notifications
You must be signed in to change notification settings - Fork 13.3k
x.trailing_zeros() > n
is not optimized as well as x & ((1 << n) - 1) == 0
on x86
#43024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It looks like trailing_zeros currently calls an LLVM intrinsic (cttz) which has a patch posted against it -- https://reviews.llvm.org/D9284 -- perhaps someone familiar with LLVM (cc @arielb1) could push that through review and then change libcore to not have the conditional that it does today: https://github.com/rust-lang/rust/blob/master/src/libcore/num/mod.rs#L1375-L1390. |
@Mark-Simulacrum A variant of that patch was implemented via llvm-mirror/llvm@1886c8e in 2016, so we should definitely have it in all LLVM versions we support and can drop that workaround. However, I don't think this is really related to the issue seen here -- LLVM just doesn't recognize this particular pattern (probably because it would be rather odd in C). Godbolt for reference: https://godbolt.org/z/ovqsCg I'm a bit stumped about the u8 cttz codegen though. If I disable all optimizations, I get:
which directly calls |
Yeah, looks like the macro just isn't doing what it's intended to do: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=33539371298f2c808a22ba3cf40db567 |
Remove u8 cttz hack This issue has since been fixed in LLVM: llvm-mirror/llvm@1886c8e Furthermore this code doesn't actually work, because the 8 literal does not match the $BITS provided from the macro invocation, so effectively this was just dead code. Ref rust-lang#43024. What LLVM does is still not ideal for CPUs that only have bsf but not tzcnt, will create a patch for that later. r? @nagisa
Partially implemented in https://reviews.llvm.org/D55745. This will handle only |
This was fully fixed by https://reviews.llvm.org/D56355, which unfortunately did not make it into the last LLVM update. |
There was another LLVM update since then, and this issue is now fully fixed. |
the bitshift and bitand version optimizes to
while the
trailing_zeros
version optimizes toeven though I find the
trailing_zeros
version much more straight forwardShould this be reported upstream in llvm or is this something a mir pass should do?
The text was updated successfully, but these errors were encountered: