matches!(n, -1 | 1)
for signed NonZero suboptimal
#84311
Labels
A-codegen
Area: Code generation
C-bug
Category: This is a bug.
E-needs-test
Call for participation: An issue has been fixed and does not reproduce, but no test has been added.
I-slow
Issue: Problems and improvements with respect to performance of generated code.
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
The code generated when comparing a signed non-zero type to
-1
or1
varies wildly, and appears to generate suboptimal branching code.A more in-depth analysis of the generated assembly can be found here: https://godbolt.org/z/qxzGrv9nn.
Basic
Looking at these two functions, one would expect them to generate the same assembly.
However that is not the case, with
is_one_1
in fact branching as seen below. Whilst I have not run benchmarks to see if branching is more performant, one would expect it not to be.What if we include
0
in the comparison?Since
len
is aNonZeroI32
, it is guaranteed thatlen.get() != 0
otherwise it is undefined behaviour. This means that the function below should behave the same asis_one_1
.In fact the generated assembly for this is the smallest yet.
This assembly is actually also generated when we include
0
in the comparison in the style ofis_one_2
.Expected behaviour
The expected behaviour of the compiler would be to generate assembly that matches between the different implementations of this check.
The text was updated successfully, but these errors were encountered: