-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Improve llvm.ucmp.iN.i1
codegen (specifically the 1-bit inputs case)
#129401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The arguments of Does these rules apply to the second approach? |
I wrote this specifically about when the inputs are (I suppose this could also be about anything that's |
llvm.ucmp.i8.i1
codegenllvm.ucmp.iN.i1
codegen (specifically the 1-bit inputs case)
We can handle this special case in |
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: None (scottmcm)
(Context: I was making a rustc PR and accidentally regressed `bool::cmp` by having it use `llvm.ucmp`, thus this bug that it would be nice if `ucmp` just was smart about it.)
But today they don't codegen the same: <https://llvm.godbolt.org/z/nxWdYhvTo> define noundef range(i8 -1, 2) i8 @<!-- -->src(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
%0 = call i8 @<!-- -->llvm.ucmp.i8.i1(i1 %a, i1 %b)
ret i8 %0
}
define noundef range(i8 -1, 2) i8 @<!-- -->tgt(i1 noundef zeroext %a, i1 noundef zeroext %b) unnamed_addr {
start:
%aa = zext i1 %a to i8
%bb = zext i1 %b to i8
%0 = sub nsw i8 %aa, %bb
ret i8 %0
} on x64 gives src: # @<!-- -->src
cmp dil, sil
seta al
sbb al, 0
ret
tgt: # @<!-- -->tgt
mov eax, edi
sub al, sil
ret I don't know if it's better to InstSimplify the |
Hi @scottmcm , I would like to work on this issue, can you assign this to me ? |
@dipeshs809 I have no permissions to assign people here, but your note in chat here is probably sufficient to claim it. |
(Context: I was making a rustc PR and accidentally regressed
bool::cmp
by having it usellvm.ucmp
, thus this bug that it would be nice ifucmp
just was smart about it.)llvm.ucmp.i8.i1(a, b)
is actually the same as justzext(a) - zext(b)
: https://alive2.llvm.org/ce/z/oHq3bhBut today they don't codegen the same: https://llvm.godbolt.org/z/nxWdYhvTo
on x64 gives
I don't know if it's better to InstSimplify theit'd be nice if the intrinsic worked optimally forucmp
tosext(b) + zext(a)
or to improve the codegen for thei1
case, but either way,i1
in addition to the wider widths.EDIT: based on comments below, sound like it'd be better to have the codegen special-case this, rather than optimize it away in the middle-end.
The text was updated successfully, but these errors were encountered: