Skip to content

[llvm9 regression] i586-unknown-linux-gnu does not generate movmskps anymore #794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gnzlbg opened this issue Aug 2, 2019 · 5 comments · Fixed by #1353
Closed

[llvm9 regression] i586-unknown-linux-gnu does not generate movmskps anymore #794

gnzlbg opened this issue Aug 2, 2019 · 5 comments · Fixed by #1353

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 2, 2019

This does looks like a regression with the LLVM9 update @nikic . The _mm_movemask_ps intrinsic used to generate a single movmskps instruction with LLVM8 on the i586-unknown-linux-gnu target, and now it generates:

---- core_arch::x86::sse::assert__mm_movemask_ps_movmskps stdout ----
disassembly for stdarch_test_shim__mm_movemask_ps_movmskps: 
	 0: sub $0x1c,%esp
	 1: movaps %xmm0,(%esp)
	 2: mov $0x811698e,%eax
	 3: mov %eax,0x81cd008
	 4: cmpl $0x0,(%esp)
	 5: sets %al
	 6: cmpl $0x0,0x4(%esp)
	 7: sets %cl
	 8: add %cl,%cl
	 9: or %al,%cl
	10: cmpl $0x0,0x8(%esp)
	11: sets %al
	12: cmpl $0x0,0xc(%esp)
	13: sets %dl
	14: add %dl,%dl
	15: or %al,%dl
	16: shl $0x2,%dl
	17: or %cl,%dl
	18: movzbl %dl,%eax
	19: add $0x1c,%esp
	20: ret
	21: xchg %ax,%ax
	22: xchg %ax,%ax
	23: xchg %ax,%ax
	24: xchg %ax,%ax
	25: xchg %ax,%ax
	26: xchg %ax,%ax
thread 'core_arch::x86::sse::assert__mm_movemask_ps_movmskps' panicked at 'failed to find instruction `movmskps` in the disassembly', crates/stdarch-test/src/lib.rs:152:9
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 2, 2019

Rust nightly MWE: https://godbolt.org/z/fA5UNl

Using the generated LLVM-IR: https://godbolt.org/z/LU7i9q

source_filename = "example.3a1fbbbh-cgu.0"
target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i586-unknown-linux-gnu"

define i32 @_ZN7example15_mm_movemask_ps17h7f545271133b7928E(<4 x float>* noalias nocapture readonly dereferenceable(16) %a) unnamed_addr #0 {
start:
  %0 = bitcast <4 x float>* %a to <4 x i32>*
  %1 = load <4 x i32>, <4 x i32>* %0, align 16
  %2 = icmp slt <4 x i32> %1, zeroinitializer
  %3 = bitcast <4 x i1> %2 to i4
  %4 = zext i4 %3 to i32
  ret i32 %4
}

attributes #0 = { norecurse nounwind nonlazybind readonly "probe-stack"="__rust_probestack" "target-cpu"="pentium" "target-features"="+sse" }

!llvm.module.flags = !{!0}

!0 = !{i32 2, !"RtLibUseGOT", i32 1}

I cannot generate the old machine code using any llc version. So maybe this is unrelated to the LLVM8 upgrade, and we changed something somewhere about the i586-unknown-linux-gnu target ?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 2, 2019

Using Rust 1.36.0: https://godbolt.org/z/k22p47

does generate

example::_mm_movemask_ps:
        mov     eax, dword ptr [esp + 4]
        movaps  xmm0, xmmword ptr [eax]
        movmskps        eax, xmm0
        ret
.Lfunc_end0:

and the LLVM-IR generated: https://godbolt.org/z/oMW4OK

is

source_filename = "example.3a1fbbbh-cgu.0"
target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i586-unknown-linux-gnu"

define i32 @_ZN7example15_mm_movemask_ps17h9d7ca884d8f840c4E(<4 x float>* noalias nocapture readonly dereferenceable(16) %a) unnamed_addr #0 {
start:
  %0 = load <4 x float>, <4 x float>* %a, align 16
  %1 = tail call i32 @llvm.x86.sse.movmsk.ps(<4 x float> %0) #2
  ret i32 %1
}

declare i32 @llvm.x86.sse.movmsk.ps(<4 x float>) unnamed_addr #1

attributes #0 = { nounwind nonlazybind readonly "probe-stack"="__rust_probestack" "target-cpu"="pentium" "target-features"="+sse" }
attributes #1 = { nounwind readnone }
attributes #2 = { nounwind }

!llvm.module.flags = !{!0}

!0 = !{i32 2, !"RtLibUseGOT", i32 1}

notice how the @llvm.x86.sse.movmsk.ps is not transformed like it is transformed with LLVM 9.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 2, 2019

Found the bug. The bug is in opt: https://godbolt.org/z/vVVDFd

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Aug 2, 2019

Opened LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=42870

@Nugine
Copy link
Contributor

Nugine commented Nov 14, 2022

The LLVM bug was resolved on 2020-03-24.

I find this issue from:

#[inline]
#[target_feature(enable = "sse")]
// FIXME: LLVM9 trunk has the following bug:
// https://github.com/rust-lang/stdarch/issues/794
// so we only temporarily test this on i686 and x86_64 but not on i586:
#[cfg_attr(all(test, target_feature = "sse2"), assert_instr(movmskps))]
#[stable(feature = "simd_x86", since = "1.27.0")]
pub unsafe fn _mm_movemask_ps(a: __m128) -> i32 {
movmskps(a)
}

Amanieu added a commit to Amanieu/stdarch that referenced this issue Nov 15, 2022
Amanieu added a commit that referenced this issue Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants