-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[MMX] POSTRAScheduler disarrange emms and mmx instruction #35330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Oops, I mixed up nice-194.s and nice-195.s. So it should be:
|
Simon, do we just need to mark EMMS/FEMMS as defing the MMX registers? That seems to at least fix this case. Maybe mark it as using them too? |
Yes, and any x87 registers as well. I was considering just making them terminators to prevent any crossover - even though that'd affect all instructions not just x87/mmx. |
I don't think either of those is sufficient, given the definitions would be dead (so you're effectively just clobbering the registers). The x86-64 ABI says "The CPU shall be in x87 mode upon entry to a function." i.e. the x87 tag word should be set to all ones. So the right way to handle this is to explicitly model the tag word as a register: MMX instrutions clobber it, emms defines it, and call/return instructions use it. There are also other potential "scheduling" problems: currently, MMX intrinsics are marked IntrNoMem, so an IR transform could sink an MMX instruction past an EMMS. But that's sort of orthogonal to the MachineInstr modeling. |
Is this a duplicate of #15760 ? It does looks suspiciously similar. I believe this bug causes undefined behavior in Rust code using MMX registers. We have had a lot of recurring bugs in linux, windows, and macos targets (mostly on 32-bit targets) over the last two years where suddenly some floating-point tests would intermittently fail in some systems and a couple of compiler builds later the failures would disappear just to reappear some time later. We have started to manually emit Frontends using the floating-point LLVM intrinsics should have to use inline assembly to avoid undefined behavior. |
A partial fix to clobber MM0-7 and ST0-7 on emms/femms was commited in r352642. This prevents the postRA scheduler from affecting the test case here. This is not a complete fix and it can break in other ways. |
The fix committed in r352642 was reverted in r352782. It was recommitted in r353016. Hopefully it will stick this time. |
Richard Smith came up with this test (i've slightly modified it for clarity). Compile for 64bit, -O2 -- https://godbolt.org/z/drn6zs.
|
James, this most recent issue isn't unique to llvm is it? I think gcc had the same issue before they switched to converting all mmx operations to SSE2 in recent versions of gcc. |
mentioned in issue #41664 |
Extended Description
POSTRAScheduler rearrange emms and mmx instruction, so we receive wrong result:
The text was updated successfully, but these errors were encountered: