Skip to content

LTO scan of module-level inline assembly does not respect CPU #67698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mysterymath opened this issue Sep 28, 2023 · 3 comments
Open

LTO scan of module-level inline assembly does not respect CPU #67698

mysterymath opened this issue Sep 28, 2023 · 3 comments
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)

Comments

@mysterymath
Copy link
Contributor

mysterymath commented Sep 28, 2023

The scan of module-level inline assembly performed to extract the referenced symbols does not forward the CPU specified on the clang command line. Because of this, opcodes that are only legal in that CPU will cause the compile to fail in -flto where it would have succeeded without. For example:

asm(".globl func; func: cm.mvsa01 s1, s0; ret");

This succeeds when compiled for RISC-V -march=rv64izcmp (https://godbolt.org/z/8dxT3nxs9), but fails when -flto is added (https://godbolt.org/z/GK5ohKnK9).

See

T->createMCSubtargetInfo(TT.str(), "", ""));

@mysterymath mysterymath added the LTO Link time optimization (regular/full LTO or ThinLTO) label Sep 28, 2023
@mysterymath mysterymath added the bug Indicates an unexpected problem or unintended behavior label Sep 28, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2023

@llvm/issue-subscribers-bug

The scan of module-level inline assembly performed to extract the referenced symbols does not forward the CPU specified on the clang command line. Because of this, opcodes that are only legal in that CPU will cause the compile to fail in `-flto` where it would have succeeded without. For example:
asm(".globl func; func: cm.mvsa01 s1, s0; ret");

This succeeds when compiled for RISC-V -march=rv64izcmp (https://godbolt.org/z/8dxT3nxs9), but fails when -flto is added (https://godbolt.org/z/GK5ohKnK9).

See

T->createMCSubtargetInfo(TT.str(), "", ""));

gnoliyil pushed a commit to gnoliyil/fuchsia that referenced this issue Jan 27, 2024
Clang has a bug whereby module level inline ASM requiring target
features may result in codegen errors that will not cause a link to
fail, but may corrupt the symbol table (e.g., by omitting the function).
Note that this only affects module level ASM, and not all instances of
inline assembly, so the number of places this affects in Fuchsia  is
relatively small. Thus far, it only appears to affect riscv64 in our
codebase, but the bug affects all architectures, and needs to be fixed
in upstream LLVM.

Specifically, we see this causing an issue in dynlink.c. The proposed
workarounds for now are to disable LTO or to move the ASM into assembly
files. This patch adopts the later approach instead of disabling LTO.

More details on the codegen issue can be found on
llvm/llvm-project#67698

Bug: 136220
Test: fx set bringup.riscv64 --variant=lto --with //bundles/boot_tests --auto-dir
Test: fx build
Change-Id: Id96ea732d2b6fd1d46a6f82ec2f91ea857dd9858
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/945114
Reviewed-by: Caslyn Tonelli <[email protected]>
Reviewed-by: Petr Hosek <[email protected]>
Fuchsia-Auto-Submit: Paul Kirth <[email protected]>
Commit-Queue: Auto-Submit <[email protected]>
gnoliyil pushed a commit to gnoliyil/fuchsia that referenced this issue Jan 27, 2024
The underlying issue seems to be that the linker does not have target
attributes set correctly in all cases, and LTO cannot diagnose this.
This is tracked in upstream LLVM, and adding these flags to the linker
invocation is the suggested work around (see
llvm/llvm-project#65090,
llvm/llvm-project#59350 (comment),
llvm/llvm-project#67698,
and
https://discourse.llvm.org/t/myterious-soft-float-output-in-lto-cache/70753
 for more details).

Bug: 129493
Change-Id: Ib3b9f387d61c5a855ba57ede191881c567414de7
Reviewed-on: https://fuchsia-review.googlesource.com/c/fuchsia/+/949677
Reviewed-by: Petr Hosek <[email protected]>
Fuchsia-Auto-Submit: Paul Kirth <[email protected]>
Commit-Queue: Auto-Submit <[email protected]>
@ilovepi
Copy link
Contributor

ilovepi commented May 15, 2025

@mysterymath did we ever determine what the fix would be here? IIRC it was to just use the asm scanning logic in clang, and have it emit the symbol table? at least that's my take away from https://discourse.llvm.org/t/rfc-target-cpu-and-features-for-module-level-inline-assembly/74713

@EugeneZelenko EugeneZelenko removed the bug Indicates an unexpected problem or unintended behavior label May 15, 2025
@mysterymath
Copy link
Contributor Author

@mysterymath did we ever determine what the fix would be here? IIRC it was to just use the asm scanning logic in clang, and have it emit the symbol table? at least that's my take away from https://discourse.llvm.org/t/rfc-target-cpu-and-features-for-module-level-inline-assembly/74713

Yeah, I think that's still the good move. I think it would also be ideal for this list to be attached to inline assembly constructs in the IR, too. It's a huge wart that there are differences in behavior between the two. Plus, like jyknight mentioned, this would make the contents of inline asm fully described by the surrounding constraints and IR attributes up until the text is expanded at the MC layer, which is much simpler to think about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

No branches or pull requests

4 participants