Skip to content
This repository was archived by the owner on Feb 21, 2026. It is now read-only.

[CIR][Dialect] Add tmp attribute to AllocaOp#2114

Open
ivanmurashko wants to merge 2 commits intollvm:mainfrom
ivanmurashko:is-temporary-attribute-pr
Open

[CIR][Dialect] Add tmp attribute to AllocaOp#2114
ivanmurashko wants to merge 2 commits intollvm:mainfrom
ivanmurashko:is-temporary-attribute-pr

Conversation

@ivanmurashko
Copy link
Member

@ivanmurashko ivanmurashko commented Jan 31, 2026

Summary

  • Introduce a semantic tmp unit attribute on cir.alloca to mark compiler-generated temporaries.
  • Replace name-prefix heuristics (ref.tmp*/agg.tmp*) in the lifetime checker with the new attribute.
  • Mark temporaries at creation sites (CreateRefTempWithAutoName,
    CreateAggTempAddressWithAutoName, CreateAggTempWithName, and
    CreateMemTempWithName paths).
  • Update CIR CodeGen tests to expect , tmp on temporary allocas.

Rationale

The prior approach relied on string prefixes to detect temporaries, which is fragile and tied to internal naming conventions. The new tmp attribute is set at semantic creation sites, making the intent explicit and robust.

We also avoid setting init on temporaries. The init flag reflects VarDecl-based locals (derived from currVarDecl), while compiler-generated temporaries are not VarDecls. Their initialization is represented by explicit stores, not the init flag. Marking temporaries with tmp and omitting init avoids misleading metadata and improves lifetime analysis accuracy. This explains changes like:

["ref.tmp1", init]  ->  ["ref.tmp1", tmp]

in some LIT tests such as clang/test/CIR/CodeGen/coro-task.cpp and clang/test/CIR/CodeGen/lambda.cpp.

We intentionally keep the scope narrow: only semantic C++ temporaries created via the *WithAutoName helpers (ref.tmp*/agg.tmp*) are tagged tmp. Other scratch/ABI temps with explicit names (e.g., "coerce", "tmp.try.call.res") are
not marked to avoid masking real lifetime issues.

Fixes #2113

Replace fragile string matching (ref.tmp*/agg.tmp*) with a semantic
tmp unit attribute for compiler-generated temporaries.

The lifetime checker now uses getTmp() instead of checking
if allocation names start with "ref.tmp". This approach is more
robust and doesn't depend on internal naming conventions.

Temporaries are marked at creation sites (CreateMemTemp, CreateAggTemp,
CreateRefTmp, CreateAggTmpAddress) to ensure semantic correctness.

Fixes llvm#2113
Rename CreateMemTemp/CreateAggTemp to *WithName to make it explicit that the
caller supplies the name (e.g., "coerce", "tmp.try.call.res"). This does not
imply non-compiler-generated; it just means the name is explicitly chosen.

Rename CreateRefTmp/CreateAggTmp* to *WithAutoName to make explicit that they
use the ref.tmp*/agg.tmp* counters and represent semantic temporaries (tagged
with tmp). This distinguishes explicit-name scratch temps from auto-named
temporaries without changing behavior.
// into a temporary alloca.
static Address emitValToTemp(CIRGenFunction &CGF, Expr *E) {
Address DeclPtr = CGF.CreateMemTemp(
Address DeclPtr = CGF.CreateMemTempWithName(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why all these changes to CreateMemTempWithName, can't you just change CreateMemTemp* to tag the allocas as "tmp"? The name of the function is already enforcing the semantics here :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why all these changes to CreateMemTempWithName, can't you just change CreateMemTemp* to tag the allocas as tmp? The name of the function is already enforcing the semantics here :)

I introduced the naming split intentionally to keep API intent explicit:

  • *WithName: explicit caller-provided names.
  • *WithAutoName: compiler-generated ref.tmp* / agg.tmp*.

If API intent clarity is valuable here, we can keep this split.

If you prefer less API surface, I can also remove the split and follow your suggestion directly: keep CreateMemTemp*/CreateAggTemp*, tag tmp inside CreateMemTemp*, and drop the extra auto-name helpers. That is simpler, but it also tags more scratch allocas as tmp.

@bcardosolopes what do you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, please follow my suggestion then!

(`,` `init` $init^)?
(`,` `const` $constant^)?
`]`
custom<AllocaNameAndFlags>($name, $init, $constant, $tmp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't init be orthogonal to tmp? Not sure I understand why they need to be exclusive (and if that's the case we should have had a verifier to guarantee), can you elaborate?

Seems like adding another (, tmp $tmp^)? here would simplify parsing/printing significantly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't init be orthogonal to tmp? Not sure I understand why they need to be exclusive (and if that's the case we should have had a verifier to guarantee), can you elaborate?

I agree they should be orthogonal, not mutually exclusive. With the custom parser we can already represent both together (["name", init, tmp]), so this is not a verifier/exclusivity issue. The remaining question is semantic policy in CodeGen: making this fully orthogonal in emitted IR would require setting init on compiler-generated temporaries too, which broadens the meaning of init.

Seems like adding another (, tmp $tmp^)? here would simplify parsing/printing significantly.

(init?, const?, tmp?) was actually my first approach. I avoided it because implementing it cleanly required changing init semantics and marking compiler-generated temporaries with init as well.

Even after doing that, there is still an MLIR ODS parsing limitation with adjacent optional groups that share the same leading ,: forms like ["name", init, tmp] fail when const is absent. So this is not only a semantic concern; it also has a parser robustness issue in this format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remaining question is semantic policy in CodeGen: making this fully orthogonal in emitted IR would require setting init on compiler-generated temporaries too, which broadens the meaning of init.

How, you can always setInit in the appropriate place, this functionality shouldn't be part of createTemp*, but can operate on the result / use side.

implementing it cleanly required changing init semantics and marking compiler-generated temporaries with init as well.

I don't see how that is true, what am I missing?

Even after doing that, there is still an MLIR ODS parsing limitation with adjacent optional groups that share the same leading ,: forms like ["name", init, tmp] fail when const is absent. So this is not only a semantic concern; it also has a parser robustness issue in this format.

The const thing looks like a silly limitation, perhaps that should be fixed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CIR][LifetimeCheck] Add is_temporary unit attribute to AllocaOp for reliable temporary detection

2 participants