[CIR][Dialect] Add tmp attribute to AllocaOp#2114
[CIR][Dialect] Add tmp attribute to AllocaOp#2114ivanmurashko wants to merge 2 commits intollvm:mainfrom
Conversation
Replace fragile string matching (ref.tmp*/agg.tmp*) with a semantic tmp unit attribute for compiler-generated temporaries. The lifetime checker now uses getTmp() instead of checking if allocation names start with "ref.tmp". This approach is more robust and doesn't depend on internal naming conventions. Temporaries are marked at creation sites (CreateMemTemp, CreateAggTemp, CreateRefTmp, CreateAggTmpAddress) to ensure semantic correctness. Fixes llvm#2113
Rename CreateMemTemp/CreateAggTemp to *WithName to make it explicit that the caller supplies the name (e.g., "coerce", "tmp.try.call.res"). This does not imply non-compiler-generated; it just means the name is explicitly chosen. Rename CreateRefTmp/CreateAggTmp* to *WithAutoName to make explicit that they use the ref.tmp*/agg.tmp* counters and represent semantic temporaries (tagged with tmp). This distinguishes explicit-name scratch temps from auto-named temporaries without changing behavior.
| // into a temporary alloca. | ||
| static Address emitValToTemp(CIRGenFunction &CGF, Expr *E) { | ||
| Address DeclPtr = CGF.CreateMemTemp( | ||
| Address DeclPtr = CGF.CreateMemTempWithName( |
There was a problem hiding this comment.
I'm not sure I understand why all these changes to CreateMemTempWithName, can't you just change CreateMemTemp* to tag the allocas as "tmp"? The name of the function is already enforcing the semantics here :)
There was a problem hiding this comment.
I'm not sure I understand why all these changes to
CreateMemTempWithName, can't you just changeCreateMemTemp*to tag the allocas astmp? The name of the function is already enforcing the semantics here :)
I introduced the naming split intentionally to keep API intent explicit:
*WithName: explicit caller-provided names.*WithAutoName: compiler-generatedref.tmp*/agg.tmp*.
If API intent clarity is valuable here, we can keep this split.
If you prefer less API surface, I can also remove the split and follow your suggestion directly: keep CreateMemTemp*/CreateAggTemp*, tag tmp inside CreateMemTemp*, and drop the extra auto-name helpers. That is simpler, but it also tags more scratch allocas as tmp.
@bcardosolopes what do you prefer?
There was a problem hiding this comment.
Thanks, please follow my suggestion then!
| (`,` `init` $init^)? | ||
| (`,` `const` $constant^)? | ||
| `]` | ||
| custom<AllocaNameAndFlags>($name, $init, $constant, $tmp) |
There was a problem hiding this comment.
Shouldn't init be orthogonal to tmp? Not sure I understand why they need to be exclusive (and if that's the case we should have had a verifier to guarantee), can you elaborate?
Seems like adding another (, tmp $tmp^)? here would simplify parsing/printing significantly.
There was a problem hiding this comment.
Shouldn't
initbe orthogonal totmp? Not sure I understand why they need to be exclusive (and if that's the case we should have had a verifier to guarantee), can you elaborate?
I agree they should be orthogonal, not mutually exclusive. With the custom parser we can already represent both together (["name", init, tmp]), so this is not a verifier/exclusivity issue. The remaining question is semantic policy in CodeGen: making this fully orthogonal in emitted IR would require setting init on compiler-generated temporaries too, which broadens the meaning of init.
Seems like adding another (, tmp $tmp^)? here would simplify parsing/printing significantly.
(init?, const?, tmp?) was actually my first approach. I avoided it because implementing it cleanly required changing init semantics and marking compiler-generated temporaries with init as well.
Even after doing that, there is still an MLIR ODS parsing limitation with adjacent optional groups that share the same leading ,: forms like ["name", init, tmp] fail when const is absent. So this is not only a semantic concern; it also has a parser robustness issue in this format.
There was a problem hiding this comment.
The remaining question is semantic policy in CodeGen: making this fully orthogonal in emitted IR would require setting init on compiler-generated temporaries too, which broadens the meaning of init.
How, you can always setInit in the appropriate place, this functionality shouldn't be part of createTemp*, but can operate on the result / use side.
implementing it cleanly required changing init semantics and marking compiler-generated temporaries with init as well.
I don't see how that is true, what am I missing?
Even after doing that, there is still an MLIR ODS parsing limitation with adjacent optional groups that share the same leading ,: forms like ["name", init, tmp] fail when const is absent. So this is not only a semantic concern; it also has a parser robustness issue in this format.
The const thing looks like a silly limitation, perhaps that should be fixed.
Summary
tmpunit attribute oncir.allocato mark compiler-generated temporaries.ref.tmp*/agg.tmp*) in the lifetime checker with the new attribute.CreateRefTempWithAutoName,CreateAggTempAddressWithAutoName,CreateAggTempWithName, andCreateMemTempWithNamepaths)., tmpon temporary allocas.Rationale
The prior approach relied on string prefixes to detect temporaries, which is fragile and tied to internal naming conventions. The new
tmpattribute is set at semantic creation sites, making the intent explicit and robust.We also avoid setting
initon temporaries. Theinitflag reflects VarDecl-based locals (derived fromcurrVarDecl), while compiler-generated temporaries are not VarDecls. Their initialization is represented by explicit stores, not theinitflag. Marking temporaries withtmpand omittinginitavoids misleading metadata and improves lifetime analysis accuracy. This explains changes like:in some LIT tests such as
clang/test/CIR/CodeGen/coro-task.cppandclang/test/CIR/CodeGen/lambda.cpp.We intentionally keep the scope narrow: only semantic C++ temporaries created via the
*WithAutoNamehelpers (ref.tmp*/agg.tmp*) are taggedtmp. Other scratch/ABI temps with explicit names (e.g., "coerce", "tmp.try.call.res") arenot marked to avoid masking real lifetime issues.
Fixes #2113