-
-
Notifications
You must be signed in to change notification settings - Fork 271
Implement cross-module inlining #1577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
|
|
||
| // Set inlining attribute | ||
| if (fdecl->neverInline) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is of course completely orthogonal to your PR, but we should probably make LDC_never_inline just set ::inlining appropriately instead of having ::neverInline, now that the former exists.
|
In case you are wondering why I'm being excessively picky here: From trying to get this to work previously, I'm rather anxious about things randomly breaking with weird symptoms from trying to run semantic analysis out-of-order. In particular, I would sometimes end up with In other words, I pretty much expect that we are going to have to come back to this piece of code a number of times to debug whatever issues turn up once this is used in the wild, which is why I think it makes sense to strive to make it as easily understandable as possible. Along the same lines, I think it would make sense – whatever heuristic we actually end up with for |
|
Thanks for your comments David. Trying to incorporate them all. Indeed, a specific switch to turn this off is good to have (also for testing purposes). |
|
Note for future work. Some of the functions in the testcase are not inlined yet with this: |
|
The AppVeyor -O3 bug appears to be an optimization bug with LLVM3.9 (can reproduce on Mac, but not with LLVM3.8). Edit: nope, not at all. It's a bug in Phobos: dlang/phobos#4509 |
|
For future work: nested functions are not chain-inlined. pragma(inline, true) bool addAndCheckInsideNestedFunc(int checkbefore, int increment)
{
pragma(inline, true)
bool addCheckNested(int checkbefore, int increment)
{
return true;
}
return addCheckNested(checkbefore, increment);
} |
|
The only remaining problem is a linking problem: for LLVM < 3.7 we don't emit symbols in COMDAT any groups. So for some symbols there are multiple definitions link errors (e.g. initializers of structs nested inside an inlined function). Now I can either complicate the LDC logic. Or simple disable cross-module inlining for LLVM < 3.7. Strongly levitating toward the latter... |
ddmd/globals.d
Outdated
| const(char)* ldc_version; | ||
| const(char)* llvm_version; | ||
|
|
||
| bool gaggedForInlining; // Set for functionSemantic3 for external inlining canditates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just caught my eye, no time for a proper review: candidates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed :)
|
With disabling for LLVM < 3.7, it is all green now. |
|
Imho just inlining always-inline functions would be enough as a first step to get this merged. Coming up with/tuning a good heuristic for striking the balance between compile time and executable performance could take some time. (One possibility to think about: Different "thresholds" for -O2 and -O3.) |
|
The compile slowdown may not be too terrible. LLVM will do its optimizations first, and non-inlined functions are discarded before machine codegen. So I think we can live with a superdumb inlining heuristic, and leave the good stuff up to LLVM? I've implemented a simple statement counting heuristic now. |
d54ae98 to
12b98e0
Compare
…e statement-counting inlining heuristic.
|
no more "WIP" ! :) |
tests/codegen/inlining_staticvar.d
Outdated
| checkNestedStruct_1(); | ||
| checkNestedStruct_2(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many trailing newlines...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:/ fixed!
| Logger::println("Does not need codegen, skipping."); | ||
| return; | ||
| // Force codegen if this is a templated function with pragma(inline, true). | ||
| if ((decl->members->dim == 1) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you ever get around to looking into using onemember?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I haven't yet. Thanks for the reminder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having looked more closely at it now, it is not completely clear to me how to work with the onemember functions and the AST :S. It'd be good to have a testcase where something that should be inlined isn't.
In some parts of DMD I see that instead decl.aliasdecl is checked to see if it is a template with eponymous function... I'm confused. Testcase needed! ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have misremembered how onemember works myself, then.
|
Looks good to merge to me. |
| uint oldgag = global.gag; | ||
| version (IN_LLVM) | ||
| { | ||
| if (global.gag && !spec && !global.gaggedForInlining) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be beneficial to discuss whether this is the "proper" way of doing it with upstream, although I understand if you don't want to spend the energy on it.
|
Random side note: It might be interesting to add diagnostic counters (like in LLVM proper) to |
I've been thinking the same. Clang/LLVM's code seems to use an interface for reporting internal compiler diagnostics. Would be great to have something similar in LDC (also for e.g. PGO things). |
|
AppVeyor actually ran is green. A unique opportunity. |
|
Great! |
|
Nice work, thanks! I fast-forwarded the
;) AppVeyor is currently the only fully working CI system, the only green spark in a sea of red. ;) It does take a painful lot of time though. |
That is true, but it's so slow that it's faster to verify that the only Travis failures are due to build timeouts than waiting for it to complete. ;) |
|
Checked just now: llvm.org/apt shows that the APT repo is now up-to-date with trunk head again, so CircleCI should be green soon (because it is at roughly the same version as AppVeyor). |
|
Thanks! |
An "inlined" function will be defined with the LLVM attribute
available_externally, so that LLVM can decide to actually inline the function or not.pragma(inline, false)--> the function receives thenoinlineattribute.pragma(inline, true)--> the function receives thealwaysinlineattribute and cross-module inlining for this function is performed even at-O0.The current heuristic for (defining available_externally functions to enable) inlining functions is:
true:-)Very aggressive indeed; good for testing :). Perhaps a simple statement count heuristic can be used .
Functions that have
__FILE__as default template parameter need to always be inlinedpragma(inline, true), otherwise it results in link errors. See the forum discussion:https://forum.dlang.org/thread/[email protected]
To inline a function, semantic analysis needs to be completed for that function (DMD does the same). Doing that however, may add new symbols to the module member list for codegenning. This means that the order of how modules are codegenned needs to be reversed (or, members[0] needs to be codegenned last, as that one receives all new symbols). This showed up while testing Weka's codebase and cost me a lot of time. Unfortunately, I have not managed to recreate this problem in a testcase.
Tested succesfully on Weka's codebase.