-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: CSE some function calls args/results #20934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Our calling convention allows callees to modify their input args on the stack. So I don't think you can CSE v15 & v24, as the first call may modify what v15 put on the stack. |
Can you explain a bit more? I still don't see how this could cause trouble, at least as long as the sequence of CSE'd writes are all setting up the stack before a call. |
Sorry, I was misinterpreting what you were suggesting. The calls are in different branches, not in serial. Yes, we could CSE those writes. We haven't been CSEing any pair where one instance does not dominate the other (memory type or otherwise). In general, this is to avoid putting work at the dominator that doesn't need to be done on some path out of the dominator. But when that work needs to be done on every path out of that dominator, we could move the values up with no chance of doing unneeded work. |
For a simpler example, try
Currently we don't CSE the x*5 computation. It would be nice if we could, and such an optimization is probably a prerequisite for the optimization you're talking about in this issue. |
In doing the "all dominated paths share expression" CSE, we should exclude panic edges. |
This kind of code ("call either g or h with the same args") is not uncommon. It is compiled suboptimally; there is lots of duplicate code to set up and read the stack:
Setting up the stack for the call and processing the return could be CSE'd to dominating and post-dominating blocks. In pseudo code:
I suspect that at least for function arguments, our current CSE pass would work, but for the prohibition on CSEing values of memory type.
At the end of the generic cse pass, the code above looks like:
Observe that v15/v24 and v18/v26 certainly look ripe for CSE.
Rolling back the prohibition on CSEing memory values would require care, though, since that helped a lot with CSE toolspeed problems. Continuing to prohibit CSE of calls might be a workable middle ground.
cc @randall77 @tzneal @dr2chase
The text was updated successfully, but these errors were encountered: