-
Notifications
You must be signed in to change notification settings - Fork 18k
cmd/compile: consumes huge amount of memory when compiling table driven test code #11612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Please try with Go tip (which will shortly become Go 1.5) from the master branch. |
On tip, there is the same problem.
|
This seems to be better now. On my Linux system:
That seems to be saying the maximum compiler footprint was 235 kB for the input file from the playground. |
I think by 235kB you actually mean 235MB.
(The MSS size is measured in kB, actually
in pages.)
|
This got much much worse with the new SSA backend. Even the 500 lines example requires gigabytes of memory to be compiled go1.6:
tip:
|
The example code seems to cause lookupOutgoingVar to generate a huge amount of garbage. |
This looks like a consequence of our phi building algorithm. For variables which are long-lived but unmodified across lots of basic blocks, we end up placing lots of FwdRef markers that are unnecessary. We use http://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf , and this seems a fairly fundamental limitation of that algorithm. |
I couldn't build https://github.com/sourcegraph/webloop on my vps which has 512Mb memory and 256Mb swap space. |
@randall77 The algorithm presented in http://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf removes unnecessary phi functions during construction (as soon as the block is sealed). Since the go implementation delay the phi removal until the whole CFG is build, you end up with a lot of unnecessary phi functions/FwdRef markers. Thus, a possible solution would be to create and optimize phi functions during the construction. Even with FwdRef the memory usage looks pretty high.
It would also nice to see how the CFG for one line looks like. |
We do optimize phi functions during construction. I think we do exactly what Braun et al describe, as if all blocks were sealed to start. If there are situations where we generate more phis than necessary, I'm happy to accept patches to fix that. |
https://github.com/golang/go/blob/master/src/cmd/compile/internal/gc/ssa.go#L3793 says "Phi optimization is a separate pass". So it is performed after construction, right? |
The optimizations described in Braun are done during phi construction. There is in addition a phi optimization pass which is run after building. |
Yes, but the phi construction is done after the whole CFG is constructed. In the paper, phi functions are constructed and optimized as soon as the block is sealed. This makes a significant difference in terms of memory usage. |
I think the comment in linkForwardReferences is stale, that may be the confusing part. All the things described as "phi optimizations" in the Braun paper are done during phi construction. Another way of looking at it, we do what Braun does, but we don't start making phi values until all the blocks are sealed. The one exception is that we don't do the recursive removal they do in tryRmoveTrivialPhi, as we don't have pointers to users in our IR. I'm open to suggestions as to how to make that work. Part of the point of phielim is to catch those cases. The "phi optimization" pass in our compiler is a separate set of optimizations. |
Can you point me to the corresponding code? Again, the crucial question (regarding memory usage) is: When do we optimize the phi functions (or FwdRefs)? In my understanding of the code, this happens after the CFG construction rather than during the CFG construction.
You can catch some (but not all) cases when optimizing the unique operand instead (if it is a phi function). |
cmd/compile/internal/gc/ssa.go:resolveFwdRef. We find the value of a variable on each incoming edge and only make the phi if there are two distinct witnesses. |
I think I finally figured out my wrong assumption. I thought you applied the algorithm during CFG construction, but you applied it afterwards. I found at least two difference to the original algorithm in
Example: Let's assume we resolve the FwdRef after the outer
In contrast, the paper would construct: |
AFAICT there are only 2 variables ('r_INT' and 'samples') for SSA construction in the code snippet. Some statistics for compiling the code snippet would be very interesting: Disclaimer: |
For the referenced "test.go", 6784 blocks, 55190 vars.
Space use (with sparse assist):
Without sparse assist:
There are still some inputs we don't compile as fast as we'd like, or as fast as we once did, and I'd love to have any mistakes in our implementation of the paper's algorithm sorted out, but this particular slow compilation is solved. |
Sorry, I do not understand the following aspect: |
uname -a
Linux nico-Qiana-Xfce 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
go version go1.4.2 linux/amd64
When I click on "share" button in Go playground to share the problematic code, it fails with "server error: try again" message.
I imagine it is because the source code is too big (~8000 lines).
But you can find a small version here:
http://play.golang.org/p/qpxcVGkzuk
It compiles quickly without any problem because it has only 500 lines in the "samples" table.
But if you copy-paste these sample lines to have 8000 of them in the "samples" table, 6g consumes more than 2.5 Gb and then, it is killed, or crashes Virtualbox.
This problem is related to issue "cmd/go: go build takes too long to compile table driven test code":
#8259
But this time, it is not the compile time which is the problem, but the memory consumption, which is really high.
Even with 2.5 Gb of ram allocated to Virtualbox, 6g consumes all the available ram before being killed.
With go 1.3.2, I could compile, even if I should allocate at least 2.4 Gb of ram to Virtualbox.
But with go 1.4.2, the memory consumption has increased so much that I cannot compile the same code any more.
The text was updated successfully, but these errors were encountered: