-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: "goroutine stack exceeds 250000000-byte limit" on linux-arm #35470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
These stacks are not large at all, so the problem is not actual stack overflow, but a false report of stack overflow. |
I can imagine a possibility: if there are both a synchronous preemption request (by clobbering the stack guard) and an asynchronous one (by signal), and the goroutine in a function prologue first sees the clobbered stack guard, so it will call morestack. If the signal lands after the CMP instruction but before the call to morestack, it will be asynchronously preempted, enter the scheduler. When it is resumed, the scheduler clears the preemption request, unclobbers the stack guard. But the resumed goroutine will still call morestack (as it has passed the CMP instruction). morestack will, as there is no preemption request, double the stack unnecessarily. If this happens multiple times, the stack may grow too big, although only a small amount is actually used. I let it print the current stack bounds in the stack-too-large error message, and the stack is indeed quite large, with only a small amount used:
In theory this can happen on other platforms. Not sure why this is only seen on the ARM builder. |
Maybe we want to disable async preemption in function prologue between the CMP instruction and the call to morestack? As it will call morestack, it will be preempted anyway. |
@cherrymui That sounds like a good idea. I think we might have to start at the load of the stack guard, as the CMP result is predestined at that point. But then maybe we need to only prevent async preemption if that loaded value is in fact the preempted guard. Tricky. |
Change https://golang.org/cl/207350 mentions this issue: |
Change https://golang.org/cl/207351 mentions this issue: |
Change https://golang.org/cl/207349 mentions this issue: |
See also #35784. |
Currently we use stack map index -2 to mark unsafe points, i.e. PC ranges that is not safe for async preemption. This has a problem: it cannot mark CALL instructions, because for stack scan a valid stack map index is needed. This CL switches to use register map index for marking unsafe points instead, which does not conflict with stack scan and can be applied on CALL instructions. This is necessary as next CL will mark call to morestack nonpreemptible. For #35470. Change-Id: I357bf26c996e1fee1e7eebe4e6bb07d62930d3f7 Reviewed-on: https://go-review.googlesource.com/c/go/+/207349 Reviewed-by: David Chase <[email protected]>
Print the current SP and (old) stack bounds when the stack grows too large. This helps to identify the problem: whether a large stack is used, or something else goes wrong. For #35470. Change-Id: I34a4064d5c7280978391d835e171b90d06f87222 Reviewed-on: https://go-review.googlesource.com/c/go/+/207351 Reviewed-by: Emmanuel Odeke <[email protected]>
Should we also close #35784? |
Yeah, I think we can close that. |
We're seeing stack overflows this week in various tests on the
linux-arm
builder.This may be related to #35349, but the stack traces on
linux-arm
are more diverse.CC @ianlancetaylor @aclements @mknyszek @cherrymui
2019-11-08T19:24:30-e6c12c3/linux-arm
2019-11-07T19:20:35-ceca99b/linux-arm
2019-11-07T18:39:03-05aa4a7/linux-arm
2019-11-07T16:13:31-0bf2eb5/linux-arm
2019-11-05T20:56:05-81559af/linux-arm
2019-11-05T17:19:16-1b3a1db/linux-arm
The text was updated successfully, but these errors were encountered: