-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime/trace: "preempted" StateTransition sometimes has Stack of single zeroed StackFrame #68090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have a suspicion as to how this is happening, but not a complete picture yet. The problematic transition you point out also happens from a thread stack. The two cases where such a transition may appear are I suspect that in one of these paths, |
#68093 may be related, but I suspect not. |
Here's a simple reproducer:
It disproves my theory about |
OK, I figured it out. It's that the stack trace has exactly 1 frame in it, but the In the reproducer, the victim goroutine is the GC mark worker (just like in the original post) and when I lower the skip count from 1 to 0, I see:
Unfortunately, there's still the question as to why the bottom frame in the mark worker isn't showing up. |
The behavior of the reproducer you posted doesn't seem the same as the original failure I see in the net/http benchmark: in my testing, it doesn't include "0x0" frames. In my debugging so far it looks like there are approximately two paths to getting a stack that consists of a single 0x0 frame. The behavior of Sometimes the problem is apparent as soon as @mknyszek , maybe something there is enough of a hint that you can immediately solve the puzzle. But I plan to keep hacking on this until the Go 1.25 freeze sets in (and maybe the fix is small enough to accept at this point in the release cycle).
|
I'll try to take a closer look today, but before that I'll just say that bug fixes are generally fair game for the freeze, even if they're for old bugs (we just prioritize new bugs). |
Thanks. I don't have an estimate of how invasive the fix may be, so I'll keep aiming for pre/early freeze. I see that So the execution trace is trying to describe a real preemption event, about a real goroutine, that really doesn't have any (user-level) functions on its call stack. That's an unusual preemption event! Given that it exists, I'm not sure how to describe it in ways that won't be a surprise to consumers of execution trace data. Maybe we could make it not happen at all: There's very little in This doesn't explain the |
I'm debugging this mainly on darwin/arm64, so the following details are about that platform. (I've re-confirmed that I also see occasional There's some disagreement between runtime/tracestack.go's Preemption events are generated while running on the system stack, which requires special backtracing behavior. It looks like I've included an example below, via a crash I added in There should be a frame for
|
Agreed, this does seem like it's trying to preempt an exiting goroutine. I was gonna respond that newly-created goroutines always start with a stack, so they shouldn't
That's a good question. I wonder if we should just show
Nice find on the disagreement between the two! Looks like yes, we're starting traceback from the wrong point. I'll have to take a closer look to try to understand what the 'right' point is. I suspect that this is probably the reason why there's a frame missing from the mark background worker, after I fix the skip count, in my example, above. (I do think the skip count is probably at least one reason things are wrong. I spot-checked the counts when changing out the tracer, but I wasn't super thorough.) |
Running with Adding Following those, I see that goroutines can experience preemption before their
The bottom of the stack is usually The generated Maybe we say that a goroutine really does have no calls on its call stack at the start and end of its life, and report that as a zero-length call stack rather than as a call stack with a single
Right, but preemption requests change the stack guard, to trigger a Maybe there are also bugs with the Here's a preemption event at the start of a goroutine's life, seen in a self-inflicted crash:
And from
|
Go version
go version devel go1.23-477ad7dd51 Thu Jun 20 16:46:54 2024 +0000 darwin/arm64
Output of
go env
in your module/workspace:What did you do?
What did you see happen?
Some StateTransition Events include a Stack and StateTransition.Stack that are not equal to NoStack, but which also don't contain a stack from the Event's goroutine. Instead, they yield a single zeroed StackFrame (PC of 0x0, Line of 0, File and Func of "").
I've only seen this on
Running->Runnable
transitions, withReason="preempted"
.It's also present in go1.22.4.
Here's the sort of stack I'd expect to see from that execution trace's view of goroutines 25, 26, and 2881:
What did you expect to see?
I expected the stack to be trace.NoStack when no stack was available, or for the stack to contain PC/Func/File/Line corresponding to code that the goroutine had on its stack. I should not see PC of 0x0.
CC @mknyszek @golang/runtime
The text was updated successfully, but these errors were encountered: