-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: signal 2 received on thread with no signal stack #18600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
/cc @rsc @ianlancetaylor |
Which version of Darwin? Is there any more to that stack trace? Please try setting the environment variable |
Sorry, running on macOS 10.12.2 (16C67) Here's a full system stack trace: https://gist.github.com/jbardin/0421e4a7689ebc46732894f3f8e2af8d I wasn't able to trigger it on Linux, but I may also just be missing the timing window on that machine. |
I've looked over the Go runtime code, and it looks fine. @quentinmit Is it still possible to ssh into that Sierra system? |
@ianlancetaylor Yeah, I don't remember what we had actually set up though. Ping me on Hangouts and I can set it up again. @jbardin Do you have any antivirus software installed? Or haxies? |
@quentinmit, nope. running vanilla macOS on real hardware. |
"haxies" = things that inject code into running programs; Application Enhancer and SIMBL are examples of that. |
Not running anything like that that I know of, unless macOS is injecting something by default. I did see that the |
Reproduced on 10.12.2 16C67 using the while loop in $GOPATH/src/camlistore.org. |
I confirmed that the signal is arriving on the ordinary goroutine stack, and the stack trace for that goroutine at the time of the signal is:
That's the RawSyscall(SYS_FORK, ...) call that creates the child process. I changed the runtime to record its own pid at init time and then at the time of the signal print both the init-time pid and the current pid. That appears to confirm that the signal has been received in the child process (at least not in the parent process!), before the child process has gotten a chance to continue executing after the fork system call. It looks like When we create a new thread we mask all signals, make the new thread, and then bring back the signals. We could do the same for fork, using the beforefork and afterfork runtime hooks, except there is no good time to run the afterfork hook and turn signals back on. Any window between 'after the fork' and 'before the exec' where signals are enabled will confuse the runtime if a signal is received. And we can't start the child with signals masked off, or it will likely keep them masked off. It seems like the best we can do is detect when we receive a signal in a child process that hasn't quite gotten to exec yet and make the child die quietly? I assume that using posix_spawn would be another way around this. I'm a little surprised this hasn't turned up on other systems too, but I guess the race window is small and it only affects killpg. Or maybe macOS is the only one that clears alternate signal stacks during fork. /cc @ianlancetaylor for thoughts about how to proceed |
Confirmed that using 'kill -2 $! ' instead of 'kill -2 %' makes the problem go away: this is about killpg, not a kernel bug where parent signals get delivered to a child. |
This problem does not happen on GNU/Linux because the child process still has an alternate signal stack in effect. At least, that is what I see from this program, which on GNU/Linux prints
Program:
|
On macOS 12.12.2 I get
OK, so we know that Linux doesn't clear sigaltstack at fork (that's nice), but there's still technically a problem. A signal has arrived and we're handling it as if we're in the parent, but we're in the child. If the signal handler did anything that depended on being in the real parent process, we'd be in trouble. We probably do need to either make this impossible or detect it and react accordingly. |
The most correct procedure I can think of is
I will work on a CL. |
To clarify, the reason for changing all the signal handlers to |
SGTM. Thanks. |
I think this can wait for Go 1.9. We're more likely to break something common than fix this rare event. |
Ping @ianlancetaylor. |
CL https://golang.org/cl/45471 mentions this issue. |
@ianlancetaylor should this (plus the follow-up fix 28f650a) get back-ported to Go 1.8.4? |
Russ decided above that this was rare enough that we wouldn't worry about for 1.8. |
go1.8rc1 && master on darwin
Sending a SIGINT to a
go build ./...
caused a runtime error:Managed to replicate this by building in a fairly large project, and running for a while
The text was updated successfully, but these errors were encountered: