-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: infinite loop in lockextra on linux/amd64 #42207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Go1.12 is no longer supported. Does this happen in Go1.15? |
It's hard to reproduce and I can't reproduce deadlock one more time either in 1.12 or 1.15. needm code is the same in 1.15 and 1.12 and I suppose it's also relevant for Go1.15. func needm(x byte) { if (iscgo || GOOS == "windows") && !cgoHasExtraM { // Can happen if C/C++ code calls Go from a global ctor. // Can also happen on Windows if a global ctor uses a // callback created by syscall.NewCallback. See issue #6751 // for details. // // Can not throw, because scheduler is not initialized yet. write(2, unsafe.Pointer(&earlycgocallback[0]), int32(len(earlycgocallback))) exit(1) } // Lock extra list, take head, unlock popped list. // nilokay=false is safe here because of the invariant above, // that the extra list always contains or will soon contain // at least one m. mp := lockextra(false) // Set needextram when we've just emptied the list, // so that the eventual call into cgocallbackg will // allocate a new m for the extra list. We delay the // allocation until then so that it can be done // after exitsyscall makes sure it is okay to be // running at all (that is, there's no garbage collection // running right now). mp.needextram = mp.schedlink == 0 extraMCount-- unlockextra(mp.schedlink.ptr()) // <--- signal raised before unlockextra ... sigtrampgo will call badsignal since g==nil in cgocallback. badsignal will call needm and we get deadlock: func sigtrampgo(sig uint32, info *siginfo, ctx unsafe.Pointer) { if sigfwdgo(sig, info, ctx) { return } c := &sigctxt{info, ctx} g := sigFetchG(c) setg(g) if g == nil { if sig == _SIGPROF { sigprofNonGoPC(c.sigpc()) return } if sig == sigPreempt && preemptMSupported && debug.asyncpreemptoff == 0 { // This is probably a signal from preemptM sent // while executing Go code but received while // executing non-Go code. // We got past sigfwdgo, so we know that there is // no non-Go signal handler for sigPreempt. // The default behavior for sigPreempt is to ignore // the signal, so badsignal will be a no-op anyway. return } c.fixsigcode(sig) badsignal(uintptr(sig), c) // <--- here we will get deadlock return } ... P.S. it's code from Go1.15 runtime |
UPD I've written a simple cgo program that reproduces this bug in Go1.15.3 $ go version go version go1.15.3 linux/amd64
|
Thanks for the reproducer. |
Thanks for the test case. I can't reproduce the problem on my system, but I think I see the problem. |
Change https://golang.org/cl/265759 mentions this issue: |
Change https://golang.org/cl/265778 mentions this issue: |
It requires cgo. Also, skip the test on windows and plan9. For #42207 Change-Id: I8522773f93bc3f9826506a41a08b86a083262e31 Reviewed-on: https://go-review.googlesource.com/c/go/+/265778 Trust: Ian Lance Taylor <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>
Thanks for quick fix! |
@gopherbot Please open backport issues. This bug can cause a deadlock for programs that create threads in C code such that those threads call into Go code, if a signal is received at the wrong time. There is no workaround. Note that CL 265759 had a bug in the test, and that CL 265778 (a test-only change) is also required. |
Backport issue(s) opened: #42635 (for 1.14), #42636 (for 1.15). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
Change https://golang.org/cl/271847 mentions this issue: |
Change https://golang.org/cl/271848 mentions this issue: |
…ting M Otherwise, if a signal occurs just after we allocated the M, we can deadlock if the signal handler needs to allocate an M itself. For #42207 Fixes #42636 Change-Id: I76f44547f419e8b1c14cbf49bf602c6e645d8c14 Reviewed-on: https://go-review.googlesource.com/c/go/+/265759 Trust: Ian Lance Taylor <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Bryan C. Mills <[email protected]> (cherry picked from commit 368c401) Reviewed-on: https://go-review.googlesource.com/c/go/+/271847
…ting M Otherwise, if a signal occurs just after we allocated the M, we can deadlock if the signal handler needs to allocate an M itself. For #42207 Fixes #42635 Change-Id: I76f44547f419e8b1c14cbf49bf602c6e645d8c14 Reviewed-on: https://go-review.googlesource.com/c/go/+/265759 Trust: Ian Lance Taylor <[email protected]> Run-TryBot: Ian Lance Taylor <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Bryan C. Mills <[email protected]> (cherry picked from commit 368c401) Reviewed-on: https://go-review.googlesource.com/c/go/+/271848
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Go program uses cgo. C part of program creates 2 threads by pthread_create: event_loop and watchdog(for monitoring hanged event loop). C code doesn't install any signal handlers and doesn't change sigmask.
C-code calls go-code for logging.
Simple C code of watchdog thread that uses popen/pclose and generates SIGCHLD signals:
Backtrace of deadlock:
What did you expect to see?
Successful execution of cgo callback.
What did you see instead?
The execution got stuck into an infinite loop in runtime.lockextra.
What happened:
Looks like there is a deadlock when signal raised in section between lockextra/unlockextra.
So is it a valid bug or I've violated some cgo rules? I've found #34391 issue but looks like root cause is different.
The text was updated successfully, but these errors were encountered: