-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: timeouts in os/signal tests #27520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I traced this into a bug in semasleep in runtime/os_darwin.go. |
Change https://golang.org/cl/133655 mentions this issue: |
@gopherbot, please open an issue for backport to 1.11.1. |
Backport issue(s) opened: #27521 (for 1.11). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
@randall77 you were right on target with your prognosis and fix of this issue, nice! I got some time tonight to work on a repro for a regression test, as I had promised in a tertiary issue. This simple program can reproduce this issue on unpatched versions //+build !windows
package main
import (
"log"
"os"
"time"
)
func main() {
log.Printf("PID: %d", os.Getpid())
<-time.After(4 * time.Second)
} and given this bash script #!/bin/bash
function si {
for ((i=0; i<40; i++))
do
kill -SIGIO $1 && echo "Sent SIGIO" || break
sleep 0.6
done
}
si $1 Running the pairIf you run With the fixThe Go program will return immediately after the fixed number of seconds. The bash shell only prints say 4 times and exits, the Go program will properly exit $ time go run samp.go
2018/09/12 02:06:09 PID: 77259
real 0m4.212s
user 0m0.201s
sys 0m0.122s Without the fixThe bash shell will print 40 times Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO
Sent SIGIO and then finally the Go program will exit after >= 40 * 0.6s: >= 24s $ time go run samp.go
2018/09/12 02:03:47 PID: 77177
real 0m31.535s
user 0m0.191s
sys 0m0.118s Go regression testI tried to work on a repro but perhaps I need a fresh mind in the morning as the repro when translated should be trivial. Nonetheless here it is //+build !windows
package main
import (
"io/ioutil"
"log"
"os"
"os/exec"
"path/filepath"
"syscall"
"time"
)
func main() {
tempDir, err := ioutil.TempDir("", "signal-refresh")
if err != nil {
log.Fatalf("Failed to create the temp directory: %v", err)
}
defer os.RemoveAll(tempDir)
// Given the simple program below
repro := `
package main
import "time"
func main() {
<-time.After(2 * time.Second)
}
`
reproPath := filepath.Join(tempDir, "repro.go")
if err := ioutil.WriteFile(reproPath, []byte(repro), 0755); err != nil {
log.Printf("Failed to create temp file for repro.go: %v", err)
os.Exit(-1)
}
// Once it is written now we need to run it
cmd := exec.Command("go", "run", reproPath)
if err := cmd.Start(); err != nil {
log.Printf("Failed to start command: %v", err)
os.Exit(-1)
}
doneCh := make(chan error)
go func() {
doneCh <- cmd.Wait()
}()
// Now that we've started the repro, we
// can continuously send to it signal SIGIO.
unfixedTimer := time.NewTimer(4 * time.Second)
for {
select {
case <-time.After(600 * time.Millisecond):
// Send the pesky signal that toggle spinning
// till infinity if #27520 is not fixed!!
cmd.Process.Signal(syscall.SIGIO)
log.Println("Sent SIGIO")
case <-unfixedTimer.C:
log.Println("Unfortunately the issue hasn't yet been fixed!")
cmd.Process.Signal(syscall.SIGKILL)
return
case err := <-doneCh:
if err != nil {
log.Printf("The program returned but unfortunately with an error: %v", err)
} else {
log.Println("Hooray, the issue is fixed!!")
}
return
}
}
} but something isn't right with my Go translation and unfortunately it doesn't reproduce the issue as a standalone, but the previous go and shell program reproduce it reliably. Perhaps it being the parent sending the SIGIO? I'll also kindly page @ianlancetaylor to help me as this involves signals. |
When you use |
Change https://golang.org/cl/135015 mentions this issue: |
@ianlancetaylor awesome, thank you for pointing that out! In deed, that was my problem. With that update, this standalone Go repro below can now reliably serve as a regression test //+build !windows
package main
import (
"io/ioutil"
"log"
"os"
"os/exec"
"path/filepath"
"syscall"
"time"
)
func main() {
tempDir, err := ioutil.TempDir("", "signal-refresh")
if err != nil {
log.Fatalf("Failed to create the temp directory: %v", err)
}
defer os.RemoveAll(tempDir)
// Given the simple program below
repro := `
package main
import "time"
func main() {
<-time.After(2 * time.Second)
}
`
mainPath := filepath.Join(tempDir, "main.go")
if err := ioutil.WriteFile(mainPath, []byte(repro), 0755); err != nil {
log.Printf("Failed to create temp file for repro.go: %v", err)
return
}
binaryPath := filepath.Join(tempDir, "binary")
out, err := exec.Command("go", "build", "-o", binaryPath, mainPath).CombinedOutput()
if err != nil {
log.Printf("Failed to compile the binary: err: %v\nOutput: %s\n", err, out)
return
}
if err := os.Chmod(binaryPath, 0755); err != nil {
log.Printf("Failed to chmod binary: %v", err)
return
}
// Now run the binary
cmd := exec.Command(binaryPath)
if err := cmd.Start(); err != nil {
log.Printf("Failed to start command: %v", err)
return
}
doneCh := make(chan error)
go func() {
doneCh <- cmd.Wait()
}()
// Now that we've started the repro, we
// can continuously send to it signal SIGIO.
unfixedTimer := time.NewTimer(4 * time.Second)
for {
select {
case <-time.After(600 * time.Millisecond):
// Send the pesky signal that toggle spinning
// till infinity if #27520 is not fixed!!
cmd.Process.Signal(syscall.SIGIO)
log.Println("Sent SIGIO")
case <-unfixedTimer.C:
log.Println("Unfortunately the issue hasn't yet been fixed!")
cmd.Process.Signal(syscall.SIGKILL)
return
case err := <-doneCh:
if err != nil {
log.Printf("The program returned but unfortunately with an error: %v", err)
} else {
log.Println("Hooray, the issue is fixed!!")
}
return
}
}
} Results
$ go run main.go
2018/09/12 11:48:33 Sent SIGIO
2018/09/12 11:48:34 Sent SIGIO
2018/09/12 11:48:35 Sent SIGIO
2018/09/12 11:48:35 Sent SIGIO
2018/09/12 11:48:36 Sent SIGIO
2018/09/12 11:48:36 Sent SIGIO
2018/09/12 11:48:37 Unfortunately the issue hasn't yet been fixed!
$ go run main.go
2018/09/12 11:49:03 Sent SIGIO
2018/09/12 11:49:04 Sent SIGIO
2018/09/12 11:49:04 Sent SIGIO
2018/09/12 11:49:05 Hooray, the issue is fixed!! With that, I've mailed https://go-review.googlesource.com/c/go/+/135015 |
A regression test in which: for a program that invokes semasleep, we send non-terminal signals such as SIGIO. Since the signal wakes up pthread_cond_timedwait_relative_np, after CL 133655, we should only re-spin for the amount of time left, instead of re-spinning with the original duration which would cause an indefinite spin. Updates #27520 Change-Id: I744a6d04cf8923bc4e13649446aff5e42b7de5d8 Reviewed-on: https://go-review.googlesource.com/135015 Run-TryBot: Emmanuel Odeke <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Keith Randall <[email protected]>
Not sure if this is the right place for it, but on go1.13 I'm seeing this issue pop up intermittently. In a CentOS 6 chroot on CentOS 7 x86_64 (using
|
@mikesmitty This issue has been closed for almost a year and we think it is fixed. Please open a new issue will full details. Thanks. |
Change https://golang.org/cl/233361 mentions this issue: |
From @mwf on issue #25696 :
Hi! I ran into a similar problem here: #26317 (comment)
$ time go test all
...
*** Test killed: ran too long (10m0s).
FAIL os/signal 605.012s
...
go test all 1521.37s user 69.95s system 228% cpu 11:35.33 total
It hangs even with -short flag passed.
The test passes without any problem in go1.10.3.
The most interesting thing that
go test -short os/signal
test passes OK, if you've never rungo test os/signal
after installing go. Once you run it, short tests also hang, evengo clean -testcache
doesn't help.Sorry, even the -short sometimes hangs.
Please try running this:
I killed the hanging tests manually with
ps aux | grep signal.test | grep -v "grep" | awk '{print $2}' | xargs kill -9
So it has nothing to do with short/non-short difference.
The text was updated successfully, but these errors were encountered: