-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: frequent timeouts in doTestParallelReaders since 2021-11-04 #49680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Marking as release-blocker for Go 1.18 because the failure rate is markedly higher as of 2021-11-04: this looks like a regression, or at the very least a flaky test exacerbated by 1.18 changes. (@mknyszek, another one for the “GC running during test” pile, perhaps?) |
Oh, I think I see what's wrong here. The test already says that it's unsafe for a GC to run, and calls I'll try to confirm it. |
Change https://golang.org/cl/366534 mentions this issue: |
I've had trouble reproducing the issue, but I do think my previous theory does indicate a real potential problem. That theory does also fit into the change in minimum heap size triggering the issue. Sent golang.org/cl/366534. We can try landing it and see if it fixes it, placing this issue in a WaitingForInfo state. I think it's something that the test should be doing anyway. |
Actually, I'm more confident this is actually the fix. If I simply start a GC on another goroutine before the test starts, it easily reproduces. I guess it's possible this is some other bug, but it's unlikely at this point. Once I land https://golang.org/cl/366534 I'll close this optimistically. |
Currently this test makes it clear that it's unsafe for a GC to run, otherwise a deadlock could occur, so it calls SetGCPercent(-1). However, a GC may be actively in progress, and SetGCPercent is not going to end any in-progress GC. Call runtime.GC to block until at least the current GC is over. Updates #49680. Change-Id: Ibdc7d378e8cf7e05270910e92effcad8c6874e59 Reviewed-on: https://go-review.googlesource.com/c/go/+/366534 Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> Reviewed-by: Than McIntosh <[email protected]> Reviewed-by: Cherry Mui <[email protected]>
Change https://golang.org/cl/367874 mentions this issue: |
As suggested by #49680, a GC could be in-progress when we disable GC. Force a GC after we pause to ensure we don't hang in this case. For #49695 Change-Id: I4fc4c06ef2ac174217c3dcf7d58c7669226e2d24 Reviewed-on: https://go-review.googlesource.com/c/go/+/367874 Run-TryBot: Paul Murphy <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Trust: Paul Murphy <[email protected]>
Update on the root cause here: the fixes for #49680, #49695, and #45867 all assumed that |
Change https://golang.org/cl/369815 mentions this issue: |
Fixes for #49680, #49695, #45867, and #49370 all assumed that SetGCPercent(-1) doesn't block until the GC's mark phase is done, but it actually does. The cause of 3 of those 4 failures comes from the fact that at the beginning of the sweep phase, the GC does try to preempt every P once, and this may run concurrently with test code. In the fourth case, the issue was likely that only *one* of the debug_test.go tests was missing a call to SetGCPercent(-1). Just to be safe, leave a TODO there for now to remove the extraneous runtime.GC calls, but leave the calls in. Updates #49680, #49695, #45867, and #49370. Change-Id: Ibf4e64addfba18312526968bcf40f1f5d54eb3f1 Reviewed-on: https://go-review.googlesource.com/c/go/+/369815 Reviewed-by: Austin Clements <[email protected]> Trust: Michael Knyszek <[email protected]> Run-TryBot: Michael Knyszek <[email protected]> TryBot-Result: Gopher Robot <[email protected]>
greplogs --dashboard -md -l -e '(?ms)\ngoroutine \d+ \[run.*\]:\n(?:.+\n\t.*\n)*runtime_test\.doTestParallelReaders'
2021-11-18T06:05:29-f6647f2/linux-ppc64le-buildlet
2021-11-18T02:17:23-ce7e501/linux-ppc64-buildlet
2021-11-17T21:26:25-0440fb8/linux-ppc64le-power9osu
2021-11-17T17:36:36-0555ea3/linux-ppc64le-power9osu
2021-11-17T04:31:40-489f587/linux-ppc64le-power9osu
2021-11-17T04:31:22-f384c70/linux-ppc64le-power9osu
2021-11-16T18:36:59-6c36c33/linux-ppc64-buildlet
2021-11-16T07:47:08-313cae3/linux-ppc64le-power9osu
2021-11-15T21:22:14-42fa03a/linux-ppc64le-power9osu
2021-11-15T21:22:09-184ca3c/linux-ppc64le-power9osu
2021-11-13T00:26:24-787708a/linux-ppc64le-power9osu
2021-11-12T22:20:50-9150c16/linux-ppc64le-power9osu
2021-11-11T13:58:28-d76b1ac/linux-ppc64-buildlet
2021-11-11T05:16:39-99fa49e/linux-ppc64le-buildlet
2021-11-09T20:47:54-a096316/linux-ppc64le-power9osu
2021-11-09T19:01:06-81f37a7/linux-ppc64le-buildlet
2021-11-09T19:01:06-81f37a7/linux-ppc64le-power9osu
2021-11-08T21:52:47-955f9f5/linux-ppc64le-power9osu
2021-11-08T21:28:37-ccea0b2/linux-ppc64le-power9osu
2021-11-06T19:41:12-7ca772a/linux-ppc64le-buildlet
2021-11-05T20:59:43-dbd3cf8/linux-ppc64le-buildlet
2021-11-05T18:20:07-53bab19/linux-ppc64le-power9osu
2021-11-05T17:46:27-4f543b5/linux-ppc64-buildlet
2021-11-05T15:52:40-a0d661a/android-amd64-emu
2021-11-04T23:56:29-0e5f287/linux-ppc64le-buildlet
2021-11-04T21:50:21-bfd74fd/linux-ppc64le-buildlet
2021-11-04T20:01:22-6d1fffa/android-amd64-emu
2021-11-04T20:00:54-961aab2/illumos-amd64
2020-01-31T23:17:00-8390c47/android-amd64-emu
2019-11-01T05:38:51-e96fd13/netbsd-amd64-8_0
The text was updated successfully, but these errors were encountered: