Skip to content

net: TestDialParallelSpuriousConnection occasionally times out #34495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bcmills opened this issue Sep 24, 2019 · 11 comments
Closed

net: TestDialParallelSpuriousConnection occasionally times out #34495

bcmills opened this issue Sep 24, 2019 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Sep 24, 2019

From the darwin-amd64-race builder (https://build.golang.org/log/96aa6d382799b872cbedeb56e3b9d2076acc6546):

panic: test timed out after 3m0s

goroutine 15 [running]:
testing.(*M).startAlarm.func1()
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:1377 +0x11c
created by time.goFunc
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/time/sleep.go:168 +0x52

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000102000, 0x13ceb09, 0x22, 0x13d56f0, 0x1)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:961 +0x68a
testing.runTests.func1(0xc000102000)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:1202 +0xa7
testing.tRunner(0xc000102000, 0xc000095d08)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:909 +0x19a
testing.runTests(0xc0000d6120, 0x1609be0, 0xe4, 0xe4, 0x0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:1200 +0x522
testing.(*M).Run(0xc0000e6000, 0x0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:1117 +0x300
net.TestMain(0xc0000e6000)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/main_test.go:52 +0x47
main.main()
	_testmain.go:564 +0x224

goroutine 29 [semacquire]:
sync.runtime_Semacquire(0xc0000b1908)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/runtime/sema.go:56 +0x42
sync.(*WaitGroup).Wait(0xc0000b1900)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/sync/waitgroup.go:130 +0xb1
net.TestDialParallelSpuriousConnection(0xc000102400)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/dial_test.go:502 +0x605
testing.tRunner(0xc000102400, 0x13d56f0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:909 +0x19a
created by testing.(*T).Run
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/testing/testing.go:960 +0x652

goroutine 31 [IO wait]:
internal/poll.runtime_pollWait(0x2495f08, 0x72, 0x0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/runtime/netpoll.go:184 +0x55
internal/poll.(*pollDesc).wait(0xc0000e6598, 0x72, 0x0, 0x0, 0x13c32f2)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/internal/poll/fd_poll_runtime.go:87 +0xe4
internal/poll.(*pollDesc).waitRead(...)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0xc0000e6580, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/internal/poll/fd_unix.go:384 +0x2cc
net.(*netFD).accept(0xc0000e6580, 0xc0000b1830, 0xf, 0xf)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/fd_unix.go:238 +0x56
net.(*TCPListener).accept(0xc0000d65e0, 0x5, 0xf, 0x5)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/tcpsock_posix.go:139 +0x50
net.(*TCPListener).Accept(0xc0000d65e0, 0xbf5a84fb15435818, 0x12b4d9c54, 0x160d240, 0xc0001e37b0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/tcpsock.go:261 +0x50
net.TestDialParallelSpuriousConnection.func1(0xc000256000, 0x1432d80, 0xc0000d65e0)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/dial_test.go:439 +0x7c
net.(*dualStackServer).buildup.func1(0xc0000d65a0, 0xc000256000, 0x1)
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/mockserver_test.go:148 +0xb3
created by net.(*dualStackServer).buildup
	/var/folders/dx/k53rs1s93538b4x20g46cj_w0000gn/T/workdir/go/src/net/mockserver_test.go:147 +0x86
FAIL	net	180.026s

It's not obvious to me whether the test was deadlocked or just ran out of time.

If the latter, perhaps some of its tests should be skipped in short mode (and/or when the race detector is enabled)?

CC @mikioh @bradfitz @ianlancetaylor

@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 24, 2019
@bcmills bcmills added this to the Go1.14 milestone Sep 24, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Sep 24, 2019

Compare #32534, #32919.

@bcmills
Copy link
Contributor Author

bcmills commented Oct 2, 2019

Same failure mode on solaris-amd64-oraclerel: https://build.golang.org/log/05176237c5a06f48100a31a8b05997879c9f8ea1

@bcmills bcmills changed the title net: TestDialParallelSpuriousConnection timeout on darwin-amd64-race net: TestDialParallelSpuriousConnection occasionally times out Oct 2, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Oct 8, 2019

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Oct 31, 2019

@bcmills
Copy link
Contributor Author

bcmills commented Mar 2, 2020

A different (but likely related) failure mode:

--- FAIL: TestDialParallelSpuriousConnection (1.49s)
    dial_test.go:456: got read tcp6 [::1]:59891->[::1]:59893: i/o timeout; want EOF
FAIL
FAIL	net	4.495s

2020-02-29T17:02:40-74f8983/darwin-arm64-corellium
2020-02-24T16:39:52-3093959/darwin-arm64-corellium
2020-01-31T20:18:54-1b7fefc/darwin-arm64-corellium
2020-01-23T21:01:12-ace25f8/darwin-arm-mg912baios
2020-01-19T14:04:09-8e0be05/darwin-arm-mg912baios

@bcmills
Copy link
Contributor Author

bcmills commented Mar 11, 2020

@bcmills bcmills changed the title net: TestDialParallelSpuriousConnection occasionally times out net: occasional "i/o timeout" in TestDialParallelSpuriousConnection Mar 11, 2020
@bcmills bcmills changed the title net: occasional "i/o timeout" in TestDialParallelSpuriousConnection net: TestDialParallelSpuriousConnection occasionally times out Mar 11, 2020
@bcmills
Copy link
Contributor Author

bcmills commented Mar 11, 2020

Actually, those new failures seem Corellium-specific. Opening a new issue.

@bcmills bcmills closed this as completed Mar 11, 2020
@golang golang locked and limited conversation to collaborators Mar 11, 2021
@golang golang unlocked this conversation Jan 5, 2022
@bcmills bcmills reopened this Jan 5, 2022
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/375694 mentions this issue: net: synchronize instead of sleeping in TestDialParallelspuriousconnection

@bcmills bcmills modified the milestones: Backlog, Go1.18 Jan 5, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Jan 5, 2022

This is a recurring test failure and does not seem to be port-specific, so marking as release-blocker. I've mailed what I believe to be a fix.

@bcmills bcmills self-assigned this Jan 6, 2022
jproberts pushed a commit to jproberts/go that referenced this issue Jun 21, 2022
…ction

The arbitrary sleep in this test is empirically not always long enough
on slower builders. However, we know the exact number of connections
that should be dialed: we can wait on that number in the dial hook
instead.

Fixes golang#34495

Change-Id: I538244ceb75a80271a724304b993309482bd5b41
Reviewed-on: https://go-review.googlesource.com/c/go/+/375694
Trust: Bryan Mills <[email protected]>
Run-TryBot: Bryan Mills <[email protected]>
Reviewed-by: Ian Lance Taylor <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@rsc rsc unassigned bcmills Jun 23, 2022
@golang golang locked and limited conversation to collaborators Jun 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

3 participants