Skip to content

x/time/rate: TestWaitCancel failures due to mismatched delays #43055

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bcmills opened this issue Dec 7, 2020 · 8 comments
Closed

x/time/rate: TestWaitCancel failures due to mismatched delays #43055

bcmills opened this issue Dec 7, 2020 · 8 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2020

2020-06-30T17:30:20-3af7569/android-amd64-emu

--- FAIL: TestWaitCancel (0.10s)
    rate_test.go:409: lim.WaitN(will-cancel, lim, 3) = context canceled with delay 96.22425ms ; want <non-nil error> with delay 100ms
    rate_test.go:438: tokens:2.0118626 last:2020-11-05 18:07:51.77802614 +0000 UTC m=+0.318446581 lastEvent:2020-11-05 18:07:51.87644761 +0000 UTC m=+0.417260321
FAIL
exitcode=1FAIL	golang.org/x/time/rate	85.825s

2020-06-30T17:30:20-3af7569/darwin-amd64-nocgo

--- FAIL: TestWaitCancel (0.41s)
    rate_test.go:409: lim.WaitN(will-cancel, lim, 3) = <nil> with delay 262.927081ms ; want <non-nil error> with delay 100ms
    rate_test.go:438: tokens:-1.99996442 last:2020-12-07 13:16:57.669299 -0500 EST m=+0.205996473 lastEvent:2020-12-07 13:16:57.869295442 -0500 EST m=+0.405992915
    rate_test.go:409: lim.WaitN(act-now-after-cancel, lim, 2) = <nil> with delay 146.745851ms ; want <nil> with delay 0s
FAIL
FAIL	golang.org/x/time/rate	0.870s

CC @Sajmani

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 7, 2020
@bcmills bcmills modified the milestones: Unplanned, Backlog Dec 7, 2020
@Sajmani Sajmani self-assigned this Jan 14, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Feb 24, 2022

Looks like we need to extend the more generous timing slop to android and plan9:

greplogs --dashboard -md -l -e 'FAIL: .*\n rate_test\.go:\d+: ' --since=2022-02-05

2022-02-24T00:04:18-90d013b-78e9976/android-386-emu
2022-02-18T20:32:59-90d013b-61b5c86/android-386-emu
2022-02-18T00:51:15-90d013b-a222963/android-386-emu
2022-02-11T22:01:05-90d013b-badba35/plan9-arm

@bcmills bcmills added NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure. labels Feb 24, 2022
@bcmills bcmills modified the milestones: Backlog, Go1.19 Feb 24, 2022
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 24, 2022
@bcmills bcmills self-assigned this Feb 24, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Apr 8, 2022

greplogs --dashboard -md -l -e 'FAIL: .*\n rate_test\.go:\d+: ' --since=2022-02-25

2022-04-07T14:42:13-0e9765c-063f403/linux-ppc64le-buildlet
2022-04-02T17:45:45-0e9765c-01c83be/solaris-amd64-oraclerel

@bcmills
Copy link
Contributor Author

bcmills commented May 5, 2022

greplogs -l -e 'FAIL: .*\n[ ]{4}rate_test\.go:\d+: ' --since=2022-04-08
2022-05-04T18:54:31-583f2d6-037b209/darwin-amd64-10_15

--- FAIL: TestWaitCancel (0.40s)
    rate_test.go:475: lim.WaitN(will-cancel, lim, 3) = <nil> with delay 221.618409ms; want <non-nil error> with delay 100ms (±50ms)
    rate_test.go:477: tokens:-1.99995866 last:2022-05-04 15:34:43.084733 -0400 EDT m=+0.211122342 lastEvent:2022-05-04 15:34:43.284728866 -0400 EDT m=+0.411118208
    rate_test.go:478: lim.WaitN(act-now-after-cancel, lim, 2) = <nil> with delay 178.065362ms; want <nil> with delay 0s (±50ms)
FAIL

@bcmills
Copy link
Contributor Author

bcmills commented May 5, 2022

(Marking as release-blocker via #11811darwin/amd64 is a first class port.)

@mknyszek
Copy link
Contributor

Ping because this is a release blocker.

@dmitshur dmitshur added the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label Jun 1, 2022
@cherrymui
Copy link
Member

Checking in as this is a release blocker. @bcmills are you looking into this, or you need someone else to take a look? Thanks.

@ianlancetaylor
Copy link
Contributor

The TestWaitCancel test is timing dependent. It starts a goroutine that sleeps for 100 ms and then cancels a context. The test goroutine will acquire more tokens, which will require it to sleep for 200 ms. The test passes if the context is cancelled before the test goroutine sleep completes. This is clearly going to be flaky in general, although in practice it fails rarely.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/411314 mentions this issue: rate: use fake time for testing

@bcmills bcmills assigned ianlancetaylor and unassigned bcmills and Sajmani Jun 17, 2022
@golang golang locked and limited conversation to collaborators Jun 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

7 participants