-
Notifications
You must be signed in to change notification settings - Fork 21k
cmd/devp2p: fix flakey tests in CI #22757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/devp2p: fix flakey tests in CI #22757
Conversation
598fa5f
to
ef36620
Compare
Done, but now it needs a rebase |
…test chain only after block is received
ef36620
to
79e82fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I'm ok with it as it is, if it solves the problems, but would prefer if we could get rid of the random timeout
@@ -52,6 +52,7 @@ func TestEthSuite(t *testing.T) { | |||
t.Fatal() | |||
} | |||
}) | |||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit indicative of it still being flaky. I'm guessing you added it to make the spurious packets from earlier tests to interfere with later tests, but is it still needed?
If it's only once in a while, maybe we can either figure out what spurious packets those are, and just 'continue' on them, or maybe somehow drain the deliver buffer between test executions?
Just having a random 100ms timout like that means it'll work "more often", but still spuriously fail on e.g. appveyor.
Not a blocker, but I would prefer if we solved it without having that in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
LGTM! |
Unfortunately, still flaky on Appveyor:
|
@@ -319,10 +319,10 @@ func (s *Suite) sendNextBlock66(t *utesting.T) { | |||
} | |||
// send announcement and wait for node to request the header | |||
s.testAnnounce66(t, sendConn, receiveConn, blockAnnouncement) | |||
// update test suite chain | |||
s.chain.blocks = append(s.chain.blocks, s.fullChain.blocks[nextBlock]) | |||
// wait for client to update its chain | |||
if err := receiveConn.waitForBlock66(s.chain.Head()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be ?
if err := receiveConn.waitForBlock66(s.chain.Head()); err != nil { | |
if err := receiveConn.waitForBlock66(s.fullchain.blocks[nextBlock]); err != nil { |
I really don't know, just a thought..?
i think you also need to apply this: https://github.com/renaynay/go-ethereum/blob/b5eb4d1195849a8ca75fbdeb5976fe7234d0e142/cmd/devp2p/internal/ethtest/types.go#L345 here:
Or, well, maybe it doesn't matter, since we have request ids. And it won't make it less flaky, so feel free to ignore it |
Ah, so the error is in the status exchange, where the 'other side' has a lower block than we expected. Isn't it because the previous broacast test already progressed 'us' one block? |
This PR fixes a couple of issues in the eth test suite that caused flakiness when run in the CI.
This PR fixes a couple of issues in the eth test suite that caused flakiness when run in the CI.
It depends on #22754 getting merged.