Skip to content

cmd/devp2p: fix flakey tests in CI #22757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 30, 2021

Conversation

renaynay
Copy link
Contributor

@renaynay renaynay commented Apr 28, 2021

This PR fixes a couple of issues in the eth test suite that caused flakiness when run in the CI.

It depends on #22754 getting merged.

@renaynay renaynay force-pushed the ethtest-fix-flakeyOldAnnounce branch from 598fa5f to ef36620 Compare April 28, 2021 11:07
@renaynay renaynay changed the title [WIP] cmd/devp2p: fix flakey tests in CI cmd/devp2p: fix flakey tests in CI Apr 28, 2021
@renaynay renaynay marked this pull request as ready for review April 28, 2021 12:20
@holiman
Copy link
Contributor

holiman commented Apr 28, 2021

It depends on #22754 getting merged.

Done, but now it needs a rebase

@renaynay renaynay force-pushed the ethtest-fix-flakeyOldAnnounce branch from ef36620 to 79e82fa Compare April 28, 2021 20:27
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I'm ok with it as it is, if it solves the problems, but would prefer if we could get rid of the random timeout

@@ -52,6 +52,7 @@ func TestEthSuite(t *testing.T) {
t.Fatal()
}
})
time.Sleep(100 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit indicative of it still being flaky. I'm guessing you added it to make the spurious packets from earlier tests to interfere with later tests, but is it still needed?
If it's only once in a while, maybe we can either figure out what spurious packets those are, and just 'continue' on them, or maybe somehow drain the deliver buffer between test executions?

Just having a random 100ms timout like that means it'll work "more often", but still spuriously fail on e.g. appveyor.

Not a blocker, but I would prefer if we solved it without having that in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

@holiman
Copy link
Contributor

holiman commented Apr 29, 2021

LGTM!

@holiman
Copy link
Contributor

holiman commented Apr 29, 2021

Unfortunately, still flaky on Appveyor:

not ok 1 TestBroadcast_66
# wrong head block in status, want:  0x8c795a2497f393359fd66bfd5696442d12a81ccb1110dffd636604bfc9af4df3 (block 1000) have 0x0e70e01064023f70f047dbf0b97e7109e4aa5df3d643d45f8e91e47d3d67a424

@@ -319,10 +319,10 @@ func (s *Suite) sendNextBlock66(t *utesting.T) {
}
// send announcement and wait for node to request the header
s.testAnnounce66(t, sendConn, receiveConn, blockAnnouncement)
// update test suite chain
s.chain.blocks = append(s.chain.blocks, s.fullChain.blocks[nextBlock])
// wait for client to update its chain
if err := receiveConn.waitForBlock66(s.chain.Head()); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ?

Suggested change
if err := receiveConn.waitForBlock66(s.chain.Head()); err != nil {
if err := receiveConn.waitForBlock66(s.fullchain.blocks[nextBlock]); err != nil {

I really don't know, just a thought..?

@holiman
Copy link
Contributor

holiman commented Apr 29, 2021

i think you also need to apply this: https://github.com/renaynay/go-ethereum/blob/b5eb4d1195849a8ca75fbdeb5976fe7234d0e142/cmd/devp2p/internal/ethtest/types.go#L345 here:

Or, well, maybe it doesn't matter, since we have request ids. And it won't make it less flaky, so feel free to ignore it

@holiman
Copy link
Contributor

holiman commented Apr 29, 2021

Ah, so the error is in the status exchange, where the 'other side' has a lower block than we expected. Isn't it because the previous broacast test already progressed 'us' one block?

@holiman holiman added this to the 1.10.3 milestone Apr 30, 2021
@holiman holiman merged commit 8ff9810 into ethereum:master Apr 30, 2021
@renaynay renaynay deleted the ethtest-fix-flakeyOldAnnounce branch May 1, 2021 10:37
atif-konasl pushed a commit to frozeman/pandora-execution-engine that referenced this pull request Oct 15, 2021
This PR fixes a couple of issues in the eth test suite that caused flakiness when run in the CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants