Skip to content

gh-130474: Create and implement flaky resource #130489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

floor-licker
Copy link

@floor-licker floor-licker commented Feb 23, 2025

I've created the resource and implemented it in some obvious places where comments pointed to flaky tests. There were certain tests I left alone as although they were marked flaky by inline comments, there were other decorators addressing this, for example certain tests that were marked as being flaky without the GIL are already marked with the @support.requires_gil_enabled decorator which I think makes more sense.

As with other resources, the use-case for decorating tests is just:

from test.support import requires_resource

@requires_resource('flaky')
def test_flaky_test():
...

With this set-up, in CI environments we can configure the CI to run flaky tests separately and to automatically retry failed flaky tests::

python -m test --use=flaky --rerun

@floor-licker floor-licker requested review from gpshead and a team as code owners February 23, 2025 21:03
@ghost
Copy link

ghost commented Feb 23, 2025

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-app
Copy link

bedevere-app bot commented Feb 23, 2025

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@tomasr8 tomasr8 self-requested a review February 23, 2025 21:31
@picnixz picnixz self-requested a review February 23, 2025 22:16
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are not necessarily flaky. You should also update the command-line of libregretest python -m test --help

@picnixz
Copy link
Member

picnixz commented Feb 23, 2025

Oh, the docs are failing. I thought that we had documented it but no. My bad.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I'm not sure we should skip testSend for now because while it's marked as flaky, I've never seen it fail beforehand.

OTOH, test_stress_modifying_handlers is only flaky on free-threaded build so, if possible, we should not mark test_stress_modifying_handlers as flaky at all (testing on the regular would not be flaky at all!) Sorry for missing the point before but for my defense it's 1 AM here!

As for test_selflink, it's the only test that is really flaky actually and skipped (it's usually not flaky on an idle system, it's flaky on a stressed system). I would leave marking flaky tests in a separate PR (except for test_selflink).

@picnixz picnixz removed request for a team and gpshead February 24, 2025 01:43
@AA-Turner AA-Turner requested a review from vstinner February 24, 2025 01:46
@AA-Turner
Copy link
Member

Looks ok, but Victor is probably the best one to review changes to libregrtest.

A

@rossburton
Copy link
Contributor

As for test_selflink, it's the only test that is really flaky actually and skipped (it's usually not flaky on an idle system, it's flaky on a stressed system). I would leave marking flaky tests in a separate PR (except for test_selflink).

Reading the linked issue in the comment for test_selflink, this doesn't appear to be a idle vs stressed issue at all (and it's potentially fixed with other improvements).

@picnixz
Copy link
Member

picnixz commented Feb 24, 2025

It is though? at least according to my readings of #109959 (comment). Anyway, the thing is that it's still flaky as it randomly fails.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike this change, If a test is known to be flaky, it should be either fixed or removed.

I suggest to remove test_selflink() in test_glob.

@vstinner
Copy link
Member

I suggest to remove test_selflink() in test_glob.

I created #130551 to remove the flaky test.

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this "flaky" resource is a good idea.

I created #130551 to remove the flaky test.

I merged my PR.

@bedevere-app
Copy link

bedevere-app bot commented Feb 27, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@picnixz
Copy link
Member

picnixz commented Mar 2, 2025

In light of #130474 (comment), I think it was a bad idea (my bad idea). At first, it seemed great but hiding a flaky test behind a decorator wouldn't help solving the problem, we could even forget about it!

So I'll close the issue and this PR as the flaky tests I thought were flaky are being fixed (most of them happen on the free-threaded build and seem to be real issues, namely we should have written the tests differently). Thanks for the work though and sorry again!

@picnixz picnixz closed this Mar 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants