-
Notifications
You must be signed in to change notification settings - Fork 218
how to express that a large number of tests are expected to fail? #1832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
For now I'm working around this by maintaining an 'expected fail' status file (https://github.com/devoncarew/wasmd/blob/main/tool/tests.properties), wrapping |
Today we only really support skipping tests, either by the I do think there is a missing feature here that actually expects a failure, I remember some discussion about it in the past, it just hasn't been prioritized (or requested). |
Fwiw I personally am not a fan of the status file or DB approach at all - it separates the status of the test too far from the test itself. But I do think a general feature that requires certain tests to fail is a good idea, I just like the configuration to live with the actual test similar to how One idea could just be a flag to run skipped tests, but expect them all to fail. If any pass that is treated as a failure. This could be a command line flag, and so we could avoid increasing the API surface area. We could also add it as an option to dart_test.yaml so that it would always run all skipped tests and expect them to fail? The immediate issue I see with that would be flaky tests, but we could allow the configuration to be per preset, so you could add a It might still be better to just a more explicit expected failure option, but I think probably >90% of the current skipped tests are meant to fail. |
That makes sense. In my case, the tests are generated, so I would need to keep the 'expected fail' status in a sidecar file in any case. But my use case is probably rare so we wouldn't need to optimize for it.
I like this - as you say, it doesn't increase the API surface area. It does mean that there are a few different concerns layered into the |
The skip parameters are typed dynamic, I think its just the annotation that only accepts a string due to limitations around our understanding of annotations. But we could probably support a specific list of constants, or as you say have some specialized versions of the annotation. |
Fwiw, I think the for expected failure case, skip really isn't the right word (you want the test to run, just fail). So maybe it is worth adding some API surface area here instead? We also surface the number of skipped tests as a yellow color, indicating basically a warning status. But expected failures should probably just be green (passing). |
It does seem useful to be able to declare that a test is expected to fail, and then fail the suite if the test passes instead. (I thought I recalled a previous issue from @jakemac53 discussing this, but now I can't find it) We could also use the A It seems like this is something that would be better built as a new tool on top of the test runner, but we don't yet have any of the utilities you'd need to build it. |
I agree that at least some support for matching a reason for the failure would be ideal. But it is still better to assert some sort of failure rather than skip entirely, which is the only current option. At least you will know if it starts passing :). |
I'm currently working on a project where most of the tests for the project are code-gen'd from a specification. Many of the tests are currently failing - of ~7,590 tests, 545 are failing; as I fix things the pass percentage increases.
Because some tests currently fail, I'm not running tests on my CI (otherwise it's just a continuous red signal). This does mean that unfortunately I have effectively no regression coverage - I don't know when I break tests that had previously been passing.
One solution other systems w/ lots of tests have landed on is the notion of status files - encoding expected failures in a data file. The CI would then pass when the test results matched what was described in the status file.
Have you thought about how to support this use case - having a setup with lots of tests, where you can't reasonably expect them all to pass (but you still want CI + test coverage)? There are a few possible solutions I've thought of:
encode 'failure expected' in the test definition
This would be having a new
'bool fails'
flag on thetest()
method. This would invert the normal test expectation. This technique would avoid the complexity of a new file format + aux. file. It has the drawback that people would need to modify the test definition (for me, in ~500 different source locations). It also wouldn't be able to express things like 'only expected to fail on windows'.parameterize
test()
expectedStatus(String testId)
closure param fortest()
support a status file
This is defining a status file format for
package:test
(or re-usingdart_test.yaml
?). You'd minimally want to be able to express thatn
tests that were expected to fail, and may want to push it as far as allowing different sections in the file (using package:boolean_selector?) each with separate test expectations for different platforms.The text was updated successfully, but these errors were encountered: