Skip to content

Make coverage collection more resilient to test failure #26324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jab
Copy link

@jab jab commented Jun 17, 2025

Without these changes, bazel coverage foo does not collect coverage for any of the tests in the foo target if even a single test in that target fails. A failing test in foo should not defeat coverage collection for all the tests in that target.

This is easily fixed by deferring the exit $TEST_STATUS call to run after $LCOV_MERGER_CMD is run.

@jab jab requested a review from lberki as a code owner June 17, 2025 20:04
@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Jun 17, 2025
@iancha1992 iancha1992 added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 17, 2025
@c-mita c-mita self-assigned this Jun 18, 2025
@c-mita
Copy link
Member

c-mita commented Jun 18, 2025

I don't think this would be a good idea, at least not without some more thought. Something similar was tried internally a while back and had to be reverted.

There are a couple of issues that come to mind:

  • This would record the lines covered by a failing test which is arguably a mistake; a failing test indicates the code is "doing the wrong thing" so that coverage record should probably not be included in the report.
  • What happens with flaky tests that are automatically re-tried? Especially when the code-paths differ between runs?

bazel coverage foo does not collect coverage for any of the tests in the foo target if even a single test in that target fails.

Annoying though it may be, this is "Working as intended".

@jab
Copy link
Author

jab commented Jun 19, 2025

Some more context, fwiw: This change was requested by the developers at my firm years ago, and they have been using it in our Bazel fork happily ever since. I was only just now able to upstream the patch.

This patch is intended to improve the local development experience. For example, when you are refactoring a class, while running something like ibazel coverage <target>, this patch enables you to actually see coverage data as you're making progress, which can help make it more obvious what you should work on next. Without this patch, you have to either get every test passing first before you can see any coverage data, or you have to temporarily comment out the failing tests in <target>. Neither option is a good development experience.

Note also that with this patch, in the case that any test fails, the runner still tells you this in the output, and critically, Bazel still exits nonzero. So this should not affect anything that actually checks for the success of the bazel coverage command that was run. For example, a CI job could/should check the exit status before uploading the resulting coverage data and presenting it as though it corresponded to a successful run.

I think the above mitigates the following concern:

This would record the lines covered by a failing test which is arguably a mistake; a failing test indicates the code is "doing the wrong thing" so that coverage record should probably not be included in the report.

As well as this one:

What happens with flaky tests that are automatically re-tried? Especially when the code-paths differ between runs?

During local development, the developer sees which tests failed, and can treat the resulting coverage data appropriately. This is especially useful when using Bazel with a coverage tool that shows you which tests covered which lines. Without this patch, the coverage data isn't even collected, so the developer doesn't even have the chance to decide. Outside local development, e.g. in CI, the failure of a test means the corresponding patch should not be merged, so the coverage data should be ignored in the first place.

I don't think this would be a good idea, at least not without some more thought. Something similar was tried internally a while back and had to be reverted.

I would be curious to hear more of that story (I don't think it overlapped with my time at Google), and appreciate your openness to giving this more thought. Hope the context I gave above is helpful.

`bazel coverage foo` runs the tests in target `foo` and collects
resulting code coverage. Note there may be arbitrarily many tests
defined in target `foo`.

Without the changes in this commit, if only one of the tests in `foo`
fails, it defeats coverage collection for all other tests in that
target.

This is fixed by deferring the `exit $TEST_STATUS` call to run after
`$LCOV_MERGER_CMD` is run.
@jab jab force-pushed the fix-coverage-early-exit branch from 42f6e87 to 7333bef Compare June 19, 2025 21:11
@fmeum
Copy link
Collaborator

fmeum commented Jun 20, 2025

I see how this can be useful, but we need to be careful with the combined report (I believe that's what @c-mita is worried about). The report should not merge on coverage from failing tests. Since it's generated only after all tests have completed, this should be doable though.

@meteorcloudy meteorcloudy added coverage team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts and removed team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website labels Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer coverage team-Rules-API API for writing rules/aspects: providers, runfiles, actions, artifacts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants