Skip to content

Presubmit testing failed to catch basic clang lit test breakages #71532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
zmodem opened this issue Nov 7, 2023 · 16 comments
Open

Presubmit testing failed to catch basic clang lit test breakages #71532

zmodem opened this issue Nov 7, 2023 · 16 comments
Assignees
Labels
infrastructure Bugs about LLVM infrastructure

Comments

@zmodem
Copy link
Collaborator

zmodem commented Nov 7, 2023

(Breaking this discussion out of #71322)

ace4489 broke a bunch of Clang lit tests (e.g. https://lab.llvm.org/buildbot/#/builders/188/builds/37563).

The breakages did not require any exotic platform, operating system, or build configuration -- it broke everywhere.

The patch did go through a pull request, so presubmit testing should have caught it.

How come it wasn't caught? What can we do to fix it?

@metaflow @tstellar @joker-eph

@zmodem zmodem added the infrastructure Bugs about LLVM infrastructure label Nov 7, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2023

@llvm/issue-subscribers-infrastructure

Author: Hans (zmodem)

(Breaking this discussion out of https://github.com//pull/71322)

ace4489 broke a bunch of Clang lit tests (e.g. https://lab.llvm.org/buildbot/#/builders/188/builds/37563).

The breakages did not require any exotic platform, operating system, or build configuration -- it broke everywhere.

The patch did go through a pull request, so presubmit testing should have caught it.

How come it wasn't caught? What can we do to fix it?

@metaflow @tstellar @joker-eph

@joker-eph
Copy link
Collaborator

This PR touches clang, but the change-detection logic seems to have considered that check-clang wasn't in scope somehow. I don't know why:

./.ci/monolithic-linux.sh "clang;clang-tools-extra;libc;llvm" "check-clang-tools check-libc"

It only triggers clang-tools and libc tests here.

@metaflow metaflow self-assigned this Nov 7, 2023
@metaflow
Copy link
Contributor

metaflow commented Nov 8, 2023

I will try to look into this tomorrow.

@metaflow
Copy link
Contributor

metaflow commented Nov 9, 2023

@joker-eph yes, "clang" is not included in the project list as clang/ modifications trigger a separate pipeline "clang-ci" that does some additional checks with libcxx (probably @ldionne or @mstorsjo know better).

And this pipeline was triggered and failed for th PR
image

Link to buildkite: https://buildkite.com/llvm-project/github-pull-requests/builds/12931

So for me it seems that CI worked fine and author simply ignored the test results.

@zmodem
Copy link
Collaborator Author

zmodem commented Nov 9, 2023

And this pipeline was triggered and failed for th PR
Link to buildkite: https://buildkite.com/llvm-project/github-pull-requests/builds/12931
So for me it seems that CI worked fine and author simply ignored the test results.

That run (I looked at the Linux one) only flagged 3 unrelated clang-tidy test failures. It didn't flag the 50 failing clang tests, though, which is the issue.

@joker-eph
Copy link
Collaborator

That buildkite run is exactly the one I copy/pasted the command above:

./.ci/monolithic-linux.sh "clang;clang-tools-extra;libc;llvm" "check-clang-tools check-libc"

Where is check-clang supposed to happen? I wouldn't expect it to happen elsewhere, especially since we're already building most of it here anyway to be able to test LLVM and clang-tools...

@metaflow
Copy link
Contributor

metaflow commented Nov 9, 2023 via email

@metaflow
Copy link
Contributor

metaflow commented Nov 9, 2023

(I agree that it might have been presented better and aggregated in some way)

@ldionne
Copy link
Member

ldionne commented Nov 9, 2023

I agree with @metaflow, it looks like the CI results were simply ignored. @metaflow I would not expect the monolithic-linux and monolithic-windows tests to trigger at all though, do you know why they were triggered? It seems like this caused part of the confusion here. In other words, I would have expected that only the clang-ci BuildKite pipeline gets triggered, which would have made it the only source of failures. And then if you ignore that, it's unambiguously your fault.

@metaflow
Copy link
Contributor

metaflow commented Nov 9, 2023

there were triggered as clang-tools-extra was touched

@metaflow
Copy link
Contributor

metaflow commented Nov 9, 2023

image

@ldionne
Copy link
Member

ldionne commented Nov 9, 2023

Ahh, I see. One thing we could do here is potentially trigger the appropriate jobs directly in this pipeline instead of triggering the clang-ci pipeline indirectly. I think this was necessary to make it work on both GitHub PRs and Phabricator, but if we are willing to drop Phabricator support now, I think we might be able to remove one level of indirection. That way, the following failures would be on the same page as the other ones:
Screenshot 2023-11-09 at 08 53 19

@metaflow
Copy link
Contributor

I like the idea of removing this indirection. One thing to think about is how to organize "dependent builds" - right now some of checks for libcxx will not start if basic linux / windows have not passed. We can either make them dependent on all "first layer" checks, or specific steps.

@zmodem
Copy link
Collaborator Author

zmodem commented Nov 13, 2023

Have you checked clang ci pipeline results?
https://buildkite.com/llvm-project/clang-ci/builds/6211#018ba4b7-5c43-4209-a282-e7cbbaabc074
there are 47 failures
(I agree that it might have been presented better and aggregated in some way)

Given that the author of #71322, Aaron, Mehdi, and I all failed to spot these failures, yes that strongly suggests the presubmit testing UI could be better :-)

I like the idea of removing this indirection. One thing to think about is how to organize "dependent builds" - right now some of checks for libcxx will not start if basic linux / windows have not passed. We can either make them dependent on all "first layer" checks, or specific steps.

I don't know anything about this, but if you want to "gate" some tests on others, couldn't you just do ninja check-clang check-llvm && ninja check-cxx?

@metaflow
Copy link
Contributor

Agree. I think we should change the pipeline. I think that that will be easier to do after we drop support for phab workflow. My plan is to do that the next week (20 November) https://discourse.llvm.org/t/stop-pre-merge-checks-for-phabricator/74871

@ldionne
Copy link
Member

ldionne commented Nov 15, 2023

Actually, in light of the recent progress made by @EricWF with Github Actions, I think we shouldn't change anything about the BuildKite indirections here. We are going to migrate our pre-commit CI checks to Github Actions, which should resolve this issue entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Bugs about LLVM infrastructure
Projects
None yet
Development

No branches or pull requests

5 participants