-
Notifications
You must be signed in to change notification settings - Fork 108
Support marking unstable jobs from other repos #5122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
clee2000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preview for pytorch/pytorch hud page has a bunch of builds in the unstable category but I can't find unstable issues for them
Oh, @ZainRizvi explicitly added them into https://github.com/pytorch/pytorch/blob/main/.github/workflows/unstable.yml as part of ARC rollout in pytorch/pytorch#124753. So, they should be there. That's the old way to mark a job as unstable, it's working in the context of ARC rollout. |
|
I can't remember if this is a thing, but do unstable jobs get grouped together on PR/commit pages? |
Good point! They should be listed under their own section, i.e. https://torchci-git-support-unstable-job-other-repos-fbopensource.vercel.app/pytorch/pytorch/commit/977105466fac8caec5f3b8ae9eaf3b977f59fbfb. It looks like I miss a thing or two here. At least, there is no regression of existing functionality from what I see |
|
https://torchci-git-support-unstable-job-other-repos-fbopensource.vercel.app/pytorch/executorch/commit/66a350b1bbf7a199b33778be17fef6847ffc37c0 is fixed now and correctly shows the failed unstable jobs section. The miss is that I forgot to pass the list of unstable issues to |
This is the next part of #5122 where Dr.CI also uses the list of unstable issues to check if a job is an unstable job. The same function `isUnstableJob` that powers HUD from #5122 is used. ### Testing Mark a job on ET as unstable pytorch/executorch#3344 as shown on HUD https://hud.pytorch.org/hud/pytorch/executorch/main and Dr.CI applies the same logic on a PR pytorch/executorch#3318 to show the job as unstable. ``` curl --request POST \ --url "http://localhost:3000/api/drci/drci?prNumber=3318" \ --header "Authorization: TOKEN" \ --data 'repo=executorch' ``` <!-- drci-comment-start --> ## 🔗 Helpful Links ### 🧪 See artifacts and rendered test results at [hud.pytorch.org/pr/pytorch/executorch/3318](https://hud.pytorch.org/pr/pytorch/executorch/3318) * 📄 Preview [Python docs built from this PR](https://docs-preview.pytorch.org/pytorch/executorch/3318/index.html) Note: Links to docs will display an error until the docs builds have been completed. ## ✅ You can merge normally! (1 Unrelated Failure) As of commit f712e381c161901b733baa6b1fe7d85dc25404d3 with merge base b669056c1cff5f7fe3786df9e68a14447cd5410b (<sub><sub><img alt="image" width=70 src="https://img.shields.io/date/1713981882?label=&color=FFFFFF&style=flat-square"></sub></sub>): <details ><summary><b>UNSTABLE</b> - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:</summary><p> * [Android / test-llama-app / mobile-job (android)](https://hud.pytorch.org/pr/pytorch/executorch/3318#24225451316) ([gh](https://github.com/pytorch/executorch/actions/runs/8821313480/job/24225451316)) `Credentials could not be loaded, please check your action inputs: Could not load credentials from any providers` </p></details> This comment was automatically generated by Dr. CI and updates every 15 minutes. <!-- drci-comment-end -->
…R pages (#5343) This is a miss from #5122 where I failed to check that the repo field in JobData is optional and wasn't set in HUD commit and PR pages. So, marking an unstable job link works correctly in main HUD page like https://hud.pytorch.org/hud/pytorch/executorch/main where the data comes from `fetchHud`, but it uses the default link pointing to `pytorch/pytorch` in: * Commit page https://hud.pytorch.org/pytorch/executorch/commit/7c0f4c2b6f87369ee20544aada33dc55078c6bc6 * and PR page https://hud.pytorch.org/pr/pytorch/executorch/3972 The fix here is to populate the repo field if it's not there ### Testing Verify this locally (need to login to HUD) * http://localhost:3000/pytorch/executorch/commit/7c0f4c2b6f87369ee20544aada33dc55078c6bc6 * http://localhost:3000/pr/pytorch/executorch/3972
Fixes pytorch/executorch#1759
Before this, a job can only be seen as unstable when it has the unstable keyword in its name. The trick of renaming the job can be done with PyTorch test jobs by manipulating the test matrix. However, that doesn't work for PyTorch build job or Nova workflow without moving the filter script to a separate job to run before them. That approach doesn't scale.
So, this PR implements a different approach to query the list of open unstable issues from Rockset from HUD, then group all jobs with an open unstable issues to the unstable group.
The PR is big because the following top level pages need to query the list and pass it around to different React components. The main update is in the function
isUnstableJobwhich callshasOpenUnstableIssue. The important call paths are as follows:getGroupingData→classifyGroup→hasOpenUnstableIssueCommitStatus→FilteredJobList→FailedJobInfo→JobSummary→isUnstableJobCommitStatus→WorkflowsContainer->WorkflowBox->WorkflowJobSummary->JobSummary→isUnstableJobThere are some misc pages like HUD flaky test pages that also needs to be updated, but they don't need to have the list of unstable issues, so I will just pass an empty array there.
The second part of this change is on Dr.CI.
Testing
trunk / test-coreml-delegate / macos-jobfrom ET is marked as unstable by pytorch/executorch#3264, and it seems to show up correctly https://torchci-git-support-unstable-job-other-repos-fbopensource.vercel.app/hud/pytorch/executorch/main.An UX caveat is that all the invocations of the job will appear in the unstable group because there is no
unstablekeyword to distinguish them.