-
Notifications
You must be signed in to change notification settings - Fork 98
Wait for docker build #6013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for docker build #6013
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
Close in favor of pytorch/pytorch#142109 |
After chatting with @malfet, let try this one instead because pytorch/pytorch#142109 (review) adds few more minutes to the workflow TTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for the fix
Hi @huydhn, I noticed that there are some failures in calculate-docker-image step in xpu ci test jobs, for example https://github.com/pytorch/pytorch/actions/runs/12198235184/job/34036392093?pr=140664#step:6:160. I suspect those failure related to this PR changes. Could you please help to double check it? |
@chuanqi129 Thank you for the fix in pytorch/pytorch#142298! It's the correct fix. The failure you see actually highlight a problem that was hidden before. Without adding the new Docker image into the docker build workflow, the image will be rebuilt in every build and tests jobs that depend on it, which is a huge waste of time. Let me take an action item to write a linter check for this to make sure that adding a new Docker images requires a corresponding update to the docker build workflow. |
Add missed new xpu docker image name to adapt the new mechanism introduced by pytorch/test-infra#6013 Works for #114850 Pull Request resolved: #142298 Approved by: https://github.com/huydhn
…ch#142298) Add missed new xpu docker image name to adapt the new mechanism introduced by pytorch/test-infra#6013 Works for pytorch#114850 Pull Request resolved: pytorch#142298 Approved by: https://github.com/huydhn
Some lint jobs are using the default 30 minutes timeout, but the jobs could wait up to 90 minutes now for the Docker image to become available after pytorch/test-infra#6013 Pull Request resolved: #142444 Approved by: https://github.com/wdvr
Some lint jobs are using the default 30 minutes timeout, but the jobs could wait up to 90 minutes now for the Docker image to become available after pytorch/test-infra#6013 Pull Request resolved: pytorch#142444 Approved by: https://github.com/wdvr
Some lint jobs are using the default 30 minutes timeout, but the jobs could wait up to 90 minutes now for the Docker image to become available after pytorch/test-infra#6013 Pull Request resolved: pytorch#142444 Approved by: https://github.com/wdvr
This is a short-term mitigation for pytorch/pytorch#141885 in which any changes touching
.ci/docker
would cause all the builds to fail until docker build workflow finishes building the images.At the moment, we don't have a good way to tell the build workflow to wait for the new docker image, so my fix here attempts to inject a delay when the action is called by
_linux_build
. It will wait up to 90 minutes for the Docker build to finishTesting
pytorch/pytorch#142177