Skip to content

Upgrade nightly wheels to ROCm5.3 #6955

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jithunnair-amd
Copy link
Contributor

@jithunnair-amd jithunnair-amd commented Nov 16, 2022

@jithunnair-amd jithunnair-amd marked this pull request as ready for review December 1, 2022 22:42
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me and the tests pass. Nevertheless I observed that some of the jobs such as Build Linux Wheels / pytorch/vision / manywheel-py3_10-rocm5_1_1 (pull_request) still run for 5.1.1 which means there are more places that need to be modified. I think this is related to @osalpekar's work of migrating from CircleCI to GA.

I've added the RelEng team to confirm.

@jithunnair-amd
Copy link
Contributor Author

The changes look good to me and the tests pass. Nevertheless I observed that some of the jobs such as Build Linux Wheels / pytorch/vision / manywheel-py3_10-rocm5_1_1 (pull_request) still run for 5.1.1 which means there are more places that need to be modified. I think this is related to @osalpekar's work of migrating from CircleCI to GA.

I've added the RelEng team to confirm.

@datumbox Yes, I noticed that too. I filed a PR pytorch/test-infra#1219 for upgrading to ROCm5.3. cc @osalpekar

@datumbox
Copy link
Contributor

datumbox commented Dec 2, 2022

@jithunnair-amd Thanks for confirming!

@osalpekar Do you advise merging this PR or wait for the other fix? My understanding is that it can be merged but I could be wrong. Thanks!

@osalpekar
Copy link
Member

Thanks @jithunnair-amd and @datumbox. We should be good to merge both in any order! Note that the test-infra PR will be the one that actually changes the ROCm versions in the nightly wheels, since those are build via GHA. The remaining py3.7 CircleCI jobs that are changed by this PR are just for the docs build.

osalpekar pushed a commit to pytorch/test-infra that referenced this pull request Dec 2, 2022
@jithunnair-amd
Copy link
Contributor Author

@datumbox The test-infra PR is merged. Is this PR good to go now?

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@datumbox datumbox merged commit c093b9c into pytorch:main Dec 5, 2022
@github-actions
Copy link

github-actions bot commented Dec 5, 2022

Hey @datumbox!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@jithunnair-amd
Copy link
Contributor Author

@datumbox @osalpekar While the PR pytorch/test-infra#1219 was merged on Dec 2, I see that rocm5.3 wheels haven't started being published yet for torchvision (https://download.pytorch.org/whl/nightly/rocm5.3/torchvision/). Even the latest rocm5.2 wheels for torchvision were only upto Nov 21 (https://download.pytorch.org/whl/nightly/rocm5.3/torchvision/). Would you know why?

@datumbox
Copy link
Contributor

datumbox commented Dec 5, 2022

@atalman Any thoughts?

@osalpekar
Copy link
Member

@jithunnair-amd I see the Dec 5 rocm5.3 wheels at that page. They're closer to the top of the page for some reason (the ordering of the jobs on the page is a little strange). Here's the name and link for reference:

[torchvision-0.15.0.dev20221205+cpu-cp310-cp310-linux_x86_64.whl](https://download.pytorch.org/whl/nightly/rocm5.3/torchvision-0.15.0.dev20221205%2Bcpu-cp310-cp310-linux_x86_64.whl)

And successful build job: https://github.com/pytorch/vision/actions/runs/3620045031/jobs/6101821737.

I see the Dec 5 rocm5.2 wheels on the nightly page as well (this time near the middle of the page): https://download.pytorch.org/whl/nightly/rocm5.2/torchvision/

@malfet
Copy link
Contributor

malfet commented Dec 5, 2022

[torchvision-0.15.0.dev20221205+cpu-cp310-cp310-linux_x86_64.whl](https://download.pytorch.org/whl/nightly/rocm5.3/torchvision-0.15.0.dev20221205%2Bcpu-cp310-cp310-linux_x86_64.whl)

@osalpekar perhaps I'm reading the package suffix name incorrectly, but this looks like a CPU build rather than ROCM accelerated one, isn't it?

@osalpekar
Copy link
Member

@malfet You're right, I see that issue and working on a fix now

@osalpekar
Copy link
Member

Just merged pytorch/test-infra#1230 in test-infra. Tests suggest that PR successfully fixes the issue @jithunnair-amd mentioned. I'll monitor the nightlies tomorrow to ensure they've been published correctly.

facebook-github-bot pushed a commit that referenced this pull request Dec 12, 2022
Summary:
* Update to ROCm 5.3

* Regenerate config.yml

Reviewed By: datumbox

Differential Revision: D41836896

fbshipit-source-id: f317897c7fc314698f8d42c00f09ff5d859cd9f0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants