Skip to content

Qualcomm AI Engine Direct - Meta CI for Mobilebert , W2L, and Llama #8616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

winskuo-quic
Copy link
Collaborator

Summary

  • Some fix from Llama
  • Mobilebert CI enablement
  • Wav2Letter CI enablement

Copy link

pytorch-bot bot commented Feb 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8616

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 275164d with merge base 24671a9 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2025
@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/meta_ci_mobilebert_w2l branch 8 times, most recently from 6bf844c to ca36c16 Compare February 25, 2025 14:53
@winskuo-quic winskuo-quic marked this pull request as ready for review February 25, 2025 14:56
@winskuo-quic winskuo-quic changed the title Qualcomm AI Engine Direct - Meta CI for Mobilebert and W2L Qualcomm AI Engine Direct - Meta CI for Mobilebert , W2L, and Llama Feb 25, 2025
@winskuo-quic
Copy link
Collaborator Author

winskuo-quic commented Feb 25, 2025

Hi @guangy10, @cccclai,
This PR should be able to fix some of the CI issues: including Mobilebert , W2L, and Llama.
For now, I am using 1 epoch for Mobilebert to expedite training process.
For W2L, I am using empty weights since we are just compiling the model. However, if weights can cause regression during inference, we should probably be using real weights in future.
Also, I notice that most of the example files(e.g., ic3. mv3, etc) are using only 1 random input when setting args.compile_only. This is probably not straight forward to community users, and I think we should either consider creating a new flag for CI or use actual data set when compiling the model.
Please let me know your thoughts. Thanks

@cccclai
Copy link
Contributor

cccclai commented Feb 25, 2025

There is an CI error, can you fix it?

@winskuo-quic
Copy link
Collaborator Author

winskuo-quic commented Feb 26, 2025

There is an CI error, can you fix it?

Thanks for the catch.
Another thing we noticed is that we are using nightly environment to develop, while it seems like CI is not using nightly libraries. Think this can possibly cause CI to fail sometimes while our environment is working fine. For example: #7634 (comment) could possibly be caused by using different libraries version for torchaudio.

@cccclai
Copy link
Contributor

cccclai commented Feb 26, 2025

There is an CI error, can you fix it?

Thanks for the catch. Another thing we noticed is that we are using nightly environment to develop, while it seems like CI is not using nightly libraries. Think this can possibly cause CI to fail sometimes while our environment is working fine. For example: #7634 (comment) could possibly be caused by using different libraries version for torchaudio.

Hmm I think CI is using this version

NIGHTLY_VERSION = "dev20250131"
maybe we need to bump the version more regularly.

@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@winskuo-quic
Copy link
Collaborator Author

There is an CI error, can you fix it?

Thanks for the catch. Another thing we noticed is that we are using nightly environment to develop, while it seems like CI is not using nightly libraries. Think this can possibly cause CI to fail sometimes while our environment is working fine. For example: #7634 (comment) could possibly be caused by using different libraries version for torchaudio.

Hmm I think CI is using this version

NIGHTLY_VERSION = "dev20250131"

maybe we need to bump the version more regularly.

@cccclai ,
Unsure if I traced this correctly, but seems like the CI starts off from
setup-linux.sh, which calls install_executorch "use-pt-pinned-commit"
and inside install_executorch.py, I saw the following code block that sets use_pytorch_nightly = False if args.use_pt_pinned_commit, as shown below:
image
Therefore, it will not be using the nightly version but the released version.

@cccclai
Copy link
Contributor

cccclai commented Feb 26, 2025

@guangy10 @kirklandsign Do you know which version we're running?

@cccclai
Copy link
Contributor

cccclai commented Feb 26, 2025

This PR needs to rebase...

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/meta_ci_mobilebert_w2l branch from 5415acb to 30effdb Compare February 26, 2025 23:58
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai
Copy link
Contributor

cccclai commented Mar 1, 2025

Still failing..maybe it's related to #8642. Can you rebase again?

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/meta_ci_mobilebert_w2l branch from 30effdb to 9318778 Compare March 3, 2025 01:54
@cccclai
Copy link
Contributor

cccclai commented Mar 3, 2025

Hmm looks like CI is failing..

@winskuo-quic winskuo-quic force-pushed the dev1/winskuo/meta_ci_mobilebert_w2l branch from 9318778 to 275164d Compare March 4, 2025 01:37
@facebook-github-bot
Copy link
Contributor

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@cccclai cccclai merged commit d92384b into pytorch:main Mar 4, 2025
84 of 85 checks passed
zonglinpeng pushed a commit that referenced this pull request Mar 6, 2025
…8616)

* Qualcomm AI Engine Direct - Meta CI for Mobilebert and W2L

* variable update
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: not user facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants