Skip to content

Register split_table_batched_embeddings_benchmark (TBE) with AI Bench. #796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Register split_table_batched_embeddings_benchmark (TBE) with AI Bench. #796

wants to merge 1 commit into from

Conversation

rweyrauch
Copy link
Contributor

Summary: Add the TBE benchmark to the AI Bench platform.

Reviewed By: jianyuh

Differential Revision: D32777968

Summary: Add the TBE benchmark to the AI Bench platform.

Reviewed By: jianyuh

Differential Revision: D32777968

fbshipit-source-id: 64829f2b973b0bc6e245f531d08fd4b79859ba30
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D32777968

q10 pushed a commit to q10/FBGEMM that referenced this pull request Apr 10, 2025
Summary:
X-link: pytorch#3715

Pull Request resolved: facebookresearch/FBGEMM#796

This diff implements `generate_vbe_metadata` for cpu, such that the function returns the same output for CPU, CUDA and MTIA.

To support VBE on CPU with existing fixed-batch-size CPU kernel, we need to recompute offsets, which is previously done in python. This diff implements offsets recomputation in C++ such that all manipulations are done in C++.

Note that reshaping offsets and grad_input to work with existing fixed-batch-size CPU kernels are done in Autograd instead of wrapper to avoid multiple computations.

VBE CPU tests are in the next diff.

Reviewed By: sryap, nautsimon

Differential Revision: D69162870

fbshipit-source-id: 08c6e45b8f0d319b96371acaba0d9a27570a1bd7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants