replace device param with bounds_check_warning of inputs_to_device function #3831

tiankongdeguiji · 2025-03-17T13:12:54Z

When using fbgemm-gpu version 1.1.0 or later, the fx trace model will be bound to the device cuda:0. As a result, deploying the model to a CPU or a different device, such as cuda:1, is not possible. We can reproduce this with the case in #3830

we fix it, the fbgemm_gpu_split_table_batched_embeddings_ops_inference_inputs_to_device do not need device(type='cuda', index=0)) as input.

···
getitem_1 = _fx_trec_unwrap_kjt[0]
    getitem_2 = _fx_trec_unwrap_kjt[1];  _fx_trec_unwrap_kjt = None
    _tensor_constant0 = self._tensor_constant0
    inputs_to_device = fbgemm_gpu_split_table_batched_embeddings_ops_inference_inputs_to_device(getitem_1, getitem_2, None, _tensor_constant0);  getitem_1 = getitem_2 = _tensor_constant0 = None
    getitem_3 = inputs_to_device[0]
    getitem_4 = inputs_to_device[1]
    getitem_5 = inputs_to_device[2];  inputs_to_device = None
    _tensor_constant1 = self._tensor_constant1
    _tensor_constant0_1 = self._tensor_constant0
    bounds_check_indices = torch.ops.fbgemm.bounds_check_indices(_tensor_constant1, getitem_3, getitem_4, 1, _tensor_constant0_1, getitem_5);  _tensor_constant1 = _tensor_constant0_1 = bounds_check_indices = None
    _tensor_constant2 = self._tensor_constant2
    _tensor_constant3 = self._tensor_constant3
···

…nction

netlify · 2025-03-17T13:13:14Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`0a3db61`
🔍 Latest deploy log	https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/67d81fd9ee95a000082b8493
😎 Deploy Preview	https://deploy-preview-3831--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

tiankongdeguiji · 2025-03-18T04:01:55Z

hi, @842974287 @q10 @jiayisuse @aporialiao could you take a look?

facebook-github-bot · 2025-03-18T05:00:57Z

@q10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

q10 · 2025-03-18T05:03:15Z

Hi @tiankongdeguiji , thanks for opening this PR. We will look into this and get back to you if we have questions, and if no questions, we will merge the PR.

tiankongdeguiji · 2025-03-18T05:29:53Z

Hi @tiankongdeguiji , thanks for opening this PR. We will look into this and get back to you if we have questions, and if no questions, we will merge the PR.

thx!

facebook-github-bot · 2025-03-18T23:15:20Z

@q10 merged this pull request in c01a227.

…nction (pytorch#3831) Summary: X-link: https://github.com/facebookresearch/FBGEMM/pull/930 When using fbgemm-gpu version 1.1.0 or later, the fx trace model will be bound to the device cuda:0. As a result, deploying the model to a CPU or a different device, such as cuda:1, is not possible. We can reproduce this with the case in pytorch#3830 we fix it, the `fbgemm_gpu_split_table_batched_embeddings_ops_inference_inputs_to_device` do not need `device(type='cuda', index=0))` as input. ``` ··· getitem_1 = _fx_trec_unwrap_kjt[0] getitem_2 = _fx_trec_unwrap_kjt[1]; _fx_trec_unwrap_kjt = None _tensor_constant0 = self._tensor_constant0 inputs_to_device = fbgemm_gpu_split_table_batched_embeddings_ops_inference_inputs_to_device(getitem_1, getitem_2, None, _tensor_constant0); getitem_1 = getitem_2 = _tensor_constant0 = None getitem_3 = inputs_to_device[0] getitem_4 = inputs_to_device[1] getitem_5 = inputs_to_device[2]; inputs_to_device = None _tensor_constant1 = self._tensor_constant1 _tensor_constant0_1 = self._tensor_constant0 bounds_check_indices = torch.ops.fbgemm.bounds_check_indices(_tensor_constant1, getitem_3, getitem_4, 1, _tensor_constant0_1, getitem_5); _tensor_constant1 = _tensor_constant0_1 = bounds_check_indices = None _tensor_constant2 = self._tensor_constant2 _tensor_constant3 = self._tensor_constant3 ··· ``` Pull Request resolved: pytorch#3831 Reviewed By: sryap Differential Revision: D71370666 Pulled By: q10 fbshipit-source-id: e8f65a534bf8235534ff861d1f135497f4660820

…nction (pytorch#930) Summary: Pull Request resolved: facebookresearch/FBGEMM#930 When using fbgemm-gpu version 1.1.0 or later, the fx trace model will be bound to the device cuda:0. As a result, deploying the model to a CPU or a different device, such as cuda:1, is not possible. We can reproduce this with the case in pytorch#3830 we fix it, the `fbgemm_gpu_split_table_batched_embeddings_ops_inference_inputs_to_device` do not need `device(type='cuda', index=0))` as input. ``` ··· getitem_1 = _fx_trec_unwrap_kjt[0] getitem_2 = _fx_trec_unwrap_kjt[1]; _fx_trec_unwrap_kjt = None _tensor_constant0 = self._tensor_constant0 inputs_to_device = fbgemm_gpu_split_table_batched_embeddings_ops_inference_inputs_to_device(getitem_1, getitem_2, None, _tensor_constant0); getitem_1 = getitem_2 = _tensor_constant0 = None getitem_3 = inputs_to_device[0] getitem_4 = inputs_to_device[1] getitem_5 = inputs_to_device[2]; inputs_to_device = None _tensor_constant1 = self._tensor_constant1 _tensor_constant0_1 = self._tensor_constant0 bounds_check_indices = torch.ops.fbgemm.bounds_check_indices(_tensor_constant1, getitem_3, getitem_4, 1, _tensor_constant0_1, getitem_5); _tensor_constant1 = _tensor_constant0_1 = bounds_check_indices = None _tensor_constant2 = self._tensor_constant2 _tensor_constant3 = self._tensor_constant3 ··· ``` X-link: pytorch#3831 Reviewed By: sryap Differential Revision: D71370666 Pulled By: q10 fbshipit-source-id: e8f65a534bf8235534ff861d1f135497f4660820

replace device param with bounds_check_warning of inputs_to_device fu…

0a3db61

…nction

facebook-github-bot added the cla signed label Mar 17, 2025

tiankongdeguiji mentioned this pull request Mar 18, 2025

How remove device information of fx traced IntNBitTableBatchedEmbeddingBagsCodegen when fbgemm-gpu >= 1.1.0 #3830

Closed

facebook-github-bot closed this in c01a227 Mar 18, 2025

facebook-github-bot added the Merged label Mar 18, 2025

q10 added category:improvement feature:tbe contributor:alibaba labels Mar 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

replace device param with bounds_check_warning of inputs_to_device function #3831

replace device param with bounds_check_warning of inputs_to_device function #3831

Uh oh!

tiankongdeguiji commented Mar 17, 2025 •

edited

Loading

Uh oh!

netlify bot commented Mar 17, 2025 •

edited

Loading

Uh oh!

tiankongdeguiji commented Mar 18, 2025

Uh oh!

facebook-github-bot commented Mar 18, 2025

Uh oh!

q10 commented Mar 18, 2025

Uh oh!

tiankongdeguiji commented Mar 18, 2025

Uh oh!

facebook-github-bot commented Mar 18, 2025

Uh oh!

Uh oh!

replace device param with bounds_check_warning of inputs_to_device function #3831

replace device param with bounds_check_warning of inputs_to_device function #3831

Uh oh!

Conversation

tiankongdeguiji commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

tiankongdeguiji commented Mar 18, 2025

Uh oh!

facebook-github-bot commented Mar 18, 2025

Uh oh!

q10 commented Mar 18, 2025

Uh oh!

tiankongdeguiji commented Mar 18, 2025

Uh oh!

facebook-github-bot commented Mar 18, 2025

Uh oh!

Uh oh!

tiankongdeguiji commented Mar 17, 2025 •

edited

Loading

netlify bot commented Mar 17, 2025 •

edited

Loading