Use F14Map instead of std::map in CodeCache.h, and reuse Ctile buffer in ExecuteU8S8.cc #670

dskhudia · 2021-08-23T16:56:02Z

Differential Revision: D30466385

… in ExecuteU8S8.cc Differential Revision: D30466385 fbshipit-source-id: a2d29c13d1c7ef9aaae5b99102fdddcea958aad8

facebook-github-bot · 2021-08-23T16:56:21Z

This pull request was exported from Phabricator. Differential Revision: D30466385

facebook-github-bot · 2021-08-26T02:31:10Z

This pull request has been merged in d5e66bf.

Summary: Pull Request resolved: facebookresearch/FBGEMM#670 X-link: pytorch#3584 Seq INT4 -> INT4 STBE look up is supported in the diff stack: https://www.internalfb.com/diff/D61305978 . This diff supports: 1. The dequanitzation of INT4 -> INT4 STBE look up onto Cuda for all float types 2. Extends the dequantization of INT4 > INT4 STBE look up onto CPU for BF16 The main gap is to handle the dequant for the case when scale bias for INT4 quantized tensor is in the front. While for CPU, just need to add the dequantization for BF16 based on dtype. This will enable us to reduce the network overhead to remote embedding server as well as D2H data transfer from onto GPU host. Reviewed By: jiayisuse Differential Revision: D68187234 fbshipit-source-id: 2c082775a711f1738eb8e5b7ee9319c4d70d7240

Use F14Map instead of std::map in CodeCache.h, and reuse Ctile buffer…

a531a44

… in ExecuteU8S8.cc Differential Revision: D30466385 fbshipit-source-id: a2d29c13d1c7ef9aaae5b99102fdddcea958aad8

facebook-github-bot added cla signed fb-exported labels Aug 23, 2021

facebook-github-bot closed this in d5e66bf Aug 26, 2021

facebook-github-bot added the Merged label Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use F14Map instead of std::map in CodeCache.h, and reuse Ctile buffer in ExecuteU8S8.cc #670

Use F14Map instead of std::map in CodeCache.h, and reuse Ctile buffer in ExecuteU8S8.cc #670

Uh oh!

dskhudia commented Aug 23, 2021

Uh oh!

facebook-github-bot commented Aug 23, 2021

Uh oh!

facebook-github-bot commented Aug 26, 2021

Uh oh!

Uh oh!

Use F14Map instead of std::map in CodeCache.h, and reuse Ctile buffer in ExecuteU8S8.cc #670

Use F14Map instead of std::map in CodeCache.h, and reuse Ctile buffer in ExecuteU8S8.cc #670

Uh oh!

Conversation

dskhudia commented Aug 23, 2021

Uh oh!

facebook-github-bot commented Aug 23, 2021

Uh oh!

facebook-github-bot commented Aug 26, 2021

Uh oh!

Uh oh!