Set accumulate type to bf16 in activation quant #152

lsy323 · 2024-07-19T02:34:32Z

XLA generate a more optimized graph in this way

Before the change, some collective ops are working on int32 mamtul, with this change those becomes bf16 (Expected). Latency is improved by ~5% compared with per-channel int8 weight only quant baseline, on llama2 70B BS=96

FanhaiLu1

Can you update the line 142 comments: "# We have to call jax because we need to do dot(int8, int8)->int32."

lsy323 · 2024-07-19T03:39:55Z

Can you update the line 142 comments: "# We have to call jax because we need to do dot(int8, int8)->int32."

Updated, thanks for reminding!

set accumulate type to bf16

036e251

lsy323 marked this pull request as ready for review July 19, 2024 02:34

lsy323 requested review from qihqi, wang2yn84 and FanhaiLu1 July 19, 2024 02:35

FanhaiLu1 approved these changes Jul 19, 2024

View reviewed changes

FanhaiLu1 reviewed Jul 19, 2024

View reviewed changes

fix comment

df9fd38

lsy323 merged commit 60c2fa5 into AI-Hypercomputer:main Jul 19, 2024
4 checks passed

lsy323 deleted the lsiyuan/fix-act-quant branch July 19, 2024 04:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Set accumulate type to bf16 in activation quant #152

Set accumulate type to bf16 in activation quant #152

Uh oh!

lsy323 commented Jul 19, 2024 •

edited

Loading

Uh oh!

FanhaiLu1 left a comment •

edited

Loading

Uh oh!

lsy323 commented Jul 19, 2024

Uh oh!

Uh oh!

Uh oh!

Set accumulate type to bf16 in activation quant #152

Set accumulate type to bf16 in activation quant #152

Uh oh!

Conversation

lsy323 commented Jul 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FanhaiLu1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lsy323 commented Jul 19, 2024

Uh oh!

Uh oh!

Uh oh!

lsy323 commented Jul 19, 2024 •

edited

Loading

FanhaiLu1 left a comment •

edited

Loading