Skip to content

batched dense vector x jagged 2D multiplication #997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

jspark1105
Copy link
Contributor

Differential Revision: D34876009

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D34876009

Differential Revision: D34804180

fbshipit-source-id: dd54ceb8c2ee5f5e2b191ddb4e40604fae229b87
Differential Revision: D34812899

fbshipit-source-id: 34e1649a6a1303532f5d3f82f808fff63d3a84b7
Differential Revision: D34840551

fbshipit-source-id: 1453db8ad12889a3cb15c396943c4f9d4c08a9ec
Differential Revision: D34840552

fbshipit-source-id: 54599136bc514acbbe0f548107b9c4165df06b17
Differential Revision: D34844686

fbshipit-source-id: fe7af0b3c67e41fc7cbb0e852969bb801722ed6e
Differential Revision: D34845894

fbshipit-source-id: 16a50531b15d1b65eb20579d79d00c36ef35bf18
Summary: Pull Request resolved: pytorch#997

Differential Revision: D34876009

fbshipit-source-id: 74740a81a2c627622af5b0398e783fb4106b64af
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D34876009

q10 pushed a commit to q10/FBGEMM that referenced this pull request Apr 10, 2025
Summary:
X-link: pytorch#3908

Pull Request resolved: facebookresearch/FBGEMM#997

This diff contains a workaround for the stochastic rounding issue for
the AMD GPUs.

Problem:

`quantize_store` calls `nearest_rounding_vector` instead of
`stochastic_rounding_vector` when stochastic rounding is used because
the `StochasticRoundingRNGState` pointer is a nullptr
(https://fburl.com/code/kna14icj)

We found that the `WeightRow` constructor also gets a null
`StochasticRoundingRNGState` pointer (https://fburl.com/code/vyq53lia)

When `WeightRow` is instantiated, we confirm that
`stochastic_rounding` is
true.  `WeightRow` should receive `&state`, but instead it receives a
nullptr. (https://fburl.com/code/o3kxgt4z)

We suspect that the compiler might have optimized out the
`StochasticRoundingRNGState` since it is only passed to `WeightRow`
and not utilized anywhere else in the caller kernel.

Workaround:

We move the `StochasticRoundingRNGState` storage inside the
`WeightRow` struct and pass a boolean to the `WeightRow` constructor
instead.

Reviewed By: q10, yinbinm, jianyuh, xw285cornell, yoyoyocmu, joebos

Differential Revision: D72201618

fbshipit-source-id: a2bc7f004ac5183c84eb0501ada6d848ebca17e1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants