Skip to content

Add generic IEEE754 truncation code #3820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

MatzeB
Copy link
Contributor

@MatzeB MatzeB commented Mar 14, 2025

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/905

This adds a generic implementation of IEEE754 floatingpoint truncation.

This is in preparation for conversion to FP8 E5M2 and FP8 E4M3FN formats but for consistency also replaces the existing float2half conversion functions.

Reviewed By: r-barnes

Differential Revision: D69941314

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69941314

Copy link

netlify bot commented Mar 14, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit e84baf0
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/67d4abcda3f0d90008b81e05
😎 Deploy Preview https://deploy-preview-3820--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary:

X-link: facebookresearch/FBGEMM#905

This adds a generic implementation of IEEE754 floatingpoint truncation.

This is in preparation for conversion to FP8 E5M2 and FP8 E4M3FN formats but for consistency also replaces the existing float2half conversion functions.

Reviewed By: r-barnes

Differential Revision: D69941314
@MatzeB MatzeB force-pushed the export-D69941314 branch from 8340f69 to e84baf0 Compare March 14, 2025 22:20
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D69941314

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 49d6314.

liligwu pushed a commit to ROCm/FBGEMM that referenced this pull request Mar 19, 2025
Summary:
Pull Request resolved: pytorch#3820

X-link: https://github.com/facebookresearch/FBGEMM/pull/905

This adds a generic implementation of IEEE754 floatingpoint truncation.

This is in preparation for conversion to FP8 E5M2 and FP8 E4M3FN formats but for consistency also replaces the existing float2half conversion functions.

Reviewed By: r-barnes

Differential Revision: D69941314

fbshipit-source-id: 1fb3d34ef3d6bc4613fbc1d522bf7c3eca53d568
q10 pushed a commit to q10/FBGEMM that referenced this pull request Apr 10, 2025
Summary:
X-link: pytorch#3820

Pull Request resolved: facebookresearch/FBGEMM#905

This adds a generic implementation of IEEE754 floatingpoint truncation.

This is in preparation for conversion to FP8 E5M2 and FP8 E4M3FN formats but for consistency also replaces the existing float2half conversion functions.

Reviewed By: r-barnes

Differential Revision: D69941314

fbshipit-source-id: 1fb3d34ef3d6bc4613fbc1d522bf7c3eca53d568
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants