-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Implement IEEE 754 rounding conditions for fp32 to fp16 conversion in host #74044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
/re-run coverage build |
/re-run Infer |
/re-run inference build |
/re-run all-failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/re-run all-failed |
3 similar comments
/re-run all-failed |
/re-run all-failed |
/re-run all-failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
Operator Mechanism
PR Types
Bug fixes
Description
Problem Summary
When converting fp32 tensors to fp16, current implementation lacks IEEE 754-compliant rounding logic, leading to Precision Error.
Example
Solution Approach
Implemented IEEE 754 round-to-nearest-even standard
Validation
Others
These codes are very difficult to understand, so I have added some comments to clarify them.
Pcard-67164