Skip to content

Adding support for float16 conversion for GPUs#273

Merged
philschmid merged 6 commits into
mainfrom
add-fp-16-optimization
Jul 10, 2022
Merged

Adding support for float16 conversion for GPUs#273
philschmid merged 6 commits into
mainfrom
add-fp-16-optimization

Conversation

@philschmid
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR adds support for converting model weights from fp32 to fp16 by adding a new Optimization parameter. If the fp16 arg is provided in the OptimizationConfig the weights are converted. I also added a test to make sure the model is not containing any fp32 weights

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@philschmid philschmid requested a review from echarlaix July 8, 2022 16:45
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Copy Markdown
Contributor

@regisss regisss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @philschmid !!
I just left two minor comments.

Comment thread optimum/onnxruntime/configuration.py Outdated
Comment thread tests/onnxruntime/test_optimization.py
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this addition @philschmid

onnx_optimized_model_output_path=optimized_model_path,
optimization_config=optimization_config,
)
model = onnx.load(optimized_model_path.as_posix())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test about the relative difference in the output? I'm not a huge fan of changing dtype for all the layers like this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure why since we are not applying any custom logic.
What if there is more difference? what do you expect from the test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test but not sure what the benefit is

@philschmid philschmid merged commit 65ad733 into main Jul 10, 2022
@philschmid philschmid deleted the add-fp-16-optimization branch July 10, 2022 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants