How to implement PTQ in this paper? #1220
Replies: 2 comments
-
There is no planned support for tf life at the moment. If you can convert the produced model to any of the frontend supported by hls4ml (keras v2/v3, torch, or onnx), it may be consumed by hls4ml. Though I'm not familier with tflite, conversion to onnx seems to be doable. However, notice that you will need to handle all metadata (bitwidth, scaling factor, and etc.) correctly. Depending on the quantization scheme, you may not be able to represent the model correctly on the hardware, as tflite may use (dynamic) float scaling factors, which are not supported by hls4ml. May I ask what is your use case? In general, we strongly recommend QAT instead of PTQ when it is feasible, which is almost always the case for models to be deployed on FPGAs as they are relatively small. |
Beta Was this translation helpful? Give feedback.
-
I used QKeras to build AlexNet and performed QAT on CIFAR-10. However, I would like to know the differences in accuracy, parameters, and hardware resource usage (such as LUTs and DSPs) after converting to HLS between two approaches: training with a Keras model followed by PTQ quantization versus using QAT. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
In this paper: https://cds.cern.ch/record/2754189/files/2103.05579.pdf?version=2
It mentions that PTQ has lower accuracy than QAT. I am using TensorFlow Lite for PTQ, but hls4ml does not support the TFLite format.
What method can I use to implement PTQ and ensure compatibility with hls4ml?
Beta Was this translation helpful? Give feedback.
All reactions