Skip to content

Commit c8df1ab

Browse files
committed
Switch the order of the to_dtype function and source transform
We're running quantization during source transform and some quantization infra doesn't support bf16 yet. Move to_dtype one stage earlier so we can choose the dtype fp32 before running quantization transform. Differential Revision: [D57883363](https://our.internmc.facebook.com/intern/diff/D57883363/) ghstack-source-id: 228003406 Pull Request resolved: #3757
1 parent a425741 commit c8df1ab

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

examples/models/llama2/export_llama_lib.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -366,8 +366,8 @@ def _prepare_for_llama_export(modelname: str, args) -> LlamaEdgeManager:
366366
)
367367
.set_output_dir(output_dir_path)
368368
.set_metadata(args.metadata)
369-
.source_transform(transforms)
370369
.to_dtype(dtype_override)
370+
.source_transform(transforms)
371371
)
372372

373373

0 commit comments

Comments
 (0)