Update QAT recipe to match full finetune recipe (5/12/25) #2721

andrewor14 · 2025-05-12T14:06:56Z

Summary: Similar to #1854. Update qat_distributed recipe to mirror full_finetune_distributed up until a6db644. The new major feature that is excluded from qat_distributed is FP8 finetuning (#2546), since QAT FP8 is not supported in torchao yet.

Diff between full finetune and QAT recipes: P1809370361

diff --color recipes/full_finetune_distributed.py recipes/qat_distributed.py

Test Plan:

Finetune:

tune run --nnodes 1 --nproc_per_node 4 qat_distributed --config llama3_2/3B_qat_full \
    epochs=1 \
    batch_size=16 \
    dataset._component_=torchtune.datasets.alpaca_cleaned_dataset \
    checkpointer.output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat \
    output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/metrics \
    metric_logger.log_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/metrics \
    quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQATQuantizer \
    quantizer.groupsize=32

Quantize:

tune run quantize --config quantization \
    model._component_=torchtune.models.llama3_2.llama3_2_3b \
    checkpointer._component_=torchtune.training.FullModelHFCheckpointer \
    checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0 \
    checkpointer.output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0_out \
    'checkpointer.checkpoint_files=[model-00001-of-00002.safetensors,model-00002-of-00002.safetensors]' \
    checkpointer.model_type=LLAMA3 \
    quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \
    quantizer.groupsize=32

Eval:

tune run eleuther_eval --config eleuther_evaluation \
    batch_size=1 \
    'tasks=[wikitext]' \
    model._component_=torchtune.models.llama3_2.llama3_2_3b \
    checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \
    checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0 \
    checkpointer.output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0_out \
    'checkpointer.checkpoint_files=[model-00001-of-00002-8da4w.ckpt]' \
    checkpointer.model_type=LLAMA3 \
    tokenizer._component_=torchtune.models.llama3.llama3_tokenizer \
    tokenizer.path=/tmp/Meta-Llama-3-8B-Instruct/original/tokenizer.model \
    quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \
    quantizer.groupsize=32

Results:

experiment_name          tok/s                peak_mem_active    peak_mem_alloc    peak_mem_reserved
-----------------------  -------------------  -----------------  ----------------  -------------------
Llama3.2-3B_alpaca_full  4677.163 (+0.000%)   12.261 (+0.000%)   12.261 (+0.000%)  15.778 (+0.000%)
Llama3.2-3B_alpaca_qat   1873.316 (-59.948%)  13.047 (+6.409%)   13.047 (+6.409%)  17.226 (+9.176%)

experiment_name          hellaswag_acc                   wikitext_word_perplexity
-----------------------  ------------------------------  -------------------------------
Llama3.2-3B_alpaca_full  0.470 quant, 0.534 float        18.563 quant, 12.364 float
Llama3.2-3B_alpaca_qat   0.511 quant, recovered 63.043%  13.792 quant, recovered 76.962%

pytorch-bot · 2025-05-12T14:06:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2721

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f0cefe6 with merge base a6db644 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

**Summary:** Similar to meta-pytorch#1854. Update `qat_distributed` recipe to mirror `full_finetune_distributed` up until a6db644. The new major feature that is excluded from `qat_distributed` is FP8 finetuning (meta-pytorch#2546), since QAT FP8 is not supported in torchao yet. Diff between full finetune and QAT recipes: P1809370361 ``` diff --color recipes/full_finetune_distributed.py recipes/qat_distributed.py ``` **Test Plan:** Finetune: ``` tune run --nnodes 1 --nproc_per_node 4 qat_distributed --config llama3_2/3B_qat_full \ epochs=1 \ batch_size=16 \ dataset._component_=torchtune.datasets.alpaca_cleaned_dataset \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat \ output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/metrics \ metric_logger.log_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/metrics \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQATQuantizer \ quantizer.groupsize=32 ``` Quantize: ``` tune run quantize --config quantization \ model._component_=torchtune.models.llama3_2.llama3_2_3b \ checkpointer._component_=torchtune.training.FullModelHFCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002.safetensors,model-00002-of-00002.safetensors]' \ checkpointer.model_type=LLAMA3 \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=32 ``` Eval: ``` tune run eleuther_eval --config eleuther_evaluation \ batch_size=1 \ 'tasks=[wikitext]' \ model._component_=torchtune.models.llama3_2.llama3_2_3b \ checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Llama3.2-3B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002-8da4w.ckpt]' \ checkpointer.model_type=LLAMA3 \ tokenizer._component_=torchtune.models.llama3.llama3_tokenizer \ tokenizer.path=/tmp/Meta-Llama-3-8B-Instruct/original/tokenizer.model \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=32 ``` Results: ``` experiment_name tok/s peak_mem_active peak_mem_alloc peak_mem_reserved ----------------------- ------------------- ----------------- ---------------- ------------------- Llama3.2-3B_alpaca_full 4677.163 (+0.000%) 12.261 (+0.000%) 12.261 (+0.000%) 15.778 (+0.000%) Llama3.2-3B_alpaca_qat 1873.316 (-59.948%) 13.047 (+6.409%) 13.047 (+6.409%) 17.226 (+9.176%) experiment_name hellaswag_acc wikitext_word_perplexity ----------------------- ------------------------------ ------------------------------- Llama3.2-3B_alpaca_full 0.470 quant, 0.534 float 18.563 quant, 12.364 float Llama3.2-3B_alpaca_qat 0.511 quant, recovered 63.043% 13.792 quant, recovered 76.962% ```

andrewor14 · 2025-05-12T18:05:09Z

@ebsmothers @joecummings

joecummings

You're the best, thank you!

**Summary:** Similar to meta-pytorch#2721. Update `qat_lora_finetune_distributed` recipe to mirror `lora_finetune_distributed` up until 0991f97. Diff between lora finetune and QAT lora recipes: ``` diff --color recipes/lora_finetune_distributed.py recipes/qat_lora_finetune_distributed.py ``` **Test Plan:** TBD

**Summary:** Similar to meta-pytorch#2721. Update `qat_lora_finetune_distributed` recipe to mirror `lora_finetune_distributed` up until 3d73591. Diff between lora finetune and QAT lora recipes: ``` diff --color recipes/lora_finetune_distributed.py recipes/qat_lora_finetune_distributed.py ``` **Test Plan:** TBD

**Summary:** Similar to meta-pytorch#2721. Update `qat_distributed` recipe to mirror `full_finetune_distributed` up until 3d73591. Diff between full finetune and QAT recipes: ``` diff --color recipes/full_finetune_distributed.py recipes/qat_distributed.py ``` **Test Plan:** TBD

**Summary:** Similar to meta-pytorch#2721. Update `qat_lora_finetune_distributed` recipe to mirror `lora_finetune_distributed` up until 371bb0b. Diff between lora finetune and QAT lora recipes: ``` diff --color recipes/lora_finetune_distributed.py recipes/qat_lora_finetune_distributed.py ``` **Test Plan:** TBD

**Summary:** Similar to meta-pytorch#2721. Update `qat_distributed` recipe to mirror `full_finetune_distributed` up until 371bb0b. Diff between full finetune and QAT recipes: ``` diff --color recipes/full_finetune_distributed.py recipes/qat_distributed.py ``` **Test Plan:** TBD

**Summary:** Similar to meta-pytorch#2721. Update `qat_lora_finetune_distributed` recipe to mirror `lora_finetune_distributed` up until 371bb0b. Diff between lora finetune and QAT lora recipes: ``` diff --color recipes/lora_finetune_distributed.py recipes/qat_lora_finetune_distributed.py ``` **Test Plan:** Fine-tune: ``` tune run --nnodes 1 --nproc_per_node 4 qat_lora_finetune_distributed --config qwen3/1.7B_qat_lora \ epochs=1 \ batch_size=16 \ dataset._component_=torchtune.datasets.alpaca_cleaned_dataset \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat \ output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ metric_logger.log_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQATQuantizer \ quantizer.groupsize=256 ``` Quantize: ``` tune run quantize --config quantization \ model._component_=torchtune.models.qwen3.lora_qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelHFCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002.safetensors,model-00002-of-00002.safetensors]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Eval: ``` tune run eleuther_eval --config eleuther_evaluation \ batch_size=1 \ 'tasks=[hellaswag,wikitext]' \ model._component_=torchtune.models.qwen3.lora_qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002-8da4w.ckpt]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Results: ``` experiment_name tok/s peak_mem_active peak_mem_alloc peak_mem_reserved ------------------- ------------------- ----------------- ---------------- ------------------- Qwen3-1.7B_full 5687.638 (+0.000%) 7.009 (+0.000%) 7.009 (+0.000%) 11.075 (+0.000%) Qwen3-1.7B_qat_lora 2812.026 (-50.559%) 5.945 (-15.177%) 5.945 (-15.177%) 10.146 (-8.390%) experiment_name hellaswag_acc wikitext_word_perplexity ------------------- ------------------------------ ------------------------------- Qwen3-1.7B_full 0.370 quant, 0.449 float 140.294 quant, 29.461 float Qwen3-1.7B_qat_lora 0.421 quant, recovered 64.602% 46.755 quant, recovered 84.396% ```

**Summary:** Similar to meta-pytorch#2721. Update `qat_distributed` recipe to mirror `full_finetune_distributed` up until 371bb0b. Diff between full finetune and QAT recipes: ``` diff --color recipes/full_finetune_distributed.py recipes/qat_distributed.py ``` **Test Plan:** Fine-tune: ``` tune run --nnodes 1 --nproc_per_node 4 qat_distributed --config qwen3/1.7B_qat_full \ epochs=1 \ batch_size=16 \ dataset._component_=torchtune.datasets.alpaca_cleaned_dataset \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat \ output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ metric_logger.log_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQATQuantizer \ quantizer.groupsize=256 ``` Quantize: ``` tune run quantize --config quantization \ model._component_=torchtune.models.qwen3.qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelHFCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002.safetensors,model-00002-of-00002.safetensors]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Eval: ``` tune run eleuther_eval --config eleuther_evaluation \ batch_size=1 \ 'tasks=[hellaswag,wikitext]' \ model._component_=torchtune.models.qwen3.qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002-8da4w.ckpt]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Results: ``` experiment_name tok/s peak_mem_active peak_mem_alloc peak_mem_reserved ------------------- ------------------- ----------------- ---------------- ------------------- Qwen3-1.7B_full 5687.638 (+0.000%) 7.009 (+0.000%) 7.009 (+0.000%) 11.075 (+0.000%) Qwen3-1.7B_qat 2569.197 (-54.828%) 7.394 (+5.496%) 7.394 (+5.496%) 12.559 (+13.398%) experiment_name hellaswag_acc wikitext_word_perplexity ------------------- ------------------------------ ------------------------------- Qwen3-1.7B_full 0.370 quant, 0.449 float 140.294 quant, 29.461 float Qwen3-1.7B_qat 0.406 quant, recovered 44.753% 48.768 quant, recovered 82.580% ```

**Summary:** Similar to meta-pytorch#2721. Update `qat_lora_finetune_distributed` recipe to mirror `lora_finetune_distributed` up until 371bb0b. Diff between lora finetune and QAT lora recipes: ``` diff --color recipes/lora_finetune_distributed.py recipes/qat_lora_finetune_distributed.py ``` **Test Plan:** Fine-tune: ``` tune run --nnodes 1 --nproc_per_node 4 qat_lora_finetune_distributed --config qwen3/1.7B_qat_lora \ epochs=1 \ batch_size=16 \ dataset._component_=torchtune.datasets.alpaca_cleaned_dataset \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat \ output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ metric_logger.log_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQATQuantizer \ quantizer.groupsize=256 ``` Quantize: ``` tune run quantize --config quantization \ model._component_=torchtune.models.qwen3.lora_qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelHFCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002.safetensors,model-00002-of-00002.safetensors]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Eval: ``` tune run eleuther_eval --config eleuther_evaluation \ batch_size=1 \ 'tasks=[hellaswag,wikitext]' \ model._component_=torchtune.models.qwen3.lora_qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002-8da4w.ckpt]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Results: ``` experiment_name tok/s peak_mem_active peak_mem_alloc peak_mem_reserved ------------------- ------------------- ----------------- ---------------- ------------------- Qwen3-1.7B_full 5687.638 (+0.000%) 7.009 (+0.000%) 7.009 (+0.000%) 11.075 (+0.000%) Qwen3-1.7B_qat_lora 2812.026 (-50.559%) 5.945 (-15.177%) 5.945 (-15.177%) 10.146 (-8.390%) experiment_name hellaswag_acc wikitext_word_perplexity ------------------- ------------------------------ ------------------------------- Qwen3-1.7B_full 0.370 quant, 0.449 float 140.294 quant, 29.461 float Qwen3-1.7B_qat_lora 0.421 quant, recovered 64.602% 46.755 quant, recovered 84.396% ```

**Summary:** Similar to meta-pytorch#2721. Update `qat_distributed` recipe to mirror `full_finetune_distributed` up until 371bb0b. Diff between full finetune and QAT recipes: ``` diff --color recipes/full_finetune_distributed.py recipes/qat_distributed.py ``` **Test Plan:** Fine-tune: ``` tune run --nnodes 1 --nproc_per_node 4 qat_distributed --config qwen3/1.7B_qat_full \ epochs=1 \ batch_size=16 \ dataset._component_=torchtune.datasets.alpaca_cleaned_dataset \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat \ output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ metric_logger.log_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/metrics \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQATQuantizer \ quantizer.groupsize=256 ``` Quantize: ``` tune run quantize --config quantization \ model._component_=torchtune.models.qwen3.qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelHFCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002.safetensors,model-00002-of-00002.safetensors]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Eval: ``` tune run eleuther_eval --config eleuther_evaluation \ batch_size=1 \ 'tasks=[hellaswag,wikitext]' \ model._component_=torchtune.models.qwen3.qwen3_1_7b_instruct \ checkpointer._component_=torchtune.training.FullModelTorchTuneCheckpointer \ checkpointer.checkpoint_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0 \ checkpointer.output_dir=/home/andrewor/local/logs/tune/Qwen3-1.7B_alpaca_qat/epoch_0_out \ 'checkpointer.checkpoint_files=[model-00001-of-00002-8da4w.ckpt]' \ checkpointer.model_type=QWEN3 \ tokenizer._component_=torchtune.models.qwen3.qwen3_tokenizer \ tokenizer.path=/tmp/Qwen3-1.7B/vocab.json \ tokenizer.merges_file=/tmp/Qwen3-1.7B/merges.txt \ quantizer._component_=torchtune.training.quantization.Int8DynActInt4WeightQuantizer \ quantizer.groupsize=256 ``` Results: ``` experiment_name tok/s peak_mem_active peak_mem_alloc peak_mem_reserved ------------------- ------------------- ----------------- ---------------- ------------------- Qwen3-1.7B_full 5687.638 (+0.000%) 7.009 (+0.000%) 7.009 (+0.000%) 11.075 (+0.000%) Qwen3-1.7B_qat 2569.197 (-54.828%) 7.394 (+5.496%) 7.394 (+5.496%) 12.559 (+13.398%) experiment_name hellaswag_acc wikitext_word_perplexity ------------------- ------------------------------ ------------------------------- Qwen3-1.7B_full 0.370 quant, 0.449 float 140.294 quant, 29.461 float Qwen3-1.7B_qat 0.406 quant, recovered 44.753% 48.768 quant, recovered 82.580% ```

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 12, 2025

andrewor14 force-pushed the update_qat_5_12_25 branch from 0bb8049 to f0cefe6 Compare May 12, 2025 18:04

joecummings approved these changes May 13, 2025

View reviewed changes

ebsmothers merged commit 541c730 into meta-pytorch:main May 13, 2025
14 checks passed

andrewor14 mentioned this pull request May 15, 2025

QAT axolotl-ai-cloud/axolotl#2590

Merged

This was referenced Jul 1, 2025

Update QAT LoRA recipe to match lora finetune recipe (7/1/25) #2862

Merged

Update QAT recipe to match full finetune recipe (7/1/25) #2863

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update QAT recipe to match full finetune recipe (5/12/25) #2721

Update QAT recipe to match full finetune recipe (5/12/25) #2721

Uh oh!

andrewor14 commented May 12, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented May 12, 2025 •

edited

Loading

Uh oh!

andrewor14 commented May 12, 2025

Uh oh!

joecummings left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update QAT recipe to match full finetune recipe (5/12/25) #2721

Update QAT recipe to match full finetune recipe (5/12/25) #2721

Uh oh!

Conversation

andrewor14 commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2721

✅ No Failures

Uh oh!

andrewor14 commented May 12, 2025

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

andrewor14 commented May 12, 2025 •

edited

Loading

pytorch-bot bot commented May 12, 2025 •

edited

Loading