Support HF LLaMA ckpt conversion #118

lsy323 · 2024-06-07T00:57:47Z

Added --from_hf option in convert_checkpoint.py for HF checkpoint. Only LLaMA is supported now. Quantization conversion is not supported with HF checkpoint.

Enable converting HF llama checkpoint by

python -m convert_checkpoints --model_name=llama-2 \
    --input_checkpoint_dir=$input_ckpt_dir \
    --output_checkpoint_dir=$output_ckpt_dir \
    --from_hf=True

The guide to add support for HF checkpoint will be done in a following PR.

Only tested with HF 7B model, 70B not tested yet

FanhaiLu1

Thanks for support HF llama chpt conversion! Can you save hf weight names as a file in the repo?

FanhaiLu1 · 2024-06-07T02:15:10Z

convert_checkpoints.py

+            "self_attn.k_proj": "attention.wk",
+            "self_attn.v_proj": "attention.wv",
+            "self_attn.o_proj": "attention.wo",
+            "mlp.gate_proj": "feed_forward.w1",


I feel [gate|down|up]_proj are more read friendly than w1, w2 and w3. @qihqi Shall we consider rename it to proj related name in default checkpoint convert?

Yeah, this makes sense. Also want to note that original llama weight is using w1/2/3 https://github.com/meta-llama/llama3/blob/main/llama/model.py#L219. If we change it we need to do the name mapping for the original llama weight.

FanhaiLu1 · 2024-06-07T02:19:51Z

convert_checkpoints.py

+    assert (
+        not FLAGS.quantize_weights
+    ), "Quantization not supported for HF checkpoint."
+    return _load_hf_llama_weight(input_ckpt_dir)


Did you test the llama2-70B model?

No I didn't, since I haven't set up multi host yet. But I will do that later

lsy323 added 4 commits June 7, 2024 00:55

support converting hf checkpoint

b1f3a25

add debug script

70631c5

pyint

3f4ab49

Merge branch 'main' into lsiyuan/hf-ckpt

349895c

FanhaiLu1 approved these changes Jun 7, 2024

View reviewed changes

qihqi merged commit 94b576c into AI-Hypercomputer:main Jun 7, 2024
4 checks passed

lsy323 deleted the lsiyuan/hf-ckpt branch June 7, 2024 04:24

lsy323 restored the lsiyuan/hf-ckpt branch June 7, 2024 04:24

lsy323 mentioned this pull request Jun 7, 2024

Add guide on adding HF ckpt conversion support #119

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support HF LLaMA ckpt conversion #118

Support HF LLaMA ckpt conversion #118

Uh oh!

lsy323 commented Jun 7, 2024 •

edited

Loading

Uh oh!

FanhaiLu1 left a comment

Uh oh!

FanhaiLu1 Jun 7, 2024

Uh oh!

lsy323 Jun 7, 2024

Uh oh!

FanhaiLu1 Jun 7, 2024

Uh oh!

lsy323 Jun 7, 2024

Uh oh!

Uh oh!

Uh oh!

Support HF LLaMA ckpt conversion #118

Support HF LLaMA ckpt conversion #118

Uh oh!

Conversation

lsy323 commented Jun 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FanhaiLu1 left a comment

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

lsy323 Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

lsy323 Jun 7, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lsy323 commented Jun 7, 2024 •

edited

Loading