Skip to content

PI0 policy suffers from weight/key mismatch on main when using transformers ≥ 4.52.0 #1406

@yongjincho

Description

@yongjincho

Summary

The current main branch works only with transformers ≥ 4.52.0, but the pretrained PI0 weights no longer align with the model’s internal key names. As a result, training starts with an abnormally high loss (≈ 4.5 vs ≈ 0.5 on a healthy run).

Failing configuration

# 1. Check out the current main commit
git co 69901b9b6a2300914ca3de0ea14b6fa6e0203bd4

# 2. Install an older Transformers version
pip install transformers==4.51.3

# 3. Run a short training job
python lerobot/scripts/train.py --policy.path=lerobot/pi0 --policy.push_to_hub=false --dataset.repo_id=lerobot/aloha_sim_insertion_human --steps=2 --log_freq=1

Observed error

INFO 2025-06-30 18:19:27 ts/train.py:202 Start offline training on a fixed dataset
Traceback (most recent call last):
  ...
AttributeError: 'GemmaForCausalLM' object has no attribute 'embed_tokens'

current main with transformers == 4.52.0

git co 69901b9b6a2300914ca3de0ea14b6fa6e0203bd4

pip install transformers==4.52.0

python lerobot/scripts/train.py --policy.path=lerobot/pi0 --policy.push_to_hub=false --dataset.repo_id=lerobot/aloha_sim_insertion_human --steps=2 --log_freq=1

Observed output (low performance)

INFO 2025-06-30 18:24:46 ts/train.py:202 Start offline training on a fixed dataset
INFO 2025-06-30 18:24:51 ts/train.py:232 step:1 smpl:8 ep:0 epch:0.00 loss:4.314 grdn:246.739 lr:5.0e-08 updt_s:1.388 data_s:3.667
INFO 2025-06-30 18:24:52 ts/train.py:232 step:2 smpl:16 ep:0 epch:0.00 loss:4.609 grdn:235.238 lr:7.5e-08 updt_s:0.348 data_s:0.000
INFO 2025-06-30 18:24:52 ts/train.py:241 Checkpoint policy after step 2

Training runs, but the initial loss is > 4.0—much higher than expected.

Expected behaviour

# checkout just before "Fixing `PI0` Policy (#1297)"
git co 697c76f75e12a6e1a2ba09911c93e1e22c9b8f5c

pip install transformers==4.51.3

python lerobot/scripts/train.py --policy.path=lerobot/pi0 --dataset.repo_id=lerobot/aloha_sim_insertion_human --steps=2 --log_freq=1

Output

INFO 2025-06-30 18:30:17 ts/train.py:202 Start offline training on a fixed dataset
INFO 2025-06-30 18:30:25 ts/train.py:232 step:1 smpl:8 ep:0 epch:0.00 loss:0.521 grdn:7.218 lr:5.0e-08 updt_s:1.244 data_s:7.370
INFO 2025-06-30 18:30:26 ts/train.py:232 step:2 smpl:16 ep:0 epch:0.00 loss:0.385 grdn:7.831 lr:7.5e-08 updt_s:0.375 data_s:0.000
INFO 2025-06-30 18:30:26 ts/train.py:241 Checkpoint policy after step 2

The loss is much lower than before.

Diagnosis

The high loss appears to stem from a key mismatch between the pretrained PI0 checkpoint and the parameter names expected by transformers ≥ 4.52.0. Earlier commits (and transformers 4.51.x) still line up correctly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions