PI0 policy suffers from weight/key mismatch on main when using transformers ≥ 4.52.0

## Summary

The current main branch works only with transformers ≥ 4.52.0, but the pretrained PI0 weights no longer align with the model’s internal key names. As a result, training starts with an abnormally high loss (≈ 4.5 vs ≈ 0.5 on a healthy run).

## Failing configuration
```
# 1. Check out the current main commit
git co 69901b9b6a2300914ca3de0ea14b6fa6e0203bd4

# 2. Install an older Transformers version
pip install transformers==4.51.3

# 3. Run a short training job
python lerobot/scripts/train.py --policy.path=lerobot/pi0 --policy.push_to_hub=false --dataset.repo_id=lerobot/aloha_sim_insertion_human --steps=2 --log_freq=1
```

### Observed error
```
INFO 2025-06-30 18:19:27 ts/train.py:202 Start offline training on a fixed dataset
Traceback (most recent call last):
  ...
AttributeError: 'GemmaForCausalLM' object has no attribute 'embed_tokens'
```

## current main with transformers == 4.52.0

```
git co 69901b9b6a2300914ca3de0ea14b6fa6e0203bd4

pip install transformers==4.52.0

python lerobot/scripts/train.py --policy.path=lerobot/pi0 --policy.push_to_hub=false --dataset.repo_id=lerobot/aloha_sim_insertion_human --steps=2 --log_freq=1
```

### Observed output (low performance)

```
INFO 2025-06-30 18:24:46 ts/train.py:202 Start offline training on a fixed dataset
INFO 2025-06-30 18:24:51 ts/train.py:232 step:1 smpl:8 ep:0 epch:0.00 loss:4.314 grdn:246.739 lr:5.0e-08 updt_s:1.388 data_s:3.667
INFO 2025-06-30 18:24:52 ts/train.py:232 step:2 smpl:16 ep:0 epch:0.00 loss:4.609 grdn:235.238 lr:7.5e-08 updt_s:0.348 data_s:0.000
INFO 2025-06-30 18:24:52 ts/train.py:241 Checkpoint policy after step 2
```

Training runs, but the initial loss is > 4.0—much higher than expected.

## Expected behaviour

```
# checkout just before "Fixing `PI0` Policy (#1297)"
git co 697c76f75e12a6e1a2ba09911c93e1e22c9b8f5c

pip install transformers==4.51.3

python lerobot/scripts/train.py --policy.path=lerobot/pi0 --dataset.repo_id=lerobot/aloha_sim_insertion_human --steps=2 --log_freq=1
```

### Output 

```
INFO 2025-06-30 18:30:17 ts/train.py:202 Start offline training on a fixed dataset
INFO 2025-06-30 18:30:25 ts/train.py:232 step:1 smpl:8 ep:0 epch:0.00 loss:0.521 grdn:7.218 lr:5.0e-08 updt_s:1.244 data_s:7.370
INFO 2025-06-30 18:30:26 ts/train.py:232 step:2 smpl:16 ep:0 epch:0.00 loss:0.385 grdn:7.831 lr:7.5e-08 updt_s:0.375 data_s:0.000
INFO 2025-06-30 18:30:26 ts/train.py:241 Checkpoint policy after step 2
```

The loss is much lower than before.

## Diagnosis

The high loss appears to stem from a key mismatch between the pretrained PI0 checkpoint and the parameter names expected by transformers ≥ 4.52.0. Earlier commits (and transformers 4.51.x) still line up correctly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PI0 policy suffers from weight/key mismatch on main when using transformers ≥ 4.52.0 #1406

Summary

Failing configuration

Observed error

current main with transformers == 4.52.0

Observed output (low performance)

Expected behaviour

Output

Diagnosis

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PI0 policy suffers from weight/key mismatch on main when using transformers ≥ 4.52.0 #1406

Description

Summary

Failing configuration

Observed error

current main with transformers == 4.52.0

Observed output (low performance)

Expected behaviour

Output

Diagnosis

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions