Skip to content

Cannot reproduce  #17

@recordmp3

Description

@recordmp3

Dear author,

Could you please provide with a complete command for RAD on DMC? (for example for "CartPole-SwingUp" ?)

I cannot reproduce results of CartPole-SwingUp in the paper by running the command in script/run.sh.

It seems the command in run.sh is not completely the same as hyperparameters listed in the paper (like batch-size is 512 in the paper but 128 in run.sh). And I changed them but still cannot get the same result of the paper.

I'll list the command I run for these experiments:

  1. SAC-pixel

    It should attain reward≈200 after 100k env step (and 12.5k policy step since action_repeat = 8) but what I got is bigger (like 250 or 300)

    CUDA_VISIBLE_DEVICES=0 python train.py
    --domain_name cartpole
    --task_name swingup
    --encoder_type pixel --work_dir ./tmp
    --action_repeat 8 --num_eval_episodes 10
    --pre_transform_image_size 100 --image_size 84
    --agent rad_sac --frame_stack 3 --data_augs no_aug
    --seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50

  2. RAD(translate)

    It should attain reward≈828 after 100k env step (12.5k policy step) but what I got is much smaller (around 50)

    CUDA_VISIBLE_DEVICES=0 python train.py
    --domain_name cartpole
    --task_name swingup
    --encoder_type pixel --work_dir ./tmp
    --action_repeat 8 --num_eval_episodes 10
    --pre_transform_image_size 100 --image_size 84
    --agent rad_sac --frame_stack 3 --data_augs translate
    --seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50

Sincerely look forward to your reply!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions