-
Notifications
You must be signed in to change notification settings - Fork 72
Description
Dear author,
Could you please provide with a complete command for RAD on DMC? (for example for "CartPole-SwingUp" ?)
I cannot reproduce results of CartPole-SwingUp in the paper by running the command in script/run.sh.
It seems the command in run.sh is not completely the same as hyperparameters listed in the paper (like batch-size is 512 in the paper but 128 in run.sh). And I changed them but still cannot get the same result of the paper.
I'll list the command I run for these experiments:
-
SAC-pixel
It should attain reward≈200 after 100k env step (and 12.5k policy step since action_repeat = 8) but what I got is bigger (like 250 or 300)
CUDA_VISIBLE_DEVICES=0 python train.py
--domain_name cartpole
--task_name swingup
--encoder_type pixel --work_dir ./tmp
--action_repeat 8 --num_eval_episodes 10
--pre_transform_image_size 100 --image_size 84
--agent rad_sac --frame_stack 3 --data_augs no_aug
--seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50 -
RAD(translate)
It should attain reward≈828 after 100k env step (12.5k policy step) but what I got is much smaller (around 50)
CUDA_VISIBLE_DEVICES=0 python train.py
--domain_name cartpole
--task_name swingup
--encoder_type pixel --work_dir ./tmp
--action_repeat 8 --num_eval_episodes 10
--pre_transform_image_size 100 --image_size 84
--agent rad_sac --frame_stack 3 --data_augs translate
--seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50
Sincerely look forward to your reply!