Aero-Hand/logs_r38_stdout.txt at main · luck2shi/Aero-Hand · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
nohup: 忽略输入
Environment Config:
action_repeat: 1
action_scale:
- 1.4
- 0.8
- 1.4
- 1.4
- 1.4
- 1.4
ctrl_dt: 0.05
early_termination: true
episode_length: 800
force_history_len: 1
history_len: 1
impl: jax
noise_config:
  force_ema_alpha: 0.85
  level: 0.3
  scales:
    hw_force: 0.05
    hw_pos: 0.005
reset_config:
  hand_qpos_noise_scale: 0.02
reward_config:
  finger_active_threshold: 0.08
  force_contact_saturation: 1.0
  force_contact_threshold: 0.06
  force_overload_soft_width: 1.5
  force_overload_threshold: 2.8
  scales:
    action_accel: -0.005
    action_rate: -0.01
    approach: 20.0
    closure: 35.0
    contact: 15.0
    drop_risk: -5.0
    finger_participation: 5.0
    force_balance: 0.0
    force_contact: 25.0
    force_overload: 0.0
    grip_force: 15.0
    height: 3.0
    hold_position: 30.0
    human_pose: 3.0
    idle_follow: 0.0
    pip_closure: 15.0
    progressive_hold: 0.0
    soft_contact: 0.0
    stable_hold: 10.0
    survival: 1.0
    termination: -80.0
    thumb_engage: 15.0
    torques: -3.0e-05
  soft_contact_fmax: 2.5
  soft_contact_fmin: 0.1
sim_dt: 0.01
spawn_config:
  cube_jitter:
  - 0.003
  - 0.005
  - 0.003
  cube_pos:
  - 0.005
  - -0.028
  - 0.115
  support_enabled: true
support_config:
  min_release_active_fingers: 3
  min_release_force: 0.2
  release_after_sec: 3.0
  release_ramp_sec: 2.0
  support_hidden_pos:
  - 0.0
  - 0.0
  - -10.0
  support_pos:
  - 0.005
  - -0.028
  - 0.1
tactile_config:
  force_saturation_n: 3.0
  obs_force_ema_alpha: 0.7
  taxel_weights:
  - 0.7
  - 1.0
  - 1.0
  - 0.7
  - 1.0
  - 1.4
  - 1.4
  - 1.0
  - 1.0
  - 1.4
  - 1.4
  - 1.0
  - 0.7
  - 1.0
  - 1.0
  - 0.7
  use_pooled_obs: true
  use_real_tactile: true

PPO Training Parameters:
action_repeat: 1
batch_size: 256
discounting: 0.97
entropy_cost: 0.01
episode_length: 800
learning_rate: 0.0003
network_factory:
  policy_hidden_layer_sizes: &id001 !!python/tuple
  - 512
  - 256
  - 128
  policy_obs_key: state
  value_hidden_layer_sizes: *id001
  value_obs_key: privileged_state
normalize_observations: true
num_envs: 8192
num_evals: 10
num_minibatches: 32
num_resets_per_eval: 1
num_timesteps: 20000000
num_updates_per_batch: 4
reward_scaling: 1.0
unroll_length: 40

Experiment name: AeroCubeGraspV2Force-20260415-015650
Logs are being stored in: /home/ll/SRTP/Aero-Hand/sim_rl/mujoco_playground/logs/AeroCubeGraspV2Force-20260415-015650
No checkpoint path provided, not restoring from checkpoint
Checkpoint path: /home/ll/SRTP/Aero-Hand/sim_rl/mujoco_playground/logs/AeroCubeGraspV2Force-20260415-015650/checkpoints
0: reward=38.538
2293760: reward=39.656
4587520: reward=38.831
6881280: reward=38.497
已终止