Commit 083a8ad
authored
Refine rollout worker health check and recovery lifecycle (#1877)
* Refactor rollout server launch specs
* Refactor rollout health manager
* Restore rollout worker session URLs
* fix
* fix
* fix
* fix comments
* Delay rollout health checks after resume and log LMDeploy health errors
* fix ci
* delete rank2info and worker_server_urls_map
* Require fail-fast health checks before rollout offload
* Align rollout generate concurrency with request entrypoint topology
* Refactor colocate rollout recovery with fail-fast shutdown and pre-sync restart
* simplify code
* fix ci
* fix ci and comments
* Refactor rollout worker state into registry1 parent b2c8c6c commit 083a8ad
12 files changed
Lines changed: 1780 additions & 746 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
56 | 55 | | |
57 | 56 | | |
58 | 57 | | |
| |||
148 | 147 | | |
149 | 148 | | |
150 | 149 | | |
151 | | - | |
152 | | - | |
| 150 | + | |
| 151 | + | |
153 | 152 | | |
154 | 153 | | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
155 | 157 | | |
156 | 158 | | |
157 | 159 | | |
| 160 | + | |
| 161 | + | |
158 | 162 | | |
159 | 163 | | |
160 | 164 | | |
| |||
220 | 224 | | |
221 | 225 | | |
222 | 226 | | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
223 | 251 | | |
224 | 252 | | |
225 | 253 | | |
226 | 254 | | |
227 | 255 | | |
228 | 256 | | |
229 | | - | |
230 | | - | |
231 | | - | |
| 257 | + | |
232 | 258 | | |
233 | 259 | | |
234 | 260 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
146 | 146 | | |
147 | 147 | | |
148 | 148 | | |
149 | | - | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
150 | 153 | | |
151 | 154 | | |
152 | 155 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
94 | | - | |
| 93 | + | |
| 94 | + | |
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
| |||
204 | 204 | | |
205 | 205 | | |
206 | 206 | | |
| 207 | + | |
207 | 208 | | |
208 | 209 | | |
209 | 210 | | |
| |||
217 | 218 | | |
218 | 219 | | |
219 | 220 | | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
220 | 227 | | |
221 | 228 | | |
222 | 229 | | |
| |||
321 | 328 | | |
322 | 329 | | |
323 | 330 | | |
| 331 | + | |
324 | 332 | | |
325 | 333 | | |
326 | 334 | | |
| |||
361 | 369 | | |
362 | 370 | | |
363 | 371 | | |
| 372 | + | |
364 | 373 | | |
365 | 374 | | |
366 | 375 | | |
| |||
0 commit comments