Commit 8a0efa8
committed
[Feature] V4 KVCompressor: compressed-kv RoPE + per-ratio cu_seq_lens_out reuse
Two changes that co-evolve the KVCompressor.forward signature (compressed-rope
table + precomputed boundaries are added side by side), so they land together.
1. Compressed-kv RoPE. After the chunk softmax + norm, rotate each compressed
chunk's rope tail at its window-center position, mirroring HF
DeepseekV4{CSA,HCA}Compressor.forward. ``qk_rope_head_dim`` is wired from the
DSA/Indexer configs into the internal KVCompressor, and DSA now forwards
``position_embeddings_compressed`` to the compressor (required for
compress_ratio > 0, not just == 4). The chunk->sample map uses
``searchsorted(cu_seq_lens_out, ., right=True) - 1`` — right=True is
load-bearing: a chunk on a sample boundary is the first chunk of the next
sample, and mapping it to the previous one overruns ``first_token_per_chunk``
and indexes the rope table out of bounds.
2. Hoist cu_seq_lens_out. ``KVCompressor.build_cu_seq_lens_out`` computes the
per-sample compressed boundaries once; DeepSeekV4 forward builds one per
distinct compress_ratio and caches it on
``SequenceContext.compressed_cu_seq_lens``, so every decoder layer of that
ratio reuses a single cumsum + H2D instead of recomputing it. ``total_c``
stays derived in the compressor from the CPU mirror (it must remain a Python
int and would force a recompile if threaded through the compiled attn graph).
Standalone callers (no cache on seq_ctx) fall back to building it in-place.1 parent 97e922b commit 8a0efa8
5 files changed
Lines changed: 181 additions & 27 deletions
File tree
- xtuner/v1
- data_proto
- model/moe
- module/attention
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
38 | 45 | | |
39 | 46 | | |
40 | 47 | | |
| |||
130 | 137 | | |
131 | 138 | | |
132 | 139 | | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
133 | 143 | | |
134 | 144 | | |
135 | 145 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
506 | 507 | | |
507 | 508 | | |
508 | 509 | | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
509 | 514 | | |
510 | 515 | | |
511 | 516 | | |
| |||
604 | 609 | | |
605 | 610 | | |
606 | 611 | | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
607 | 624 | | |
608 | 625 | | |
609 | 626 | | |
610 | 627 | | |
| 628 | + | |
611 | 629 | | |
612 | 630 | | |
613 | 631 | | |
| |||
676 | 694 | | |
677 | 695 | | |
678 | 696 | | |
| 697 | + | |
679 | 698 | | |
680 | 699 | | |
681 | 700 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
277 | 277 | | |
278 | 278 | | |
279 | 279 | | |
| 280 | + | |
280 | 281 | | |
281 | 282 | | |
282 | 283 | | |
| |||
461 | 462 | | |
462 | 463 | | |
463 | 464 | | |
464 | | - | |
465 | | - | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
466 | 472 | | |
467 | 473 | | |
468 | 474 | | |
| |||
534 | 540 | | |
535 | 541 | | |
536 | 542 | | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
537 | 553 | | |
538 | 554 | | |
539 | 555 | | |
540 | 556 | | |
| 557 | + | |
| 558 | + | |
541 | 559 | | |
542 | 560 | | |
543 | 561 | | |
| |||
564 | 582 | | |
565 | 583 | | |
566 | 584 | | |
| 585 | + | |
567 | 586 | | |
568 | 587 | | |
569 | 588 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
181 | 181 | | |
182 | 182 | | |
183 | 183 | | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
184 | 190 | | |
185 | 191 | | |
186 | 192 | | |
| |||
191 | 197 | | |
192 | 198 | | |
193 | 199 | | |
| 200 | + | |
194 | 201 | | |
195 | 202 | | |
196 | 203 | | |
| |||
206 | 213 | | |
207 | 214 | | |
208 | 215 | | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
209 | 222 | | |
210 | 223 | | |
211 | 224 | | |
| |||
259 | 272 | | |
260 | 273 | | |
261 | 274 | | |
| 275 | + | |
| 276 | + | |
262 | 277 | | |
263 | 278 | | |
264 | 279 | | |
| |||
0 commit comments