Skip to content

Improve prefix handling, attention mask compatibility, and KV cache control#11

Open
YangYangGirl wants to merge 2 commits intoQwenLM:mainfrom
YangYangGirl:main
Open

Improve prefix handling, attention mask compatibility, and KV cache control#11
YangYangGirl wants to merge 2 commits intoQwenLM:mainfrom
YangYangGirl:main

Conversation

@YangYangGirl
Copy link

  • Prefix mask fix
    Changed prefix mask values from 0 to 1 so virtual prefix tokens can properly participate in attention.
  • Prefix K/V concatenation during training
    Added logic to concatenate prefix key/value states during training.
  • 2D attention mask support
    Enabled attention_mask in [B, T] format in addition to 4D [B, H, T_q, T_k].
  • Configurable KV cache usage
    Added use_cache parameter; KV cache is disabled during training.

- Prefix mask fix
Changed prefix mask values from 0 to 1 so virtual prefix tokens can properly participate in attention.
- Prefix K/V concatenation during training
Added logic to concatenate prefix key/value states during training.
- 2D attention mask support
Enabled attention_mask in [B, T] format in addition to 4D [B, H, T_q, T_k].
- Configurable KV cache usage
Added use_cache parameter; KV cache is disabled during training.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant