Add page attention manager and kvcache manager #167

FanhaiLu1 · 2024-08-06T21:05:25Z

This PR adds two classes, fundamental for page attention in JetStream:

PageAttentionManager:

This class manages and frees page resources, calculates page metadata, and supports cache insertion.

PageKVCacheGenerate:

This class updates decode caches in a page-attention format. Unlike the standard LLM KV cache shape ([batch_size, num_heads, seq_len, head_dim]), PageKVCache uses the shape [num_heads, total_num_pages, page_size, head_dim].

jetstream_pt/cache_manager.py

qihqi · 2024-08-06T21:25:42Z

jetstream_pt/page_attention_manager.py

+      page_size: int,
+      max_pages_per_sequence: int,
+  ):
+    self.unused_pages = queue.Queue()


use deque

Current jet stream implementation's detokenize_threads and _generate_threads are different thread, both of them need to access this queue. So the queue should be thread safe, but deque is not thread safe.

tests/test_page_attention.py

FanhaiLu1 added 2 commits August 6, 2024 21:04

Add page attention manager and kvcache manager

cf05b09

adapt prefill update layer new api

ed97a81

FanhaiLu1 requested review from qihqi, wang2yn84, lsy323, sixiang-google and bhavya01 August 6, 2024 21:16

qihqi approved these changes Aug 6, 2024

View reviewed changes

FanhaiLu1 added 2 commits August 6, 2024 22:12

use tensor indices

c70929d

lint format

d710d37

FanhaiLu1 merged commit eb360ee into AI-Hypercomputer:main Aug 6, 2024
4 checks passed

FanhaiLu1 deleted the pa_decode_checkin_2 branch August 9, 2024 18:00

FanhaiLu1 mentioned this pull request Sep 6, 2024

Support End To End PagedAttention in JetStream #180

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add page attention manager and kvcache manager #167

Add page attention manager and kvcache manager #167

Uh oh!

FanhaiLu1 commented Aug 6, 2024

Uh oh!

Uh oh!

Uh oh!

qihqi Aug 6, 2024

Uh oh!

FanhaiLu1 Aug 6, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add page attention manager and kvcache manager #167

Add page attention manager and kvcache manager #167

Uh oh!

Conversation

FanhaiLu1 commented Aug 6, 2024

Uh oh!

Uh oh!

Uh oh!

qihqi Aug 6, 2024

Choose a reason for hiding this comment

Uh oh!

FanhaiLu1 Aug 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

FanhaiLu1 Aug 6, 2024 •

edited

Loading