Skip to content

[feat] support prefix cache clearing when /clear_load_weight is called#4008

Merged
Jiang-Jia-Jun merged 35 commits intoPaddlePaddle:developfrom
liyonghua0910:develop+clear_prefix_cache
Sep 28, 2025
Merged

[feat] support prefix cache clearing when /clear_load_weight is called#4008
Jiang-Jia-Jun merged 35 commits intoPaddlePaddle:developfrom
liyonghua0910:develop+clear_prefix_cache

Conversation

@liyonghua0910
Copy link
Copy Markdown
Collaborator

@liyonghua0910 liyonghua0910 commented Sep 9, 2025

需求描述

在 RL 场景需要交替进行训练和推理,因此每次推理结束后需要调用 /clear_load_weight 清理权重给训练腾出空间,训练结束后需要调用 /update_model_weight 时重新加载权重进行推理。现在 RL 需要开启上下文前缀缓存(Prefix Caching)实现推理加速,因此 KV 缓存也要配套该接口进行与模型权重的清除和加载。本 PR 主要实现该功能点。

主要改动

  • 新增了算子 unset_data_ipc 用于解除 set_data_ipc 对 kv cache tensor 的引用
  • 新增了 prefix_tree_status_signal 和 kv_cache_status_signal 信号,分别用于通知清除缓存树索引和缓存本身
  • 修改 kv cache 的初始化时机,现在有两种情况:(1) cache_transfer_manager 创建 cache 而 gpu_model_runner 连接,(2) gpu_model_runner 创建 cache 而 cache_transfer_manager 连接。只有非 profile 模式的 PD 分离部署为第 1 种情况
  • 扩展了 cache_ready_signal 信号的使用场景,原本用于 prefix_cache_manager 等待 cache_transfer_manager 进程创建 cache tensor,而现在 cache tensor 不一定由 cache_transfer_manager 创建,因此现在 cache_transfer_manager 和 gpu_model_runner 都有可能负责将该信号量置 1,或等待对方将该信号置 1
  • 新增了 swap_space_ready_signal 信号,用于 prefix_cache_manager 等待 cache_transfer_manager 完成 cpu 缓存的分配
  • 新增了部分 IPC 信号的状态常量的名称映射,以免开发时直接使用 0, 1, -1 等数字表示不同状态容易导致混淆
  • 给 engine_client 中的 clear/update 操作加了互斥锁,避免同时清除和加载权重
  • 新增了环境变量 FD_ENABLE_SWAP_SPACE_CLEARING 控制在清除 PrefixCache 时是否要顺带清除 CPU 缓存,默认为 0,即不会清除 CPU 缓存

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Sep 9, 2025

Thanks for your contribution!

@ltd0924 ltd0924 force-pushed the develop+clear_prefix_cache branch from e3fb87f to 2a47216 Compare September 10, 2025 04:05
@liyonghua0910 liyonghua0910 force-pushed the develop+clear_prefix_cache branch from 2a47216 to 94a55fc Compare September 12, 2025 13:44
@liyonghua0910 liyonghua0910 marked this pull request as ready for review September 12, 2025 13:44
liyonghua0910 and others added 23 commits September 15, 2025 10:54
This reverts commit 0bc6d55.
@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 6265f43 into PaddlePaddle:develop Sep 28, 2025
29 of 38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants