Skip to content

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config#4116

Merged
Jiang-Jia-Jun merged 7 commits intoPaddlePaddle:developfrom
YuanRisheng:remove_duplicate_field
Sep 17, 2025
Merged

[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config#4116
Jiang-Jia-Jun merged 7 commits intoPaddlePaddle:developfrom
YuanRisheng:remove_duplicate_field

Conversation

@YuanRisheng
Copy link
Copy Markdown
Collaborator

@YuanRisheng YuanRisheng commented Sep 15, 2025

移除parallel_config和FDConfig中的max_num_batched_tokens和max_num_seqs,将其放入SchedulerConfig

@YuanRisheng YuanRisheng changed the title [FDConfig]Remove max_num_batched_tokens in parallel config [FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config Sep 15, 2025
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Sep 15, 2025

Thanks for your contribution!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR moves max_num_batched_tokens and max_num_seqs configuration parameters from ParallelConfig and FDConfig to a dedicated SchedulerConfig class. This refactoring improves the separation of concerns by placing scheduler-related configuration parameters in their appropriate config class.

  • Introduces SchedulerConfig import and instantiation across test files and worker components
  • Updates all references to use scheduler_config.max_num_batched_tokens and scheduler_config.max_num_seqs instead of parallel_config equivalents
  • Modifies config initialization in args_utils to properly pass parameters to SchedulerConfig

Reviewed Changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/v1/test_schedule_output.py Updates test to use SchedulerConfig and new config structure
tests/v1/test_prefix_cache.py Updates test configuration to use SchedulerConfig
tests/utils/test_config.py Updates test assertions to access scheduler config properties
tests/utils.py Moves max_num_seqs from parallel_config to scheduler_config
tests/graph_optimization/*.py Updates multiple graph optimization tests to use SchedulerConfig
fastdeploy/worker/*.py Updates worker components to access scheduler config properties
fastdeploy/spec_decode/base.py Updates to use scheduler_config for max_num_seqs
fastdeploy/scheduler/config.py Adds max_num_batched_tokens and max_num_seqs to SchedulerConfig
fastdeploy/model_executor/models/deepseek_v3.py Updates buffer allocation to use scheduler config
fastdeploy/model_executor/layers/sample/sampler.py Updates early stopper initialization
fastdeploy/model_executor/layers/backends/gcu/attention/*.py Updates attention backends to use scheduler config
fastdeploy/model_executor/layers/attention/*.py Updates attention components to use scheduler config
fastdeploy/model_executor/guided_decoding/xgrammar_backend.py Updates batch size configuration
fastdeploy/engine/*.py Updates engine components to use scheduler config
fastdeploy/config.py Removes max_num_batched_tokens and max_num_seqs from ParallelConfig and FDConfig, adds SchedulerConfig integration

self.long_prefill_token_threshold = int(self.max_model_len * 0.04)

self.cache_config.postprocess(self.max_num_batched_tokens, self.max_num_seqs)
self.cache_config.postprocess(self.scheduler_config.max_num_batched_tokens, self.max_num_seqs)
Copy link

Copilot AI Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use self.scheduler_config.max_num_seqs instead of self.max_num_seqs for consistency and to ensure the scheduler config values are used throughout.

Suggested change
self.cache_config.postprocess(self.scheduler_config.max_num_batched_tokens, self.max_num_seqs)
self.cache_config.postprocess(self.scheduler_config.max_num_batched_tokens, self.scheduler_config.max_num_seqs)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 2e9e53f into PaddlePaddle:develop Sep 17, 2025
15 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants