[Serve] log deployment config in controller logs#59222
[Serve] log deployment config in controller logs#59222abrarsheikh merged 4 commits intoray-project:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces logging for deployment configurations in Ray Serve, which is a valuable addition for debugging. The new _log_deployment_configs function logs declarative, imperative, and merged configurations. My review focuses on improving the implementation of this new function for better maintainability and readability by reducing code duplication.
| def _log_deployment_configs( | ||
| app_name: str, | ||
| declarative_config: ServeApplicationSchema, | ||
| imperative_configs: Dict[str, DeploymentInfo], | ||
| merged_configs: Dict[str, DeploymentInfo], | ||
| ) -> None: | ||
| """Log the declarative, imperative, and merged deployment configs. | ||
|
|
||
| This function logs the deployment configuration after each update to help | ||
| with debugging and understanding how declarative (from YAML/config) and | ||
| imperative (from code) configurations are merged. | ||
|
|
||
| Args: | ||
| app_name: Name of the application. | ||
| declarative_config: The declarative config from the YAML/config file. | ||
| imperative_configs: The imperative configs from the application code. | ||
| merged_configs: The final merged configs after applying overrides. | ||
| """ | ||
| try: | ||
| # Format declarative config | ||
| declarative_dict = {} | ||
| if declarative_config.deployments: | ||
| for dep in declarative_config.deployments: | ||
| dep_dict = dep.dict(exclude_unset=True) if hasattr(dep, "dict") else {} | ||
| declarative_dict[dep.name] = dep_dict | ||
|
|
||
| # Format imperative configs | ||
| imperative_dict = {} | ||
| for dep_name, dep_info in imperative_configs.items(): | ||
| imperative_dict[dep_name] = { | ||
| "deployment_config": dep_info.deployment_config.dict(), | ||
| "replica_config": { | ||
| "ray_actor_options": dep_info.replica_config.ray_actor_options, | ||
| "placement_group_bundles": dep_info.replica_config.placement_group_bundles, | ||
| "placement_group_strategy": dep_info.replica_config.placement_group_strategy, | ||
| "max_replicas_per_node": dep_info.replica_config.max_replicas_per_node, | ||
| }, | ||
| "route_prefix": dep_info.route_prefix, | ||
| "version": dep_info.version, | ||
| } | ||
|
|
||
| # Format merged configs | ||
| merged_dict = {} | ||
| for dep_name, dep_info in merged_configs.items(): | ||
| merged_dict[dep_name] = { | ||
| "deployment_config": dep_info.deployment_config.dict(), | ||
| "replica_config": { | ||
| "ray_actor_options": dep_info.replica_config.ray_actor_options, | ||
| "placement_group_bundles": dep_info.replica_config.placement_group_bundles, | ||
| "placement_group_strategy": dep_info.replica_config.placement_group_strategy, | ||
| "max_replicas_per_node": dep_info.replica_config.max_replicas_per_node, | ||
| }, | ||
| "route_prefix": dep_info.route_prefix, | ||
| "version": dep_info.version, | ||
| } | ||
|
|
||
| # Log the configurations | ||
| logger.info( | ||
| f"Application '{app_name}' deployment configs after update:\n" | ||
| f"Declarative config (from YAML/config): {json.dumps(declarative_dict, indent=2, default=str)}\n" | ||
| f"Imperative config (from code): {json.dumps(imperative_dict, indent=2, default=str)}\n" | ||
| f"Merged config (final): {json.dumps(merged_dict, indent=2, default=str)}" | ||
| ) | ||
| except Exception as e: | ||
| # Don't fail the deployment if logging fails | ||
| logger.warning( | ||
| f"Failed to log deployment configs for application '{app_name}': {e}" | ||
| ) |
There was a problem hiding this comment.
The _log_deployment_configs function can be refactored for better readability and to reduce code duplication.
- The logic for formatting
imperative_configsandmerged_configsis identical. This can be extracted into a nested helper function,_format_deployment_infos, to avoid repetition. - The creation of
declarative_dictcan be simplified using a dictionary comprehension. Thehasattr(dep, "dict")check is also redundant becausedeclarative_config.deploymentsis a list ofDeploymentSchemapydantic models, which are guaranteed to have a.dict()method.
Here is the suggested refactoring:
def _log_deployment_configs(
app_name: str,
declarative_config: ServeApplicationSchema,
imperative_configs: Dict[str, DeploymentInfo],
merged_configs: Dict[str, DeploymentInfo],
) -> None:
"""Log the declarative, imperative, and merged deployment configs.
This function logs the deployment configuration after each update to help
with debugging and understanding how declarative (from YAML/config) and
imperative (from code) configurations are merged.
Args:
app_name: Name of the application.
declarative_config: The declarative config from the YAML/config file.
imperative_configs: The imperative configs from the application code.
merged_configs: The final merged configs after applying overrides.
"""
try:
def _format_deployment_infos(infos: Dict[str, DeploymentInfo]) -> Dict[str, Dict]:
"""Helper to format DeploymentInfo objects for logging."""
return {
dep_name: {
"deployment_config": dep_info.deployment_config.dict(),
"replica_config": {
"ray_actor_options": dep_info.replica_config.ray_actor_options,
"placement_group_bundles": dep_info.replica_config.placement_group_bundles,
"placement_group_strategy": dep_info.replica_config.placement_group_strategy,
"max_replicas_per_node": dep_info.replica_config.max_replicas_per_node,
},
"route_prefix": dep_info.route_prefix,
"version": dep_info.version,
}
for dep_name, dep_info in infos.items()
}
# Format declarative config
declarative_dict = {
dep.name: dep.dict(exclude_unset=True)
for dep in declarative_config.deployments
} if declarative_config.deployments else {}
# Format imperative and merged configs
imperative_dict = _format_deployment_infos(imperative_configs)
merged_dict = _format_deployment_infos(merged_configs)
# Log the configurations
logger.info(
f"Application '{app_name}' deployment configs after update:\n"
f"Declarative config (from YAML/config): {json.dumps(declarative_dict, indent=2, default=str)}\n"
f"Imperative config (from code): {json.dumps(imperative_dict, indent=2, default=str)}\n"
f"Merged config (final): {json.dumps(merged_dict, indent=2, default=str)}"
)
except Exception as e:
# Don't fail the deployment if logging fails
logger.warning(
f"Failed to log deployment configs for application '{app_name}': {e}"
)
abrarsheikh
left a comment
There was a problem hiding this comment.
These changes seem overly complex for a simple task like logging config. Is there a architectural reason why logging just the deployment config is difficult?
Thanks for the feedback. Indeed it looks complex, but this is caused by architectural limitations. |
right, i think you are capturing the important detail in your comment. I am wondering why the change is not as simple as diff --git a/python/ray/serve/_private/config.py b/python/ray/serve/_private/config.py
index 22546f6188..efc8e6c468 100644
--- a/python/ray/serve/_private/config.py
+++ b/python/ray/serve/_private/config.py
@@ -806,7 +806,16 @@ class ReplicaConfig:
def to_proto_bytes(self):
return self.to_proto().SerializeToString()
-
+
+ def to_dict(self):
+ # only use for logging purposes
+ return {
+ "deployment_def_name": self.deployment_def_name,
+ "ray_actor_options": json.loads(self.ray_actor_options),
+ "placement_group_bundles": json.loads(self.placement_group_bundles),
+ "placement_group_strategy": self.placement_group_strategy,
+ "max_replicas_per_node": self.max_replicas_per_node,
+ }
def prepare_imperative_http_options(
proxy_location: Union[None, str, ProxyLocation],
diff --git a/python/ray/serve/_private/deployment_info.py b/python/ray/serve/_private/deployment_info.py
index 5413c7878a..1e4e824368 100644
--- a/python/ray/serve/_private/deployment_info.py
+++ b/python/ray/serve/_private/deployment_info.py
@@ -167,3 +167,14 @@ class DeploymentInfo:
else:
data["target_capacity_direction"] = self.target_capacity_direction.name
return DeploymentInfoProto(**data)
+
+ def to_dict(self):
+ # only use for logging purposes
+ return {
+ "deployment_config": self.deployment_config.model_dump(),
+ "replica_config": self.replica_config.to_dict(),
+ "start_time_ms": self.start_time_ms,
+ "actor_name": self.actor_name,
+ "version": self.version,
+ "end_time_ms": self.end_time_ms,
+ }
\ No newline at end of file
diff --git a/python/ray/serve/_private/deployment_state.py b/python/ray/serve/_private/deployment_state.py
index d1e29e9f08..d507799987 100644
--- a/python/ray/serve/_private/deployment_state.py
+++ b/python/ray/serve/_private/deployment_state.py
@@ -2340,6 +2340,9 @@ class DeploymentState:
bool: Whether the target state has changed.
"""
+ logger.info(f"Deploying deployment {deployment_info.to_dict()}")
+ logger.info(f"Current target state: {self._target_state.info.to_dict() if self._target_state.info is not None else None}")
+
curr_deployment_info = self._target_state.info
if curr_deployment_info is not None:
# Redeploying should not reset the deployment's start time. |
Signed-off-by: JianZhang <keepromise@apache.org>
Hi @abrarsheikh Thank you for taking the time to review. PTAL. |
|
tests are failing. |
Signed-off-by: JianZhang <keepromise@apache.org>
| # Log the resulting declarative + imperative version of the deployment config | ||
| # after each update to the application | ||
| if deployment_infos is not None and not deleting: | ||
| deployment_configs = { | ||
| name: info.to_dict() for name, info in deployment_infos.items() | ||
| } | ||
| logger.info( | ||
| f"Application '{self._name}' updated. Deployment configs (declarative + imperative): " | ||
| f"{json.dumps(deployment_configs, indent=2, default=str)}" | ||
| ) |
There was a problem hiding this comment.
i think the log in deploy is sufficient. this one is going to be too noisy since it get called on every autoscaling decision.
Signed-off-by: JianZhang <keepromise@apache.org>
|
please add screenshot of manual test after this change. |
## Description [Serve] log deployment config in controller logs ## Related issues #59167 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: JianZhang <keepromise@apache.org>
|
@abrarsheikh Thank you for reviewing and merging this PR. |
|
@KeeProMise message me on slack if you are interested in making other contributions :) |
## Description [Serve] log deployment config in controller logs ## Related issues ray-project#59167 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: JianZhang <keepromise@apache.org>
## Description [Serve] log deployment config in controller logs ## Related issues ray-project#59167 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: JianZhang <keepromise@apache.org>
## Description [Serve] log deployment config in controller logs ## Related issues ray-project#59167 ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: JianZhang <keepromise@apache.org> Signed-off-by: peterxcli <peterxcli@gmail.com>

Description
[Serve] log deployment config in controller logs
Related issues
#59167
Additional information