[core][observability] Report idle node information in status and dashboard#39638
[core][observability] Report idle node information in status and dashboard#39638rickyyx merged 13 commits intoray-project:masterfrom
Conversation
rickyyx
left a comment
There was a problem hiding this comment.
looks pretty good to me! some nits and comments.
So i guess the actual printing or showing on the dashboard part will be in another PR?
Signed-off-by: vitsai <vitsai@cs.stanford.edu>
Signed-off-by: vitsai <vitsai@cs.stanford.edu>
Signed-off-by: vitsai <vitsai@cs.stanford.edu>
Signed-off-by: vitsai <vitsai@cs.stanford.edu>
rickyyx
left a comment
There was a problem hiding this comment.
Looks pretty good to me!
The only thing I wanna double check is just the Snapshot status setting logic (if the idle one will get overridden), and if we could have a test that tests the new node activity thing e2e? I think so far it's either being mocked. If e2e is non trivial, then some manual tests with ray status -v is also fine, and some unit testing.
And a couple of nits, including
- Write in the PR description for what's being changed here in terms of output?
jjyao
left a comment
There was a problem hiding this comment.
Could you add the PR description?
Ideally also show some example outputs of status and dashboard after your change.
Signed-off-by: vitsai <vitsai@cs.stanford.edu>
rickyyx
left a comment
There was a problem hiding this comment.
Is the idle time thing being done outside of this PR? If so, let's document it in a TODO.
| switch (std::get<WorkFootprint>(iter.first)) { | ||
| case WorkFootprint::NODE_WORKERS: | ||
| node_activity << " Node currently has leased workers." << std::endl; | ||
| resources_data.add_node_activity("Busy workers on node."); |
There was a problem hiding this comment.
| resources_data.add_node_activity("Busy workers on node."); | |
| resources_data.add_node_activity("Active workers."); |
?
Signed-off-by: vitsai <vitsai@cs.stanford.edu>
…board (ray-project#39638) --------- Signed-off-by: vitsai <vitsai@cs.stanford.edu>
…and dashboard (ray-project#39638)" This reverts commit a2dedf1.
…board (ray-project#39638) --------- Signed-off-by: vitsai <vitsai@cs.stanford.edu> Signed-off-by: Victor <vctr.y.m@example.com>



Plumbs through idle node information to be reflected in ray status both in the CLI and on the dashboard. Does not include additional changes in the cluster tab of the dashboard UI, but does plumb the status field through to the datasource for dashboard consumption.
Why are these changes needed?
Related issue number
#35411
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.