Skip to content

Conversation

tgross
Copy link
Member

@tgross tgross commented Jul 2, 2025

The output of the reconciler stage of scheduling is only visible via debug-level logs, typically accessible only to the cluster admin. We can give job authors better ability to understand what's happening to their jobs if we expose this information to them in the eval status command.

Add the reconciler's desired updates to the evaluation struct so it can be exposed in the API. This increases the size of evals by roughly 15% in the state store, or a bit more when there are preemptions (but we expect this will be a small minority of evals).

Ref: https://hashicorp.atlassian.net/browse/NMD-818
Fixes: #15564

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad website documentation to reflect this. Refer to
    the website README for docs guidelines. Please also consider whether the
    change requires notes within the upgrade guide.
    • more comprehensive eval status and alloc status docs changes coming under separate PR

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.

@tgross
Copy link
Member Author

tgross commented Jul 2, 2025

Example output of the updated eval status command with preemptions:

$ nomad eval status 4d832430
ID                 = 4d832430
Create Time        = 34s ago
Modify Time        = 34s ago
Status             = complete
Status Description = complete
Type               = service
TriggeredBy        = job-register
Job ID             = example2
Namespace          = default
Priority           = 100
Placement Failures = false
Previous Eval      = <none>
Next Eval          = <none>
Blocked Eval       = <none>

Reconciler Annotations
Task Group  Ignore  Place  Stop  Migrate  InPlace  Destructive  Canary  Preemptions
group       0       1      0     0        0        0            0       1

Preempted Allocations
ID        Job ID    Node ID   Task Group  Version  Desired  Status   Created   Modified
116e9046  example1  fb4cacda  group       0        run      running  1m9s ago  59s ago

Placed Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
a61edb08  fb4cacda  group       0        run      running  34s ago  18s ago

The output of the reconciler stage of scheduling is only visible via debug-level
logs, typically accessible only to the cluster admin. We can give job authors
better ability to understand what's happening to their jobs if we expose this
information to them in the `eval status` command.

Add the reconciler's desired updates to the evaluation struct so it can be
exposed in the API. This increases the size of evals by roughly 15% in the state
store, or a bit more when there are preemptions (but we expect this will be a
small minority of evals).

Ref: https://hashicorp.atlassian.net/browse/NMD-818
Fixes: #15564
@tgross tgross force-pushed the NMD818-reconciler-annotations branch from c008acb to 0c27588 Compare July 2, 2025 19:19
@tgross tgross added this to the 1.11.0 milestone Jul 2, 2025
@tgross tgross marked this pull request as ready for review July 2, 2025 20:29
@tgross tgross requested review from a team as code owners July 2, 2025 20:29
Copy link
Member

@jrasell jrasell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tgross tgross merged commit 5c90921 into main Jul 7, 2025
48 checks passed
@tgross tgross deleted the NMD818-reconciler-annotations branch July 7, 2025 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

persist reconciler metrics in raft
2 participants