scheduler: skip setting preferred node when nodepool changes #26662

mismithhisler · 2025-08-29T13:19:54Z

Description

Preferred node is used when a task group has an ephemeral disk, so we ideally stay on the same node. However if the jobs node pool changes, we should not select the current node as the preferred node, and let the scheduler decide which node to pick from the correct node pool.

Testing & Reproduction steps

Links

Fixes GH #26600

Contributor Checklist

Changelog Entry If this PR changes user-facing behavior, please generate and add a
changelog entry using the make cl command.
Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
ensure regressions will be caught.
Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.

Reviewer Checklist

Backport Labels Please add the correct backport labels as described by the internal
backporting document.
Commit Type Ensure the correct merge method is selected which should be "squash and merge"
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
within the public repository.

If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

Preferred node is used when a task group has an ephemeral disk, so we ideally stay on the same node. However if the jobs node pool changes, we should not select the current node as the preferred node, and let the scheduler decide which node to pick from the correct node pool.

jrasell

The change looks good to me, but seeing as @pkazmierczak has been deep in this code, I'd like for him to take a quick look.

It would be nice if we could tighten the use of must for the added test cases; such as:

		if err := h.Process(NewServiceScheduler, eval); err != nil {
			t.Fatalf("err: %v", err)
		}

		must.NoError(t, h.Process(NewServiceScheduler, eval))

PR also needs a changelog entry.

pkazmierczak

Great work, @mismithhisler, LGTM. As noted by James, a small refactoring of the tests would be nice and a changelog entry.

scheduler/generic_sched_test.go

tgross

LGTM!

tgross · 2025-09-03T16:18:52Z

I'm now realizing this bug probably happens with datacenters too (if they're not overlapping sets between versions), because we use both pool and DC to get the set of eligible nodes.

mismithhisler requested review from a team as code owners August 29, 2025 13:19

mismithhisler self-assigned this Aug 29, 2025

vercel bot deployed to Preview – nomad-ui August 29, 2025 13:20 View deployment

jrasell reviewed Aug 29, 2025

View reviewed changes

pkazmierczak previously approved these changes Sep 2, 2025

View reviewed changes

add changelog and refactor test

6524709

mismithhisler dismissed pkazmierczak’s stale review via 6524709 September 2, 2025 13:09

vercel bot deployed to Preview – nomad-ui September 2, 2025 13:11 View deployment

remove unnecessary code

4daa42a

vercel bot deployed to Preview – nomad-ui September 2, 2025 13:16 View deployment

more test refactor

70c7f83

vercel bot deployed to Preview – nomad-ui September 2, 2025 13:19 View deployment

mismithhisler added backport/ent/1.8.x+ent Changes are backported to 1.8.x+ent backport/ent/1.9.x+ent Changes are backported to 1.9.x+ent backport/1.10.x backport to 1.10.x release line labels Sep 2, 2025

jrasell previously approved these changes Sep 3, 2025

View reviewed changes

regner reviewed Sep 3, 2025

View reviewed changes

scheduler/generic_sched_test.go Outdated Show resolved Hide resolved

fix test comment and small assertions refactor

1956f15

mismithhisler dismissed jrasell’s stale review via 1956f15 September 3, 2025 12:41

vercel bot deployed to Preview – nomad-ui September 3, 2025 12:42 View deployment

tgross reviewed Sep 3, 2025

View reviewed changes

scheduler/generic_sched_test.go Outdated Show resolved Hide resolved

scheduler/generic_sched_test.go Show resolved Hide resolved

more test refactor

82885b8

vercel bot deployed to Preview – nomad-ui September 3, 2025 15:36 View deployment

tgross previously approved these changes Sep 3, 2025

View reviewed changes

add logic and test for datacenter change

dc370c3

mismithhisler dismissed tgross’s stale review via dc370c3 September 3, 2025 17:46

vercel bot deployed to Preview – nomad-ui September 3, 2025 17:47 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

scheduler: skip setting preferred node when nodepool changes #26662

scheduler: skip setting preferred node when nodepool changes #26662

Uh oh!

mismithhisler commented Aug 29, 2025 •

edited

Loading

Uh oh!

jrasell left a comment

Uh oh!

pkazmierczak left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tgross left a comment

Uh oh!

tgross commented Sep 3, 2025

Uh oh!

Uh oh!

scheduler: skip setting preferred node when nodepool changes #26662

Are you sure you want to change the base?

scheduler: skip setting preferred node when nodepool changes #26662

Uh oh!

Conversation

mismithhisler commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing & Reproduction steps

Links

Contributor Checklist

Reviewer Checklist

Changes to Security Controls

Uh oh!

jrasell left a comment

Choose a reason for hiding this comment

Uh oh!

pkazmierczak left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tgross left a comment

Choose a reason for hiding this comment

Uh oh!

tgross commented Sep 3, 2025

Uh oh!

Uh oh!

mismithhisler commented Aug 29, 2025 •

edited

Loading