script checks: use check ID from group service hook#27453
Merged
Conversation
7dee55e to
5f43ab5
Compare
5f43ab5 to
3bc1744
Compare
Script checks are registered in Consul at the group service hook, then executed and heartbeat by the Nomad client in the script check hook. But not all fields in the task environment are available for interpolation at the time the group service hook runs, and the hash we use for the check ID includes all fields post-interpolation. The script check hook intends to use the original uninterpolated check ID, but the service it's reading this value from has been interpolated with the full task environment at this point. Using the uninterpolated check from the task group would be incorrect anyways, as the group service check has interpolated before creating the ID used to register the check. Ideally we'd interpolate values available at submit-time and generate an immutable service ID and check ID based on that. But for backwards compatibility with existing registered services, we'll need to fix this at the script check instead. Have the group service check record every check ID it creates in the allochook resources. Thread these down to the script check taskrunner hook and use the stored values as an override of the check ID we use for TTL updates. Fixes: #26952 Ref: https://hashicorp.atlassian.net/browse/NMD-1054
3bc1744 to
a12288b
Compare
jrasell
previously approved these changes
Feb 4, 2026
Member
jrasell
left a comment
There was a problem hiding this comment.
LGTM. I've left a couple of inline comments, but I don't see those as blocking.
jrasell
approved these changes
Feb 4, 2026
7 tasks
tgross
added a commit
that referenced
this pull request
Feb 5, 2026
In #27453 we fixed a bug in script check hook interpolation by recording the check IDs for each check in the group service hook and then passing that to the script check hook via alloc hook resources. But this change did not account for checks being updated in-place, so the script check hook reads the old check IDs and fails. This was caught by nightly E2E testing. Record the updated check IDs in the group service hook as well. Expand the group service tests to include updating the checks. Ref: #27453 Ref: https://hashicorp.atlassian.net/browse/NMD-1054 Ref: https://github.com/hashicorp/nomad-e2e/actions/runs/21699934682/job/62586402991
7 tasks
tgross
added a commit
that referenced
this pull request
Feb 6, 2026
In #27453 we fixed a bug in script check hook interpolation by recording the check IDs for each check in the group service hook and then passing that to the script check hook via alloc hook resources. But this change did not account for checks being updated in-place, so the script check hook reads the old check IDs and fails. This was caught by nightly E2E testing. Record the updated check IDs in the group service hook as well. Expand the group service tests to include updating the checks. Ref: #27453 Ref: https://hashicorp.atlassian.net/browse/NMD-1054 Ref: https://github.com/hashicorp/nomad-e2e/actions/runs/21699934682/job/62586402991
Merged
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Script checks are registered in Consul at the group service hook, then executed and heartbeat by the Nomad client in the script check hook. But not all fields in the task environment are available for interpolation at the time the group service hook runs, and the hash we use for the check ID includes all fields post-interpolation. The script check hook intends to use the original uninterpolated check ID, but the service it's reading this value from has been interpolated with the full task environment at this point. Using the uninterpolated check from the task group would be incorrect anyways, as the group service check has interpolated before creating the ID used to register the check.
Ideally we'd interpolate values available at submit-time and generate an immutable service ID and check ID based on that. But for backwards compatibility with existing registered services, we'll need to fix this at the script check instead.
Have the group service check record every check ID it creates in the allochook resources. Thread these down to the script check taskrunner hook and use the stored values as an override of the check ID we use for TTL updates.
Fixes: #26952
Ref: https://hashicorp.atlassian.net/browse/NMD-1054
Testing & Reproduction steps
Run Consul and a dev server, and do the usual
nomad setup consul -yconfiguration. Deploy the following jobspec and observe that both script checks and TCP checks work as expected. Note theNOMAD_ALLOC_IP_wwwvalue is key here; just using theNOMAD_JOB_NAMEdoesn't trigger the bug because that's interpolated at the group level.jobspec
Contributor Checklist
changelog entry using the
make clcommand.ensure regressions will be caught.
and job configuration, please update the Nomad product documentation, which is stored in the
web-unified-docsrepo. Refer to theweb-unified-docscontributor guide for docs guidelines.Please also consider whether the change requires notes within the upgrade
guide. If you would like help with the docs, tag the
nomad-docsteam in this PR.Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.
Changes to Security Controls
Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.