Skip to content

identity: The initial implementation code for node identity.#26291

Merged
jrasell merged 10 commits intomainfrom
f-NMD-763-identity
Aug 5, 2025
Merged

identity: The initial implementation code for node identity.#26291
jrasell merged 10 commits intomainfrom
f-NMD-763-identity

Conversation

@jrasell
Copy link
Copy Markdown
Member

@jrasell jrasell commented Jul 17, 2025

This PR comprises the initial node identity code. It has been previously reviewed in stages on a PR by PR basis. Merging this code into main now allows us to get it into our nightly testing loop and makes it easier for engineers to test it when they want to.

This initial PR does not include changelog or documentation updates which will come at a later date.

Links

internal jira: https://hashicorp.atlassian.net/issues/NMD-763
internal design doc: https://docs.google.com/document/d/1MYjlFlOAmGHmWGC3VsrIMUL_VSgKwjLAq6GUymDY38M/edit?tab=t.0

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad website documentation to reflect this. Refer to
    the website README for docs guidelines. Please also consider whether the
    change requires notes within the upgrade guide.

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.

jrasell and others added 10 commits June 18, 2025 07:43
…ion to support node identities. (#26052)

When Nomad generates an identity for a node, the root key used to
sign the JWT will be stored as a field on the node object and
written to state. To provide fast lookup of nodes by their
signing key, the node table schema has been modified to include
the keyID as an index.

In order to ensure a root key is not deleted while identities are
still actively signed by it, the Nomad state has an in-use check.
This check has been extended to cover node identities.

Nomad node identities will have an expiration. The expiration will
be defined by a TTL configured within the node pool specification
as a time duration. When not supplied by the operator, a default
value of 24hr is applied.

On cluster upgrades, a Nomad server will restore from snapshot
and/or replay logs. The FSM has therefore been modified to ensure
restored node pool objects include the default value. The builtin
"all" and "default" pools have also been updated to include this
default value.

Nomad node identities will be a new identity concept in Nomad and
will exist alongside workload identities. This change introduces a
new envelope identity claim which contains generic public claims
as well as either a node or workload identity claims. This allows
us to use a single encryption and decryption path, no matter what
the underlying identity. Where possible node and workload
identities will use common functions for identity claim
generation.

The new node identity has the following claims:

* "nomad_node_id" - the node ID which is typically generated on
  the first boot of the Nomad client as a UUID within the
  "ensureNodeID" function.

* "nomad_node_pool" - the node pool is a client configuration
  parameter which provides logical grouping of Nomad clients.

* "nomad_node_class" - the node class is a client configuration
  parameter which provides scheduling constraints for Nomad clients.

* "nomad_node_datacenter" - the node datacenter is a client
  configuration parameter which provides scheduling constraints
  for Nomad clients and a logical grouping method.
When a node heartbeats, the RPC handler will optionally generate
an identity to return to the caller. The identity key ID will be
stored in the node object, so we have tracking of keys in use.

The state store has been updated to handle node status update
requests that include a signing key ID. Rather than add another
parameter into the function signature, the FSM function now takes
the entire request object.
The authenticator process which performs RPC authentication has
been modified to support node identities. Node identities are
verified by ensuring the node ID as claimed has a node written
to Nomad state.

The client only and generic authenticate methods now support
both node secret IDs and node identities. It uses uuid checking
to attempt to parse either option.

A new method has also been added to handle the specific RPCs that
will optionally generate node identities. While a new
authenticator method is not ideal, it is better than the
alternative option for these RPCs to perform complex additional
RPC context work in order to understand whether an identity
should be generated.

The TLS verification functionality has been pulled into its own
method to avoid further code duplication.
…6165)

When a Nomad client register or re-registers, the RPC handler will
generate and return a node identity if required. When an identity
is generated, the signing key ID will be stored within the node
object, to ensure a root key is not deleted until it is not used.

During normal client operation it will periodically heartbeat to
the Nomad servers to indicate aliveness. The RPC handler that
is used for this action has also been updated to conditionally
perform identity generation. Performing it here means no extra RPC
handlers are required and we inherit the jitter in identity
generation from the heartbeat mechanism.

The identity generation check methods are performed from the RPC
request arguments, so they a scoped to the required behaviour and
can handle the nuance of each RPC. Failure to generate an identity
is considered terminal to the RPC call. The client will include
behaviour to retry this error which is always caused by the
encrypter not being ready unless the servers keyring has been
corrupted.
…26184)

The Nomad client will persist its own identity within its state
store for restart persistence. The added benefit of using it over
the filesystem is that it supports transactions. This is useful
when considering the identity will be renewed periodically.
Nomad servers, if upgraded, can return node identities as part of
the register and update/heartbeat response objects. The Nomad
client will now handle this and store it as appropriate within its
memory and statedb.

The client will now use any stored identity for RPC authentication
with a fallback to the secretID. This supports upgrades paths where
the Nomad clients are updated before the Nomad servers.
)

The Nomad client will have its identity renewed according to the
TTL which defaults to 24h. In certain situations such as root
keyring rotation, operators may want to force clients to renew
their identities before the TTL threshold is met. This change
introduces a client HTTP and RPC endpoint which will instruct the
node to request a new identity at its next heartbeat. This can be
used via the API or a new command.

While this is a manual intervention step on top of the any keyring
rotation, it dramatically reduces the initial feature complexity
as it provides an asynchronous and efficient method of renewal that
utilises existing functionality.
@jrasell jrasell self-assigned this Jul 17, 2025
@jrasell jrasell marked this pull request as ready for review July 17, 2025 07:10
@jrasell jrasell requested review from a team as code owners July 17, 2025 07:10
Copy link
Copy Markdown
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jrasell jrasell merged commit 350662c into main Aug 5, 2025
40 checks passed
@jrasell jrasell deleted the f-NMD-763-identity branch August 5, 2025 07:52
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 4, 2025

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Dec 4, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants