Skip to content

Conversation

tehut
Copy link
Contributor

@tehut tehut commented Jul 1, 2025

Description

The nomad monitor export command introduces the ability for nomad to export logs a given agent has written to journald or to the nomad log file. Journald logs can be requested for a specific period of time while we just return the agent's entire nomad log file. Introducing this RPC is a prerequisite for adding journald logs to the nomad support bundle.

Pasting a link to a comment deep in the thread regarding fixes for inconsistencies related to ending the stream.
#26178 (comment)

Testing & Reproduction steps

Links

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad website documentation to reflect this. Refer to
    the website README for docs guidelines. Please also consider whether the
    change requires notes within the upgrade guide.

Pulled this from the Contributor checklist for new cli commands, I struck out a few items that didn't seem to fit this use case but I'm happy to revisit them if I was wrong there.

CLI Command Checklist

  • Consider similar commands in Consul, Vault, and other tools. Is there
    prior art we should match? Arguments, flags, env vars, etc?
  • New file in command/ or in an existing file if a subcommand
  • Test new command in command/ package
  • Implement autocomplete
  • Update help text
  • Register new command in command/commands.go
  • Implement and test new HTTP endpoint in command/agent/<command>_endpoint.go
  • Register new URL paths in command/agent/http.go
  • Implement and test new RPC endpoint in nomad/<command>_endpoint.go
  • Implement and test new Client RPC endpoint in
    client/<command>_endpoint.go (For client endpoints like Filesystem only)
  • Implement and test new nomad/structs/ package Request and Response structs (cstructs, but still)
  • Implement and test new api/ package helper methods
    I added a monitorHelper function to more cleanly share the existing monitor code in /api/agent.go. I did not add an additional test for the helper or the MonitorExport command. Mostly because I'd have to update the Client Config structs to be able to set the LogFile value on the api Client. But the only difference between the Monitor & MonitorExport commands in api/agent.go is which path is passed to the monitorHelper and helper is exercised in the existing monitor tests.

* [ ] For nested commands make sure all intermediary subcommands exist (for
example, nomad acl, nomad acl policy, and nomad acl policy apply must
all be valid commands)

* [ ] If the command has a status subcommand consider adding a search context
in nomad/search_endpoint.go and update command/status.go

* [ ] Implement -json (returns raw API response)
* [ ] Implement -t (format API response using gotemplate)
* [ ] Implement -verbose (expands truncated UUIDs, adds other detail)
* [ ] Implement and test new api/ package Request and Response structs

Docs

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.

@tehut tehut mentioned this pull request Jul 1, 2025
6 tasks
@tehut tehut force-pushed the f-NMD-855/monitor_external branch from 3a0e003 to 4a608b0 Compare July 1, 2025 17:42
@tehut tehut changed the title F nmd 855/monitor external Add nomad monitor export command Jul 1, 2025
@tehut tehut force-pushed the f-NMD-855/monitor_external branch from b105697 to e430e42 Compare July 2, 2025 00:25
@tehut tehut force-pushed the f-NMD-855/monitor_external branch from 68fdc38 to 2ddf97c Compare July 2, 2025 03:07
@tehut tehut force-pushed the f-NMD-855/monitor_external branch from 2ddf97c to 2c0148a Compare July 2, 2025 03:17
@tehut tehut force-pushed the f-NMD-855/monitor_external branch from 2c0148a to 8863e01 Compare July 2, 2025 03:21
@tehut tehut force-pushed the f-NMD-855/monitor_external branch from 8863e01 to 71a05f0 Compare July 2, 2025 03:25
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're not quite done @tehut but I made a first pass over this.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good @tehut! I've also checked this out locally and ran it against a multinode cluster to check things like goroutine leaks, truncated reads, etc. and it looks great!

I've left a little pile of comments and questions but most of it is small stuff at this point.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I've left a small number of suggestions and one last-triple-check question, but once those are resolved we can squash this and merge it!

// Context passed from client to close the cmd and exit the function
Context context.Context

bufSize int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also unused, looks like. The callers are in other packages, so they can't see this field anyways. But we're not using it in test either.

Copy link
Contributor Author

@tehut tehut Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super committed to keeping this in but I wanted to try it as an exported field in light of the point you raised yesterday about the StreamFramer and StreamReader also having a configurable window size. I like how it's currently configured where the producer is pushing 1/2 as much data as each consumer is ready to process but it seems like a dial folks might want to use down the road? Or does that just introduce another thing to keep track of/test/validate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. 👍

This whole project really points to a need to revisit some of the plumbing for how uni-directional and bi-directional streaming is designed, as it was super painful to add a new endpoint. So having configuration knobs handy will be useful later if we ever find the time to set aside for that.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! :shipit:

@tehut tehut merged commit d709acc into main Aug 1, 2025
39 checks passed
@tehut tehut deleted the f-NMD-855/monitor_external branch August 1, 2025 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.10.x backport to 1.10.x release line theme/cli theme/docs Documentation issues and enhancements type/enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants