-
Notifications
You must be signed in to change notification settings - Fork 2k
Add Disk Throttle Resource Limit Support #26392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kyounghunJang - I'm the tech writer who works with the Nomad team. Thank you very much for including excellent documentation with your feature code. I left some suggestions so that the new content follows our documentation style guide. Please feel free to tag me with any documentation questions.
@@ -0,0 +1,113 @@ | |||
--- | |||
layout: docs | |||
page_title: disk_throttles block in the job specification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
page_title: disk_throttles block in the job specification | |
page_title: disk_throttle block in the job specification |
changed to match the block name
layout: docs | ||
page_title: disk_throttles block in the job specification | ||
description: |- | ||
Configure disk I/O throttling limits in the `disk_throttles` block of the Nomad job specification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Configure disk I/O throttling limits in the `disk_throttles` block of the Nomad job specification. | |
Configure disk I/O throttling limits in the `disk_throttle` block of the Nomad job specification. |
|
||
<Placement groups={['job', 'group', 'task', 'resources', 'disk_throttle']} /> | ||
|
||
The disk_throttle block is used to set limits on the disk I/O (Input/Output) a task can perform on a specific block device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The disk_throttle block is used to set limits on the disk I/O (Input/Output) a task can perform on a specific block device. | |
Use the `disk_throttle` block to set limits on the disk I/O (Input/Output) a task can perform on a specific block device. |
- Block name in code backticks
- Use active voice
<Placement groups={['job', 'group', 'task', 'resources', 'disk_throttle']} /> | ||
|
||
The disk_throttle block is used to set limits on the disk I/O (Input/Output) a task can perform on a specific block device. | ||
This block helps mitigate the "noisy neighbor" problem, where a single task consuming excessive disk bandwidth can negatively impact other tasks running on the same node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block helps mitigate the "noisy neighbor" problem, where a single task consuming excessive disk bandwidth can negatively impact other tasks running on the same node. | |
This block helps mitigate the noisy neighbor problem, where a single task consuming excessive disk bandwidth can negatively impact other tasks running on the same node. |
Nit: remove quotes around noisy neighbor
The disk_throttle block is used to set limits on the disk I/O (Input/Output) a task can perform on a specific block device. | ||
This block helps mitigate the "noisy neighbor" problem, where a single task consuming excessive disk bandwidth can negatively impact other tasks running on the same node. | ||
|
||
When a disk_throttle block is added, Nomad will limit the task's I/O throughput in bytes per second (BPS) or I/O operations per second (IOPS). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a disk_throttle block is added, Nomad will limit the task's I/O throughput in bytes per second (BPS) or I/O operations per second (IOPS). | |
When you add a `disk_throttle` block, Nomad limits the task's I/O throughput in bytes per second (BPS) or I/O operations per second (IOPS). |
} | ||
``` | ||
|
||
### Throttling Multiple Devices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Throttling Multiple Devices | |
### Throttling multiple devices |
|
||
```hcl | ||
resources { | ||
disk_throttles { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disk_throttles { | |
disk_throttle { |
This block should be disk_throttle
, correct?
@@ -109,6 +113,22 @@ resources { | |||
} | |||
} | |||
``` | |||
|
|||
### DiskThrottles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### DiskThrottles | |
### Disk throttle |
@@ -167,3 +187,4 @@ resource utilization and considering the following suggestions: | |||
[numa]: /nomad/docs/job-specification/numa 'Nomad NUMA Job Specification' | |||
[`secrets/`]: /nomad/docs/reference/runtime-environment-settings#secrets | |||
[concepts-cpu]: /nomad/docs/architecture/cpu | |||
[disk_throttle]: /nomad/docs/job-specification/disk_throttle 'Nomad Disk_Throttle Job Specification' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[disk_throttle]: /nomad/docs/job-specification/disk_throttle 'Nomad Disk_Throttle Job Specification' | |
[disk_throttle]: /nomad/docs/job-specification/disk_throttle |
You don't need to add a page title after the link. That pattern is from a much earlier version of the docs.
Remember that the `disk_throttle` block is only valid in the placements listed above. | ||
|
||
### Limiting Bandwidth (BPS) | ||
This example limits the read and write bandwidth of a specific device to 50 MB/s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding an explanation to each of the examples!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kyounghunJang and thanks for the PR! But I don't think we can accept this as-is.
As you can see from the resource structs you've edited, Nomad has legacy fields for managing disk IOPs and size, but they aren't currently in use. Disk IOPs tracking was never fully supported, and the field was deprecated in Nomad 0.9.0 (ref #4970). Disk capacity (DiskMB) was removed all the way back in Nomad 0.5.0 (ref #1679) and moved to the ephemeral_disk
field. So there's a long history of not-quite-complete/working disk related resource tracking. 😁
The core challenge with anything to do with disk resources is that they're platform dependent. The cgroups-based approach you have here for throttling doesn't work on non-Linux OS. They're also dependent on the task driver. Setting cgroups won't work for the qemu
driver or the libvirt
driver. Or if in the future someone wanted to add disk space constraints, it's infeasibly expensive to do that except on filesystems that support quotas (ex. ZFS). We'll likely need to have a discussion about how to handle host volumes, dynamic volumes, and the alloc directory, all of which are bind-mounted (but only on supported task drivers!).
Also, the resource
block generally represents schedulable resources. That is, resources that the scheduler should be comparing against the available resources on the node. Your implementation sets cgroups flags when the workload is placed, but there's no way for the scheduler to tell whether it's "using up" too much IOPs for a given node. And the per-device throttling option you've got only works if all the hosts have the same set of major:minor numbers on disks. We could start fingerprinting the IOPs per devices and then acocunting for that resource, but then that's only useful if every alloc is assigned disk IO slice as well.
All of which is to say that this feature has a lot of complexity, and I'm not sure it's a good idea to try to work out the design incrementally in a pull request, rather than having a design discussion in the original #26295 issue. For a feature of this complexity, we'd typically have an internal Request for Comments (RFC) document and product management involvement as well.
We'd love to have your enthusiasm as part of that discussion in #26295 though!
Description
This PR adds support for disk IO throttling using cgroup in Nomad task drivers.
The implementation allows users to define per-device IO bandwidth limits through job specifications.
Testing & Reproduction steps
Links
#26295
Contributor Checklist
changelog entry using the
make cl
command.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.