-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Nomad version
Nomad v1.9.5
BuildDate 2025-01-14T18:35:12Z
Revision 0b7bb8b60758981dae2a78a0946742e09f8316f5+CHANGES
Issue
I am not entirely sure whenever this is a legit limitation of nomad
+ nomad-device-nvidia
plugin or legit bug. According to documentation posted in https://developer.hashicorp.com/nomad/docs/job-specification/device#multiple-nvidia-gpu multiple GPU is supported but in reality - its not specified whenever those GPU should be the same model + located on same node or models can be different and only same node placement play the role. in our case - we have 2 NVIDIA GPU placed and available on one node. Setting them specifically like:
device "nvidia/gpu/NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition" {
count = 1
}
or
device "nvidia/gpu/NVIDIA RTX 5000 Ada Generation" {
count = 1
}
work without any issues, same as running container manually on node - nvidia-smi
report that both cards are visible and can be utilised. But setting them as:
device "nvidia/gpu" {
count = 2
}
result into placement failure
Reproduction steps
Have 2 NVIDIA GPU that are correctly fingerprinted:
❯ nomad node status -json fc9077e8 | jq '.NodeResources.Devices' ~
[
{
"Attributes": {
"cores_clock": {
"Int": 210,
"Unit": "MHz"
},
"pci_bandwidth": {
"Int": 32768,
"Unit": "MB/s"
},
"driver_version": {
"String": "580.65.06",
"Unit": ""
},
"memory": {
"Int": 32760,
"Unit": "MiB"
},
"bar1": {
"Int": 256,
"Unit": "MiB"
},
"display_state": {
"String": "0",
"Unit": ""
},
"power": {
"Int": 14,
"Unit": "W"
},
"memory_clock": {
"Int": 405,
"Unit": "MHz"
},
"persistence_mode": {
"String": "0",
"Unit": ""
}
},
"Instances": [
{
"HealthDescription": "",
"Healthy": true,
"ID": "GPU-45bc2781-22da-689e-59d5-f3778161164f",
"Locality": {
"PciBusID": "00000000:00:1B.0\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"
}
}
],
"Name": "NVIDIA RTX 5000 Ada Generation",
"Type": "gpu",
"Vendor": "nvidia"
},
{
"Attributes": {
"memory": {
"Int": 97887,
"Unit": "MiB"
},
"memory_clock": {
"Int": 405,
"Unit": "MHz"
},
"power": {
"Int": 8,
"Unit": "W"
},
"pci_bandwidth": {
"Int": 49152,
"Unit": "MB/s"
},
"cores_clock": {
"Int": 180,
"Unit": "MHz"
},
"bar1": {
"Int": 256,
"Unit": "MiB"
},
"persistence_mode": {
"String": "0",
"Unit": ""
},
"driver_version": {
"String": "580.65.06",
"Unit": ""
},
"display_state": {
"String": "0",
"Unit": ""
}
},
"Instances": [
{
"HealthDescription": "",
"Healthy": true,
"ID": "GPU-fb44165c-1a4f-a9dd-aa1b-30fd8c5658e3",
"Locality": {
"PciBusID": "00000000:00:10.0\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000"
}
}
],
"Name": "NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition",
"Type": "gpu",
"Vendor": "nvidia"
}
]
Create job
that have defined device
set just to <vendor>/<type>
or just <vendor>
or just type
and set to count = 2
:
device "nvidia/gpu" {
count = 2
}
Expected Result
job
evaluated that there is 2 GPU on node that match <vendor>/<type>
or <vendor>
or <type>
combination and reserve those GPU + create container with correct NVIDIA_VISIBLE_DEVICES
env variable
Actual Result
Evaluation + placement failure
I had tried to add constraints
that will point to one of the defined in list ids
, model
or even resource attributes but all of those resulted in placement failure. As soon as count
had been reduced from = 2
to = 1
- job without any problem evaluated into one of those cards. I checked whenever some process lock card making its inaccessible but no, nothing. We have recent version of nvidia-device-plugin
1.1.0
plus we also compiled code from master
branch (version reported to 1.2.0
) but have pretty much the same result.
Question is: is this is a legit limitation and documentation does not tell specifically that Multiple GPU
means Multiple GPU with same model
or this is a bug? If there is a question "why do we put different models of GPU in same container" - we are working heavily with ollama. Couple of our servers have configuration where we have more and less power hungry cards on the same node. Due to the fact that different models have different size - ollama can automatically choose what cards it can use currently and will be enough for current model/task
Metadata
Metadata
Assignees
Labels
Type
Projects
Status