Skip to content

[Bug]: VFIO discovery advertises non-vfio GPUs and Unconfigure rebinds pre-bound GPUs #1089

@johnahull

Description

@johnahull

Component: gpu-kubelet-plugin

Bug Description:

Four related bugs in the VFIO passthrough lifecycle that prevent GPU passthrough to VMs on multi-GPU and NVLink systems:

  1. CDI spec missing /dev/vfio/vfioGetCommonEdits only includes /dev/vfio/vfio when enableAPIDevice=true. Libvirt requires it to detect VFIO support regardless of the API device setting.

  2. VFIO discovery advertises non-vfio GPUsenumerateGpuVfioDevices treats any GPU not on the nvidia driver as a VFIO candidate, including driverless GPUs (stuck after a failed unbind). The scheduler allocates them, and prepare fails or hangs.

  3. Unconfigure rebinds pre-bound GPUs — On H100 SXM5 with NVLink, Unconfigure tries to rebind vfio-pci GPUs back to nvidia, which hangs indefinitely during NVLink fabric reconfiguration. GPUs pre-bound to vfio-pci at boot (via vfio-pci.ids kernel cmdline) should stay on vfio-pci.

  4. Sysfs checks fail inside containerscheckVfioPCIModuleLoaded and checkIommuEnabled check /host-root/sys/ which doesn't expose host sysfs inside containers. The VfioPciManager fails to initialize even though vfio_pci and IOMMU are working on the host.

Steps to Reproduce:

  1. System with H100 SXM5 GPUs (NVLink), GPUs pre-bound to vfio-pci via vfio-pci.ids=10de:2330
  2. Deploy NVIDIA DRA driver with PassthroughSupport=true
  3. Create a ResourceClaim requesting a VFIO GPU
  4. Various failures depending on which bug is hit

Expected Behavior:

VFIO prepare should succeed for GPUs already bound to vfio-pci. CDI spec should always include /dev/vfio/vfio. Only GPUs actually on vfio-pci should be advertised. Sysfs checks should work inside containers.

DRA Driver Version: v25.12.0
Kubernetes Version: v1.36.0
GPU Model: NVIDIA H100 SXM5 80GB HBM3
NVIDIA Driver Version: 595.58
OS / Kernel: Fedora 44, kernel 6.19.14
Container Runtime: containerd 2.2.3

Feature Gates: PassthroughSupport=true, DeviceMetadata=true

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Status
In-Review

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions