Skip to content

Add PCI hotplug support#5786

Merged
ilstam merged 15 commits into
firecracker-microvm:mainfrom
ilstam:hotplug
May 1, 2026
Merged

Add PCI hotplug support#5786
ilstam merged 15 commits into
firecracker-microvm:mainfrom
ilstam:hotplug

Conversation

@ilstam

@ilstam ilstam commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Add PCI hotplug support. No hotplug notification mechanism is implemented yet, so the the guest needs to rescan the PCI "bus" manually in order to see new attachments.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@codecov

codecov Bot commented Mar 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 69.12442% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.80%. Comparing base (42066df) to head (fec4905).
⚠️ Report is 15 commits behind head on main.

Files with missing lines Patch % Lines
src/vmm/src/rpc_interface.rs 19.35% 25 Missing ⚠️
src/firecracker/src/api_server_adapter.rs 0.00% 23 Missing ⚠️
.../firecracker/src/api_server/request/hotplug/mod.rs 0.00% 10 Missing ⚠️
src/firecracker/src/api_server/parsed_request.rs 0.00% 7 Missing ⚠️
src/vmm/src/device_manager/mod.rs 98.16% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5786      +/-   ##
==========================================
- Coverage   82.82%   82.80%   -0.02%     
==========================================
  Files         276      277       +1     
  Lines       29691    29889     +198     
==========================================
+ Hits        24591    24751     +160     
- Misses       5100     5138      +38     
Flag Coverage Δ
5.10-m5n.metal 83.10% <69.12%> (-0.01%) ⬇️
5.10-m6a.metal 82.43% <69.12%> (-0.02%) ⬇️
5.10-m6g.metal 79.73% <69.12%> (+<0.01%) ⬆️
5.10-m6i.metal 83.09% <69.12%> (-0.02%) ⬇️
5.10-m7a.metal-48xl 82.42% <69.12%> (-0.02%) ⬇️
5.10-m7g.metal 79.73% <69.12%> (+<0.01%) ⬆️
5.10-m7i.metal-24xl 83.07% <69.12%> (-0.02%) ⬇️
5.10-m7i.metal-48xl 83.07% <69.12%> (-0.02%) ⬇️
5.10-m8g.metal-24xl 79.73% <69.12%> (+0.01%) ⬆️
5.10-m8g.metal-48xl 79.73% <69.12%> (+<0.01%) ⬆️
5.10-m8i.metal-48xl 83.07% <69.12%> (-0.02%) ⬇️
5.10-m8i.metal-96xl 83.07% <69.12%> (-0.03%) ⬇️
6.1-m5n.metal 83.13% <69.12%> (-0.01%) ⬇️
6.1-m6a.metal 82.46% <69.12%> (-0.01%) ⬇️
6.1-m6g.metal 79.73% <69.12%> (+<0.01%) ⬆️
6.1-m6i.metal 83.13% <69.12%> (-0.01%) ⬇️
6.1-m7a.metal-48xl 82.45% <69.12%> (-0.01%) ⬇️
6.1-m7g.metal 79.73% <69.12%> (+<0.01%) ⬆️
6.1-m7i.metal-24xl 83.14% <69.12%> (-0.02%) ⬇️
6.1-m7i.metal-48xl 83.14% <69.12%> (-0.02%) ⬇️
6.1-m8g.metal-24xl 79.73% <69.12%> (+<0.01%) ⬆️
6.1-m8g.metal-48xl 79.73% <69.12%> (+<0.01%) ⬆️
6.1-m8i.metal-48xl 83.14% <69.12%> (-0.02%) ⬇️
6.1-m8i.metal-96xl 83.15% <69.12%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ilstam ilstam force-pushed the hotplug branch 2 times, most recently from 0fcdd09 to cdcf610 Compare March 31, 2026 18:06

@ShadowCurse ShadowCurse left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

Comment thread src/vmm/src/devices/virtio/pmem/device.rs Outdated
Comment thread src/vmm/src/device_manager/pci_mngr.rs Outdated

@Manciukic Manciukic left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly LGTM, just a few things here and there. I'm happy to take some of the items (like extending test coverage) to a separate PR. We should also update documentation, but happy to take that on another PR.

Comment thread tests/integration_tests/functional/test_api.py
Comment thread src/vmm/src/device_manager/pci_mngr.rs Outdated
Comment thread resources/seccomp/aarch64-unknown-linux-musl.json
Comment thread tests/integration_tests/functional/test_hotplug.py
Comment thread src/vmm/src/device_manager/pci_mngr.rs Outdated
Comment thread src/vmm/src/device_manager/pci_mngr.rs Outdated
Comment thread resources/seccomp/aarch64-unknown-linux-musl.json
Comment thread tests/integration_tests/functional/test_hotplug.py Outdated
Comment thread tests/integration_tests/functional/test_hotplug.py Outdated
@ilstam ilstam force-pushed the hotplug branch 3 times, most recently from edfdb0d to b878e82 Compare April 29, 2026 15:48
@ilstam

ilstam commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

I'm happy to take some of the items (like extending test coverage) to a separate PR. We should also update documentation, but happy to take that on another PR.

I extended the coverage in this PR, but will update documentation and CHANGELOG in a different PR after we get this merged.

@ilstam ilstam force-pushed the hotplug branch 2 times, most recently from 33c4989 to 70ad97e Compare April 29, 2026 17:59
@ilstam ilstam added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Apr 29, 2026
Manciukic
Manciukic previously approved these changes Apr 30, 2026

@Manciukic Manciukic left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment thread tests/integration_tests/functional/test_hotplug.py
Comment thread tests/integration_tests/functional/test_hotplug.py
ShadowCurse
ShadowCurse previously approved these changes Apr 30, 2026

@ShadowCurse ShadowCurse left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

ilstam added 13 commits May 1, 2026 14:39
ApiServerAdapter has a handle_request() method for dispatching requests
received from the API thread to the right handler. This method is called
from inside EventManager::run() which takes &mut self as an argument.
This can be problematic if we want to modify the EventManager object
from inside handle_request().

Work around that by having ApiServerAdapter::process() store the request
but not call handle_request(). EventManager::run() returns after
handling each event. Call handle_request() after EventManager::run() in
the event loop.

In a subsequent patch, handle_request() will take a &mut EventManager as
an argument.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The handle_request() function will soon need access to the EventManager
in order to add and remove new objects to support device hot-plugging.
Pass a mutable reference of EventManager to handle_request() in
preparation of that.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The EntropyDevice and PmemDevice error names are inconsistent with the
error names used for all other devices all of which use the Config
suffix. Rename them to EntropyConfig and PmemConfig for consistency.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Define VirtioDeviceId as (VirtioDeviceType, String) and use it in the
PCI and MMIO device manager HashMaps instead of the raw tuple. This will
be used in more places in subsequent patches and having an ID type
increases readability.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Firecracker has been rejecting device attach API requests after the VM
was started until now.

Add hot-plugging for Block, Pmem and Net PCIe devices. This enables the
relevant API calls and attaches the device to the PCIe "bus".

No notification is delivered to the guest at the moment to notify it
that a new device has been added. The guest has to manually rescan the
bus in order to detect new devices.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
PciBus::add_device() silently overwrites the previous device if called
with a device_id that is already occupied. Add an assert to catch this
as a programming error.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
add_device() inserts a device into the PCI bus but does not mark the
corresponding slot in the device_ids bitmap. This is ok without device
hotplugging support because next_device_id() is never called after
restore. However, with the upcoming device hotplugging support this is
going to be a problem.

The device_ids vector is set to all False after restore. Therefore if we
try to hotplug a device after a snapshot restore then next_device_id()
will return a slot that is not really free.

Fix this by marking the slot as non-free in add_device(). After snapshot
restore add_device() will be called for every device saved in the
snapshot and the device_ids vector will be updated accordingly.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Wire up the HTTP DELETE method for device hot-unplug. Add the
HotUnplugDevice VmmAction variant, route it through the pre-boot/runtime
controllers, and define the hot_unplug_device() stub which will be
implemented in a subsequent patch.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Implement the actual device teardown: remove the device from the virtio
devices map, the MMIO bus, the PCI bus, and the event manager. Also add
PciBus::remove_device() to free the PCI device slot.

No notification is delivered to the guest at the moment to notify it
that a device has been removed.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The test generates bad syscall arguments by adding 1000000 to the
allowed value, assuming this produces a value that will be rejected.
This breaks for masked_eq operations: e.g. with mask=4 and val=0
(PROT_EXEC must not be set), (0 + 1000000) & 4 is still 0, so the
bad value passes the check.

Use XOR with the mask instead, which flips the relevant bit and
guarantees a violation regardless of whether the rule expects the
bit set or clear.

A subsequent patch will add seccomp rules that use the 'masked_eq'
operation.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The device hotplug path creates devices after boot, when seccomp
filters are already active. This requires allowing syscalls that were
previously only called during boot before seccomp was installed.

Add the following to the vmm thread filter:

- timerfd_create: RateLimiter creates a TimerFd for each new device
- ioctl(KVM_IOEVENTFD): registering ioeventfds for virtqueue
  notification
- ioctl(TUNSETIFF): opening tap device for net hotplug
- ioctl(TUNSETOFFLOAD): configuring tap offload for net hotplug
- ioctl(TUNSETVNETHDRSZ): setting vnet header size for net hotplug
- mmap(MAP_SHARED|MAP_NORESERVE, !PROT_EXEC): pmem backing file mapping
- mmap(MAP_PRIVATE|MAP_NORESERVE|MAP_ANONYMOUS, !PROT_EXEC): pmem
  aligned region
- mmap(MAP_SHARED|MAP_NORESERVE|MAP_FIXED, !PROT_EXEC): pmem file
  overlay mapping
- mmap(MAP_SHARED|MAP_FIXED, PROT_READ|PROT_WRITE): IovDeque ring
  buffer for net device
- memfd_create(MFD_CLOEXEC|MFD_ALLOW_SEALING): IovDeque ring buffer
  for net device
- fcntl(F_ADD_SEALS): IovDeque memfd sealing for net device

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add unit tests for testing the possible failure modes of hotplugging
different classes of devices.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Extend the existing hotplug unit tests to also verify device hot-unplug
and add new tests to verify that hot-unplugging root devices is
rejected.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
ilstam added 2 commits May 1, 2026 14:39
Add integration tests for block, pmem and net hotplugging.

The tests require a manual PCI bus rescan at the moment since no hotplug
notification mechanism is implemented at the moment.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add a delete() method to the test framework HTTP API client and extend
the hotplug tests to verify device hot-unplug. Also extend the
test_hotplug_max_devices() test to unplugs all devices, verifies slots
are freed, and plug them again.

Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
@ilstam ilstam dismissed stale reviews from ShadowCurse and Manciukic via fec4905 May 1, 2026 13:39
@ilstam ilstam merged commit 3cf3a58 into firecracker-microvm:main May 1, 2026
6 of 7 checks passed
@ilstam ilstam deleted the hotplug branch May 1, 2026 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants