Skip to content

fix(aarch64): override fabricated CLIDR_EL1 to match host cache topology#5780

Merged
kalyazin merged 1 commit into
firecracker-microvm:mainfrom
kalyazin:arm_topo
Mar 24, 2026
Merged

fix(aarch64): override fabricated CLIDR_EL1 to match host cache topology#5780
kalyazin merged 1 commit into
firecracker-microvm:mainfrom
kalyazin:arm_topo

Conversation

@kalyazin

@kalyazin kalyazin commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Since host kernel 6.3 (commit 7af0c2534f4c), KVM fabricates CLIDR_EL1 instead of passing through the host's real value. On hosts with IDC=1 and DIC=0 (e.g. Neoverse V1), the fabricated CLIDR exposes only L1=Unified when the host actually has separate L1d+L1i, L2, and L3.

Guest kernels >= 6.1.156 backported init_of_cache_level() which counts cache leaves from the DT, while populate_cache_leaves() uses CLIDR_EL1. When the DT (built from host sysfs) describes more cache entries than CLIDR_EL1, the mismatch causes cache sysfs entries to not be created, breaking /sys/devices/system/cpu/cpu*/cache/* in the guest.

Fix this by reading the current CLIDR_EL1 from vCPU 0, merging in the ctype and LoC fields derived from the host's sysfs cache topology, and writing the result back to each vCPU via KVM_SET_ONE_REG. Fields that cannot be derived from sysfs (LoUU, LoUIS, ICB, Ttype) are preserved from the original CLIDR_EL1. This makes CLIDR_EL1 consistent with the FDT, which already describes the real host caches.

On pre-6.3 kernels, KVM passes through the real host CLIDR rather than fabricating one. Since the sysfs cache topology already matches the real CLIDR, the merge produces the same value, the write is skipped, and the override is effectively a no-op.

This approach preserves the full host cache information for the guest rather than stripping the FDT to match the fabricated CLIDR.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkbuild --all to verify that the PR passes
    build checks on all supported architectures.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@codecov

codecov Bot commented Mar 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 84.61538% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.02%. Comparing base (dc84e40) to head (587e8f1).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/vmm/src/arch/aarch64/mod.rs 77.41% 7 Missing ⚠️
src/vmm/src/arch/aarch64/cache_info.rs 89.36% 5 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5780   +/-   ##
=======================================
  Coverage   83.02%   83.02%           
=======================================
  Files         276      276           
  Lines       29340    29418   +78     
=======================================
+ Hits        24359    24425   +66     
- Misses       4981     4993   +12     
Flag Coverage Δ
5.10-m5n.metal 83.30% <ø> (+<0.01%) ⬆️
5.10-m6a.metal 82.62% <ø> (ø)
5.10-m6g.metal 80.03% <84.61%> (+0.01%) ⬆️
5.10-m6i.metal 83.30% <ø> (ø)
5.10-m7a.metal-48xl 82.62% <ø> (-0.01%) ⬇️
5.10-m7g.metal 80.03% <84.61%> (+0.01%) ⬆️
5.10-m7i.metal-24xl 83.27% <ø> (-0.01%) ⬇️
5.10-m7i.metal-48xl 83.27% <ø> (-0.01%) ⬇️
5.10-m8g.metal-24xl 80.03% <84.61%> (+0.01%) ⬆️
5.10-m8g.metal-48xl 80.03% <84.61%> (+0.01%) ⬆️
5.10-m8i.metal-48xl 83.27% <ø> (ø)
5.10-m8i.metal-96xl 83.27% <ø> (ø)
6.1-m5n.metal 83.33% <ø> (+<0.01%) ⬆️
6.1-m6a.metal 82.65% <ø> (+<0.01%) ⬆️
6.1-m6g.metal 80.03% <84.61%> (+0.01%) ⬆️
6.1-m6i.metal 83.32% <ø> (ø)
6.1-m7a.metal-48xl 82.64% <ø> (ø)
6.1-m7g.metal 80.03% <84.61%> (+0.01%) ⬆️
6.1-m7i.metal-24xl 83.34% <ø> (+<0.01%) ⬆️
6.1-m7i.metal-48xl 83.34% <ø> (-0.01%) ⬇️
6.1-m8g.metal-24xl 80.03% <84.61%> (+0.01%) ⬆️
6.1-m8g.metal-48xl 80.03% <84.61%> (+0.01%) ⬆️
6.1-m8i.metal-48xl 83.34% <ø> (ø)
6.1-m8i.metal-96xl 83.34% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kalyazin kalyazin force-pushed the arm_topo branch 2 times, most recently from 3f6742b to 4184864 Compare March 20, 2026 17:32
@kalyazin kalyazin marked this pull request as ready for review March 20, 2026 18:00
@kalyazin kalyazin requested review from Manciukic and pb8o as code owners March 20, 2026 18:00
@kalyazin kalyazin self-assigned this Mar 20, 2026
@kalyazin kalyazin added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Mar 20, 2026
Comment thread src/vmm/src/arch/aarch64/cache_info.rs
Comment thread src/vmm/src/arch/aarch64/cache_info.rs
Comment thread src/vmm/src/arch/aarch64/mod.rs
@kalyazin kalyazin force-pushed the arm_topo branch 2 times, most recently from af73c49 to 1f2e62b Compare March 23, 2026 12:20
Comment thread src/vmm/src/arch/aarch64/cache_info.rs
Comment thread src/vmm/src/arch/aarch64/mod.rs Outdated
Comment thread src/vmm/src/arch/aarch64/mod.rs Outdated
Comment thread src/vmm/src/arch/aarch64/mod.rs Outdated
ShadowCurse
ShadowCurse previously approved these changes Mar 23, 2026
JamesC1305
JamesC1305 previously approved these changes Mar 23, 2026

@Manciukic Manciukic left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one nit that we could print a warning in case we fail to read the host cache topology.

Comment thread src/vmm/src/arch/aarch64/mod.rs
Since host kernel 6.3 (commit 7af0c2534f4c), KVM fabricates CLIDR_EL1
instead of passing through the host's real value. On hosts with IDC=1
and DIC=0 (e.g. Neoverse V1), the fabricated CLIDR exposes only
L1=Unified when the host actually has separate L1d+L1i, L2, and L3.

Guest kernels >= 6.1.156 backported init_of_cache_level() which counts
cache leaves from the DT, while populate_cache_leaves() uses CLIDR_EL1.
When the DT (built from host sysfs) describes more cache entries than
CLIDR_EL1, the mismatch causes cache sysfs entries to not be created,
breaking /sys/devices/system/cpu/cpu*/cache/* in the guest.

Fix this by reading the current CLIDR_EL1 from vCPU 0, merging in the
ctype and LoC fields derived from the host's sysfs cache topology, and
writing the result back to each vCPU via KVM_SET_ONE_REG. Fields that
cannot be derived from sysfs (LoUU, LoUIS, ICB, Ttype) are preserved
from the original CLIDR_EL1. This makes CLIDR_EL1 consistent with the
FDT, which already describes the real host caches.

On pre-6.3 kernels, KVM passes through the real host CLIDR rather than
fabricating one. Since the sysfs cache topology already matches the real
CLIDR, the merge produces the same value, the write is skipped, and the
override is effectively a no-op.

This approach preserves the full host cache information for the guest
rather than stripping the FDT to match the fabricated CLIDR.

Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>
@kalyazin kalyazin enabled auto-merge (rebase) March 24, 2026 15:54
@kalyazin kalyazin merged commit b5ac3a6 into firecracker-microvm:main Mar 24, 2026
7 checks passed
@kalyazin kalyazin deleted the arm_topo branch March 24, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Awaiting review Indicates that a pull request is ready to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants