Skip to content

Add ARM64 (AArch64) benchmark coverage (ASV runner) to monitor perf regressions and gaps (incl. Kunpeng) #64454

@113xiaoji

Description

@113xiaoji

Research

  • I have searched the [pandas] tag on StackOverflow for similar questions.

  • I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

None

Question about pandas

Hi pandas maintainers,

I’m working on improving Python data ecosystem performance on ARM64 servers (Huawei Kunpeng, also Ampere/Neoverse/Graviton-class). I’d like to ask about pandas’ approach to ongoing ARM64 performance monitoring and whether the project would welcome additional ARM64 benchmarking coverage.

Why this matters

Performance changes may impact architectures differently. If we only run regular benchmarks on x86, ARM64 regressions or widening gaps can slip in as the codebase evolves—even when changes are well-intentioned.

pandas already uses ASV and has runner automation, so adding ARM64 could be a natural extension to help:

  • catch ARM64 regressions early (nightly/periodic),
  • understand cross-arch impact (x86 improves, ARM64 regresses, or vice versa),
  • provide reproducible data for optimization work and PR reviews.

Questions for maintainers

  1. Does pandas currently run ASV benchmarks on ARM64 on a regular schedule? If yes, where are the results published?
  2. If not, would you be open to adding an ARM64 runner (or accepting externally-run ARM64 results)?
  3. What is the preferred workflow for adding a new benchmark environment?
    • via existing runner infrastructure (if applicable),
    • or via a separate community-managed runner that publishes a dashboard?
  4. Would maintainers value tracking ARM64 vs x86 ratios (to detect widening gaps), or just per-arch regressions?
  5. Are there priority benchmark groups you’d like to ensure are covered on ARM64 (groupby, merge/join, indexing, IO, strings, datetime, etc.)?

What I can contribute (long-term, bare-metal)

  • Access to stable bare-metal ARM64 machines (Kunpeng), dedicated to benchmarking (not VMs, not shared workloads).
  • Reproducible environment controls (CPU frequency/power policy, BIOS settings, kernel, pinned toolchain), so results are stable and comparable.
  • I’m willing to take long-term ownership of runner operations (automation, maintenance, upgrades, incident handling), so this won’t become a temporary/unmaintained runner.
  • Investigation and actionable reports for regressions, with potential fixes upstream.

Operational commitments (so maintainers can trust sustainability)

  • Cadence
    • Nightly run on main (latest SHA each night).
    • Weekly run on the latest maintenance/release branch(es) if useful.
    • Optional: monthly “deep run” (more repetitions / expanded suite) to confirm trends.
  • Reproducibility
    • Dedicated bare-metal host, minimal background noise.
    • Pinned OS/kernel/toolchain; documented dependency versions (NumPy/Arrow/optional deps as applicable).
    • Fixed CPU governor / power mode and recorded machine metadata each run.
  • Publishing
    • Publish an ASV dashboard (or whichever mechanism pandas prefers) with a clearly named environment (e.g., aarch64-kunpeng-baremetal).
    • Keep logs/metadata for debugging (dependency versions, build flags, ASV machine info).
  • Retention
    • Keep raw results for ≥ 2 years to support trend analysis across releases.
  • Regression policy
    • Use a triage threshold (example: >3–5% sustained regression on key groups, confirmed via rerun) to open actionable follow-ups.
    • Confirm regressions with reruns to reduce noise.
  • Failure handling
    • Automatic retry on transient failures; alert on repeated failures with an owner/contact.
  • Continuity
    • If I can’t maintain the runner, I’ll provide advance notice and help transfer scripts/docs to another maintainer/volunteer.

Happy to follow whatever process you prefer. If this belongs in a different repo (benchmark infra/runner), please redirect me.

Thanks!
<Your Name / Org>
<Contact / GitHub handle>
(ARM64 HW summary: Kunpeng model, cores, memory, OS/kernel; optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceMemory or execution speed performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions