Skip to content

feat(neon): add AArch32 compatibility for FMA intrinsics in neon_mathfun.h#6393

Merged
nihui merged 1 commit intoTencent:masterfrom
Abandon-ht:master
Nov 6, 2025
Merged

feat(neon): add AArch32 compatibility for FMA intrinsics in neon_mathfun.h#6393
nihui merged 1 commit intoTencent:masterfrom
Abandon-ht:master

Conversation

@Abandon-ht
Copy link
Contributor

The vfmaq_f32/vfmsq_f32 intrinsics are only available on AArch64.
To support ARM 32-bit (AArch32) targets, replace direct usage with
portable macros that fall back to vmlaq_f32/vmlsq_f32 on 32-bit NEON.

This enables successful compilation on armeabi-v7a while preserving
FMA performance on AArch64. All math functions (log, exp, sin, cos, etc.)
retain identical behavior and accuracy.

@github-actions github-actions bot added the arm label Nov 5, 2025
@tencent-adm
Copy link
Member

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.88%. Comparing base (3ab4b58) to head (5b93449).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6393      +/-   ##
==========================================
- Coverage   95.89%   95.88%   -0.01%     
==========================================
  Files         841      841              
  Lines      266338   266338              
==========================================
- Hits       255402   255379      -23     
- Misses      10936    10959      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

The binary size change of libncnn.so (bytes)

architecture base size pr size difference
x86_64 15212648 15212648 0 😘
armhf 6206656 6206656 0 😘
aarch64 9524368 9524560 +192 ⚠️

@nihui nihui requested a review from Copilot November 5, 2025 15:59
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@nihui nihui requested a review from Copilot November 6, 2025 02:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nihui nihui merged commit 0cc23b6 into Tencent:master Nov 6, 2025
68 of 71 checks passed
@nihui
Copy link
Member

nihui commented Nov 6, 2025

Thanks for your contribution !

@nihui
Copy link
Member

nihui commented Nov 6, 2025

Thanks for your contribution !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants