arm unified elempack optimization for groupnorm by mmyyy22 · Pull Request #4080 · Tencent/ncnn

mmyyy22 · 2022-07-23T14:10:36Z

add groupnorm_arm optimize

codecov-commenter · 2022-07-24T03:07:00Z

Codecov Report

Attention: Patch coverage is 99.29245% with 3 lines in your changes missing coverage. Please review.

Project coverage is 95.70%. Comparing base (1d84e98) to head (e7e770e).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
src/layer/arm/groupnorm_arm.cpp	99.20%	2 Missing ⚠️
src/layer/arm/groupnorm_arm_asimdhp.cpp	99.41%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #4080    +/-   ##
========================================
  Coverage   95.70%   95.70%            
========================================
  Files         827      829     +2     
  Lines      269904   270328   +424     
========================================
+ Hits       258301   258722   +421     
- Misses      11603    11606     +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

nihui · 2022-07-25T08:45:16Z

hi, the test failure indicates that your code produces wrong result in some condition. please investigate it.

disable support_packing

tencent-adm · 2025-06-04T08:51:17Z

Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ nihui
❌ mmyyy22
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Copilot

Pull Request Overview

This PR introduces an optimized group normalization implementation for ARM with unified elempack support, enhancing performance in FP16 calculations.

Updated test logging to include additional dimensions and the affine flag.
Added a new implementation (groupnorm_arm_asimdhp.cpp) using ARM NEON intrinsics for FP16 arithmetic.
Introduced header declarations in groupnorm_arm.h for the new methods.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
tests/test_groupnorm.cpp	Updated test log formatting to print additional tensor dimensions and the affine flag.
src/layer/arm/groupnorm_arm_asimdhp.cpp	New optimized FP16 group normalization implementation using ARM NEON intrinsics.
src/layer/arm/groupnorm_arm.h	Header declarations for the ARM-optimized groupnorm functions.

Comments suppressed due to low confidence (1)

src/layer/arm/groupnorm_arm_asimdhp.cpp:25

Consider adding a function-level comment for groupnorm_fp16s to explain its purpose, algorithm, and parameter assumptions. This will improve code readability and maintainability.

static void groupnorm_fp16s(__fp16* ptr, const float* gamma_ptr, const float* beta_ptr, float eps, int channels, int size, int elempack, size_t cstep)

nihui · 2025-06-04T11:30:28Z

Thanks for your contribution !

github-actions · 2025-06-04T11:35:30Z

The binary size change of libncnn.so (bytes)

architecture	base size	pr size	difference
x86_64	16495480	16495480	0 😘
armhf	7369732	7380016	+10284 ⚠️
aarch64	10775832	10777936	+2104 ⚠️

mmyyy22 and others added 5 commits July 21, 2022 00:03

[GroupNorm] arm_optimize

b381d8c

apply code-format changes

c6cbef9

fix const bug

03f461f

fix const bug

22da9d5

Merge branch 'Tencent:master' into groupnorm_arm

51032f2

mmyyy22 and others added 5 commits July 26, 2022 21:43

Merge branch 'Tencent:master' into groupnorm_arm

4343e00

fix add sum bug

cb2ea2f

disable support_packing

fix add sqsum bug

22c8452

Merge branch 'Tencent:master' into groupnorm_arm

80a7b79

Merge branch 'master' into groupnorm_arm

f9393b5

github-actions bot added the arm label Jun 4, 2025

arm unified elempack optimization for groupnorm

554f450

github-actions bot added the test label Jun 4, 2025

nihui changed the title ~~Groupnorm_arm~~ arm unified elempack optimization for groupnorm Jun 4, 2025

f

e7e770e

nihui requested a review from Copilot June 4, 2025 11:06

Copilot AI reviewed Jun 4, 2025

View reviewed changes

nihui merged commit 78b2e68 into Tencent:master Jun 4, 2025
94 of 98 checks passed

BrewTestBot mentioned this pull request Sep 16, 2025

ncnn 20250916 Homebrew/homebrew-core#243467

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arm unified elempack optimization for groupnorm#4080

arm unified elempack optimization for groupnorm#4080
nihui merged 12 commits intoTencent:masterfrom
mmyyy22:groupnorm_arm

mmyyy22 commented Jul 23, 2022

Uh oh!

codecov-commenter commented Jul 24, 2022 •

edited

Loading

Uh oh!

nihui commented Jul 25, 2022

Uh oh!

tencent-adm commented Jun 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

nihui commented Jun 4, 2025

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mmyyy22 commented Jul 23, 2022

Uh oh!

codecov-commenter commented Jul 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

nihui commented Jul 25, 2022

Uh oh!

tencent-adm commented Jun 4, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

nihui commented Jun 4, 2025

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Jul 24, 2022 •

edited

Loading