Skip to content

Conversation

@PeterPtroc
Copy link
Contributor

Description of PR

  • Introduces a riscv64 native implementation path for CRC32 (CRC32C not optimized).
  • Adds runtime CPU feature detection on linux-riscv64 to enable hardware-accelerated CRC32 when available; falls back to the existing implementation if native is unavailable or disabled.

Below are the performance changes observed using the built-in CRC32 benchmark. Although performance is poor when bpc <= 64, there are substantial improvements when bpc > 64. To keep the codebase simple and maintainable, I did not add bpc-size-specific handling.

bpc #T Native (origin) Native (new) Δ (MB/s) Δ%
32 1 661.5 463.5 -198.0 -29.9%
32 2 642.6 491.4 -151.2 -23.5%
32 4 663.7 480.5 -183.2 -27.6%
32 8 653.0 472.0 -181.0 -27.7%
32 16 656.1 473.4 -182.7 -27.8%
64 1 793.9 318.0 -475.9 -59.9%
64 2 771.3 322.1 -449.2 -58.2%
64 4 787.3 315.0 -472.3 -60.0%
64 8 778.0 309.3 -468.7 -60.2%
64 16 773.5 308.1 -465.4 -60.2%
128 1 878.8 2398.8 +1520.0 +173.0%
128 2 846.8 1723.9 +877.1 +103.6%
128 4 861.2 1690.0 +828.8 +96.2%
128 8 857.8 1373.3 +515.5 +60.1%
128 16 853.8 1361.3 +507.5 +59.4%
256 1 783.9 2752.5 +1968.6 +251.1%
256 2 810.0 2053.3 +1243.3 +153.5%
256 4 835.2 1966.5 +1131.3 +135.5%
256 8 812.4 1756.3 +943.9 +116.2%
256 16 811.8 1524.7 +712.9 +87.8%
512 1 923.6 3328.9 +2405.3 +260.4%
512 2 886.5 3295.1 +2408.6 +271.7%
512 4 910.5 2359.9 +1449.4 +159.2%
512 8 888.1 1637.4 +749.3 +84.4%
512 16 897.0 1840.1 +943.1 +105.1%
1024 1 950.4 3045.0 +2094.6 +220.4%
1024 2 918.0 2202.9 +1284.9 +140.0%
1024 4 937.6 2040.4 +1102.8 +117.6%
1024 8 916.5 1961.5 +1045.0 +114.0%
1024 16 927.4 2003.9 +1076.5 +116.1%
2048 1 962.3 3189.1 +2226.8 +231.4%
2048 2 970.1 3192.3 +2222.2 +229.1%
2048 4 943.4 2411.2 +1467.8 +155.6%
2048 8 937.6 1837.7 +900.1 +96.0%
2048 16 933.1 1864.0 +930.9 +99.8%
4096 1 969.9 3654.5 +2684.6 +276.8%
4096 2 972.0 2798.0 +1826.0 +187.9%
4096 4 960.1 2307.0 +1346.9 +140.3%
4096 8 948.2 2753.1 +1804.9 +190.4%
4096 16 938.7 2170.5 +1231.8 +131.2%
8192 1 973.6 4008.1 +3034.5 +311.7%
8192 2 922.5 3018.2 +2095.7 +227.2%
8192 4 955.6 2968.7 +2013.1 +210.7%
8192 8 943.4 2077.9 +1134.5 +120.3%
8192 16 944.9 2191.7 +1246.8 +132.0%
16384 1 974.4 4090.3 +3115.9 +319.8%
16384 2 978.3 2999.6 +2021.3 +206.6%
16384 4 956.6 3248.9 +2292.3 +239.6%
16384 8 950.8 3228.0 +2277.2 +239.5%
16384 16 941.2 2832.1 +1890.9 +200.9%
32768 1 972.2 4205.7 +3233.5 +332.6%
32768 2 938.6 4115.2 +3176.6 +338.4%
32768 4 957.4 2508.9 +1551.5 +162.1%
32768 8 952.8 2319.8 +1367.0 +143.5%
32768 16 944.5 1657.7 +713.2 +75.5%
65536 1 976.3 4226.6 +3250.3 +332.9%
65536 2 940.0 3075.8 +2135.8 +227.2%
65536 4 958.5 1345.2 +386.7 +40.3%
65536 8 950.2 1954.7 +1004.5 +105.7%
65536 16 945.8 2414.0 +1468.2 +155.2%

How was this patch tested?

Built hadoop-common with native profile on riscv64; verified it's function by TestNativeCrc32.
Ran Hadoop’s CRC32 benchmark on riscv64 (OpenEuler/EulixOS) with JDK 17.
Here is the commands and results:

Command:

mvn -Pnative \
  -Dtest=org.apache.hadoop.util.TestNativeCrc32 \
  -Djava.library.path="$HADOOP_COMMON_LIB_NATIVE_DIR" \
  test

Results

[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.util.TestNativeCrc32
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.017 s -- in org.apache.hadoop.util.TestNativeCrc32
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0
------------------------------------------------------------------------
[INFO] BUILD SUCCESS

Command:

export HADOOP_COMMON_LIB_NATIVE_DIR="$PWD/hadoop-common-project/hadoop-common/target/native/target/usr/local/lib"

export LD_LIBRARY_PATH="$HADOOP_COMMON_LIB_NATIVE_DIR:$LD_LIBRARY_PATH"

mvn -Pnative -DskipTests     -Dexec.classpathScope=test     \
-Dexec.mainClass=org.apache.hadoop.util.Crc32PerformanceTest     \
-Djava.library.path="$HADOOP_COMMON_LIB_NATIVE_DIR" exec:java

Results (Origin)

[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Detecting the operating system and CPU architecture
[INFO] ------------------------------------------------------------------------
[INFO] os.detected.name: linux
[INFO] os.detected.arch: riscv64
[INFO] os.detected.bitness: 64
[INFO] os.detected.version: 6.12
[INFO] os.detected.version.major: 6
[INFO] os.detected.version.minor: 12
[INFO] os.detected.release: EulixOS
[INFO] os.detected.release.version: 3.0
[INFO] os.detected.release.like.EulixOS: true
[INFO] os.detected.classifier: linux-riscv64
[INFO] 
[INFO] ------------------< org.apache.hadoop:hadoop-common >-------------------
[INFO] Building Apache Hadoop Common 3.5.0-SNAPSHOT
[INFO]   from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- exec:1.3.1:java (default-cli) @ hadoop-common ---
[WARNING] Warning: killAfter is now deprecated. Do you need it ? Please comment on MEXEC-6.
                 java.version = 17.0.11
            java.runtime.name = OpenJDK Runtime Environment
         java.runtime.version = 17.0.11+9
              java.vm.version = 17.0.11+9
               java.vm.vendor = BiSheng
                 java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.version = 17
   java.specification.version = 17
                      os.arch = riscv64
                      os.name = Linux
                   os.version = 6.12.35.eos30.riscv64+
Data Length = 64 MB
Trials      = 5

Direct Buffer Performance Table (bpc: byte-per-crc in MB/sec; #T: #Theads)
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|    32 |  1 |      49.8 |     161.9 | 225.0% |     186.8 |  15.4% |      172.8 |  -7.5% |     661.5 | 282.9% |     659.8 |  -0.3% |
|    32 |  2 |      49.4 |     142.9 | 189.2% |     184.0 |  28.7% |      165.1 | -10.3% |     642.6 | 289.3% |     639.2 |  -0.5% |
|    32 |  4 |      49.4 |     144.9 | 193.2% |     186.8 |  28.9% |      183.1 |  -2.0% |     663.7 | 262.4% |     659.7 |  -0.6% |
|    32 |  8 |      48.7 |     146.4 | 200.7% |     183.5 |  25.3% |      182.3 |  -0.7% |     653.0 | 258.2% |     650.2 |  -0.4% |
|    32 | 16 |      47.6 |     142.5 | 199.3% |     185.1 |  30.0% |      182.4 |  -1.5% |     656.1 | 259.8% |     653.7 |  -0.4% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|    64 |  1 |      94.6 |     271.8 | 187.4% |     294.0 |   8.2% |      282.6 |  -3.9% |     793.9 | 181.0% |     793.9 |  -0.0% |
|    64 |  2 |      93.8 |     268.1 | 185.9% |     291.3 |   8.7% |      292.0 |   0.3% |     771.3 | 164.1% |     765.3 |  -0.8% |
|    64 |  4 |      93.0 |     268.5 | 188.6% |     284.9 |   6.1% |      294.2 |   3.3% |     787.3 | 167.6% |     781.0 |  -0.8% |
|    64 |  8 |      91.6 |     267.3 | 192.0% |     286.6 |   7.2% |      291.5 |   1.7% |     778.0 | 166.9% |     773.1 |  -0.6% |
|    64 | 16 |      91.3 |     265.8 | 191.1% |     290.0 |   9.1% |      291.1 |   0.4% |     773.5 | 165.7% |     772.5 |  -0.1% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|   128 |  1 |     161.9 |     406.2 | 150.9% |     417.7 |   2.8% |      421.9 |   1.0% |     878.8 | 108.3% |     874.7 |  -0.5% |
|   128 |  2 |     156.2 |     398.5 | 155.2% |     419.4 |   5.2% |      404.4 |  -3.6% |     846.8 | 109.4% |     845.6 |  -0.1% |
|   128 |  4 |     160.3 |     382.8 | 138.8% |     401.7 |   4.9% |      419.4 |   4.4% |     861.2 | 105.3% |     860.1 |  -0.1% |
|   128 |  8 |     151.7 |     395.5 | 160.6% |     400.1 |   1.2% |      417.0 |   4.2% |     857.8 | 105.7% |     853.8 |  -0.5% |
|   128 | 16 |     152.9 |     391.4 | 156.0% |     408.7 |   4.4% |      415.5 |   1.7% |     853.8 | 105.5% |     849.0 |  -0.6% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|   256 |  1 |     254.8 |     512.3 | 101.1% |     522.7 |   2.0% |      537.2 |   2.8% |     783.9 |  45.9% |     847.9 |   8.2% |
|   256 |  2 |     248.2 |     512.0 | 106.3% |     512.7 |   0.1% |      509.2 |  -0.7% |     810.0 |  59.1% |     838.7 |   3.5% |
|   256 |  4 |     248.1 |     514.7 | 107.5% |     505.3 |  -1.8% |      523.3 |   3.6% |     835.2 |  59.6% |     854.0 |   2.2% |
|   256 |  8 |     246.1 |     508.1 | 106.5% |     501.1 |  -1.4% |      522.4 |   4.2% |     812.4 |  55.5% |     840.3 |   3.4% |
|   256 | 16 |     242.0 |     505.1 | 108.7% |     503.8 |  -0.2% |      520.9 |   3.4% |     811.8 |  55.8% |     836.1 |   3.0% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|   512 |  1 |     368.7 |     640.3 |  73.6% |     613.7 |  -4.2% |      628.9 |   2.5% |     923.6 |  46.9% |     929.1 |   0.6% |
|   512 |  2 |     354.3 |     564.3 |  59.3% |     598.4 |   6.1% |      598.8 |   0.1% |     886.5 |  48.1% |     884.6 |  -0.2% |
|   512 |  4 |     357.0 |     597.0 |  67.2% |     589.0 |  -1.3% |      615.4 |   4.5% |     910.5 |  47.9% |     914.8 |   0.5% |
|   512 |  8 |     353.0 |     609.0 |  72.5% |     591.8 |  -2.8% |      617.0 |   4.3% |     888.1 |  43.9% |     896.0 |   0.9% |
|   512 | 16 |     349.2 |     607.0 |  73.8% |     593.9 |  -2.2% |      606.9 |   2.2% |     897.0 |  47.8% |     893.9 |  -0.3% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  1024 |  1 |     473.2 |     684.2 |  44.6% |     653.3 |  -4.5% |      663.8 |   1.6% |     950.4 |  43.2% |     943.5 |  -0.7% |
|  1024 |  2 |     431.2 |     669.8 |  55.3% |     631.6 |  -5.7% |      632.0 |   0.1% |     918.0 |  45.3% |     949.6 |   3.4% |
|  1024 |  4 |     443.3 |     681.9 |  53.8% |     620.0 |  -9.1% |      639.7 |   3.2% |     937.6 |  46.6% |     934.0 |  -0.4% |
|  1024 |  8 |     435.6 |     653.8 |  50.1% |     602.6 |  -7.8% |      630.4 |   4.6% |     916.5 |  45.4% |     897.8 |  -2.0% |
|  1024 | 16 |     423.7 |     640.9 |  51.2% |     610.0 |  -4.8% |      628.3 |   3.0% |     927.4 |  47.6% |     926.5 |  -0.1% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  2048 |  1 |     518.3 |     719.7 |  38.8% |     660.8 |  -8.2% |      677.9 |   2.6% |     962.3 |  42.0% |     957.0 |  -0.5% |
|  2048 |  2 |     516.9 |     678.7 |  31.3% |     634.2 |  -6.6% |      652.0 |   2.8% |     970.1 |  48.8% |     964.7 |  -0.6% |
|  2048 |  4 |     504.9 |     681.2 |  34.9% |     635.0 |  -6.8% |      658.0 |   3.6% |     943.4 |  43.4% |     946.7 |   0.4% |
|  2048 |  8 |     502.3 |     662.5 |  31.9% |     625.6 |  -5.6% |      624.4 |  -0.2% |     937.6 |  50.2% |     932.6 |  -0.5% |
|  2048 | 16 |     490.1 |     663.8 |  35.4% |     631.0 |  -4.9% |      652.9 |   3.5% |     933.1 |  42.9% |     931.1 |  -0.2% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  4096 |  1 |     524.8 |     743.4 |  41.7% |     676.6 |  -9.0% |      686.9 |   1.5% |     969.9 |  41.2% |     964.1 |  -0.6% |
|  4096 |  2 |     560.0 |     746.9 |  33.4% |     681.0 |  -8.8% |      698.5 |   2.6% |     972.0 |  39.2% |     933.7 |  -3.9% |
|  4096 |  4 |     529.5 |     651.7 |  23.1% |     664.7 |   2.0% |      683.6 |   2.8% |     960.1 |  40.4% |     968.4 |   0.9% |
|  4096 |  8 |     543.4 |     692.3 |  27.4% |     649.0 |  -6.3% |      669.6 |   3.2% |     948.2 |  41.6% |     947.2 |  -0.1% |
|  4096 | 16 |     533.6 |     696.7 |  30.6% |     648.5 |  -6.9% |      662.0 |   2.1% |     938.7 |  41.8% |     940.0 |   0.1% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  8192 |  1 |     570.8 |     751.8 |  31.7% |     682.5 |  -9.2% |      679.1 |  -0.5% |     973.6 |  43.4% |     951.5 |  -2.3% |
|  8192 |  2 |     581.1 |     682.4 |  17.4% |     629.2 |  -7.8% |      644.7 |   2.5% |     922.5 |  43.1% |     974.2 |   5.6% |
|  8192 |  4 |     560.8 |     682.9 |  21.8% |     648.3 |  -5.1% |      671.2 |   3.5% |     955.6 |  42.4% |     951.5 |  -0.4% |
|  8192 |  8 |     570.1 |     690.3 |  21.1% |     632.0 |  -8.4% |      642.4 |   1.6% |     943.4 |  46.9% |     948.6 |   0.6% |
|  8192 | 16 |     560.6 |     691.2 |  23.3% |     645.3 |  -6.6% |      646.0 |   0.1% |     944.9 |  46.3% |     940.4 |  -0.5% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
| 16384 |  1 |     631.0 |     750.6 |  18.9% |     681.1 |  -9.3% |      702.8 |   3.2% |     974.4 |  38.6% |     967.5 |  -0.7% |
| 16384 |  2 |     603.7 |     754.5 |  25.0% |     688.7 |  -8.7% |      704.2 |   2.3% |     978.3 |  38.9% |     974.1 |  -0.4% |
| 16384 |  4 |     585.9 |     728.4 |  24.3% |     630.7 | -13.4% |      683.6 |   8.4% |     956.6 |  39.9% |     951.1 |  -0.6% |
| 16384 |  8 |     567.8 |     659.9 |  16.2% |     612.5 |  -7.2% |      633.0 |   3.3% |     950.8 |  50.2% |     943.3 |  -0.8% |
| 16384 | 16 |     578.0 |     692.6 |  19.8% |     599.7 | -13.4% |      633.3 |   5.6% |     941.2 |  48.6% |     935.8 |  -0.6% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
| 32768 |  1 |     625.6 |     744.7 |  19.0% |     680.3 |  -8.6% |      701.3 |   3.1% |     972.2 |  38.6% |     966.4 |  -0.6% |
| 32768 |  2 |     630.3 |     757.7 |  20.2% |     686.6 |  -9.4% |      643.5 |  -6.3% |     938.6 |  45.9% |     976.8 |   4.1% |
| 32768 |  4 |     585.1 |     702.9 |  20.1% |     644.8 |  -8.3% |      640.1 |  -0.7% |     957.4 |  49.6% |     957.0 |  -0.0% |
| 32768 |  8 |     555.8 |     650.8 |  17.1% |     608.9 |  -6.4% |      640.7 |   5.2% |     952.8 |  48.7% |     945.4 |  -0.8% |
| 32768 | 16 |     554.5 |     664.4 |  19.8% |     600.8 |  -9.6% |      603.9 |   0.5% |     944.5 |  56.4% |     942.3 |  -0.2% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
| 65536 |  1 |     601.3 |     749.4 |  24.6% |     681.2 |  -9.1% |      703.8 |   3.3% |     976.3 |  38.7% |     970.8 |  -0.6% |
| 65536 |  2 |     556.5 |     677.6 |  21.8% |     625.0 |  -7.8% |      637.5 |   2.0% |     940.0 |  47.5% |     937.3 |  -0.3% |
| 65536 |  4 |     546.2 |     678.5 |  24.2% |     632.9 |  -6.7% |      642.6 |   1.5% |     958.5 |  49.2% |     957.2 |  -0.1% |
| 65536 |  8 |     561.6 |     656.8 |  17.0% |     589.6 | -10.2% |      602.7 |   2.2% |     950.2 |  57.7% |     947.6 |  -0.3% |
| 65536 | 16 |     564.9 |     631.9 |  11.8% |     646.2 |   2.3% |      580.0 | -10.2% |     945.8 |  63.1% |     952.4 |   0.7% |
Elapsed 523.5s
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:53 min

Results (With this commit)

[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Detecting the operating system and CPU architecture
[INFO] ------------------------------------------------------------------------
[INFO] os.detected.name: linux
[INFO] os.detected.arch: riscv64
[INFO] os.detected.bitness: 64
[INFO] os.detected.version: 6.12
[INFO] os.detected.version.major: 6
[INFO] os.detected.version.minor: 12
[INFO] os.detected.release: EulixOS
[INFO] os.detected.release.version: 3.0
[INFO] os.detected.release.like.EulixOS: true
[INFO] os.detected.classifier: linux-riscv64
[INFO] 
[INFO] ------------------< org.apache.hadoop:hadoop-common >-------------------
[INFO] Building Apache Hadoop Common 3.5.0-SNAPSHOT
[INFO]   from pom.xml
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- exec:1.3.1:java (default-cli) @ hadoop-common ---
[WARNING] Warning: killAfter is now deprecated. Do you need it ? Please comment on MEXEC-6.
                 java.version = 17.0.11
            java.runtime.name = OpenJDK Runtime Environment
         java.runtime.version = 17.0.11+9
              java.vm.version = 17.0.11+9
               java.vm.vendor = BiSheng
                 java.vm.name = OpenJDK 64-Bit Server VM
java.vm.specification.version = 17
   java.specification.version = 17
                      os.arch = riscv64
                      os.name = Linux
                   os.version = 6.12.35.eos30.riscv64+
Data Length = 64 MB
Trials      = 5

Direct Buffer Performance Table (bpc: byte-per-crc in MB/sec; #T: #Theads)
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|    32 |  1 |      50.8 |     159.1 | 213.2% |     187.0 |  17.5% |      184.3 |  -1.5% |     463.5 | 151.5% |     667.0 |  43.9% |
|    32 |  2 |      51.2 |     161.1 | 214.9% |     186.2 |  15.6% |      183.7 |  -1.3% |     491.4 | 167.5% |     649.1 |  32.1% |
|    32 |  4 |      51.0 |     162.9 | 219.4% |     177.9 |   9.2% |      183.7 |   3.3% |     480.5 | 161.5% |     661.3 |  37.6% |
|    32 |  8 |      46.4 |     162.6 | 250.4% |     185.1 |  13.8% |      181.9 |  -1.7% |     472.0 | 159.5% |     655.7 |  38.9% |
|    32 | 16 |      48.1 |     162.5 | 238.0% |     185.8 |  14.4% |      177.8 |  -4.3% |     473.4 | 166.2% |     649.6 |  37.2% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|    64 |  1 |      87.8 |     272.5 | 210.2% |     293.0 |   7.5% |      295.4 |   0.8% |     318.0 |   7.7% |     783.2 | 146.3% |
|    64 |  2 |      90.8 |     269.6 | 196.8% |     270.3 |   0.2% |      279.7 |   3.5% |     322.1 |  15.2% |     788.0 | 144.7% |
|    64 |  4 |      90.9 |     272.2 | 199.6% |     293.3 |   7.7% |      288.4 |  -1.7% |     315.0 |   9.2% |     787.1 | 149.9% |
|    64 |  8 |      83.8 |     262.5 | 213.1% |     291.1 |  10.9% |      281.3 |  -3.4% |     309.3 |  10.0% |     763.4 | 146.8% |
|    64 | 16 |      84.5 |     268.2 | 217.2% |     289.1 |   7.8% |      279.6 |  -3.3% |     308.1 |  10.2% |     770.7 | 150.2% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|   128 |  1 |     168.3 |     407.0 | 141.9% |     419.8 |   3.2% |      422.7 |   0.7% |    2398.8 | 467.5% |     866.5 | -63.9% |
|   128 |  2 |     167.2 |     407.5 | 143.8% |     403.4 |  -1.0% |      405.6 |   0.5% |    1723.9 | 325.0% |     818.5 | -52.5% |
|   128 |  4 |     156.9 |     398.9 | 154.3% |     411.6 |   3.2% |      405.1 |  -1.6% |    1690.0 | 317.2% |     848.3 | -49.8% |
|   128 |  8 |     156.4 |     397.5 | 154.1% |     406.9 |   2.4% |      395.8 |  -2.7% |    1373.3 | 247.0% |     842.8 | -38.6% |
|   128 | 16 |     151.1 |     399.5 | 164.4% |     410.6 |   2.8% |      399.2 |  -2.8% |    1361.3 | 241.0% |     848.1 | -37.7% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|   256 |  1 |     258.1 |     531.3 | 105.8% |     530.8 |  -0.1% |      531.4 |   0.1% |    2752.5 | 418.0% |     612.1 | -77.8% |
|   256 |  2 |     256.8 |     520.0 | 102.5% |     523.4 |   0.7% |      525.3 |   0.4% |    2053.3 | 290.9% |     714.1 | -65.2% |
|   256 |  4 |     241.9 |     526.4 | 117.6% |     527.3 |   0.2% |      508.5 |  -3.6% |    1966.5 | 286.7% |     745.4 | -62.1% |
|   256 |  8 |     247.1 |     514.9 | 108.4% |     507.4 |  -1.5% |      511.5 |   0.8% |    1756.3 | 243.4% |     595.4 | -66.1% |
|   256 | 16 |     240.5 |     515.0 | 114.2% |     509.3 |  -1.1% |      496.3 |  -2.6% |    1524.7 | 207.2% |     724.8 | -52.5% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|   512 |  1 |     377.5 |     642.6 |  70.2% |     619.1 |  -3.7% |      629.9 |   1.7% |    3328.9 | 428.5% |     919.9 | -72.4% |
|   512 |  2 |     376.7 |     630.7 |  67.4% |     620.5 |  -1.6% |      633.7 |   2.1% |    3295.1 | 420.0% |     883.2 | -73.2% |
|   512 |  4 |     359.4 |     634.7 |  76.6% |     608.4 |  -4.1% |      605.9 |  -0.4% |    2359.9 | 289.5% |     877.2 | -62.8% |
|   512 |  8 |     347.4 |     610.1 |  75.6% |     591.7 |  -3.0% |      571.9 |  -3.3% |    1637.4 | 186.3% |     881.4 | -46.2% |
|   512 | 16 |     341.6 |     609.5 |  78.5% |     599.0 |  -1.7% |      577.6 |  -3.6% |    1840.1 | 218.6% |     840.5 | -54.3% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  1024 |  1 |     443.5 |     698.6 |  57.5% |     657.1 |  -5.9% |      633.0 |  -3.7% |    3045.0 | 381.0% |     875.9 | -71.2% |
|  1024 |  2 |     426.0 |     672.1 |  57.8% |     624.4 |  -7.1% |      633.7 |   1.5% |    2202.9 | 247.6% |     832.9 | -62.2% |
|  1024 |  4 |     422.2 |     692.4 |  64.0% |     639.2 |  -7.7% |      609.9 |  -4.6% |    2040.4 | 234.5% |     883.2 | -56.7% |
|  1024 |  8 |     430.3 |     661.6 |  53.8% |     620.6 |  -6.2% |      620.7 |   0.0% |    1961.5 | 216.0% |     877.6 | -55.3% |
|  1024 | 16 |     425.6 |     659.2 |  54.9% |     615.0 |  -6.7% |      602.1 |  -2.1% |    2003.9 | 232.8% |     870.5 | -56.6% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  2048 |  1 |     511.7 |     701.7 |  37.1% |     658.0 |  -6.2% |      677.7 |   3.0% |    3189.1 | 370.6% |     958.6 | -69.9% |
|  2048 |  2 |     514.8 |     719.8 |  39.8% |     662.7 |  -7.9% |      679.4 |   2.5% |    3192.3 | 369.9% |     966.7 | -69.7% |
|  2048 |  4 |     495.6 |     691.9 |  39.6% |     646.1 |  -6.6% |      653.6 |   1.2% |    2411.2 | 268.9% |     956.5 | -60.3% |
|  2048 |  8 |     509.0 |     686.8 |  34.9% |     635.7 |  -7.4% |      639.5 |   0.6% |    1837.7 | 187.3% |     939.4 | -48.9% |
|  2048 | 16 |     495.8 |     672.9 |  35.7% |     630.4 |  -6.3% |      632.0 |   0.2% |    1864.0 | 195.0% |     915.5 | -50.9% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  4096 |  1 |     552.6 |     733.7 |  32.8% |     678.6 |  -7.5% |      692.8 |   2.1% |    3654.5 | 427.5% |     964.6 | -73.6% |
|  4096 |  2 |     536.5 |     708.8 |  32.1% |     683.7 |  -3.5% |      665.5 |  -2.7% |    2798.0 | 320.4% |     924.6 | -67.0% |
|  4096 |  4 |     552.4 |     708.2 |  28.2% |     670.7 |  -5.3% |      672.7 |   0.3% |    2307.0 | 242.9% |     953.5 | -58.7% |
|  4096 |  8 |     558.5 |     707.2 |  26.6% |     654.6 |  -7.4% |      659.2 |   0.7% |    2753.1 | 317.7% |     949.0 | -65.5% |
|  4096 | 16 |     550.5 |     693.8 |  26.0% |     640.5 |  -7.7% |      651.3 |   1.7% |    2170.5 | 233.3% |     924.8 | -57.4% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
|  8192 |  1 |     598.5 |     757.5 |  26.6% |     685.5 |  -9.5% |      689.7 |   0.6% |    4008.1 | 481.1% |     968.1 | -75.8% |
|  8192 |  2 |     596.6 |     703.7 |  17.9% |     641.0 |  -8.9% |      650.9 |   1.6% |    3018.2 | 363.7% |     977.6 | -67.6% |
|  8192 |  4 |     566.3 |     702.0 |  24.0% |     671.3 |  -4.4% |      667.6 |  -0.5% |    2968.7 | 344.7% |     958.2 | -67.7% |
|  8192 |  8 |     570.5 |     697.2 |  22.2% |     636.0 |  -8.8% |      652.9 |   2.7% |    2077.9 | 218.2% |     928.9 | -55.3% |
|  8192 | 16 |     559.7 |     649.9 |  16.1% |     642.7 |  -1.1% |      636.4 |  -1.0% |    2191.7 | 244.4% |     900.0 | -58.9% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
| 16384 |  1 |     555.1 |     753.0 |  35.6% |     685.3 |  -9.0% |      682.8 |  -0.4% |    4090.3 | 499.1% |     941.5 | -77.0% |
| 16384 |  2 |     594.7 |     736.3 |  23.8% |     641.7 | -12.8% |      646.7 |   0.8% |    2999.6 | 363.8% |     936.4 | -68.8% |
| 16384 |  4 |     588.0 |     699.8 |  19.0% |     656.6 |  -6.2% |      654.9 |  -0.3% |    3248.9 | 396.1% |     889.1 | -72.6% |
| 16384 |  8 |     567.7 |     655.4 |  15.5% |     663.6 |   1.2% |      687.4 |   3.6% |    3228.0 | 369.6% |     944.7 | -70.7% |
| 16384 | 16 |     580.7 |     688.4 |  18.6% |     637.8 |  -7.3% |      659.5 |   3.4% |    2832.1 | 329.4% |     905.7 | -68.0% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
| 32768 |  1 |     608.5 |     756.5 |  24.3% |     688.3 |  -9.0% |      696.2 |   1.1% |    4205.7 | 504.1% |     970.2 | -76.9% |
| 32768 |  2 |     572.3 |     673.8 |  17.7% |     608.9 |  -9.6% |      636.8 |   4.6% |    4115.2 | 546.3% |     936.2 | -77.3% |
| 32768 |  4 |     596.3 |     661.1 |  10.9% |     610.2 |  -7.7% |      667.4 |   9.4% |    2508.9 | 275.9% |     952.5 | -62.0% |
| 32768 |  8 |     605.5 |     687.2 |  13.5% |     624.8 |  -9.1% |      674.0 |   7.9% |    2319.8 | 244.2% |     940.4 | -59.5% |
| 32768 | 16 |     578.2 |     601.1 |   4.0% |     571.4 |  -4.9% |      582.4 |   1.9% |    1657.7 | 184.7% |     925.0 | -44.2% |
|  bpc  | #T ||      Zip ||     ZipC | % diff || PureJava | % diff || PureJavaC | % diff ||   Native | % diff ||  NativeC | % diff |
| 65536 |  1 |     625.1 |     761.1 |  21.7% |     690.4 |  -9.3% |      702.1 |   1.7% |    4226.6 | 502.0% |     973.7 | -77.0% |
| 65536 |  2 |     582.1 |     691.2 |  18.7% |     623.4 |  -9.8% |      640.3 |   2.7% |    3075.8 | 380.4% |     934.2 | -69.6% |
| 65536 |  4 |     539.3 |     669.7 |  24.2% |     635.1 |  -5.2% |      567.9 | -10.6% |    1345.2 | 136.9% |     887.6 | -34.0% |
| 65536 |  8 |     588.4 |     633.4 |   7.7% |     533.5 | -15.8% |      587.8 |  10.2% |    1954.7 | 232.5% |     934.8 | -52.2% |
| 65536 | 16 |     598.0 |     642.9 |   7.5% |     591.6 |  -8.0% |      558.0 |  -5.7% |    2414.0 | 332.6% |     926.3 | -61.6% |
Elapsed 509.6s
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  08:39 min

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 23m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 47m 25s trunk passed
+1 💚 compile 14m 16s trunk passed
+1 💚 mvnsite 2m 32s trunk passed
+1 💚 shadedclient 101m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 10s the patch passed
+1 💚 compile 13m 15s the patch passed
+1 💚 cc 13m 15s the patch passed
+1 💚 golang 13m 15s the patch passed
+1 💚 javac 13m 15s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 2m 29s the patch passed
+1 💚 shadedclient 39m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 23m 25s hadoop-common in the patch passed.
+1 💚 asflicense 1m 57s The patch does not generate ASF License warnings.
206m 55s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8031/1/artifact/out/Dockerfile
GITHUB PR #8031
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux cf2b3cead534 5.15.0-156-generic #166-Ubuntu SMP Sat Aug 9 00:02:46 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0e62b04
Default Java Red Hat, Inc.-1.8.0_462-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8031/1/testReport/
Max. process+thread count 1376 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8031/1/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@PeterPtroc
Copy link
Contributor Author

@steveloughran could you please review this PR if you have time? thanks!

@PeterPtroc
Copy link
Contributor Author

Hi @pan3793 @slfan1989 , could you please take a look when you have a moment? Happy to address any feedback. Thanks!

@steveloughran
Copy link
Contributor

Is everyone with a risc-v setup able to test this?

@slfan1989
Copy link
Contributor

Hi @pan3793 @slfan1989 , could you please take a look when you have a moment? Happy to address any feedback. Thanks!

@PeterPtroc Thank you for your contribution! However, RISC-V is beyond my current knowledge, and I’m sorry I’m unable to assist with reviewing this part of the code. I recommend reaching out to other team members for assistance with the review.

@pan3793
Copy link
Member

pan3793 commented Nov 10, 2025

Is everyone with a risc-v setup able to test this?

to reviewers, #7924 may help you to set up a dev box on x86 or aarch platform by leveraging Docker & QEMU to simulate riscv env, but it's super super slow, either has no means of performance evaluation.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I've tried to review this and asked google gemini to look at specific aspects (alignment, assembly, pointer). It's happy "Yes, the RISC-V assembly code in this pull request looks excellent. It is correct, safe, and follows modern best practices for inline assembly.". If I hadn't been arguing with it and latex citations all afternoon I'd treat its opinions as valid. But here they are as good as my judgement.

I propose adding a comment on each method so that whoever reads this code next understands what it tries to do. Same for the processing of the misaligned data at the start of an operation, and any leftovers.

that's all: explain things for future developers.

+1 pending these changes

@leiwen2025
Copy link
Contributor

Hi @PeterPtroc ,
I received a suggestion on my PR to collaborate with you on optimizing the CRC32-related design for better performance.
I’m happy to work together — maybe we can review each other’s approaches and align on a combined solution that provides the best CRC32 performance.

@steveloughran
Copy link
Contributor

@PeterPtroc as noted, @leiwen2025 can help here.

@leiwen2025 -can you look at this PR as is and review it. Ideally: check it out and do a -Pnative build running the native tests.

If you two are using different instructions, how do they differ.

Having just looked at what clmul/clmulh does, I can see why it offers benefits

  • how common is the instruction?
  • @leiwen2025 how does your vectorized compare? This opcode is intended to be pipelined and the opcode is designed for these kind of encryption/checksum algorithms.

Looking at #7912 it's calling vclmul.vv -this is generally going to be faster, isn't it?

Which means that while the code is more complex, ultimately it's going be the best option on cores with the right feature flaggs.

This makes me think that this one can go in but the vector one goes in as the followup, with the choice of operation dependent on feature, with priority of: vclmul, cmul, classic.

@leiwen2025
Copy link
Contributor

@PeterPtroc as noted, @leiwen2025 can help here.

@leiwen2025 -can you look at this PR as is and review it. Ideally: check it out and do a -Pnative build running the native tests.

If you two are using different instructions, how do they differ.

Having just looked at what clmul/clmulh does, I can see why it offers benefits

  • how common is the instruction?
  • @leiwen2025 how does your vectorized compare? This opcode is intended to be pipelined and the opcode is designed for these kind of encryption/checksum algorithms.

Looking at #7912 it's calling vclmul.vv -this is generally going to be faster, isn't it?

Which means that while the code is more complex, ultimately it's going be the best option on cores with the right feature flaggs.

This makes me think that this one can go in but the vector one goes in as the followup, with the choice of operation dependent on feature, with priority of: vclmul, cmul, classic.

@steveloughran Thanks! I’m happy to help. I’ll check out the PR as-is and run a -Pnative build with the native tests to verify. Will report back once I have the results.

@leiwen2025
Copy link
Contributor

@PeterPtroc as noted, @leiwen2025 can help here.

@leiwen2025 -can you look at this PR as is and review it. Ideally: check it out and do a -Pnative build running the native tests.

If you two are using different instructions, how do they differ.

Having just looked at what clmul/clmulh does, I can see why it offers benefits

  • how common is the instruction?
  • @leiwen2025 how does your vectorized compare? This opcode is intended to be pipelined and the opcode is designed for these kind of encryption/checksum algorithms.

Looking at #7912 it's calling vclmul.vv -this is generally going to be faster, isn't it?

Which means that while the code is more complex, ultimately it's going be the best option on cores with the right feature flaggs.

This makes me think that this one can go in but the vector one goes in as the followup, with the choice of operation dependent on feature, with priority of: vclmul, cmul, classic.

Hi @steveloughran, I have completed the native tests, and the results are consistent with the data shown in the PR. Should I display the data?

Co-authored-by: gong-flying <[email protected]>
@PeterPtroc
Copy link
Contributor Author

sorry for the late reply.
I completely agree with the suggested approach: merging the Zbc/Zbkc scalar-based CRC32C acceleration first, followed by the vector (V + Zvbc) version. This aligns perfectly with my current plan.

I am also excited to collaborate with @leiwen2025 in the next phase to integrate the vectorized solution. Our goal is to implement a multi-tiered optimization strategy: vclmul > clmul > software.

The decision to prioritize the scalar (Zbc/Zbkc) implementation is based on several key factors:

  • Broader Hardware Availability: Zbc is part of the RISC-V scalar cryptography extension and is already supported by many mainstream chips (e.g., SiFive P/U series, T-Head C9xx). The Vector extension (V) and Zvbc are currently limited to newer high-end or experimental hardware, so the scalar version provides immediate benefits to a larger user base.
  • Ease of Verification and Maintenance: The scalar implementation is logically simpler, making it easier to review, test, and debug. Vectorization introduces complexities like memory alignment and register scheduling, which are better handled as a dedicated follow-up optimization.
  • Progressive Evolution & Robust Fallback: Establishing a solid scalar path first allows us to build a robust fallback mechanism (vclmul → clmul → software). This ensures that Hadoop can run efficiently across the diverse RISC-V hardware ecosystem.

I will update the PR with the requested comments and documentation from @steveloughran shortly to ensure the implementation is well-explained for future developers.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 24m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 47m 17s trunk passed
+1 💚 compile 16m 57s trunk passed
+1 💚 mvnsite 2m 45s trunk passed
+1 💚 shadedclient 105m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 13s the patch passed
+1 💚 compile 16m 8s the patch passed
+1 💚 cc 16m 8s the patch passed
+1 💚 golang 16m 8s the patch passed
+1 💚 javac 16m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 mvnsite 2m 42s the patch passed
+1 💚 shadedclient 41m 1s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 23m 45s hadoop-common in the patch passed.
+1 💚 asflicense 2m 9s The patch does not generate ASF License warnings.
216m 46s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8031/2/artifact/out/Dockerfile
GITHUB PR #8031
Optional Tests dupname asflicense compile cc mvnsite javac unit codespell detsecrets golang
uname Linux df771ca26052 5.15.0-164-generic #174-Ubuntu SMP Fri Nov 14 20:25:16 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f34e1b6
Default Java Red Hat, Inc.-1.8.0_472-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8031/2/testReport/
Max. process+thread count 3150 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8031/2/console
versions git=2.43.7 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants