Skip to content

Conversation

@bwesterb
Copy link
Member

Somewhat surprisingly this leads to a small speed up. Results will obviously vary per platform, but unless assembly gives a dramatic and clear speed up, we shouldn't be bothered to maintain it.

Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz

name                   old time/op   new time/op   delta
PermutationFunction-8    378ns ± 1%    355ns ± 3%  -6.12%  (p=0.000 n=10+9)
Sha3_512_MTU-8          7.73µs ± 1%   8.45µs ±23%  +9.30%  (p=0.003 n=9+10)
Sha3_384_MTU-8          5.61µs ± 4%   5.65µs ±12%    ~     (p=0.853 n=10+10)
Sha3_256_MTU-8          4.47µs ± 6%   4.50µs ±13%    ~     (p=0.579 n=10+10)
Sha3_224_MTU-8          4.21µs ± 4%   4.06µs ± 5%  -3.67%  (p=0.001 n=10+10)
Shake128_MTU-8          3.62µs ± 2%   3.43µs ± 2%  -5.30%  (p=0.000 n=9+10)
Shake256_MTU-8          3.93µs ± 2%   3.77µs ± 4%  -4.06%  (p=0.000 n=10+10)
Shake256_16x-8          55.3µs ± 1%   54.8µs ± 5%    ~     (p=0.315 n=9+10)
Shake256_1MiB-8         3.03ms ± 3%   3.08ms ± 8%    ~     (p=0.353 n=10+10)
Sha3_512_1MiB-8         5.61ms ± 3%   5.37ms ± 2%  -4.20%  (p=0.000 n=10+10)

name                   old speed     new speed     delta
PermutationFunction-8  530MB/s ± 1%  564MB/s ± 3%  +6.53%  (p=0.000 n=10+9)
Sha3_512_MTU-8         175MB/s ± 1%  161MB/s ±19%  -7.76%  (p=0.003 n=9+10)
Sha3_384_MTU-8         241MB/s ± 4%  240MB/s ±11%    ~     (p=0.853 n=10+10)
Sha3_256_MTU-8         302MB/s ± 5%  301MB/s ±12%    ~     (p=0.579 n=10+10)
Sha3_224_MTU-8         321MB/s ± 4%  333MB/s ± 5%  +3.83%  (p=0.001 n=10+10)
Shake128_MTU-8         373MB/s ± 2%  394MB/s ± 2%  +5.62%  (p=0.000 n=9+10)
Shake256_MTU-8         343MB/s ± 3%  358MB/s ± 4%  +4.24%  (p=0.000 n=10+10)
Shake256_16x-8         296MB/s ± 1%  299MB/s ± 5%    ~     (p=0.315 n=9+10)
Shake256_1MiB-8        347MB/s ± 3%  341MB/s ± 7%    ~     (p=0.353 n=10+10)
Sha3_512_1MiB-8        187MB/s ± 3%  195MB/s ± 2%  +4.38%  (p=0.000 n=10+10)

@bwesterb bwesterb requested a review from armfazh April 18, 2023 11:13
Somewhat surprisingly this leads to a small speed up. Results will
obviously vary per platform, but unless assembly gives a dramatic and
clear speed up, we shouldn't be bothered to maintain it.

Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz

name                   old time/op   new time/op   delta
PermutationFunction-8    378ns ± 1%    355ns ± 3%  -6.12%  (p=0.000 n=10+9)
Sha3_512_MTU-8          7.73µs ± 1%   8.45µs ±23%  +9.30%  (p=0.003 n=9+10)
Sha3_384_MTU-8          5.61µs ± 4%   5.65µs ±12%    ~     (p=0.853 n=10+10)
Sha3_256_MTU-8          4.47µs ± 6%   4.50µs ±13%    ~     (p=0.579 n=10+10)
Sha3_224_MTU-8          4.21µs ± 4%   4.06µs ± 5%  -3.67%  (p=0.001 n=10+10)
Shake128_MTU-8          3.62µs ± 2%   3.43µs ± 2%  -5.30%  (p=0.000 n=9+10)
Shake256_MTU-8          3.93µs ± 2%   3.77µs ± 4%  -4.06%  (p=0.000 n=10+10)
Shake256_16x-8          55.3µs ± 1%   54.8µs ± 5%    ~     (p=0.315 n=9+10)
Shake256_1MiB-8         3.03ms ± 3%   3.08ms ± 8%    ~     (p=0.353 n=10+10)
Sha3_512_1MiB-8         5.61ms ± 3%   5.37ms ± 2%  -4.20%  (p=0.000 n=10+10)

name                   old speed     new speed     delta
PermutationFunction-8  530MB/s ± 1%  564MB/s ± 3%  +6.53%  (p=0.000 n=10+9)
Sha3_512_MTU-8         175MB/s ± 1%  161MB/s ±19%  -7.76%  (p=0.003 n=9+10)
Sha3_384_MTU-8         241MB/s ± 4%  240MB/s ±11%    ~     (p=0.853 n=10+10)
Sha3_256_MTU-8         302MB/s ± 5%  301MB/s ±12%    ~     (p=0.579 n=10+10)
Sha3_224_MTU-8         321MB/s ± 4%  333MB/s ± 5%  +3.83%  (p=0.001 n=10+10)
Shake128_MTU-8         373MB/s ± 2%  394MB/s ± 2%  +5.62%  (p=0.000 n=9+10)
Shake256_MTU-8         343MB/s ± 3%  358MB/s ± 4%  +4.24%  (p=0.000 n=10+10)
Shake256_16x-8         296MB/s ± 1%  299MB/s ± 5%    ~     (p=0.315 n=9+10)
Shake256_1MiB-8        347MB/s ± 3%  341MB/s ± 7%    ~     (p=0.353 n=10+10)
Sha3_512_1MiB-8        187MB/s ± 3%  195MB/s ± 2%  +4.38%  (p=0.000 n=10+10)
@armfazh armfazh merged commit 7955403 into main Apr 18, 2023
@armfazh armfazh deleted the bas/remove-sha3-asm branch April 18, 2023 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants