Skip to content

Conversation

@mdcornu
Copy link
Contributor

@mdcornu mdcornu commented Jun 16, 2025

This PR introduces new low-level implementations of the SHA3 and SHAKE algorithms using AVX512VL extensions as proposed in this discussion.

Summary of Changes:

  • Added two AVX512VL-based implementations:
    • SHA3-256/384/512 and SHAKE128/256 single buffer implementations
    • SHAKE128 and SHAKE256 multi-buffer implementations
  • Implementations use 256-bit AVX512VL instructions for broader compatibility (including AVX10 in the future).
  • Compile-time flag: OQS_USE_SHA3_AVX512VL (enabled by default on x86 platforms)
  • Runtime Selection Logic (slight change from original proposal):
    • Instead of the AVX512VL module checking for AVX512 support and falling back to XKCP, the XKCP module now performs the runtime check.
    • If AVX512VL is available, the XKCP module sets the default callbacks to use the AVX512VL implementations.
    • This approach minimizes changes to the XKCP module and in common source modules in general.
  • Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
  • Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider will also need to be ready for review and merge by the time this is merged.)

Thank you for reviewing this contribution. We’re happy to make adjustments based on your feedback.

mdcornu and others added 4 commits June 11, 2025 15:23
Co-authored-by: Tomasz Kantecki <[email protected]>
Co-authored-by: Erdinc Ozturk <[email protected]>
Signed-off-by: Marcel Cornu <[email protected]>
Signed-off-by: Tomasz Kantecki <[email protected]>
Co-authored-by: Marcel Cornu <[email protected]>
Signed-off-by: Tomasz Kantecki <[email protected]>
Signed-off-by: Marcel Cornu <[email protected]>
Signed-off-by: Marcel Cornu <[email protected]>
@SWilson4
Copy link
Member

Thanks for the contribution! Could you please add a CI test for this feature?

@dstebila
Copy link
Member

Thanks for the contribution! Could you please add a CI test for this feature?

What do you specifically mean by that? Wouldn't this code get activated automatically on any supported platform? If so then it would be a matter of ensuring the CI matrix does include such a platform. Are we able to do that?

@SWilson4
Copy link
Member

Thanks for the contribution! Could you please add a CI test for this feature?

What do you specifically mean by that? Wouldn't this code get activated automatically on any supported platform? If so then it would be a matter of ensuring the CI matrix does include such a platform. Are we able to do that?

Ah, of course, it's enabled by default—my mistake. In that case, I think it would be good to add a CI config that explicitly switches the new option off in order to ensure that the AVX2 implementation still gets tested. I expect that the GitHub runners will support both.

@mdcornu
Copy link
Contributor Author

mdcornu commented Jun 18, 2025

Ah, of course, it's enabled by default—my mistake. In that case, I think it would be good to add a CI config that explicitly switches the new option off in order to ensure that the AVX2 implementation still gets tested. I expect that the GitHub runners will support both.

Sure. That can be done with OQS_USE_SHA3_AVX512VL=OFF. I'll add a config for that 👍

@mdcornu mdcornu requested a review from SWilson4 as a code owner June 18, 2025 20:20
@SWilson4
Copy link
Member

If I'm reading this correctly, it looks like we actually can't count on the GitHub default Ubuntu runner supporting the AVX512 extensions. I wonder if one of the larger (paid) runners would? Any suggestions @mdcornu?

@mdcornu
Copy link
Contributor Author

mdcornu commented Jun 19, 2025

If I'm reading this correctly, it looks like we actually can't count on the GitHub default Ubuntu runner supporting the AVX512 extensions. I wonder if one of the larger (paid) runners would? Any suggestions @mdcornu?

I looked into this and you're right that the free runners only support AVX2. I tested adding a CI test that uses Intel SDE to emulate running on Skylake and that works to run the AVX512 code path. However, it's quite slow and with a minimal build (ML-KEM & ML-DSA) it's timing out after an hour. I'll keep experimenting with that to try reduce the testing time. Let me know if there are any other suggestions.

@SWilson4
Copy link
Member

If I'm reading this correctly, it looks like we actually can't count on the GitHub default Ubuntu runner supporting the AVX512 extensions. I wonder if one of the larger (paid) runners would? Any suggestions @mdcornu?

I looked into this and you're right that the free runners only support AVX2. I tested adding a CI test that uses Intel SDE to emulate running on Skylake and that works to run the AVX512 code path. However, it's quite slow and with a minimal build (ML-KEM & ML-DSA) it's timing out after an hour. I'll keep experimenting with that to try reduce the testing time. Let me know if there are any other suggestions.

If you're not already doing this, what about building the library without the emulator and only using the emulator at the test stage? We also don't need to run all of the tests.

@tkanteck
Copy link
Contributor

Thanks @mdcornu. Newly added tests, limited in scope, on emulated Skylake server platform passed - see more here.
Skylake was the first platform to introduce AVX512 and this was the rationale behind this choice.

@SWilson4 are you OK with the test scope and the emulated platform?

@SWilson4
Copy link
Member

Thanks @mdcornu. Newly added tests, limited in scope, on emulated Skylake server platform passed - see more here. Skylake was the first platform to introduce AVX512 and this was the rationale behind this choice.

@SWilson4 are you OK with the test scope and the emulated platform?

Yes, that looks fine to me. Thank you!

I see that the Travis build on s390x is failing, which is confusing to me since the GitHub UI says that it passed on a previous commit with no diff except for GitHub workflow files. I've re-triggered the Travis build and also triggered the Travis build on liboqs main as well in case it's an environment issue. Once that failure is resolved, this will be good to merge from my point of view.

Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging as the same Travis failure is happening on main. Maybe an out of memory issue—pinging @bhess to take a look.

@SWilson4 SWilson4 merged commit 8f92606 into open-quantum-safe:main Jun 20, 2025
87 of 88 checks passed
@tkanteck
Copy link
Contributor

Thanks 👍

@mdcornu
Copy link
Contributor Author

mdcornu commented Jun 23, 2025

Thanks for the quick turnaround on this. Should I close the discussion now that this is merged?

@SWilson4
Copy link
Member

Thanks for the quick turnaround on this. Should I close the discussion now that this is merged?

Yes please, and thanks again for contributing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants