Skip to content

gh-130213: update hacl_star_rev to 322f6d58290e0ed7f4ecb84fcce12917aa0f594b #130960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 15, 2025

Conversation

chris-eibl
Copy link
Member

@chris-eibl chris-eibl commented Mar 7, 2025

Vendor hacl_star from rev hacl-star/hacl-star@322f6d5 which integrated hacl-star/hacl-star#1025 on main.

How to blurb it?

If the abstraction (PR-1025) is the only change worth noting here in Python since the last sync (hacl-star/hacl-star@f218923ef2417d963d7efc7951593ae6aef613f7=), then maybe

"update to latest hacl-star to fix build issues with older clang compilers"

but where would it belong? In Misc/NEWS.d/next/Build?

# Adjust the include path to reflect the local directory structure
$sed -i 's!#include.*types.h"!#include "krml/types.h"!g' "${all_files[@]}"
$sed -i 's!#include.*compat.h"!!g' "${all_files[@]}"
$sed -i 's!#include "fstar_uint128_msvc.h"!#include "krml/fstar_uint128_msvc.h"!g' include/krml/internal/types.h
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current directory layout, it seemed best to me to put these two headers from krmllib/dist/minimal into include/krml, like all the other lib_files. But then I need to patch here.

BTW: I am not a sed expert - quite the contrary. For sure this two replacements can/shall be done in one line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do not need fstar_uint128_msvc.h, see below #130960 (comment)

@@ -707,12 +707,12 @@ _blake2_blake2b_copy_impl(Blake2Object *self)
switch (self->impl) {
#if HACL_CAN_COMPILE_SIMD256
case Blake2b_256:
cpy->blake2b_256_state = Hacl_Hash_Blake2b_Simd256_copy(self->blake2b_256_state);
cpy->blake2b_256_state = Hacl_Hash_Blake2b_Simd256_copy0(self->blake2b_256_state);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting. I thought about copying/using dist\portable-gcc-compatible\clients\krmlrenamings.h, but there I'd only get

#define Hacl_Streaming_Blake2b_256_copy Hacl_Hash_Blake2b_Simd256_copy0

which was a too far stretch for me to replace for Hacl_Hash_Blake2b_Simd256_copy, or

#define Hacl_Hash_Blake2b_256_copy Hacl_Hash_Blake2b_Simd256_copy

which does not help, because Hacl_Hash_Blake2b_Simd256_copy does not exist.
Maybe the above line should read

#define Hacl_Hash_Blake2b_256_copy Hacl_Hash_Blake2b_Simd256_copy0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bug in krmlrenamings.h let me look into this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed upstream

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed upstream

break;
#endif
#if HACL_CAN_COMPILE_SIMD128
case Blake2s_128:
cpy->blake2s_128_state = Hacl_Hash_Blake2s_Simd128_copy(self->blake2s_128_state);
cpy->blake2s_128_state = Hacl_Hash_Blake2s_Simd128_copy0(self->blake2s_128_state);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Additionally, just to reassure I still get an error from clang-cl 18.1.8, I temporarily placed

typedef __m256i bla;

which did result in an error. So I am quite confident, that for (older) clang-cl the visibility of the intrinsincs is no longer an issue!

Thanks a lot @msprotz!


#include "FStar_UInt128.h"
#include "FStar_UInt_8_16_32_64.h"
#include "LowStar_Endianness.h"
Copy link
Member Author

@chris-eibl chris-eibl Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I only get a warning from clang-cl on Windows:

  In file included from e:\cpython_clang\Modules\_hacl\include\krml/internal/types.h:99:
e:\cpython_clang\Modules\_hacl\include\krml/fstar_uint128_gcc64.h(28,10): warning : non-portable path to file '"lowstar_endianness.h"'; specified path differs in case from file name on disk [-Wnonportable-include-path] [e:\cpython_clang
\PCbuild\pythoncore.vcxproj]

The problem might be, that there are:

  • hacl-star\dist\karamel\krmllib\dist\minimal\LowStar_Endianness.h
  • hacl-star\dist\karamel\include\krml\lowstar_endianness.h

Up to now only the latter was copied. Two same names (for Windows, since case insensitive, but clang warns) in different folders. Which one is needed here? If we can change the include to lower case #include "lowstar_endianness.h", then we'd be fine.

If not, where to put the one with capital letters in the current layout and how to patch here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously, IS_MSVC64 is not defined in case of Windows, since clang-cl reports the warning in fstar_uint128_gcc64.h:

#if !defined(KRML_VERIFIED_UINT128) && defined(IS_MSVC64)
#include "krml/fstar_uint128_msvc.h"
#elif !defined(KRML_VERIFIED_UINT128) && defined(HAS_INT128)
#include "krml/fstar_uint128_gcc64.h"

We anyway only vendored gcc-compatible and not msvc-compatible, so maybe I'll drop fstar_uint128_msvc.h"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msvc-compatible replaces the usage of VLAs with calls to alloca for the sake of MSVC compatibility -- but for the code that you care about, there should be no such VLAs in the code, so I think we should be ok

uint128 is another, separate problem: the HACL* code assumes the existence of a certain FStar_UInt128_uint128 type, so we need to define it by hand depending on platform and toolchain... there are three options

  • gcc builtins for 64-bit platforms (fstar_uint128_gcc64.h)
  • MSVC builtins for 64-bit platforms (fstar_uint128_msvc.h)
  • portable C implementation, verified (FStar_UInt128_Verified.h)

I think what the previous hand-written types.h did was always pick the portable C implementation on the basis that uint128 was not used in a deep, performance-critical way, and so it was not worth the packaging burden

if you want to retain this behavior but use the same types.h as upstream, you should be able to -DKRML_VERIFIED_UINT128 and then get rid of fstar_uint128_{gcc64,msvc.h} which would simplify the packaging story and would resolve the LowStar_Endianness.h / lowstar_endianness.h issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the endianness issue, this is an original design mistake: lowercase means hand-written, and uppercase means auto-generated:

  • LowStar_Endianness.h contains the prototypes expected by the generated code, but no implementation
  • lowstar_endianness.h contains the hand-written implementation that provides those prototypes

I agree that this is definitely worth fixing, but if we can ditch fstar_uint128_gcc64.h then maybe we can push this to a separate issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to retain

I am no expert here and fully trust your expertise. Maybe others wanna chime in.
I think this PR is still flying under the radar - I just wanted to see how it goes and whether it helps you getting your PR hacl-star/hacl-star#1025 upstream.

Let me know when you think my PR is ready for others to review or who from Python should be pinged.

This currently fails due to missing make regen-sbom. I'll update and keep this PR draft. Will need another sync anyway when your PR gets upstream.

get rid of fstar_uint128_{gcc64,msvc.h}
Will definitely simplify things.

IIUC, since previously the hand written types.h did

typedef struct FStar_UInt128_uint128_s {
  uint64_t low;
  uint64_t high;
} FStar_UInt128_uint128, uint128_t;

we won't be any slower and do the same as previously.
But we could be slightly faster with the special implementations.

Maybe keep that for another PR when the renamings are done?

I added commits addressing this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep that for another PR when the renamings are done?

This sounds excellent!

remove fstar_uint128_gcc64.h and fstar_uint128_msvc.h
make regen-sbom
#include resolved using non-portable Microsoft search rules
@@ -143,7 +131,7 @@ $sed -i -z 's!\(extern\|typedef\)[^;]*;\n\n!!g' include/krml/FStar_UInt_8_16_32_
$sed -i 's!#include.*Hacl_Krmllib.h"!!g' "${all_files[@]}"

# Use globally unique names for the Hacl_ C APIs to avoid linkage conflicts.
$sed -i -z 's!#include <string.h>\n!#include <string.h>\n#include "python_hacl_namespaces.h"\n!' Hacl_Hash_*.h
$sed -i -z 's!#include <string.h>!#include <string.h>\n#include "python_hacl_namespaces.h"!' Hacl_Hash_*.h
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without removing the \n, the python_hacl_namespaces.h got nowhere inserted.
No glue why. Maybe because I am running this in WSL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might have the evil core.autocrlf setting set to true, in which case git will rewrite the files to have windows-style line endings -- in any case, good to change the sed 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought so, too, but in WSL I have Unix-style endings (there I did run refreh.sh) and on Windows I have Windows-style endings. But that shouldn't be a problem, since I was running refresh.sh on WSL?

So this change is good either way? Great :)

$sed -i 's!#include "fstar_uint128_struct_endianness.h"!#include "krml/fstar_uint128_struct_endianness.h"!g' include/krml/internal/types.h

# use KRML_VERIFIED_UINT128
$sed -i -z 's!#define KRML_TYPES_H!#define KRML_TYPES_H\n#define KRML_VERIFIED_UINT128!g' include/krml/internal/types.h
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested py @msprotz, let's go here with KRML_VERIFIED_UINT128 and do not use fstar_uint128_{gcc64,msvc.h} for now.

@msprotz
Copy link
Contributor

msprotz commented Mar 7, 2025

Ok so I need to debug the krmlrenamings bug and review this PR. I should be able to tackle this early next week. Let me know if this is more pressing and I'll see what I can do. Thanks!

@chris-eibl
Copy link
Member Author

Oh for me this is absolutely not pressing.

Have a great weekend!

Copy link
Contributor

@msprotz msprotz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great to me -- one more round of updating from upstream and you'll be good (the copy0, malloc_with_key0 will be fixed).

Just want to double-check that you're ok with slightly more verbose headers, in exchange for a simpler process for refreshing HACL*.

Finally, FStar_UInt128.h can go I think.

Thanks so much!

Makefile.pre.in Outdated
@@ -666,12 +666,14 @@ LIBHACL_BLAKE2_OBJS= \
$(LIBHACL_SIMD256_OBJS)

LIBHACL_HEADERS= \
Modules/_hacl/include/krml/FStar_UInt128.h \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this file needed? git grep mentions no reference to it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great spot: this is a leftover from fstar_uint128_gcc64.h which did a #include "FStar_UInt128.h". Now that we do not use fstar_uint128_gcc64.h anymore, I will remove FStar_UInt128.h, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this file should go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same deal, this file could probably go, and you could sed its inclusion away -- up to you, this is really for stuff that is irrelevant to Python

@@ -195,6 +221,24 @@
} \
} while (0)

#if defined(_MSC_VER) && _MSC_VER < 1900
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same deal -- this all used to be sed'ed away

remove FStar_UInt128.h
revert interim fixes in blake2module.c
regen sbom
@chris-eibl chris-eibl changed the title gh-130213: update hacl_star_rev to 809c320227eecc61a744953f1ee574b4f24aabe3 gh-130213: update hacl_star_rev to 322f6d58290e0ed7f4ecb84fcce12917aa0f594b Mar 11, 2025
@chris-eibl chris-eibl marked this pull request as ready for review March 11, 2025 07:33
@chris-eibl
Copy link
Member Author

@msprotz:

  • sync with hacl-star/hacl-star@322f6d5
  • remove FStar_UInt128.h (not needed anymore because fstar_uint128_{gcc64,msvc.h} are gone)
  • revert interim fixes in blake2module.c
  • regen sbom

(I personally don't mind it, and I think it's better to make the script simple for consuming new versions of HACL*, but just want to check you're aware of this)

Regarding the "extra cruft": when I started, I could not use the "uncrufting" out of the box and it was easier for me to "just go with what comes from hacl-star".

As you mention, it is easier for people like me who are no hacl-star experts to sync with upstream if there is less "uncrufting" happening. At least I'd like to keep the folder structure closer - creating a "public" stub types.h and everywhere patching it in instead of #include "krml/internal/types.h" seems too much to me.

But that's just my take on it - happy to adopt all changes you and the other reviewers suggest in refresh.sh.

@gpshead gpshead added 🔨 test-with-buildbots Test PR w/ buildbots; report in status section 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Mar 11, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit 427dfab 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F130960%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit 427dfab 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F130960%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed 🔨 test-with-buildbots Test PR w/ buildbots; report in status section 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Mar 11, 2025
@gpshead
Copy link
Member

gpshead commented Mar 11, 2025

re: blurb to add the news:

"update to latest hacl-star to fix build issues with older clang compilers"
but where would it belong? In Misc/NEWS.d/next/Build?

yep, Build seems fine. as would Library. don't overthink it. the important thing is that we have a news entry. some thing cross categories.

@msprotz
Copy link
Contributor

msprotz commented Mar 11, 2025

@chris-eibl no worries! I'm already super grateful with your huge help so far... let me recap -- to be clear, I think all of this can be dealt with later:

For Hacl_Streaming_HMAC:

  • malloc may return OutOfMemory, InvalidAlgorithm (e.g. requesting Blake2b_256 on an ARM machine), or Success
  • reset may return InvalidLength (if trying to reset the state with a key of different length, this is not supported), or Success
  • update: MaximumLengthExceeded or Success
  • digest: OutOfMemory or Success
  • copy: may return NULL (indicates out of memory)

For all hash algorithms (NEW with this PR):

  • malloc, malloc_with_params_and_key, malloc_with_key: may return NULL (out of memory)
  • copy: may return NULL (out of memory)

For SHA3/Keccak only:

  • digest may return InvalidAlgorithm (if the algorithm is shake)
  • squeeze may return InvalidAlgorithm (if the algorithm is not shake)

I think all of these can be handled as a followup, I just thought it would be good to have it in writing here so that you can decide which of these are worth checking for. The reason I brought up other hash algorithms is that, since you requested (or maybe @picnixz ?) proper out of memory handling in HACL*, we now may return NULL for other algorithms (like hash algorithms), meaning that this PR will introduce new possibly-NULL return values as a side-effect of updating the vendored copy of HACL*.

For the record, Python ignores MaximumLengthExceeded on the basis that this cannot happen in practice.

@picnixz
Copy link
Member

picnixz commented Mar 12, 2025

(I requested it and in my HMAC PR I'm handling and converting possible return codes from HACL*)

$sed -i 's!#include "fstar_uint128_struct_endianness.h"!#include "krml/fstar_uint128_struct_endianness.h"!g' include/krml/internal/types.h

# use KRML_VERIFIED_UINT128
$sed -i -z 's!#define KRML_TYPES_H!#define KRML_TYPES_H\n#define KRML_VERIFIED_UINT128!g' include/krml/internal/types.h

# FStar_UInt_8_16_32_64 contains definitions useful in the general case, but not
# for us; trim!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big "sed" you were thiking of is probably this one @msprotz but it appears that it now does nothing?

Copy link
Member Author

@chris-eibl chris-eibl Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. This line indeed does no longer change the vendored files.

I found some more and removed them, too.

Maybe again an issue with \n like #130960 (comment)?

But if this is related to the evil core.autocrlf git setting, then ISTM that the CI step Check if generated files are up to date should fail?

On my machine core.autocrlf is unfortunately active, I didn't opt in for it, seems I got that "for free on Windows"?
I am now sure, that this is the root of the problem like @msprotz already hinted above. Still no glue why the CI did not fire ...

TBH, since the GHA runners are happy with the almost unmodified vendored files, I'd opt for "the less sed, the better".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, because

echo -e 'extern a();\n\ntypedef abc;\n\n' | sed -z 's!\(extern\|typedef\)[^;]*;\n\n!oh!g'

prints "ohoh" on https://cocalc.com/features/terminal so I guess there's something happening here.

then ISTM that the CI step should fail

I'm not sure we're actually running refresh.sh (I couldn't find it in the logs). And maybe the CI step is also configured with an evil core.autocrlf (note that to check the diffs, we are doing git add -u.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with both of you that these seds are too fragile and we're better off without them

@chris-eibl
Copy link
Member Author

Is there anything left do be done in this PR?
E.g. #130960 (comment)?
Seems this would be handled in #130960 (comment)?

Is this feasible for alpha 7 (2025-04-08, the last alpha)?
If not, shall we revisit the "workarounds" #130332 and #130447 or is there anyway no pressure because still possible after alpha 7, e.g. because treated as "bug, because failing to build"?

@picnixz
Copy link
Member

picnixz commented Mar 15, 2025

Seems this would be handled in #130960 (comment)?

Yes.

Is this feasible for alpha 7 (2025-04-08, the last alpha)?

I think we can still ship something just before the beta (so until May). But the HMAC PR is ready technically, and everything is completed on my side (tests pass on my machine and everything builds correctly).

is there anyway no pressure because still possible after alpha 7, e.g. because treated as "bug, because failing to build"?

It would be treated as a bugfix and not a new feature so no worries here I think.

@gpshead gpshead merged commit 0ce056d into python:main Mar 15, 2025
46 checks passed
@picnixz
Copy link
Member

picnixz commented Mar 15, 2025

I've actually just observed that:

For all hash algorithms (NEW with this PR):
malloc, malloc_with_params_and_key, malloc_with_key: may return NULL (out of memory)
copy: may return NULL (out of memory)

A follow-up PR could be done to raise/check that we raise PyErr_NoMemory() in existing hash functions.

@chris-eibl chris-eibl deleted the hacl-abstraction2 branch March 15, 2025 19:22
@chris-eibl
Copy link
Member Author

A follow-up PR

Shall we create an issue so that this does not get lost? What is the common approach here in such cases?

@picnixz
Copy link
Member

picnixz commented Mar 15, 2025

Shall we create an issue so that this does not get lost? What is the common approach here in such cases?

Yes, create a new issue from my comment for instance. The title can be like "Handle NULL returned by HACL* allocation functions"

plashchynski pushed a commit to plashchynski/cpython that referenced this pull request Mar 17, 2025
…2917aa0f594b (pythonGH-130960)

Updates the HACL* implementation used by hashlib from upstream sources.
seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025
…2917aa0f594b (pythonGH-130960)

Updates the HACL* implementation used by hashlib from upstream sources.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants