Skip to content

On the size_t / uintptr_t guard #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
geofft opened this issue Jun 3, 2021 · 7 comments
Closed

On the size_t / uintptr_t guard #329

geofft opened this issue Jun 3, 2021 · 7 comments
Labels
• arch Related to a particular arch, `arch/` support in general...

Comments

@geofft
Copy link

geofft commented Jun 3, 2021

In #196 it was found that the __builtin_types_compatible_p(size_t, uintptr_t) check fails onCONFIG_ARM, and that PR skipped the check.

It turns out that the types are in fact compatible at the ABI level - they're both 32-bit integers. It turns out that they're just not compatible as C types. You can get a better error message if you try to assign a size_t pointer to a uintptr_t * or something:

/root/foo.c: In function 'init_module':
/root/foo.c:6:13: error: initialization of 'size_t *' {aka 'unsigned int *'} from incompatible pointer type 'uintptr_t *' {aka 'long unsigned int *'} [-Werror=incompatible-pointer-types]
 size_t *b = &a;

C doesn't believe that unsigned int and long unsigned int are compatible types in terms of the abstract C idea of int and long - but on 32-bit ARM, they're not actually different (what else could they be?).

So I think it should be safe to change this test to something else, like sizeof(size_t) == sizeof(uintptr_t) && __alignof__(size_t) == __alignof__(uintptr_t), which does pass on ARM, and make it not-platform-specific.


I added this check (without thinking too hard about how _builtin_types_compatible_p works) back in fishinabarrel/linux-kernel-module-rust@1e8ca1dc because, at least in the way we're using it, the way that rust-lang/rust-bindgen#1671 was fixed effectively adds a new bug. The bug it adds is that it translates the C size_t type to u32 or whatever instead of to usize, which means you can't straightforwardly use it in contexts where Rust expects a usize, e.g., slice indexing.

We are working around that (new) bug by passing --size_t-is-usize to bindgen (in rust/Makefile). But that simply assumes that the two types are compatible. I think in every practical place they are (and in particular, I'm pretty sure they are everywhere mainline Linux and/or Rust currently runs), but the bug originally reported in 1671 is that this isn't guaranteed by the C standard, and there might be places (like 16-bit x86 with segmentation, I think) where it's not.

In case we ever end up on such a platform, somehow, we don't want to be binding C functions that use size_t with a Rust usize because that's either ABI-incompatible or prone to integer overflow bugs. We either want to decide to not support that platform or to very carefully fix our uses of usize. (Or, possibly, try to get the kernel's ABI on that platform to change to make those types the same.)

That is, the fact that issue 1671 is closed doesn't mean we no longer need the guard. We very much need the guard on all platforms because we use --size_t-is-usize.


I'll send in a patch to fix the test and also improve the comment / assertion message.

I'm going to see if I can get a change into bindgen where bindgen itself does the compatibility check when it generates bindings and also defaults to --size_t-is-usize, and if the compatibility check fails (and something uses size_t), you need to specify an option saying that you're okay with binding size_t to u32/u64/whatever. If that lands, then we can remove the guard.

@ojeda ojeda added prio: normal • arch Related to a particular arch, `arch/` support in general... labels Jun 3, 2021
@nbdd0121
Copy link
Member

nbdd0121 commented Jun 3, 2021

Kernel defines uintptr_t to be unsigned long, and size_t to be unsigned int in 32-bit arch and unsigned long in 64-bit arch. Technically according to C spec, int and long are not compatible, and so are uintptr_t and size_t, but in kernel they are guaranteed to be of the same size and align. Are there 32-bit archs where int and long are not ABI-compatible?

@alex
Copy link
Member

alex commented Jun 3, 2021 via email

@geofft
Copy link
Author

geofft commented Jun 3, 2021

I was briefly afraid that ARM's uintptr_t was larger than 32 bits because of LPAE and we had an actual problem, but it turns out it's jut the int vs. long thing :)

I don't know for sure but I expect there aren't any 32-bit arches where size_t and uintptr_t are going to be different and Linux can run. I suppose a wacky arch might choose to make int 16-bit or something, but I think getting Linux to work there is probably going to involve making the kernel compile with a 32-bit int even if userspace programs use a 16-bit one, or at least changing the definition of size_t on that platform.

(btw - no need for Windows, on plain old x86_64 Linux, int is 32-bit and long is 64-bit. In fact, on 64-bit Windows, both int and long are 32-bit, presumably for compatibility, and you need long long for 64-bit! See https://docs.microsoft.com/en-us/cpp/cpp/data-type-ranges)

@nbdd0121
Copy link
Member

nbdd0121 commented Jun 3, 2021

BTW there are platforms where size_t and uintptr_t are different, e.g. CHERI. CHERI uses 64-bit size_t with 128-bit uintptr_t. There is no Linux support for such platforms though, and Rust currently do not support differently-sized size_t and uintptr_t either, see rust-lang/rust#65473.

@geofft
Copy link
Author

geofft commented Jun 3, 2021

Filed rust-lang/rust-bindgen#2062.

Thanks for the link to that Rust issue - I wholeheartedly agree with

To me, the ideal solution is to change usize to be in line with size_t and not uintptr_t. ... I claim that this is only problematic on architectures where uintptr_t != size_t. As such, code breakage from changing this assumption is constrained to targets where the code was already broken.

If we ever run into an architecture (like CHERI) where size_t isn't compatible with usize-as-Rust-defines-it-right-now, the right answer is going to be to push Rust to redefine usize IMO.

@nbdd0121
Copy link
Member

nbdd0121 commented Jun 3, 2021

Thanks for the link to that Rust issue - I wholeheartedly agree with

To me, the ideal solution is to change usize to be in line with size_t and not uintptr_t. ... I claim that this is only problematic on architectures where uintptr_t != size_t. As such, code breakage from changing this assumption is constrained to targets where the code was already broken.

If we ever run into an architecture (like CHERI) where size_t isn't compatible with usize-as-Rust-defines-it-right-now, the right answer is going to be to push Rust to redefine usize IMO.

Well, the issue is that usize is defined to be uintptr_t now, and people are relying on this to do arithmetic on pointers, so this is not a feasible change. Rust could split usize into usize and uptr but this isn't a backward compatible change. On the other hand, tweak bindings for such platforms so that size_t isn't mapped to usize is simpler than breaking the language or the entire ecosystem.

And this isn't an issue for us TBH. Many kernel code pretty much assumed that sizeof(size_t) = sizeof(ptrdiff_t) = sizeof(long) = sizeof(uintptr_t) = sizeof(void*), just like Rust does. And as you mentioned in the PR, unlike userspace code, kernel can select pretty much any data model it prefers. So if kernel is ever going to support these platforms, it is more likely a compatible data model is picked rather than finding and fixing all code that relies on this assumption.

@ojeda
Copy link
Member

ojeda commented Jun 3, 2021

Closed by #330.

@ojeda ojeda closed this as completed Jun 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
• arch Related to a particular arch, `arch/` support in general...
Development

No branches or pull requests

4 participants