-
Notifications
You must be signed in to change notification settings - Fork 13.3k
RFC: Make *T
not nullable
#10571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I feel a little weird having Rust types like |
We need to discuss this further, there are interesting issues on both sides of this. P-backcompat-lang. |
Some comments:
Upside of the change:
Downside of the change:
|
See some discussion in #9788 |
I like this a lot. (It's similar to what I tried proposing a few months ago.) I think the mentioned downside is an upside. It should be explicit whether nulls are possible or not. Why should it be different for If we do this then the |
My issue with this is that I consider the current enum-pointer optimization to be just that, an optimization, and relying on that behaviour doesn't sit right with me. Currently, I am firmly against anything that makes This also, for some reason, singles out 0 as a bad value. The whole point of the raw pointers is that Rust cannot determine if they point to any valid location. Why should we treat 0 as special in this case? What about other bogus pointers? What about tagged pointers? They aren't safe either and I don't see how dubiously preventing I can see this only producing misleading code and faulty assumptions. Canonicalising If |
I extremely strongly agree with @Aatch. Making |
I also agree with @Aatch. |
I approve of this change. It will ensure that writers of C bindings document their assumptions and that their code is consistent with their assumptions. In debug builds, Rust could automatically insert assertions to test these assumptions (e.g. that certain C functions never return null).
They are not pointers! They should be represented by corresponding Rust types depending on how they differ from pointers (both 0 and 1 are special; low bits are meant to be lopped off; etc). In the worst case, they can be represented in Rust as (newtyped) uints. |
This could be really annoying in kernel code where |
If this is referring to
How then would you distinguish I don't get the attachment to C semantics, either. We have no trouble learning from C's mistakes in other parts of the language. Where else in the semantics (not syntax) of the language do we say "we have to do it this way because C does it this way, period", even if another way might be better? Why here? Finally, do (plural) you feel the same way about function pointers, where the same things (non-nullable, need Things I agree with are that writing the null pointer optimization in stone and assuming One further concern I would raise from an ergonomics perspective, overlapping with but not quite the same as things above, is that if it looks like a C pointer, and it sure does, then people are going to expect that it behaves like a C pointer (to some degree this might already be apparent). They're going to think it's the same thing as in C and use it in their |
@glehel I don't like the idea of pushing systems concerns further to the side in a systems programming language. There needs to be some type that you can use to directly deal with the hardware without Rust sticking it's nose in. We have pointers that don't follow C semantics. We do need one that does. This isn't something we can change, we have to deal with C code. As for other values being passed to extern code being made into illegal values, yes, you're right, but in those cases either some has gone terribly wrong or you are making a terrible mistake. My issue is that it gives the illusion of safety where it's blatantly false. It's basically saying "these pointers are never null, except when they are", because it's not like the compiler will catch the little things like forgetting to wrap some type in an Option. Lastly, what, in light of these criticisms, would this change gain you? It doesn't make raw pointers any more safe. It doesn't make writing FFI bindings any easier, it could help catch a class of errors, but only by introducing a new class. All it does, to me, is make things more complicated for no good reason. |
@Aatch Agreed. Furthermore, all current FFI bindings can be created pretty easily by looking purely at the types involved in the C declaration. This absolutely is not the case with a non-nullable |
@huonw Sure, but almost no code ever uses it. And note that it's an opt-in to being non-nullable, whereas the proposal here makes it an opt-out. |
@nikomatsakis you listed as one upside of the change: "More accurate types"; I think that would be more correctly stated as "more precise types"; all doing this can buy you (I think) is the ability to directly express in a Rust-ic fashion a distinction between a nonnullable and a nullable pointer. But as they said in my high school chemistry class, precision is not accuracy. |
This comment from @Aatch, "If There are potentially three or more options here, not two, and I do not know which niko intended from the original description.
From the debates on this ticket, it seems like a lot of people have been assuming that some variant of (3) is what is being proposed. (There are also subvariants of (2.), for example where we remove So, @nikomatsakis: can you clarify which (sub)variant were you proposing? |
@pnkfelix I was proposing option 3, though I had considered Option 2 for a while. I find the "hands off my The main motivator here is the fact that we permit casts from random integers to |
Oh, the other motivator is that I think it's genuinely surprising that |
And one parting thought -- I agree that in general enum repr should remain undefined, but I am ok with specifying the pointer optimization. I suspect it'll be a de facto standard whatever we do. |
Hmm, I was thinking more about the question of kernel code. My first thought was since one could still construct a |
On Wed, Nov 27, 2013 at 01:12:01AM -0800, Felix S Klock II wrote:
True. But then, ain't that always the case? But I guess it's |
My original instinct was to go with option 1: "Keep things as they are". I think the main reason why I'm considering alternatives is that I too find it genuinely surprising that @nikomatsakis what about an option 2 subvariant where we also remove (This is basically my way of trying to deal with my option 3 probably being a weak version of option 2.) |
By the way, in case it is unclear, I do find it distasteful (emphatically "not clever") that a consequence of option 2 as I described it (and I think in expected practice, option 3 as well) an application of |
Automation tools would/should stick to
Here too, if the story is that the syntax of nullable pointers is changing from There are two wrinkles. One is that The other wrinkle is that, as mentioned earlier, Another direction you could approach it from. What's going unmentioned is the other major use case for unsafe pointers: as building blocks for Rustic smart pointers, and in data structures with invariants the type system can't express. In these cases you almost never want implicit nullability, and in fact it's a hindrance because it means that @pnkfelix: I agree that I don't like Option 2 either. :-) |
On Wed, Nov 27, 2013 at 03:00:58AM -0800, Felix S Klock II wrote:
Yes, clearly this is suprising too. |
@nikomatsakis: On second thought, kernels wanting to use 0 pointers for something real will probably need a project-wide attribute like |
You don't even need a keyword, you could just use something like |
Keep in mind that what when you say "*T can point anywhere at anything", you're invoking undefined behavior in LLVM. |
@cmr As far as I'm aware, it's only undefined if you dereference the |
I agreed with this at first, but I don't anymore. I was working with some embedded code where 0 was a valid and used pointer. WIth this change, I wouldn't be able to use Rust for that project. What does this change actually win us, anyway? I don't find the upsides particularly compelling. |
@cmr: as @thestinger pointed out, this is just as relevant for other pointer types, i.e. you can't have |
I'm not worried about those other types. On Wed, Dec 4, 2013 at 2:35 PM, Gábor Lehel [email protected]:
|
I'm wondering what the precise meaning of What I have bouncing around in my head is that maybe there should be two types. One that shares all of the same properties and type system invariants as the safe pointer types, including non-nullability, except that it's up to the programmer, rather than the compiler, to uphold them, potentially gets followed by the GC, and so forth. This would mainly be used for things like smart pointers and data structures. And one that's truly just a raw memory address "like in C" (or perhaps instead asm?), without nothing at all assumed about it, the programmer can dereference it if she wants to or she can not, and otherwise it's just as inert as an |
I would not expect the garbage collector to ever follow a |
In C, you're not allowed to do pointer arithmetic outside of the bounds of an object (with a special case allowing one-byte-past-the-end), you're not allowed to do make arbitrary casts between pointers and you're definitely not allowed to dereference a null/dangling pointer. They're not just an address at all, and it's not possible to write something like an XOR-linked-list without hitting undefined behaviour due to the aliasing/derived pointer rules you must respect. LLVM inherits almost all of these semantics from C and |
@kballard what about @thestinger I know, which is why I said "or perhaps instead asm?". I don't personally care how C-like versus uint-like it is or isn't. |
@glehel: Hrm, I hadn't considered the fact that |
C++ programmers aren't going to be willing to make compromises for garbage collection, so it can't dictate the design of the language. If it's intended to be a fully optional feature, it's entirely a library/compiler issue and doesn't belong in language design. The standard library can use as many attributes as needed to support it. |
my assumption has been that when we add a proper Gc, we will probably have to also add some way for 3rd party libraries providing smart pointers and/or allocators to properly interoperate with it. (And if a library does not o cannot interoperate with it, then a task won't be able to compose that library with the Gc -- though I hope that attempts to perform erroneous compositions would at least be statically detected rather than dynamic failures.) The design is still quite fuzzy in my head, but this may involve any/all of:
These topics remain to be worked out. (I was about to say "I don't know what bearing the above has on the issue of making |
@pnkfelix: I expected it would be something like adding an attribute to fields with raw pointers the garbage collector should trace through. Anyway, adding lots of pain to low-level code is an incentive to maintain another library ecosystem. |
@thestinger hmm, I admit that my definition of "some other protocol" had not included that option. But I'm going to take the liberty of reinterpreting my own comment to now include that option, (though I'm still not sure if its what I would go with). |
I withdraw this suggestion. |
I think the fact that
*T
is nullable is an anachronism (or will be once #10570 is fixed). We should just useOption<*T>
for nullable pointers. Anybody else have an opinion?Nominating.
The text was updated successfully, but these errors were encountered: