Description
What should the const interner do when encountering a pointer that doesn't point anywhere? Right now it does throw_ub_format!
, but inside a const that is actually a lint that can be allowed. Also the error is not very informative as it doesn't say where in the const the dangling pointer is. With #71665 it does tcx.sess.span_err
, at least guaranteeing to hard error, but still with an unspecific error and actually in case of a dangling reference with a duplicate error as validation also complains (with a much better error message).
Ideally we wouldn't have an error at all. We could either (1) make Miri+codegen not ICE on dangling AllocId
, or (2) make the AllocId
not dangling by creating "fake" allocations, or (3) adjust the interned allocation to not have a dangling pointer any more. There might be more options.
Is (1) realistic? I am not sure. It might also not be great to put that burden on the rest of the compiler. (Note though that since we "just" show a hard error diagnostic in dangling AllocId
, all parts of the compiler that still run after this must anyway be prepared to handle them properly.)
(2) is honestly my personal favorite. The "fake" allocations would have the right alignment but size 1, and undefined content, and we could send them off to LLVM like that. However we'd have to make sure validation does not consider these "fake" allocations dereferencable (not even for size 0). So in some sense it might be easiest to create them only during codegen when getting None
for a given AllocId
... which kind of makes this a variant of (1), maybe? However we would have to ensure to still have alignment information in codegen.
For (3), the question is what exactly to replace them with. The easiest thing to do is to turn a pointer allocN+x
into integer align_of(allocN)+x
, which is at least a possible value for that pointer. @oli-obk expressed a preference for this option. However, I do not think this is a great idea. Consider the following program, which might actually run one day once we alllow more raw pointer stuff (currently it shouldn't run even with Miri unleashed):
const NASTY: (bool, *const i32, *const i32) {
let x = 0; let y = 1;
let x = &x as *const _; let y = &y as *const _;
(x == y, x, y)
}
If we follow this variant of (3), the result will be (false, 4, 4)
. IOW, it looks as if CTFE concluded that 4 != 4
... this is why I prefer (2).
Cc @rust-lang/wg-const-eval