Skip to content

RFC: Assume bounds for generic functions #3802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

DasLixou
Copy link

@DasLixou DasLixou commented Apr 21, 2025

This propsal adds support for #[unsafe(assume)]-ing conditions in where clauses to help with complex generic call stacks and hinting for higher ranked bounds.

Rendered

@jieyouxu jieyouxu added the T-types Relevant to the types team, which will review and decide on the RFC. label Apr 21, 2025
@programmerjake
Copy link
Member

this sounds like just adding late-checked bounds, which isn't necessarily unsafe since the compiler could in theory still do all bounds checking at monomorphization time (more or less exactly how C++ templates work), but it does lead to the almost totally unreadable error messages that C++ templates are infamous for.

@DasLixou
Copy link
Author

this sounds like just adding late-checked bounds, which isn't necessarily unsafe since the compiler could in theory still do all bounds checking at monomorphization time (more or less exactly how C++ templates work), but it does lead to the almost totally unreadable error messages that C++ templates are infamous for.

As mentioned in the RFC, it could help catch some errors, but not all and definitely shouldn't be used by ease, thus unreadable errors aren't really that much of a problem.
It also explains why it is unsafe, because at monomorphization, the compiler isn't aware of lifetimes anymore, which means something like

fn test<T>()
where
    #[unsafe(assume)] T: Cool<'static>
{}

can't be proven or disproven anymore.


`assume`d bounds are just skipped during bounds check and we trust the user.

Later, the compiler could assist with some wrong conditions, like if for example I would pass something in here which doesn't implement `Debug`, the compiler could tell me post-monomorph that this assumed trait bound is not fulfilled for that _specific_ type. But you shouldn't 100% depend on this, as for example lifetimes aren't preserved up to that stage, so any lifetime-dependant condition is completely unchecked, thus making it `unsafe`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a lint, or might it become a hard error? (Hard errors could cause problems if people are using this feature in dead code.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait is it a problem because it might error without being used or because it is eliminated before being errored so that there's no error?

Copy link
Contributor

@Jules-Bertholet Jules-Bertholet Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it might error without being used

This one. E.g., code like this:

if check_that_its_really_debug() {
    unsafe { assume_t_implements_debug::<T>() };
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh that's what you mean by dead code.. yeah then hint is probably better, if not even less..

@programmerjake
Copy link
Member

It also explains why it is unsafe, because at monomorphization, the compiler isn't aware of lifetimes anymore, which means something like

yes, hence why I said theoretically, since, to do full checking, the compiler would have to be rearchitected to keep lifetimes around until monomorphization (which is very unlikely to happen).

@Jules-Bertholet
Copy link
Contributor

this sounds like just adding late-checked bounds, which isn't necessarily unsafe since the compiler could in theory still do all bounds checking at monomorphization time

It's not, because not having a post-mono check means you are not restricted in what you can do in dead code.

@kennytm
Copy link
Member

kennytm commented Apr 22, 2025

this is already possible using (full) specialization and I'd argue it is better to evaluate this under a feature which we had experience.

#![feature(specialization)]

use std::fmt::Display;

fn print<T: Display>(val: T) {
    println!("good! {val}");
}

fn less_restricted<T>(val: T) {
    trait PrintAssumed {
        fn print_assumed(self);
    }
    impl<X> PrintAssumed for X {
        default fn print_assumed(self) {
            unsafe extern "C" {
                #[link_name = "\n\n[ERROR] less_restricted() called without satisfying T: Display\n\n"]
                fn error();
            }
            unsafe {
                error();
            }
        }
    }
    impl<X: Display> PrintAssumed for X {
        fn print_assumed(self) {
            print(self)
        }
    }

    PrintAssumed::print_assumed(val)
}

fn main() {
    print(5);
    less_restricted(5);
    print("hello");
    less_restricted("hello");
    // print(Some("not display")); // compile error
    // less_restricted(Some("not display")); // linker error

    struct OnlyDisplayIfStatic<'a>(&'a str);
    impl Display for OnlyDisplayIfStatic<'static> {
        fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
            write!(f, "static: {}", self.0)
        }
    }

    print(OnlyDisplayIfStatic("static"));
    less_restricted(OnlyDisplayIfStatic("static"));
    let bad = "bad".to_string();
    // print(OnlyDisplayIfStatic(&bad)); // compile error
    less_restricted(OnlyDisplayIfStatic(&bad)); // pass, UB.
}

@DasLixou
Copy link
Author

DasLixou commented Apr 22, 2025

Interesting idea, but especially with your 'static example below you imply that specialization will get lifetime support, somehow.. and I don't think there are that many people who want to make this with the current compiler (if I'm up to date with the debate)

Edit: oh yeah sorry you even wrote UB there, skipped that

@kennytm
Copy link
Member

kennytm commented Apr 22, 2025

Interesting idea, but especially with your 'static example below you imply that specialization will get lifetime support, somehow.. and I don't think there are that many people who want to make this with the current compiler (if I'm up to date with the debate)

It is no different from this RFC itself, which you can't prevent anyone using print_assumed(OnlyDisplayIfStatic(&bad)). Isn't this also the rationale why the attribute is unsafe:

  • ..., as for example lifetimes aren't preserved up to that [post-monomorphization] stage, so any lifetime-dependant condition is completely unchecked, thus making it unsafe.

So #[unsafe(assume)] bound should unconditionally force the function print_assumed must be declared unsafe fn. But a proc-macro generating the specialization above can annotate the same on the generated unsafe fn PrintAssumed::print_assumed.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When implementing a function with a `where` clause, like this one:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about non-functions

trait Foo<U> where #[unsafe(assume)] U: Iterator {
    type Item where #[unsafe(assume)] <U as Iterator>::Item: Into<u16>;
}

impl Foo<U> for U where #[unsafe(assume)] U: Future {
    type Item = U::Output;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:0 interesting idea, should probably also work on those..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like it would be much more difficult to implement/develop coherent semantics for.

@DasLixou
Copy link
Author

@kennytm yeah it isn't very different from the RFC but the problem with specialization is that the thing causing UB isn't causing UB while being unsafe, but rather while being unsound. So how this will work out solely relies on how specialization progresses

@clarfonthey
Copy link

I'm a bit confused on the benefit of these bounds at all. Like, in what circumstances would it be useful to have them either over a regular bound, or no bound at all?

@DasLixou
Copy link
Author

I'm a bit confused on the benefit of these bounds at all. Like, in what circumstances would it be useful to have them either over a regular bound, or no bound at all?

Only on rare occasions, like where the compiler can't verify it itself because of too many indirections or when you really don't want to go through 30 layers of generic functions

@Jules-Bertholet
Copy link
Contributor

I'm a bit confused on the benefit of these bounds at all. Like, in what circumstances would it be useful to have them either over a regular bound, or no bound at all?

Only on rare occasions, like where the compiler can't verify it itself because of too many indirections or when you really don't want to go through 30 layers of generic functions

It would be nice to see a concrete, real-world example.

@Noratrieb
Copy link
Member

This is a feature with very major impact on the type system of the Rust language, and such features are not added lightly. The RFC is very short, containing only a short motivation with very few details. With this, it's hard to extract what exactly the problem it is you're having, and what other solutions there can be.
Then, about these other solutions, the RFC makes no attempt to think of other solutions of this problem at all. Especially for a large change to the type system, there is a very high chance that many other solutions could be done to approach this problem, and they should all be laid out explicitly and evaluated against each other to determine the best solution. It seems very unlikely to me that this approach here would be the winner.

One major feature that relates to this is "implied bounds", RFC Tracking Issue. It's not guaranteed that this will ever land either, but if you do want to solve your problem, that direction seems a lot more promising, so it's probably better to invest your time there instead of pursuing this RFC, which is likely a dead-end (I am not on a relevant team to make the final call about this, but I can't imagine a world in which this is accepted as-is today).

This is an area with many hidden complexities, so working on it will not yield immediate returns as things like this take time, but if you want to work on this, I really recommend looking into alternative approaches like implied bounds, or entirely different directions you may come up with.

@Jules-Bertholet
Copy link
Contributor

One possible use case for this is expressing bounds that the compiler can’t understand yet. For example:

unsafe fn foo<T>(param: T)
where
    // We actually only need `T: for<'a, 'b: 'a> Trait<'a, 'b>`,
    // but rustc can’t understand that atm
    #[unsafe(assume)] T: for<'a, 'b> Trait<'a, 'b>
{
    ... 
}

```rs
pub fn print<T>(val: T)
where
#[unsafe(assume)] T: Debug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making the attribute unsafe, would it not make more sense to require the function it is applied to to be unsafe?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well not really.. The user assures the safety by already writing unsafe in the attribute. When we also allow that for bounds in e.g. struct definitions, then there is no way of making that unsafe otherwise. Also, not every function using that must be unsafe, e.g. a type_id_of_static where it get's the typeid of it when it would be 'static would be completely safe.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

`assume`d bounds are just skipped during bounds check and we trust the user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

unsafe fn foo<T>(param: T) -> impl Debug 
where
    #[unsafe(assume)] T: Debug,
{
    param
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. Is this something I should explicitly provide as an example?

@ahicks92
Copy link

I think that whether or not the attribute should be unsafe is less important than first defining what unsafe means here.

In practice if the point is complex bounds--especially if the point is complex higher-ranked bounds for lifetimes--there's no way you'd be able to spot whether it's safe by reading the code. "you, the human, are now the compiler" isn't a good idea (source: I know C++ and was the human compiler checking lifetimes there...).

For things which aren't higher-ranked you can usually get named centralized bounds:

trait NamedBound where Self: Bound {}
impl<T: ?Sized> NamedBound where T: Bound {}

// Use it
fn example<T: NamedBound>(val: T) {
    // code.
}

Which has the added benefit of not requiring you to repeat yourself and also plays nice with feature flags. It does work with lifetimes too. It may work to some limited extent with HRTBs but I've never been crazy enough to try. It can even go in the public API of your crate. For the non-HRTB cases this works and prevents marching up and down the call graph if you need to change a bound.

@DasLixou
Copy link
Author

This is a feature with very major impact on the type system of the Rust language, and such features are not added lightly. The RFC is very short, containing only a short motivation with very few details. With this, it's hard to extract what exactly the problem it is you're having, and what other solutions there can be. Then, about these other solutions, the RFC makes no attempt to think of other solutions of this problem at all. Especially for a large change to the type system, there is a very high chance that many other solutions could be done to approach this problem, and they should all be laid out explicitly and evaluated against each other to determine the best solution. It seems very unlikely to me that this approach here would be the winner.

One major feature that relates to this is "implied bounds", RFC Tracking Issue. It's not guaranteed that this will ever land either, but if you do want to solve your problem, that direction seems a lot more promising, so it's probably better to invest your time there instead of pursuing this RFC, which is likely a dead-end (I am not on a relevant team to make the final call about this, but I can't imagine a world in which this is accepted as-is today).

This is an area with many hidden complexities, so working on it will not yield immediate returns as things like this take time, but if you want to work on this, I really recommend looking into alternative approaches like implied bounds, or entirely different directions you may come up with.

Implied bounds do look interesting, and I might be able to bend them to my usecase.

As for the real world example, I want the user to have many nested functions with a signature looking something like fn _<T>() -> impl FnOnce(T) and then there might be an item impl Key which can be created from that T and assures that T has a special trait bound. So in short, it means for e.g. 20 nested functions either expanding to fn _<K: Key, T>(k: K) -> impl FnOnce(T) where T: Has<K::Provided> or just giving down an impl Key.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-types Relevant to the types team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants