Skip to content

Decoupling new type mappings from the cxx crate #251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
anforowicz opened this issue Aug 10, 2020 · 7 comments
Open

Decoupling new type mappings from the cxx crate #251

anforowicz opened this issue Aug 10, 2020 · 7 comments

Comments

@anforowicz
Copy link
Contributor

Hello,

Thank you for all the work on the cxx crate. I am sorry for a not very specific nor constructive issue, but I was wondering what it would take to decouple individual type mappings (e.g. std::vector<T> to CxxVector<T>, std::map<K, V> to TBD<K, V>, etc.) from the cxx crate itself. In particular, some projects have their own container implementations (e.g. Boost''s flat_map or Chromium's small_map or WebKit's WTF::Vector) or string implementations (e.g. WebKit's 16-bit String). It would be great if such projects could extend cxx with the extra mappings, without having to modify the cxx crate itself. AFAIU cxx already supports translating std::string and std::vector and plans to support std::map and std::unordered_map, but obviously would probably dislike having knowledge of Boost's, Chromium's or WebKit's types.

Could you please offer any feedback / thoughts on the above?

  • It is unclear to me how one would declare that a given C++ type should be recognized and wrapped in a specific way in Rust. Specifying the C++ type via its namespace-qualified name is definitely one possible way, but maybe some form of duck-typing is also desirable (handwaving: if a C++ type has begin and end methods, then it is wrapped as std::iter::Iterator or as IntoIterator; if it quacks like a drop-in replacement for std::map, then it is wrapped as CxxMap; etc);

  • It is unclear to me exactly which C++ => Rust type mappings would need to remain in the core cxx (even if they don't need to be in cxx and could be decoupled for design/hygiene reasons, we might still want to keep all the std library types close to cxx, to promote one canonical std => Rust type mapping).

    • I guess UniquePtr would remain in the core.
    • I am not sure about CxxString and CxxVector.
  • It is unclear to me how extensible type mappings impact with the safety guarantees of cxx and Rust.

/cc @adetaylor

@adetaylor
Copy link
Collaborator

I was thinking that one day, syntax::Type might need to change from an enum to a trait, such that new type support can be added by extra crates.

@dtolnay
Copy link
Owner

dtolnay commented Aug 10, 2020

We support something like this already, which would work for WTF::String. My codebase uses it for Folly strings. https://docs.rs/cxx/0.3.4/cxx/trait.ExternType.html

unsafe impl ExternType for WtfString {
    type Id = type_id!("WTF::String");
}

It doesn't handle generic types at the moment but I am hopeful that it would be possible to make that work and be equivalently easy to use.

unsafe impl<K: ??, V: ??> ExternType for BoostFlatMap<K, V> {
    type Id = type_id!("boost::container::flat_map<K, V>");
}

It also doesn't handle types without a compatible Rust definition yet, i.e. some kind of bindgen-based definition of the type in Rust is required. We'll want to make some alternative approach available there.

#[repr(transparent)]
pub struct WtfString(cxx::OpaqueType);

unsafe impl ExternType for WtfString {
    type Id = type_id!("WTF::String");
    const OPAQUE: bool = true;
}

I think the above would be able to cover all the types you named. Roughly that's types for which the goal is just shuffling values or references from language A to B. There is another tier of types that require some deeper integration with the code generator. So far I think the ones I know are just Result-style types (outcome<T>, leaf::result<T>, StatusOr<T>; see #16) so maybe all we need to do is support that as a category. Otherwise we'd need to look into plugging logic into the code generator -- #216 was one attempt toward defining a trait-based API that abstracts the code generator's interaction of individual types, which could then be exposed for implementations outside the cxx crate, but sadly I wasn't able to find time to iterate on that design so far.

@dtolnay
Copy link
Owner

dtolnay commented Aug 10, 2020

To respond to your questions more specifically:

  • It is unclear to me how one would declare that a given C++ type should be recognized and wrapped in a specific way in Rust.

    For the "just shuffle a value or reference from one language to the other" category of types I think all we need is the namespace-qualified name and whether to restrict holding an instance of the type by value in Rust. The ExternType trait can support these; see previous comment.

  • It is unclear to me exactly which C++ => Rust type mappings would need to remain in the core cxx.

    I agree that there is value in decoupling with std types that don't strictly need to be built into cxx for some reason. Largely, it would help ensure that our API for integrating types is sufficiently flexible/powerful. I think it would take some more experimentation and design work to make a call on this but it could really go either way -- none of unique_ptr or string or vector being built in, or all of them remaining built in.

  • It is unclear to me how extensible type mappings impact with the safety guarantees of cxx and Rust.

    Yes, this is a big concern. The escape hatch we have is unsafe impls, as you see in the previous comment. Without an unsafe impl block it wouldn't be correct to let the programmer claim that some Rust type is compatible with a particular namespace-qualified type in C++.

@vlovich
Copy link

vlovich commented Nov 28, 2020

Is it possible today to provide a UniquePtr-like implementation of a custom smart pointer (Eg for boost::scoped_ptr)?

I understand you mentioned that template support is needed (is there an issue tracking that?) but is there anything else beyond that required to properly communicate resource ownership (ie to implement the Clone trait to do the right thing if relevant for that type).

@dtolnay
Copy link
Owner

dtolnay commented Nov 28, 2020

@vlovich that should be possible already with the feature set of cxx 1.0. See #524 for runnable example code.

Generic ExternType impls are nice-to-have and on the roadmap but not a blocker for this use case, since you can work around relatively simply by naming individual instantiations:

using ScopedClass = boost::scoped_ptr<Class>;

type ScopedClass = crate::ScopedPtr<Class>;

@vlovich
Copy link

vlovich commented Nov 30, 2020

It's not clear to me that's sufficient for the use-case I'm thinking. Specifically I'm thinking of it to wrap the cap'n'proto RPC C++ library rather than trying to reimplement it in Rust (while the capnp-rpc reimplementation is neat, it's a very incomplete reimplementation).

So the goal would be to codegen the C++ interfaces for use with the underlying library but come up with an efficient way to access the C++ library idiomatically in a way that's safe. Most things there deal with a custom kj::Own<T> structure that's like std::unique_ptr but lacks a release method (also the underlying type stored in kj::Own isn't necessarily allocated on the heap but usually is). The T here is going to be an instance of the underlying cap'n'proto interface (i.e. build time code generated). It would be unfortunate to have to retype all the types that are part of a schema just to do this.

It would seem like in this situation it would be expected to move kj::Own onto the heap & thus accesses from Rust typically involve an extra indirection to allow that. Is that correct? Or is there a simpler way to map this stuff?

@daira
Copy link

daira commented Aug 6, 2022

It doesn't handle generic types at the moment but I am hopeful that it would be possible to make that work and be equivalently easy to use.

For Zcash we really really need, in the relatively near term, to have good interoperation between some Result-style type on the C++ side (e.g. tl::expected) and Rust's Result<T, E>, that does not rely on mapping to exceptions. It needs to support arbitrary T and E (well, maybe not completely arbitrary). What's the best path to getting this working now? (I'm aware of #16.)

Our issue for this is zcash/zcash#4816 (comment)

Edit: oh I guess we could name every instantiation with different T, E that we actually use? I'll have to look at how much boilerplate that adds in practice. We might try doing that for now with a view to migrating it when generics are supported (since we won't be exposing these type aliases in any public-facing APIs).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants