-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Proposed TcpStream::open and TcpStream::open_timeout #13919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed TcpStream::open and TcpStream::open_timeout #13919
Conversation
I've avoided adding this to the standard library because I think that doing it may require a significant amount of code. I think the implementation here is fine, it's just not what I would quite expect. I would expect this to be a pure convenience method which only takes a string argument, which is the address to open up (or bind to). Taking a string and a number for a port I'm not sure is getting us much closer to the goal. Additionally, the semantics around DNS resolution are a little tricky here, especially in the timeout case. The timeout being supplied is not an overall timeout, but rather only for one connect operation, not the DNS lookup, nor the whole batch of connects (if there are many). It's close to the best that we can do, but I think that filling this out more will require a good deal more code (just guessing, though). All in all, I've avoided doing this because I think doing it as taking one string argument and handling weird edge cases always seemed like a good chunk of code that probably didn't belong in the standard library itself. That being said, this is incredibly useful, I hate having to type out
|
IMO (and as you later observe) the most annoying part is setting up the SocketAddr jazz. I find the string-only API a little surprising. With the exception of Go, most mainstream languages I've seen tend to separate host+port for this sort of convenience method: Ruby, Python (the host+port is a tuple here), [Java](http://docs.oracle.com/javase/7/docs/api/java/net/Socket.html#Socket%28java.lang.String, int%29), C# Not that we need to blindly follow that convention, I guess I just don't see the gap between the convention-friendly Happy to move things to liburl if that seems more appropriate -- I don't feel that strongly about it -- but given we're talking about potentially moving it out of libstd to satisfy the proposed interface I thought I might offer up this alternative. Anyway, let me know if you're still set on the single string arg for open/bind & a move to liburl -- easy enough either way. I assume you don't want to move the entirety of TcpStream into liburl -- imagine something like
I totally agree with all this. We could subtract the DNS lookup time from the remaining connect timeout, but then if the first connect fails & we fall back to a second A record we're left with ambiguity wrt what the "correct" behavior should be. Options for dealing with this in other languages vary -- Ruby offers no timeout support out of the box & seems to rely on things like The more I think about it, the more I'd be just as happy to ditch the open_timeout implementation for now & maybe look at our options there another time. |
You have convinced me, a string + port doesn't sound that bad at all! Perhaps in that case this could live in libstd for now, seeing how the implementation is small enough. I think it would be cool to have something like Could you add some tests for opening a stream with hosts that look like I also agree with removing |
Glad you came around :) Assuming there are no problems with the Travis CI builds, I think this is what Any thoughts on what the |
Guess we could rename the more-difficult-to-use e.g.
|
/// `host` can be a hostname or IP address string. | ||
#[experimental = "this function may eventually move to another module"] | ||
pub fn open(host: &str, port: u16) -> IoResult<TcpStream> { | ||
match get_host_addresses(host) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be better expresses (less indentation) via:
let addreses = try!(get_host_addresses(host));
How about |
No worse than
Instead, maybe something like this?
Though then you can't get at TcpListener::socket_name(), which is a problem if you pass in zero for the port number ... hrm. |
Hm that is a good point, One route we could go is to add Thinking out loud here for a second, perhaps we could only provide support for |
Oh to have overloading eh? :) It'd be I can kind of imagine the struct All that said, I feel like Another random idea: |
Ugh, then |
Huh, I actually don't mind the
Imagine that would break tests everywhere, but y'know :) |
I think that the one downside of the |
Sure. So to be clear, I should proceed with replacing the |
@thomaslee, I discussed a bit with some other folks, and we're confident this is the best way to go! As part of this change, can you make sure the commit message outlines the changes made as well? See https://mail.mozilla.org/pipermail/rust-dev/2014-April/009543.html for more info. |
Sweet, see you on the other side. Shall I squash the commits from earlier in this PR, or do you want the history? |
Squashing is ok, the comments should have enough history. Thanks again, and let me know if you need any help! |
Making good progress, but what should I do with |
Let's leave it taking |
Ack, looks like a rebase got messed up. See if I can address that ... |
@alexcrichton here you go, let me know how you feel about this. There's similar stuff in udp:: that we should probably look at too, but maybe that's a separate PR. |
/// `host` can be a hostname or IP address string. If no error is | ||
/// encountered, then `Ok(stream)` is returned. | ||
pub fn connect(host: &str, port: u16) -> IoResult<TcpStream> { | ||
let addresses = try!(get_host_addresses(host)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to avoid the syscall, could this first try from_str
into an IpAddr
? If that succeeds, then we know we don't need to go call getaddrinfo
.
This looks fantastic, thanks @thomaslee! I agree that the same treatment needs to happen with the UDP bindings, but if you're running low on time, I can take care of that as well! |
Oh I'm happy to do it unless you're particularly keen to jump on it. Standard libs + compiler is my best avenue for learning the language a little more thoroughly. :) Thanks for the comments, I'll get that sorted out. |
@alexcrichton here you go |
@alexcrichton curious -- looks like src/test/run-pass/tcp-stress.rs is failing on mac-64. Any reason why this would run on mac-64 but not on linux-64? I'm not seeing any failures locally, though I think I can see why it would fail if it did try to build & run that test. I'll see if I can fix that, but I imagine bors will need another nudge. |
Could you rebase the fix into the original commit as well? It looks like it'll fix the problem, though! |
@alexcrichton done -- let's see if this one does any better :) |
…en, r=alexcrichton Been meaning to try my hand at something like this for a while, and noticed something similar mentioned as part of #13537. The suggestion on the original ticket is to use `TcpStream::open(&str)` to pass in a host + port string, but seems a little cleaner to pass in host and port separately -- so a signature like `TcpStream::open(&str, u16)`. Also means we can use std::io::net::addrinfo directly instead of using e.g. liburl to parse the host+port pair from a string. One outstanding issue in this PR that I'm not entirely sure how to address: in open_timeout, the timeout_ms will apply for every A record we find associated with a hostname -- probably not the intended behavior, but I didn't want to waste my time on elaborate alternatives until the general idea was a-OKed. :) Anyway, perhaps there are other reasons for us to prefer the original proposed syntax, but thought I'd get some thoughts on this. Maybe there are some solid reasons to prefer using liburl to do this stuff.
Let's see if it was a fluke. |
I suspect that |
Aha! It looks like the definition of diff --git a/src/liblibc/lib.rs b/src/liblibc/lib.rs
index e2b647f..446f753 100644
--- a/src/liblibc/lib.rs
+++ b/src/liblibc/lib.rs
@@ -431,8 +431,17 @@ pub mod types {
pub ai_socktype: c_int,
pub ai_protocol: c_int,
pub ai_addrlen: socklen_t,
+
+ #[cfg(target_os = "linux")]
pub ai_addr: *sockaddr,
+ #[cfg(target_os = "linux")]
+ pub ai_canonname: *c_char,
+
+ #[cfg(target_os = "android")]
pub ai_canonname: *c_char,
+ #[cfg(target_os = "android")]
+ pub ai_addr: *sockaddr,
+
pub ai_next: *addrinfo,
}
pub struct sockaddr_un { |
Prior to this commit, TcpStream::connect and TcpListener::bind took a single SocketAddr argument. This worked well enough, but the API felt a little too "low level" for most simple use cases. A great example is connecting to rust-lang.org on port 80. Rust users would need to: 1. resolve the IP address of rust-lang.org using io::net::addrinfo::get_host_addresses. 2. check for errors 3. if all went well, use the returned IP address and the port number to construct a SocketAddr 4. pass this SocketAddr to TcpStream::connect. I'm modifying the type signature of TcpStream::connect and TcpListener::bind so that the API is a little easier to use. TcpStream::connect now accepts two arguments: a string describing the host/IP of the host we wish to connect to, and a u16 representing the remote port number. Similarly, TcpListener::bind has been modified to take two arguments: a string describing the local interface address (e.g. "0.0.0.0" or "127.0.0.1") and a u16 port number. Here's how to port your Rust code to use the new TcpStream::connect API: // old ::connect API let addr = SocketAddr{ip: Ipv4Addr{127, 0, 0, 1}, port: 8080}; let stream = TcpStream::connect(addr).unwrap() // new ::connect API (minimal change) let addr = SocketAddr{ip: Ipv4Addr{127, 0, 0, 1}, port: 8080}; let stream = TcpStream::connect(addr.ip.to_str(), addr.port()).unwrap() // new ::connect API (more compact) let stream = TcpStream::connect("127.0.0.1", 8080).unwrap() // new ::connect API (hostname) let stream = TcpStream::connect("rust-lang.org", 80) Similarly, for TcpListener::bind: // old ::bind API let addr = SocketAddr{ip: Ipv4Addr{0, 0, 0, 0}, port: 8080}; let mut acceptor = TcpListener::bind(addr).listen(); // new ::bind API (minimal change) let addr = SocketAddr{ip: Ipv4Addr{0, 0, 0, 0}, port: 8080}; let mut acceptor = TcpListener::bind(addr.ip.to_str(), addr.port()).listen() // new ::bind API (more compact) let mut acceptor = TcpListener::bind("0.0.0.0", 8080).listen() [breaking-change]
Fall back to get_host_addresses to try a DNS lookup if we can't parse it as an IP address.
@alexcrichton I assume you intend for me to apply that patch to this PR? Rebased from master & done in 218d01e. |
…en, r=alexcrichton Been meaning to try my hand at something like this for a while, and noticed something similar mentioned as part of #13537. The suggestion on the original ticket is to use `TcpStream::open(&str)` to pass in a host + port string, but seems a little cleaner to pass in host and port separately -- so a signature like `TcpStream::open(&str, u16)`. Also means we can use std::io::net::addrinfo directly instead of using e.g. liburl to parse the host+port pair from a string. One outstanding issue in this PR that I'm not entirely sure how to address: in open_timeout, the timeout_ms will apply for every A record we find associated with a hostname -- probably not the intended behavior, but I didn't want to waste my time on elaborate alternatives until the general idea was a-OKed. :) Anyway, perhaps there are other reasons for us to prefer the original proposed syntax, but thought I'd get some thoughts on this. Maybe there are some solid reasons to prefer using liburl to do this stuff.
A bit late to the party, but this change seems pretty bad to me. The addr version was removed entirely, and now you have to keep slinging strings around, which has much more cognitive overload with lifetimes than a simple copyable SocketAddr. How often do you really connect to a hardcoded remote endpoint that optimizing the string-directly-in-the-call case is important? (To be clear, I'm mostly advocating that the version that takes SockAddr be brought back, probably as |
When a function is taking It's not that connecting to a hardcoded endpoint is uncommon, I found it much easier to deal with the string representations of ips rather than the |
I also feel strong about keeping I understand that strings are convenient to use (especially with user interactions) and I like the idea of a method that parses a string, does dns lookups and connects to one address - but imo this method shouldn't hide the real connect method using a native type. E.g. in a lib I'm currently working on, there's some p2p stuff that keeps list of peers, exchanges them, connects them and so on. After this PR, I changed 9 I would very welcome if Rust would keep methods using
|
I agree that this isn't necessarily perfect, but I believe this situation to be the best of both worlds. Duplication in the API is unfortunate because you'll constantly forget whether to call open/connect or bind/open with what you have, and it also just leads to a general explosion of related methods. I generally find that reducing the API is one of the most important concerns. The other downside you pointed out, allocating strings, I don't consider that large of a problem. When compared with opening a TCP socket or binding a TCP socket, I don't think that the allocation will show up that high in the profiles. |
Could we use a trait implemented for (I have little context for this, so this may be a complete nonsense suggestion.) |
A trait for overloading is a possible route to take, but it's unclear how much better in terms of documentation that would be (a one-off trait for one method with unknown implementors when you first see it). I would, however, prefer overloading via a trait to method duplication. |
Documentation would be at a single place, no method/functionality duplication. The generic parameter of |
Happy to make another pass using traits if nobody else is running with it. Does seem like the best of both worlds. |
I'd tend to agree that either the APIs should be explicit or they should take in various inputs (if technically feasible with that ToSocketAddr trait). I started working on a universal open method a while back though it didn't get very far due to lack of time, but you can see it at https://gist.github.com/Seldaek/73aefd28ca5ff5655bac - I think having explicit internal libs + a "magic" open() function wrapping them for ease of use with strings is at least a better solution than just supporting string inputs. |
Been meaning to try my hand at something like this for a while, and noticed something similar mentioned as part of #13537. The suggestion on the original ticket is to use
TcpStream::open(&str)
to pass in a host + port string, but seems a little cleaner to pass in host and port separately -- so a signature likeTcpStream::open(&str, u16)
.Also means we can use std::io::net::addrinfo directly instead of using e.g. liburl to parse the host+port pair from a string.
One outstanding issue in this PR that I'm not entirely sure how to address: in open_timeout, the timeout_ms will apply for every A record we find associated with a hostname -- probably not the intended behavior, but I didn't want to waste my time on elaborate alternatives until the general idea was a-OKed. :)
Anyway, perhaps there are other reasons for us to prefer the original proposed syntax, but thought I'd get some thoughts on this. Maybe there are some solid reasons to prefer using liburl to do this stuff.