Skip to content

Optimize creation of buffered readers/writers #10424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 12, 2013

Conversation

alexcrichton
Copy link
Member

I was benchmarking rust-http recently, and I saw that 50% of its time was spent
creating buffered readers/writers. Albeit rust-http wasn't using
std::rt::io::buffered, but the same idea applies here. It's much cheaper to
malloc a large region and not initialize it than to set it all to 0. Buffered
readers/writers never use uninitialized data, and their internal buffers are
encapsulated, so any usage of uninitialized slots are an implementation bug in
the readers/writers.

I was benchmarking rust-http recently, and I saw that 50% of its time was spent
creating buffered readers/writers. Albeit rust-http wasn't using
std::rt::io::buffered, but the same idea applies here. It's much cheaper to
malloc a large region and not initialize it than to set it all to 0. Buffered
readers/writers never use uninitialized data, and their internal buffers are
encapsulated, so any usage of uninitialized slots are an implementation bug in
the readers/writers.
@zargony
Copy link
Contributor

zargony commented Nov 11, 2013

I was wondering if this also applies in general to all reader's read method. Currently, Reader::read requires an initialized vector to be passed. Wouldn't it be better to pass an empty vector and read data up to its capacity? I did something similar in rust-fuse when reading commands from a fd that's connected to the kernel driver: channel.rs, method receive().

@alexcrichton
Copy link
Member Author

I agree that right now we don't really have a safe way of specifying "read into this unallocated data". It's tough because you can only set the length on an owned vector, and I would much rather have fn read(&mut [u8]) than fn read(&mut ~[u8]). I think that we can live with this for now as long as we design things well, although we may want some sort of utility which wraps this unsafe behavior for us.

@brson
Copy link
Contributor

brson commented Nov 11, 2013

This is a common enough pattern that would be nice to have a function that gives you uninitialized u8s. There are security implications to handing out uninitialized buffers though so we'd probably want to talk over whether this is a 'safe' operation.

@sfackler
Copy link
Member

One option that may be a bit safer is to provide a method that returns a zeroed vector instead of an uninitialized vector. I'm not sure how smart the calloc implementation rust is using is, but large allocations are presumably backed by anonymous pages which are automatically zero so large allocations of 0s are just as fast as large allocations of uninitialized memory. It wouldn't be safe to provide it for arbitrary ~[T], but adding it to vec::bytes and only providing it for ~[u8] would probably solve the majority of the use cases for needing it.

@huonw
Copy link
Member

huonw commented Nov 12, 2013

@brson: our "behaviours considered unsafe" section includes "Reads of undef (uninitialized) memory", so presumably uninit_vec(length: uint) -> ~[u8] would have to be unsafe.

bors added a commit that referenced this pull request Nov 12, 2013
I was benchmarking rust-http recently, and I saw that 50% of its time was spent
creating buffered readers/writers. Albeit rust-http wasn't using
std::rt::io::buffered, but the same idea applies here. It's much cheaper to
malloc a large region and not initialize it than to set it all to 0. Buffered
readers/writers never use uninitialized data, and their internal buffers are
encapsulated, so any usage of uninitialized slots are an implementation bug in
the readers/writers.
@alexcrichton
Copy link
Member Author

@sfackler, sadly at least on OSX calloc is not nearly as fast as malloc. In a small #[bench] test with just free(alloc(64K)) I got:

running 2 tests
test calloc ... bench:      1938 ns/iter (+/- 99)
test malloc ... bench:        47 ns/iter (+/- 3)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

@sfackler
Copy link
Member

Bummer. I wonder if jemalloc does any better. Are there any plans to switch back to that?

@alexcrichton
Copy link
Member Author

We don't have any concrete plans, but I also don't think that we'd turn it down at all. It would be nice to have some numbers to back the change, however.

@sfackler
Copy link
Member

Based off of some quick checks, it looks like calloc(64 * 1024, 1) runs at about the same speed as malloc(64 * 1024) with jemalloc.

@alexcrichton
Copy link
Member Author

Interesting! That's not a bad reason to start looking into jemalloc again.

@bors bors closed this Nov 12, 2013
@bors bors merged commit cdf7d63 into rust-lang:master Nov 12, 2013
bors added a commit that referenced this pull request Nov 13, 2013
#10424 did optimizations without adding a
way to measure their effect and ensure no regressions. This fixes that.
@alexcrichton alexcrichton deleted the optimize-buffered branch November 13, 2013 22:20
flip1995 pushed a commit to flip1995/rust that referenced this pull request Jun 30, 2023
…shearth

manual_let_else: support struct patterns

This adds upon the improvements of rust-lang#10797 and:

* Only prints `()` around `Or` patterns at the top level (fixing a regression of rust-lang#10797)
* Supports multi-binding patterns: `let (u, v) = if let (Some(u_i), Ok(v_i)) = ex { (u_i, v_i) } else ...`
* Traverses through tuple patterns: `let v = if let (Some(v), None) = ex { v } else ...`
* Supports struct patterns: `let v = if let S { v, w, } = ex { (v, w) } else ...`

```
changelog: [`manual_let_else`]: improve pattern printing to support struct patterns
```

fixes rust-lang#10708
fixes rust-lang#10424
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants