-
Notifications
You must be signed in to change notification settings - Fork 27
fix: copy_io is a large future, this will result in a large number of memory copies #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Actually, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes the copy_io function by moving buffer allocations from the stack to the heap using boxed slices. The change addresses memory efficiency concerns when handling a large number of concurrent connections, reducing the future size and associated memory copying overhead.
Key Changes:
- Modified buffer allocation strategy from stack arrays to heap-allocated boxed slices
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tuic-server/src/io.rs
Outdated
| let mut a2b = vec![0u8; BUFFER_SIZE].into_boxed_slice(); | ||
| let mut b2a = vec![0u8; BUFFER_SIZE].into_boxed_slice(); |
Copilot
AI
Nov 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The allocation pattern vec![0u8; BUFFER_SIZE].into_boxed_slice() performs unnecessary zero-initialization. Consider using Box::new_uninit_slice(BUFFER_SIZE) followed by assume_init() after the first read, or allocate with Vec::with_capacity(BUFFER_SIZE) and manually set the length, to avoid the overhead of zeroing memory that will be immediately overwritten by the read operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the bytes crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary to use the bytes crate here, since the buffer is used in a very simple way.
What's more worth optimizing is what copilot mentioned there's no need to initialize the slice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result after using uninit is as follows:
[Stack] 1000000 concurrent connections completed in 1.846773625s
[Box] 1000000 concurrent connections completed in 603.313875ms
[Box uninit] 1000000 concurrent connections completed in 566.827834ms
|
Could you add several simple unit tests to |
I wrote a simple test, but I won't put it in the source code because the benchmark framework hasn't been introduced yet.
result: