-
Notifications
You must be signed in to change notification settings - Fork 72
Removed unnecessary intermediate Vec<Bytes> allocation in HTTP request body #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes the body stream creation for PUT and POST requests by eliminating an unnecessary intermediate vector allocation. The change removes the TODO comment about the inefficiency and directly converts the Arc<SegmentedBytes> into an iterator.
Key Changes:
- Removed intermediate
Vec<Bytes>collection that caused unnecessary memory allocation - Used
Arc::unwrap_or_clone()to efficiently extract or clone theSegmentedBytesbefore converting to iterator - Simplified stream creation by directly mapping the iterator
003bbd9 to
ae0edaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
| if (*method == Method::PUT) || (*method == Method::POST) { | ||
| //TODO: why-oh-why first collect into a vector and then iterate to a stream? | ||
| let bytes_vec: Vec<Bytes> = match body { | ||
| Some(v) => v.iter().collect(), | ||
| None => Vec::new(), | ||
| let iter = match body { | ||
| Some(v) => BodyIterator::Segmented(Arc::unwrap_or_clone(v).into_iter()), | ||
| None => BodyIterator::Empty(std::iter::empty()), | ||
| }; | ||
| let stream = futures_util::stream::iter( | ||
| bytes_vec.into_iter().map(|b| -> Result<_, Error> { Ok(b) }), | ||
| ); | ||
| let stream = futures_util::stream::iter(iter.map(|b| -> Result<_, Error> { Ok(b) })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something, we're still loading the whole body into memory within SegmentedBytes, wouldn't it make sense to have that already as a Stream so that e.g. uploading a file would not have to load the whole thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing implementation is already fairly efficient, since nothing actually gets copied. The bytes are split into a vector of byte segments, it might look like data is being copied, but that is not the case, because Bytes is reference-counted. Still, if there are 100+ segments, all those reference counts need to be updated, and that is what is being changed here. And I didn't want to change SegmentedBytes. But yes, maybe there are better optimizations to get rid of the segments in the first place.
cb97a7e to
19d7c3a
Compare
…est body preparation - Removed unnecessary intermediate `Vec<Bytes>` allocation in HTTP request body preparation - Directly consume `SegmentedBytes` iterator into stream using `Arc::unwrap_or_clone()` - Implemented zero-cost `BodyIterator` enum to avoid heap allocation and dynamic dispatch - Reduces memory overhead and eliminates Vec growth reallocations during request preparation
19d7c3a to
6ccbfe1
Compare
Summary
Vec<Bytes>allocation in HTTP request body preparationSegmentedBytesiterator into stream usingArc::unwrap_or_clone()BodyIteratorenum to avoid heap allocation and dynamic dispatchMemory Impact Analysis
Important: Bytes is reference-counted, so cloning it never copies the actual byte data - just increments a reference count. The 1GB of actual data is
never duplicated in either implementation. The difference is in metadata overhead - the collection of Bytes pointers.
Scenario 1: Multipart Upload (5MB parts)
For a 1GB file with MIN_PART_SIZE = 5 MiB:
Original Implementation:
New Implementation:
Savings: ~10-13 KB
Scenario 2: Streaming Read (smaller chunks)
If the 1GB is read from a stream in smaller buffers (e.g., 64KB chunks common in I/O):
Original Implementation:
New Implementation:
Savings: ~750 KB - 1 MB
Additional Benefits
Technical Details
Arc::unwrap_or_clone() efficiently handles the Arc:
The BodyIterator enum provides type unification without runtime overhead, as the enum variants are resolved at compile time.