-
Notifications
You must be signed in to change notification settings - Fork 5.5k
chore: Add http2 data compression for cpp worker #26382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR adds support for ZSTD-based decompression of incoming Thrift requests and enables HTTP/2 response compression (including Thrift content types) in the Presto native execution engine, by updating request handlers, utilities, and server configuration. Sequence diagram for handling compressed Thrift requestssequenceDiagram
participant Client
participant "HTTPServer"
participant "TaskResource"
participant "Utils (decompressMessageBody)"
participant "TaskManager"
Client->>HTTPServer: Send Thrift request (Content-Encoding: zstd)
HTTPServer->>TaskResource: Forward request with headers
TaskResource->>Utils (decompressMessageBody): Decompress request body if isCompressed
Utils (decompressMessageBody)-->>TaskResource: Return decompressed body
TaskResource->>TaskManager: Process decompressed request
TaskManager-->>TaskResource: Return response
TaskResource->>HTTPServer: Send response
HTTPServer->>Client: Return compressed response (if applicable)
Class diagram for updated Utils functionsclassDiagram
class Utils {
+std::string extractMessageBody(const std::vector<std::unique_ptr<folly::IOBuf>>& body)
+std::string decompressMessageBody(const std::vector<std::unique_ptr<folly::IOBuf>>& body, const std::string& contentEncoding)
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- decompressMessageBody uses prependChain which reverses buffer order and may incur extra copies—consider using appendChain or folly's coalescing utilities to preserve ordering and reduce overhead.
- the contentEncoding check is exact-match and case-sensitive for "zstd" only—normalize the header and support common variants (e.g. uppercase, "x-zstd", or even gzip) to be more robust.
- response compression is enabled only for gzip and thrift mime types—if you want end-to-end zstd support, consider adding zstd to the HTTP/2 response compression codecs or exposing negotiation of the response codec.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- decompressMessageBody uses prependChain which reverses buffer order and may incur extra copies—consider using appendChain or folly's coalescing utilities to preserve ordering and reduce overhead.
- the contentEncoding check is exact-match and case-sensitive for "zstd" only—normalize the header and support common variants (e.g. uppercase, "x-zstd", or even gzip) to be more robust.
- response compression is enabled only for gzip and thrift mime types—if you want end-to-end zstd support, consider adding zstd to the HTTP/2 response compression codecs or exposing negotiation of the response codec.
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/common/Utils.cpp:97-101` </location>
<code_context>
+ try {
+ // Combine all IOBufs into a single chain
+ std::unique_ptr<folly::IOBuf> combined;
+ for (const auto& buf : body) {
+ if (!combined) {
+ combined = buf->clone();
+ } else {
+ combined->prependChain(buf->clone());
+ }
+ }
</code_context>
<issue_to_address>
**issue (bug_risk):** Prepending chains may reverse buffer order, potentially altering request body.
Using prependChain here reverses the buffer order. If maintaining the original order is important, use appendChain instead.
</issue_to_address>
### Comment 2
<location> `presto-native-execution/presto_cpp/main/common/Utils.cpp:116` </location>
<code_context>
+ }
+
+ // Decompress the data
+ auto codec = folly::compression::getCodec(codecType);
+ auto decompressed = codec->uncompress(combined.get());
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Potential null pointer dereference if getCodec returns nullptr.
Add a nullptr check for codec before calling uncompress to prevent a crash if no codec is found.
</issue_to_address>
### Comment 3
<location> `presto-native-execution/presto_cpp/main/common/Utils.cpp:128-134` </location>
<code_context>
+
+ return ret;
+ } catch (const std::exception& e) {
+ LOG(ERROR) << "Failed to decompress request body with " << contentEncoding
+ << ": " << e.what() << ". Treating as uncompressed.";
+ // Fall back to treating it as uncompressed
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Falling back to uncompressed extraction on decompression failure may mask client errors.
Returning an error response instead of accepting uncompressed data may prevent processing invalid input when decompression fails.
```suggestion
} catch (const std::exception& e) {
LOG(ERROR) << "Failed to decompress request body with " << contentEncoding
<< ": " << e.what() << ". Returning error response.";
throw std::runtime_error(
"Failed to decompress request body: " + std::string(e.what()));
}
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| for (const auto& buf : body) { | ||
| if (!combined) { | ||
| combined = buf->clone(); | ||
| } else { | ||
| combined->prependChain(buf->clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Prepending chains may reverse buffer order, potentially altering request body.
Using prependChain here reverses the buffer order. If maintaining the original order is important, use appendChain instead.
| } | ||
|
|
||
| // Decompress the data | ||
| auto codec = folly::compression::getCodec(codecType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Potential null pointer dereference if getCodec returns nullptr.
Add a nullptr check for codec before calling uncompress to prevent a crash if no codec is found.
Summary: 1. add support to decompress thrift request with zstd and compress thrift response for http server Differential Revision: D85150330
7dfecd1 to
40890ad
Compare
Summary: 1. add support to decompress thrift request with zstd and compress thrift response for http server Differential Revision: D85150330
40890ad to
ec380de
Compare
Summary: 1. add support to decompress thrift request with zstd and compress thrift response for http server Differential Revision: D85150330
ec380de to
5e8756b
Compare
Summary: 1. add support to decompress thrift request with zstd and compress thrift response for http server Differential Revision: D85150330
5e8756b to
06f28bc
Compare
Summary: 1. add support to decompress thrift request with zstd and compress thrift response for http server Differential Revision: D85150330
06f28bc to
b93d88e
Compare
Description
Motivation and Context
Differential Revision: D85150330