-
Notifications
You must be signed in to change notification settings - Fork 161
Streaming support for JsonRpcServer #4879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming support for JsonRpcServer #4879
Conversation
| def encode(self) -> bytes: | ||
| return json.dumps(self.to_json()).encode('ascii') | ||
| def encode(self) -> Iterator[bytes]: | ||
| yield json.dumps(self.wrap_response()).encode('utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This yields the whole bytes value, if you want to yield each byte, you can do
data = json.dumps(self.wrap_response()).encode('utf-8')
yield from (bytes([b]) for b in data)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we need this fine granularity. The purpose of splitting the output stream into chunks is to keep the memory usage low. For example, kontrol-node writes a large (>1GB) JSON response to disk. The current design requires reading these files in full into memory before sending them over the network. The new design allows us to read and transfer these files in chunks. Additionally, the new design also avoids one circle of JSON parsing and serializing the files.
| id: str | int | None | ||
|
|
||
| def to_json(self) -> dict[str, Any]: | ||
| def wrap_response(self) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description states
This PR extends the JsonRpcServer with support for streaming.
but it introduces what look like breaking changes. Are there other clients to this library other than kontrol-node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I cannot think of a use case where a client library actually uses any JsonRpcResult subclasses directly. They were introduced as an internal abstraction layer when I implemented batch requests. Also, I'm not aware of any other user than kontrol-node. Therefore, the risk is very low here and can be easily addressed in downstream repositories.
| version_encoded = json.dumps(JsonRpcServer.JSONRPC_VERSION) | ||
| yield f'{{"jsonrpc": {version_encoded}, "id": {id_encoded}, "result": '.encode() | ||
| if isinstance(self.payload, Iterator): | ||
| yield from self.payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is self.payload a bytes now? If so, the type hint can be strengthened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. payload can still be any JSON serializable Python value, or it can be a Iterator[bytes] object. This is to keep backwards compatibility.
| if isinstance(self.payload, Iterator): | ||
| yield from self.payload | ||
| else: | ||
| yield json.dumps(self.payload).encode('utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also yields the whole bytes.
2d83de4
into
develop
This PR extends the JsonRpcServer with support for streaming.
JsonRpcMethods can now return a stream of bytes (Iterator[bytes]) instead of a JSON-serializable value.
The method must ensure that the produced stream is already JSON-encoded, as no extra validation is performed.
This allows us to transfer large responses without buffering the entire response in memory.
This is needed by the kontrol-node, when transferring large execution traces.