Skip to content

[BUG] Unable to configure gRPC keepalive/HTTP2 ping options – idle connections closed #813

@zgxkbtl

Description

@zgxkbtl

Expected Behavior

I should be able to configure gRPC channel options—especially HTTP/2 keepalive and ping parameters—directly in the Python SDK, for example:

client = DaprClient(
  address="127.0.0.1:3500",
  grpc_channel_options=[
    ('grpc.keepalive_time_ms', 10000),
    ('grpc.keepalive_timeout_ms', 5000),
    ('grpc.keepalive_permit_without_calls', 1),
    ('grpc.http2.min_time_between_pings_ms', 1000),
    ('grpc.http2.min_recv_ping_interval_without_data_ms', 0),
  ],
)

Actual Behavior

When using the Dapr Python SDK over gRPC, long‐lived HTTP/2 connections are torn down after ~20 seconds of inactivity:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"

My Pseudo code:

client = DaprClient()
client.invoke_binding()
time.sleep(20)
client.invoke_binding()

On the Dapr side you see:

2025/07/01 16:09:35 INFO: [transport] … Closing: keepalive ping not acked within timeout 5s
2025/07/01 16:09:35 INFO: [transport] … loopyWriter exiting with error: transport closed by client

By default the Python SDK calls grpc.insecure_channel(...) without any keepalive or HTTP2‐ping tuning, so:

The client never sends PINGs (server sees idle for 20 s and pings)
The client’s C‐Core enforces a minimum ping‐interval, so it does not ACK server PINGs in time
The Dapr (Go‐gRPC) runtime kills the connection
Subsequent RPCs fail with “Socket closed”

Steps to Reproduce the Problem

import json, time
from dapr.clients import DaprClient
from dapr.clients.grpc import DaprGrpcClient

def main():
    metadata = {
        "rpc-version": "1.0.0.DAILY",
        "rpc-group": "HSF",
        "rpc-interface-name": "HelloService",
        "rpc-method-name": "sayHello",
        "rpc-method-parameter-types": "java.lang.String",
        "serialization-type": "application/json"
    }
    client = DaprClient()
    # First invocation succeeds
    resp1 = client.invoke_binding(
        binding_name="hsf.consumer",
        operation="invoke",
        data=json.dumps(["zhangsan"]),
        binding_metadata=metadata
    )
    print("Response 1:", resp1.data.decode())

    # Sleep longer than the 20s idle‐timeout on the server/Proxy
    time.sleep(20)

    # Second invocation fails: Socket closed
    resp2 = client.invoke_binding(
        binding_name="hsf.consumer",
        operation="invoke",
        data=json.dumps(["zhangsan"]),
        binding_metadata=metadata
    )
    print("Response 2:", resp2.data.decode())

Console showed:

Response 1: "Hello, zhangsan"
Traceback (most recent call last):
  File "/Users/hy/Documents/VscodeProjects/dapr-demo-python/main.py", line 120, in <module>
    main()
  File "/Users/hy/Documents/VscodeProjects/dapr-demo-python/main.py", line 75, in main
    response = client.invoke_binding(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hy/Documents/VscodeProjects/dapr-demo-python/.venv/lib/python3.12/site-packages/dapr/clients/grpc/client.py", line 409, in invoke_binding
    response, call = self.retry_policy.run_rpc(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hy/Documents/VscodeProjects/dapr-demo-python/.venv/lib/python3.12/site-packages/dapr/clients/retry.py", line 75, in run_rpc
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hy/Documents/VscodeProjects/dapr-demo-python/.venv/lib/python3.12/site-packages/grpc/_channel.py", line 1198, in with_call
    return _end_unary_response_blocking(state, call, True, None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hy/Documents/VscodeProjects/dapr-demo-python/.venv/lib/python3.12/site-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "Socket closed"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_status:14, grpc_message:"Socket closed"}"
>

On the Dapr side you see:

2025/07/01 16:09:35 INFO: [transport] … Closing: keepalive ping not acked within timeout 5s
2025/07/01 16:09:35 INFO: [transport] … loopyWriter exiting with error: transport closed by client


What I’ve Tried

Subclassing DaprClient / DaprGrpcClient

Close the default channel, rebuild with:

grpc.insecure_channel(
  address,
  options=[
    ('grpc.keepalive_time_ms', 10_000),
    ('grpc.keepalive_timeout_ms', 5_000),
    ('grpc.keepalive_permit_without_calls', 1),
    ('grpc.http2.min_time_between_pings_ms', 1_000),
    ('grpc.http2.min_recv_ping_interval_without_data_ms', 0),
  ],
)

Re-wire all the stubs (DaprStub, ActorStub, etc) to use the new channel.
Still see “keepalive ping not acked within timeout 5s” and socket closes.

Recreating DaprClient() before every RPC

Works around the idle issue but imposes a TCP/TLS handshake per call and hurts performance.

Release Note

RELEASE NOTE:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions