Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health checks no longer works with legacy compression options. #8162

Open
s-matyukevich opened this issue Mar 11, 2025 · 6 comments
Open

Health checks no longer works with legacy compression options. #8162

s-matyukevich opened this issue Mar 11, 2025 · 6 comments
Labels
Area: RPC Features Includes Compression, Encoding, Attributes/Metadata, Interceptors. Type: Bug

Comments

@s-matyukevich
Copy link
Contributor

What version of gRPC are you using?

Found the bug in v1.70.0 and confirmed it is wtill there in the HEAD of master.

What version of Go are you using (go version)?

go version go1.24.0 darwin/arm64

What operating system (Linux, Windows, …) and version?

Mac OS

What did you do?

Modify health example in the following way:

  • Add the following 2 options on the client
    grpc.WithCompressor(grpc.NewGZIPCompressor()),
    grpc.WithDecompressor(grpc.NewGZIPDecompressor()),
    
  • Add the following 2 options on the server
    grpc.RPCCompressor(grpc.NewGZIPCompressor()),
    grpc.RPCDecompressor(grpc.NewGZIPDecompressor()),
    

What did you expect to see?

Example working correctly

What did you see instead?

Health check failing. Client can't connect to the server.

I think I traced the issue to its root cause. If I modify this code to include decompressor provided in the options, e.g

as := &addrConnStream{
		...
                decompressorV0:   ac.cc.dopts.dc,
	}

everything is working as expected. This is not a proper fix as you also need to handle cases when compression is configured using the new API, but it should be easy to handle that.

@kkhor-datadog
Copy link

We encountered this issue when upgrading from v1.64.0 to v1.70.0, and were able to reproduce following the steps provided above.

With the reproduction steps provided (using legacy compression options on the client and server in the healthcheck example), we observe the following output from the healthcheck example on the v1.64.x branch:

go run client/main.go
2025/03/11 15:45:25 ERROR: [core] [Channel #1 SubChannel #3]Subchannel health check is unimplemented at server side, thus health check is disabled
2025/03/11 15:45:25 ERROR: [core] [Channel #1 SubChannel #2]Subchannel health check is unimplemented at server side, thus health check is disabled
UnaryEcho:  hello from localhost:50052
UnaryEcho:  hello from localhost:50052

The error log comes from here, and seems to occur because the client does not have a decompressor registered, which is recorded as an Unimplemented error, and ends up being interpreted as a server with a disabled health check, which is considered healthy.

On the v1.70.x branch, following the reproduction steps, we observe the following output from the healthcheck example:

UnaryEcho: _,  rpc error: code = Unavailable desc = last connection error: connection active but received health check RPC error: rpc error: code = Internal desc = grpc: Decompressor is not installed for grpc-encoding "gzip"
UnaryEcho: _,  rpc error: code = Unavailable desc = last connection error: connection active but received health check RPC error: rpc error: code = Internal desc = grpc: Decompressor is not installed for grpc-encoding "gzip"
UnaryEcho: _,  rpc error: code = Unavailable desc = last connection error: connection active but received health check RPC error: rpc error: code = Internal desc = grpc: Decompressor is not installed for grpc-encoding "gzip"

We believe this behavior was exposed by #7461, though as described in the original post, the fix may be to register the legacy decompressor in the health check stream.

@atollena atollena added the Area: RPC Features Includes Compression, Encoding, Attributes/Metadata, Interceptors. label Mar 12, 2025
@atollena atollena added the P1 label Mar 12, 2025
@atollena
Copy link
Collaborator

Checked with @s-matyukevich and he will provide a patch for this.

@purnesh42H
Copy link
Contributor

I see. So, #7461 changed to return INTERNAL status instead of UNIMPLEMENTED and the upstream code is expecting UNIMPLEMENTED to be considered healthy with server health check disabled. However, I am not sure "register the legacy decompressor in the health check stream" is the appropriate fix here. Shouldn't the server with a disabled health check be updated? @dfawley @easwars

@atollena
Copy link
Collaborator

atollena commented Mar 13, 2025

I see. So, #7461 changed to return INTERNAL status instead of UNIMPLEMENTED and the upstream code is expecting UNIMPLEMENTED to be considered healthy with server health check disabled. However, I am not sure "register the legacy decompressor in the health check stream" is the appropriate fix here. Shouldn't the server with a disabled health check be updated? @dfawley @easwars

There's a cross-language design question, I think. Are there cases where health checks should use compression? cc @ejona86 any hints on whether java ever enable compression on health checks? There's nothing explicit on the design.

Since, at least in Go, there's no way to control per-CallOptions on health checks. So it's not possible to enable compression without legacy options. A correct client side fix is probably to disable compression on health checks even when the legacy WithCompressor option is used. On the server side returning Unimplemented as before like @purnesh42H suggests, when health checks are requested with compression, is also probably a better trade off (it prevents breaking old clients while they get updated to use a non-compressed health check, since Unimplemented disables health checks whereas Internal fails it).

@ejona86
Copy link
Member

ejona86 commented Mar 17, 2025

I don't understand why the INTERNAL behavior matters. The client should never see that error message; the server did something wrong. Does grpc-go allow the server to respond with a client-unsupported algorithm? If so, that seems broken.

grpc-java doesn't use compression for health, unless you do something from an interceptor or such.

@atollena
Copy link
Collaborator

I tried to reproduce the issue on master today and the server properly responds to health check requests with an unsupported compression algorithm with UNIMPLEMENTED:

2025/03/21 10:21:33 ERROR: [health_service] Subchannel health check is unimplemented at server side, thus health check is disabled for SubConn 0x140001d6730

This can be verified by adding grpc.WithCompressor(grpc.NewGZIPCompressor()), to the list of DialOptions in the health check example at

, and running the example.

It's not clear to me what happened in this issue then which triggered the Internal error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: RPC Features Includes Compression, Encoding, Attributes/Metadata, Interceptors. Type: Bug
Projects
None yet
Development

No branches or pull requests

5 participants