-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRPC metric exporter doesn't reconnect #4435
Comments
I have same problem with GRPC and reconnect, HTTP exporter works fine and survive Otel-collector UP/DOWN. |
There is repro repo for similar case. |
I think this might be related to #4429. |
I am facing the same issue. I do not think that fixing #4429 solves it, because I am able to reproduce my issues even when running with I am also able to reproduce the same systems if the metrics endpoint (in this case the I suspect that something is not right with the retry logic or usage of gRPC channels in |
I'm also seeing this when using auto instrumentation with FastAPI. My application seems unable to reconnect to the local gRPC sink as described by @rmelick-muon. |
I see, same issue. with |
I believe the issue may be due to: grpc/grpc#38290 |
this is a big issue. I cannot afford loosing all metrics of running pods just because of a grafana alloy upgrade. |
Describe your environment
OS: Ubuntu
Python version: Python 3.8
SDK version: 1.27.0
API version: 1.27.0
Exporter: 1.27.0
Endpoint:
Telegraf, docker, 1.28, OpenTelemetry input(https://github.com/influxdata/telegraf/tree/master/plugins/inputs/opentelemetry)
What happened?
If the metric endpoint does not exist at the start of a "PeriodicExportingMetricReader" with an "OTLPMetricExporter" then it can't connect to it, even after the endpoint gets alive.
It tries to resend the metric, but without any success:
Transient error StatusCode.UNAVAILABLE encountered while exporting metrics to localhost:4317, retrying in 1s.
...
Steps to Reproduce
Config file (/tmp/tele.conf)
Start telegraf:
Expected Result
The exporter should try to rebuild the connection to the endpoint in case of "StatusCode.UNAVAILABLE".
Actual Result
The exporter gets stuck in "StatusCode.UNAVAILABLE" status.
Additional context
No response
Would you like to implement a fix?
None
The text was updated successfully, but these errors were encountered: