-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote read API not performant in v2.13 #9764
Comments
I saw fixes around remote read API not honoring hints. (ref) but I saw this perf issues during instant query so this is different issue than hinting fixes. |
The first thing that comes to my mind is that remote read supports two response types (specs):
Fetching samples is much slower than fetching the encoded chunks. Could you make sure that Thanos requests |
Could you also share the full trace |
@pracucci regarding Encoded chunks, I confirmed that encoded chunks is being used as response type and was introduced in thanos few years back (reference) Regarding full trace, I am still figuring out how to export it as full json, also we are issuing federated query for 100 of tenant which is far slower in remote read compared to range APIs . |
Trace-4f3442-2024-11-07 15_12_26.json Attaching a trace of an instant query which took 20+ seconds |
Thanks. I tried to load it in the Jaeger UI but doesn't work (apparently it's an invalid format for Jaeger). What format is the trace? Which application have you used to export it? Sorry for this ping-pong, but would be great if you could just give a me a trace that loads in the Jaeger UI. To test it in Jaeger you can run it with:
Then upload the |
@pracucci I downloaded it from Grafana UI, can you try visualizing it in Grafana. |
Describe the bug
Hello Team,
We have been testing using thanos querier with Mimir remote read API and are seeing significant performance difference in range query v/s remote read. One of first thing I noticed was queries weren't sharded that might be contributing to majority of
To Reproduce
Steps to reproduce the behavior:
count(services_platform_service_request_count{namespace=~".*-staging$"}) by (namespace)
Expected behavior
This query is taking ~2 seconds to execute when query range API are used and it should take approx same time with remote read API too, however it took ~15+ seconds.
Environment
Additional Context
NA
The text was updated successfully, but these errors were encountered: