[BUG] Duplicated requests on refreshing the overview #105

ansjcy · 2025-02-07T23:13:36Z

What is the bug?

On refreshing the overview page, we are sending multiple duplicate top n queries requests on all the metrics. See below screenshot.

How can one reproduce the bug?

Run the QID with all metrics enabled
Hit the refresh button and check the network requests.

What is the expected behavior?

Ideally, only one request per metric should be send to the backend on refresh.

What is your host/environment?

Operating system, version.

Do you have any screenshots?

If applicable, add screenshots to help explain your problem.

Do you have any additional context?

Add any other context about the problem.

ansjcy · 2025-02-12T21:39:46Z

This could be a first good issue for query insights dashboards.

brucejxz · 2025-03-08T19:52:52Z

Seeing this as well and it's killing a pretty much empty 2-node test cluster I set up on a t4g.medium from this docker compose file: https://opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/#deploy-an-opensearch-cluster-using-docker-compose.

When I first load the page, the requests look like:

But if I refresh it again:

This is accompanied by these logs:

opensearch-node1       | [2025-03-08T19:38:45,121][INFO ][o.o.c.c.FollowersChecker ] [opensearch-node1] FollowerChecker{discoveryNode={opensearch-node2}{GGrs_d1ARSqSkU6HWg1bPA}{5DCrvZfAR_eiH
9wWHD6muA}{172.18.0.4}{172.18.0.4:9300}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=1, [cluster.fault_detection.follower_check.retry_count]=3} failed, retrying
opensearch-node1       | org.opensearch.transport.ReceiveTimeoutTransportException: [opensearch-node2][172.18.0.4:9300][internal:coordination/fault_detection/follower_check] request_id [1056
] timed out after [10014ms]
opensearch-node1       |        at org.opensearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1421) [opensearch-2.19.0.jar:2.19.0]
opensearch-node1       |        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:955) [opensearch-2.19.0.jar:2.19.0]
opensearch-node1       |        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
opensearch-node1       |        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
opensearch-node1       |        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
opensearch-node1       | [2025-03-08T19:38:54,064][WARN ][o.o.c.InternalClusterInfoService] [opensearch-node1] Failed to update node information for ClusterInfoUpdateJob within 15s timeout
opensearch-node1       | [2025-03-08T19:38:56,123][INFO ][o.o.c.c.FollowersChecker ] [opensearch-node1] FollowerChecker{discoveryNode={opensearch-node2}{GGrs_d1ARSqSkU6HWg1bPA}{5DCrvZfAR_eiH
9wWHD6muA}{172.18.0.4}{172.18.0.4:9300}{dimr}{shard_indexing_pressure_enabled=true}, failureCountSinceLastSuccess=2, [cluster.fault_detection.follower_check.retry_count]=3} failed, retrying
opensearch-node1       | org.opensearch.transport.ReceiveTimeoutTransportException: [opensearch-node2][172.18.0.4:9300][internal:coordination/fault_detection/follower_check] request_id [1093
] timed out after [10006ms]
opensearch-node1       |        at org.opensearch.transport.TransportService$TimeoutHandler.run(TransportService.java:1421) [opensearch-2.19.0.jar:2.19.0]
opensearch-node1       |        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:955) [opensearch-2.19.0.jar:2.19.0]
opensearch-node1       |        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
opensearch-node1       |        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
opensearch-node1       |        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
opensearch-dashboards  | Unable to get top queries (cpu):  StatusCodeError: Request Timeout after 30000ms
opensearch-dashboards  |     at /usr/share/opensearch-dashboards/node_modules/elasticsearch/src/lib/transport.js:397:9
opensearch-dashboards  |     at Timeout.<anonymous> (/usr/share/opensearch-dashboards/node_modules/elasticsearch/src/lib/transport.js:429:7)
opensearch-dashboards  |     at listOnTimeout (node:internal/timers:569:17)
opensearch-dashboards  |     at processTimers (node:internal/timers:512:7) {
opensearch-dashboards  |   status: undefined,
opensearch-dashboards  |   displayName: 'RequestTimeout',
opensearch-dashboards  |   body: undefined
opensearch-dashboards  | }
opensearch-dashboards  | {"type":"response","@timestamp":"2025-03-08T19:38:33Z","tags":[],"pid":1,"method":"get","statusCode":200,"req":{"url":"/api/top_queries/cpu?from=2025-03-07T19%3A38%3
A33.633Z&to=2025-03-08T19%3A38%3A33.633Z","method":"get","headers":{"host":"localhost:5601","connection":"keep-alive","osd-version":"2.19.0","sec-ch-ua-platform":"\"macOS\"","user-agent":"Mo
zilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36","sec-ch-ua":"\"Chromium\";v=\"134\", \"Not:A-Brand\";v=\"24\", \"Brave\";
v=\"134\"","content-type":"application/json","sec-ch-ua-mobile":"?0","osd-xsrf":"osd-fetch","accept":"*/*","sec-gpc":"1","accept-language":"en-GB,en;q=0.7","sec-fetch-site":"same-origin","se
c-fetch-mode":"cors","sec-fetch-dest":"empty","referer":"http://localhost:5601/app/query-insights-dashboards","accept-encoding":"gzip, deflate, br, zstd","securitytenant":""},"remoteAddress"
:"172.18.0.1","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/134.0.0.0 Safari/537.36","referer":"http://localhost:5601/app/query-i
nsights-dashboards"},"res":{"statusCode":200,"responseTime":30157,"contentLength":9},"message":"GET /api/top_queries/cpu?from=2025-03-07T19%3A38%3A33.633Z&to=2025-03-08T19%3A38%3A33.633Z 200
 30157ms - 9.0B"}
opensearch-dashboards  | Unable to get top queries (cpu):  StatusCodeError: Request Timeout after 30000ms
opensearch-dashboards  |     at /usr/share/opensearch-dashboards/node_modules/elasticsearch/src/lib/transport.js:397:9
opensearch-dashboards  |     at Timeout.<anonymous> (/usr/share/opensearch-dashboards/node_modules/elasticsearch/src/lib/transport.js:429:7)
opensearch-dashboards  |     at listOnTimeout (node:internal/timers:569:17)
opensearch-dashboards  |     at processTimers (node:internal/timers:512:7) {
opensearch-dashboards  |   status: undefined,
opensearch-dashboards  |   displayName: 'RequestTimeout',
opensearch-dashboards  |   body: undefined
opensearch-dashboards  | }

rishabh6788 · 2025-03-17T20:12:54Z

I migrated my self managed OS-2.12 cluster to 2.19.1 and upon clicking on query-insights tab on the dashboards the cluster became unresponsive and data nodes started dropping out. Restarted and tried again clicking on query-insights tab to confirm and it happened again. Had to disable the plugin to stabilize the cluster.

ansjcy · 2025-03-18T22:00:35Z

@rishabh6788 there are 2 main issues, one is the duplicated requests mentioned here, another one is by default we are fetching way too many records from the index. We are fixing this issue by setitng the default reader size to 50.

ansjcy added bug Something isn't working untriaged labels Feb 7, 2025

ansjcy added good first issue Good for newcomers and removed untriaged labels Feb 12, 2025

KishoreKicha14 mentioned this issue Mar 12, 2025

Fix duplicated requests on refreshing the overview #138

Merged

dzane17 mentioned this issue Mar 17, 2025

Reduce LocalIndexReader size to 50 opensearch-project/query-insights#281

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Duplicated requests on refreshing the overview #105

[BUG] Duplicated requests on refreshing the overview #105

ansjcy commented Feb 7, 2025

ansjcy commented Feb 12, 2025

brucejxz commented Mar 8, 2025

rishabh6788 commented Mar 17, 2025

ansjcy commented Mar 18, 2025 •

edited

Loading

[BUG] Duplicated requests on refreshing the overview #105

[BUG] Duplicated requests on refreshing the overview #105

Comments

ansjcy commented Feb 7, 2025

What is the bug?

How can one reproduce the bug?

What is the expected behavior?

What is your host/environment?

Do you have any screenshots?

Do you have any additional context?

ansjcy commented Feb 12, 2025

brucejxz commented Mar 8, 2025

rishabh6788 commented Mar 17, 2025

ansjcy commented Mar 18, 2025 • edited Loading

ansjcy commented Mar 18, 2025 •

edited

Loading