Skip to content

Cancel stale ClickHouse queries server-side on navigation#150

Open
trieloff wants to merge 3 commits intomainfrom
fix/cancel-refinement-queries
Open

Cancel stale ClickHouse queries server-side on navigation#150
trieloff wants to merge 3 commits intomainfrom
fix/cancel-refinement-queries

Conversation

@trieloff
Copy link
Copy Markdown
Contributor

Summary

  • When users switch time ranges (e.g., from 7-day to 15-minute view), refinement pass queries (full table scans without sampling) continued running on the ClickHouse server even after the browser aborted the fetch
  • These stale queries consumed server resources and starved new queries — David observed 37s dashboard loads on a 15-minute time range because 7-day refinement queries were still running
  • Each request context now tags its queries with a unique query_id prefix, and when a new context replaces the old one, it sends KILL QUERY to cancel still-running server queries

Testing Done

  • Verified KILL QUERY works for read-only users (tested against production ClickHouse Cloud instance)
  • All 774 tests pass (npm test)
  • Zero lint errors (npm run lint)
  • Updated test helpers in index.test.js and approx-top.test.js to filter out KILL QUERY POST calls from assertions

Checklist

  • Tests pass (npm test)
  • Lint passes (npm run lint)
  • Documentation updated (if applicable)

🤖 Generated with Claude Code

When users switch time ranges, the refinement pass queries (full table
scans without sampling) continued running on the ClickHouse server even
after the browser aborted the fetch. These queries consumed server
resources and starved new queries, causing perceived slowness.

This adds server-side query cancellation via KILL QUERY. Each request
context tags its queries with a unique query_id prefix, and when a new
context replaces the old one, it sends KILL QUERY to cancel the
still-running server queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lars Trieloff <lars@trieloff.net>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 12, 2026

Preview deployment

Preview is live at: https://klickhaus.aemstatus.net/preview/pr-150/dashboard.html

Updated for commit 383fa85

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds server-side cancellation for stale ClickHouse queries when a request context is replaced (e.g., when users change dashboard time ranges), preventing long-running refinement queries from continuing to consume resources after the browser aborts the request.

Changes:

  • Tag ClickHouse queries with a context-specific query_id prefix to enable targeted cancellation.
  • On starting a new request context, send KILL QUERY for the previous context’s query group.
  • Update breakdown tests to ignore control-plane KILL QUERY POST calls when asserting query POST behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
js/request-context.js Adds per-context query group prefixing and triggers server-side cancellation of the previous context’s queries.
js/api.js Stores per-signal query-group metadata and applies query_id tagging; adds killQueryGroup() to issue KILL QUERY.
js/breakdowns/index.test.js Filters mock fetch call assertions to exclude KILL QUERY control requests.
js/breakdowns/approx-top.test.js Same as above for approx-top refinement tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +48
export function killQueryGroup(prefix) {
if (!prefix || !state.credentials?.user) return;
const safePfx = prefix.replace(/'/g, "\\'");
fetch(CLICKHOUSE_URL, {
method: 'POST',
headers: {
Authorization: `Basic ${btoa(`${state.credentials.user}:${state.credentials.password}`)}`,
},
body: `KILL QUERY WHERE query_id LIKE '${safePfx}-%' ASYNC`,
}).catch(() => { /* ignore */ });
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

killQueryGroup() builds a ClickHouse statement using LIKE '${safePfx}-%', but safePfx only escapes single quotes. If the prefix can contain % or _ (e.g., via dynamic scopes like facet:${breakdown.id}), the LIKE will treat them as wildcards and may cancel queries outside the intended group. Consider either sanitizing the prefix to a strict safe charset before storing it, or escaping %/_ for LIKE (and also escaping backslashes) so the kill pattern only matches the exact prefix.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +47

// Filter helper: extract only ClickHouse data query POSTs (exclude KILL QUERY calls)
const queryPosts = (calls) => calls.filter((c) => c.options?.method === 'POST' && c.options?.body?.includes('FORMAT JSON'));

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests introduce two slightly different queryPosts helpers (const arrow in index.test.js vs function in approx-top.test.js) with the same filtering logic. To reduce duplication and keep future changes consistent (e.g., if the ClickHouse POST body format changes), consider extracting a shared helper (or importing from a common test utility) rather than duplicating the filter in multiple test files.

Copilot uses AI. Check for mistakes.
Comment on lines +19 to +23
// Filter helper: extract only ClickHouse query POSTs (exclude KILL QUERY and other control calls)
function queryPosts(calls) {
return calls.filter((c) => c.options?.method === 'POST' && c.options?.body?.includes('FORMAT JSON'));
}

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file adds a queryPosts() helper that duplicates the same filtering logic added in js/breakdowns/index.test.js. To avoid divergence, consider factoring this into a shared test helper so both test suites stay aligned when control-plane ClickHouse POSTs (like KILL QUERY) evolve.

Copilot uses AI. Check for mistakes.
Comment on lines +32 to +39
// Cancel any still-running server-side queries from the previous context
if (ctx.queryGroupPrefix) {
killQueryGroup(ctx.queryGroupPrefix);
}
ctx.controller = new AbortController();
ctx.requestId += 1;
ctx.queryGroupPrefix = `klick-${scope || 'dashboard'}-${ctx.requestId}`;
setSignalQueryGroup(ctx.controller.signal, ctx.queryGroupPrefix);
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryGroupPrefix is derived only from scope + an in-memory incrementing requestId. This can collide across browser tabs/windows (each tab starts at requestId=1), so starting a new context in one tab can KILL QUERY for another tab that happens to share the same scope/prefix. Consider including a per-tab/session random component (e.g., a UUID generated once per page load) in the prefix, and/or avoid embedding scope directly so the prefix is both unique and limited to safe characters for ClickHouse query_id/LIKE matching.

Copilot uses AI. Check for mistakes.
The KILL QUERY command requires SELECT(query_id, user, query) on
system.processes. Without this grant, server-side query cancellation
silently fails with ACCESS_DENIED.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lars Trieloff <lars@trieloff.net>
AbortSignal.any() returns a new native signal that loses WeakMap
associations. The merged signal used in breakdown queries was untagged,
so KILL QUERY couldn't cancel them. Explicitly propagate the query
group after merging signals in createRequestStatus.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lars Trieloff <lars@trieloff.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants