Per Query Stats

We should extend observability beyond global cache stats.

ClickHouse exposes:
- cache contents via system.query_condition_cache
- per-query hits/misses via system.query_log ProfileEvents
- trace-level pruning details like granules dropped / marks read

Iceberg's ScanReport is also a good model: it reports planning duration and scanned/skipped manifests/files. For our future Parquet/Iceberg work, the metrics should generalize over:
table -> data_unit -> data_range -> predicate -> bitmask/result.

Proposal:
1. Keep condition_cache_stats() for global counters:
   entry_count, table_count, memory, lookup_hit/miss, build_count, applied_count,
   effective_hit_count, invalidation/eviction counters.

2. Add condition_cache_entries():
   one row per cache entry, similar to ClickHouse system.query_condition_cache:
   table, predicate_hash, optional predicate_text, memory, cached row groups,
   qualifying vectors, total vectors, selectivity, created/last-used timestamps.

3. Add condition_cache_last_query() and condition_cache_query_history(n):
   one row per scan/probe in recent queries:
   query_seq, scan_id, table, predicate_hash, status, skip_reason, lookup_hit,
   built_inline, build_time_us, applied, row_groups/vectors pruned,
   approx rows pruned, uncached pass-through vectors.

4. Add EXPLAIN ANALYZE / JSON profiling extra_info for cached scans:
   Query Condition Cache: status, predicate_hash, cached_row_groups,
   qualifying_vectors, runtime_vectors_pruned, approx_rows_pruned.

Important: distinguish lookup hit from effective hit.
A lookup hit that prunes zero vectors is useful for debugging but should not be
counted as an effective cache benefit.

---

- create a query-local profile object, e.g. shared_ptr<ConditionCacheProfileInfo>
- attach it to the scan bind data
- set get.function.dynamic_to_string = ConditionCacheDynamicToString

The practical storage place is a custom bind-data type derived from TableScanBindData, because dynamic_to_string only gets bind_data, local_state, and global_state

```cpp
struct ConditionCacheProfileInfo {
	atomic<bool> initial_hit {false};
	atomic<bool> built_this_query {false};
	atomic<idx_t> vectors_checked {0};
	atomic<idx_t> vectors_pruned {0};
	atomic<idx_t> rows_pruned {0};
};

struct ConditionCacheTableScanBindData : public TableScanBindData {
	ConditionCacheTableScanBindData(TableCatalogEntry &table, shared_ptr<ConditionCacheProfileInfo> profile_p)
	    : TableScanBindData(table), profile(std::move(profile_p)) {
	}

	shared_ptr<ConditionCacheProfileInfo> profile;

	unique_ptr<FunctionData> Copy() const override {
		auto result = make_uniq<ConditionCacheTableScanBindData>(table, profile);
		result->is_index_scan = is_index_scan;
		result->is_create_index = is_create_index;
		result->column_ids = column_ids;
		result->order_options = order_options ? make_uniq<RowGroupOrderOptions>(*order_options) : nullptr;
		return std::move(result);
	}
};

static InsertionOrderPreservingMap<string>
ConditionCacheDynamicToString(TableFunctionDynamicToStringInput &input) {
	auto &bind = input.bind_data->Cast<ConditionCacheTableScanBindData>();
	auto &p = *bind.profile;

	InsertionOrderPreservingMap<string> result;
	result["Condition Cache"] = p.built_this_query ? "MISS -> BUILT" : (p.initial_hit ? "HIT" : "MISS");
	result["Cache Vectors Checked"] = to_string(p.vectors_checked.load());
	result["Cache Vectors Pruned"] = to_string(p.vectors_pruned.load());
	result["Cache Rows Pruned"] = to_string(p.rows_pruned.load());
	return result;
}
```

Then update that shared object from:

- ConditionCacheFilterFn(...) on the hit/apply path: query_condition_cache_filter.cpp (line 42)
- PhysicalCacheRecorder::Execute(...) and OperatorFinalize(...) on the miss/build path if you take PR #78’s approach

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per Query Stats #80

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Per Query Stats #80

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions