Open
Description
Describe the bug
When aggregating with array_agg
using a limited memory pool we hit ResourceExhausted errors in the array_agg
accumulator.
After investigation this seems to come from over-accounting memory from shared arrow buffers. We might keep a single value from a 8k rows arrow array but will account the memory taken by the whole arrow array rather than just the value we kept.
To Reproduce
Run datafusion-cli
with a memory limit datafusion-cli -m 100m
Create a table:
CREATE table logs AS
SELECT
('Chrome-' || (random() * 1000)::int) as user_agent,
(n.value % 40000 )::text as client_id,
(n.value % 21 ==0) as is_internal
FROM
generate_series(1, 300000) n;
Try running a simple array_agg
query:
WITH
user_agents AS (
SELECT
DISTINCT user_agent, client_id
FROM
logs
WHERE
NOT is_internal
)
SELECT
client_id,
ARRAY_AGG(user_agent) AS user_agent
FROM
user_agents
GROUP BY
client_id
Fails with:
Resources exhausted: Additional allocation failed with top memory consumers (across reservations) as:
GroupedHashAggregateStream[6] (array_agg(user_agents.user_agent))#44(can spill: true) consumed 2.9 MB,
GroupedHashAggregateStream[2] (array_agg(user_agents.user_agent))#36(can spill: true) consumed 2.2 MB,
GroupedHashAggregateStream[3] (array_agg(user_agents.user_agent))#38(can spill: true) consumed 2.2 MB.
Error: Failed to allocate additional 609.0 MB for GroupedHashAggregateStream[0] (array_agg(user_agents.user_agent)) with 0.0 B already allocated for this reservation - 75.3 MB remain available
for the total pool
Further inspection (adding debug prints) shows that we over-report memory in the array_agg
merge_batch
function:
Addr:0x418911c18b0 Data type:Utf8 Length: 1, Size of first value: 9, Mem size: 162362
[X][Utf8] 1 Size before: 56 + mem_size:162362 -> 9
[X][Utf8] 1 Size after: 162482
Expected behavior
Query should succeed within the memory limits
Additional context
Similar to: #16055