Skip to content

Incorrect memory accounting in array_agg function #16517

Open
@sfluor

Description

@sfluor

Describe the bug

When aggregating with array_agg using a limited memory pool we hit ResourceExhausted errors in the array_agg accumulator.

After investigation this seems to come from over-accounting memory from shared arrow buffers. We might keep a single value from a 8k rows arrow array but will account the memory taken by the whole arrow array rather than just the value we kept.

To Reproduce

Run datafusion-cli with a memory limit datafusion-cli -m 100m

Create a table:

CREATE table logs AS

SELECT 
('Chrome-' || (random() * 1000)::int) as user_agent,
    (n.value % 40000 )::text as client_id,
    (n.value % 21 ==0) as is_internal
FROM
generate_series(1, 300000) n;

Try running a simple array_agg query:

WITH
  user_agents AS (
    SELECT 
    DISTINCT user_agent, client_id
    FROM
     logs
    WHERE
      NOT is_internal
  )
SELECT
  client_id,
  ARRAY_AGG(user_agent) AS user_agent
FROM
  user_agents
GROUP BY
  client_id

Fails with:

Resources exhausted: Additional allocation failed with top memory consumers (across reservations) as:
  GroupedHashAggregateStream[6] (array_agg(user_agents.user_agent))#44(can spill: true) consumed 2.9 MB,
  GroupedHashAggregateStream[2] (array_agg(user_agents.user_agent))#36(can spill: true) consumed 2.2 MB,
  GroupedHashAggregateStream[3] (array_agg(user_agents.user_agent))#38(can spill: true) consumed 2.2 MB.
Error: Failed to allocate additional 609.0 MB for GroupedHashAggregateStream[0] (array_agg(user_agents.user_agent)) with 0.0 B already allocated for this reservation - 75.3 MB remain available 
for the total pool

Further inspection (adding debug prints) shows that we over-report memory in the array_agg merge_batch function:

Addr:0x418911c18b0 Data type:Utf8 Length: 1, Size of first value: 9, Mem size: 162362
[X][Utf8] 1 Size before: 56 + mem_size:162362 -> 9
[X][Utf8] 1 Size after: 162482

Expected behavior

Query should succeed within the memory limits

Additional context

Similar to: #16055

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions