KAFKA-19767 - Send Share-Fetch one-node at a time for record_limit mode #20855

ShivsundarR · 2025-11-10T14:57:54Z

What

After KIP-1206, when record_limit mode was introduced, we ideally
want no more than the #records in the maxRecords field in
ShareFetchRequest.
Currently, the client broadcasts the share fetch requests to all nodes
which host the leaders of the partitions that it is subscribed to.
The application thread would be woken up after the first response
arrives, but meanwhile the responses from other nodes could bring in
those many #records next and would wait in the buffer, that would mean
we are wasting the acquisition locks for these records which are
waiting.
Instead we would want to only send the next request when we poll
again.
PR aims to send the request to only 1 node at a time in record_limit
mode.
We are using partition-rotation on each poll so that no partition is
starved.

There were NCSS checkstyle errors in ShareConsumeRequestManagerTest,
so added a few refactors there to reduce the length.

Performance

When we have more consumers than the #partitions(i.e when real sharing
of data happens in a partition), then we are seeing the performance is
almost the same as the current approach. But when we have lesser
consumers than the #partitions, then we see a performance regression as
client is waiting for a node to return a response before it can send the
next request.
Hence we have introduced this only for record_limit mode for now,
future work will be done to improve this area.

ShivsundarR added 2 commits November 10, 2025 19:48

Implement one-node logic for record_limit mode

e40e488

Remove local file

de59e7a

github-actions bot added triage PRs from the community consumer clients labels Nov 10, 2025

AndrewJSchofield added KIP-932 Queues for Kafka ci-approved and removed triage PRs from the community labels Nov 10, 2025

AndrewJSchofield self-requested a review November 10, 2025 16:04

NCSS check fix

5dc4720

AndrewJSchofield mentioned this pull request Nov 10, 2025

[WIP] KAFKA-19767: Share fetch from one node at a time #20667

Closed

Provide feedback