feat: Support prefetch in index lookup join #12611

xiaoxmeng · 2025-03-11T21:00:39Z

Summary:
This PR adds prefetch for index join with query config. The index join operator can prefetch up to the
configured prefetch limit to enable: (1) parallel prefetches at backend for parallel execution (1-1) in case
of multiple backend shards or (1-2) enable backend to batch multiple requests to improve throughput;
(2) pipeline the table scan and index lookup execution in the same driver pipeline. The table scan is sync
executed while index lookup is async. This achieve pipelining without relying exchange which might cause
non-deterministic execution and Meta internal use case needs deterministic execution for checkpointing.
With Meta internal testing, this can achieve 2x throughput improvement (measured in rows per second)
with 33% memory overhead with up to 4 batches prefetch.

The follow is to add memory based prefetch throttling to integrate with Meta internal
ML use case and memory pool wiring to ease performance (memory overhead) analysis

Differential Revision: D70909786

facebook-github-bot · 2025-03-11T21:00:47Z

This pull request was exported from Phabricator. Differential Revision: D70909786

netlify · 2025-03-11T21:00:58Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`71a1913`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/67d12f53d5d6b3000831a16c

Summary: This PR adds prefetch for index join with query config. The index join operator can prefetch up to the configured prefetch limit to enable: (1) parallel prefetches at backend for parallel execution (1-1) in case of multiple backend shards or (1-2) enable backend to batch multiple requests to improve throughput; (2) pipeline the table scan and index lookup execution in the same driver pipeline. The table scan is sync executed while index lookup is async. This achieve pipelining without relying exchange which might cause non-deterministic execution and Meta internal use case needs deterministic execution for checkpointing. With Meta internal testing, this can achieve 2x throughput improvement (measured in rows per second) with 33% memory overhead with up to 4 batches prefetch. The follow is to add memory based prefetch throttling to integrate with Meta internal ML use case and memory pool wiring to ease performance (memory overhead) analysis Reviewed By: wenqiwooo Differential Revision: D70909786

facebook-github-bot · 2025-03-12T06:16:09Z

This pull request was exported from Phabricator. Differential Revision: D70909786

Summary: This PR adds prefetch for index join with query config. The index join operator can prefetch up to the configured prefetch limit to enable: (1) parallel prefetches at backend for parallel execution (1-1) in case of multiple backend shards or (1-2) enable backend to batch multiple requests to improve throughput; (2) pipeline the table scan and index lookup execution in the same driver pipeline. The table scan is sync executed while index lookup is async. This achieve pipelining without relying exchange which might cause non-deterministic execution and Meta internal use case needs deterministic execution for checkpointing. With Meta internal testing, this can achieve 2x throughput improvement (measured in rows per second) with 33% memory overhead with up to 4 batches prefetch. The follow is to add memory based prefetch throttling to integrate with Meta internal ML use case and memory pool wiring to ease performance (memory overhead) analysis Reviewed By: wenqiwooo Differential Revision: D70909786

facebook-github-bot · 2025-03-12T06:53:14Z

This pull request was exported from Phabricator. Differential Revision: D70909786

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 11, 2025

facebook-github-bot added the fb-exported label Mar 11, 2025

xiaoxmeng force-pushed the export-D70909786 branch from a6ae1ec to 8798a2f Compare March 12, 2025 06:16

xiaoxmeng force-pushed the export-D70909786 branch from 8798a2f to 71a1913 Compare March 12, 2025 06:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support prefetch in index lookup join #12611

feat: Support prefetch in index lookup join #12611

xiaoxmeng commented Mar 11, 2025

facebook-github-bot commented Mar 11, 2025

netlify bot commented Mar 11, 2025 •

edited

Loading

facebook-github-bot commented Mar 12, 2025

facebook-github-bot commented Mar 12, 2025

feat: Support prefetch in index lookup join #12611

Are you sure you want to change the base?

feat: Support prefetch in index lookup join #12611

Conversation

xiaoxmeng commented Mar 11, 2025

facebook-github-bot commented Mar 11, 2025

netlify bot commented Mar 11, 2025 • edited Loading

✅ Deploy Preview for meta-velox canceled.

facebook-github-bot commented Mar 12, 2025

facebook-github-bot commented Mar 12, 2025

netlify bot commented Mar 11, 2025 •

edited

Loading