-
Notifications
You must be signed in to change notification settings - Fork 2k
feat: reorder row groups by statistics during sort pushdown #21580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a013bf6
c9a6c26
5018882
556b590
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -16,7 +16,12 @@ | |||||||||||||||||||
| // under the License. | ||||||||||||||||||||
|
|
||||||||||||||||||||
| use crate::sort::reverse_row_selection; | ||||||||||||||||||||
| use arrow::datatypes::Schema; | ||||||||||||||||||||
| use datafusion_common::{Result, assert_eq_or_internal_err}; | ||||||||||||||||||||
| use datafusion_physical_expr::expressions::Column; | ||||||||||||||||||||
| use datafusion_physical_expr_common::sort_expr::LexOrdering; | ||||||||||||||||||||
| use log::debug; | ||||||||||||||||||||
| use parquet::arrow::arrow_reader::statistics::StatisticsConverter; | ||||||||||||||||||||
| use parquet::arrow::arrow_reader::{RowSelection, RowSelector}; | ||||||||||||||||||||
| use parquet::file::metadata::{ParquetMetaData, RowGroupMetaData}; | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
|
@@ -377,6 +382,120 @@ impl PreparedAccessPlan { | |||||||||||||||||||
| }) | ||||||||||||||||||||
| } | ||||||||||||||||||||
|
|
||||||||||||||||||||
| /// Reorder row groups by their min statistics for the given sort order. | ||||||||||||||||||||
| /// | ||||||||||||||||||||
| /// This helps TopK queries find optimal values first. For ASC sort, | ||||||||||||||||||||
| /// row groups with the smallest min values come first. For DESC sort, | ||||||||||||||||||||
| /// row groups with the largest min values come first. | ||||||||||||||||||||
| /// | ||||||||||||||||||||
| /// Gracefully skips reordering when: | ||||||||||||||||||||
| /// - There is a row_selection (too complex to remap) | ||||||||||||||||||||
| /// - 0 or 1 row groups (nothing to reorder) | ||||||||||||||||||||
| /// - Sort expression is not a simple column reference | ||||||||||||||||||||
| /// - Statistics are unavailable | ||||||||||||||||||||
| pub(crate) fn reorder_by_statistics( | ||||||||||||||||||||
| mut self, | ||||||||||||||||||||
| sort_order: &LexOrdering, | ||||||||||||||||||||
| file_metadata: &ParquetMetaData, | ||||||||||||||||||||
| arrow_schema: &Schema, | ||||||||||||||||||||
| ) -> Result<Self> { | ||||||||||||||||||||
| // Skip if row_selection present (too complex to remap) | ||||||||||||||||||||
| if self.row_selection.is_some() { | ||||||||||||||||||||
| debug!("Skipping RG reorder: row_selection present"); | ||||||||||||||||||||
| return Ok(self); | ||||||||||||||||||||
| } | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Nothing to reorder | ||||||||||||||||||||
| if self.row_group_indexes.len() <= 1 { | ||||||||||||||||||||
| return Ok(self); | ||||||||||||||||||||
| } | ||||||||||||||||||||
|
|
||||||||||||||||||||
| // Get the first sort expression | ||||||||||||||||||||
| // LexOrdering is guaranteed non-empty, so first() returns &PhysicalSortExpr | ||||||||||||||||||||
| let first_sort_expr = sort_order.first(); | ||||||||||||||||||||
|
Comment on lines
+414
to
+415
|
||||||||||||||||||||
| // LexOrdering is guaranteed non-empty, so first() returns &PhysicalSortExpr | |
| let first_sort_expr = sort_order.first(); | |
| let first_sort_expr = match sort_order.iter().next() { | |
| Some(expr) => expr, | |
| None => { | |
| debug!("Skipping RG reorder: empty sort order"); | |
| return Ok(self); | |
| } | |
| }; |
Copilot
AI
Apr 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For DESC ordering, reordering by min values is often a poor proxy for “row group likely contains the largest values first”; typically you want to sort by max when descending == true (and by min when ascending). This can significantly reduce the intended TopK benefit (and can even choose a worse first row group when ranges overlap). Consider switching to row_group_maxs(...) for descending order, and update the doc comment (currently mentions “min/max”) and the DESC unit test accordingly.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @adriangb had the great idea to also order by grouping keys which can
Doesn't have to be in this PR but perhaps we can think about how it fits in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dandandan for review! That's a great extension. The reorder_by_statistics method is generic enough to take any LexOrdering — it doesn't need to be tied to TopK specifically. So extending this for GROUP BY should be a matter of:
Happy to track this as a follow-up issue. Will open one after this PR lands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dandandan! Created #21581 to track this. The existing infrastructure from this PR should be directly reusable — mainly needs the aggregate planner to populate
sort_order_for_reorderfrom grouping keys.