Skip to content

Conversation

@mhaseeb123
Copy link
Member

@mhaseeb123 mhaseeb123 commented Sep 30, 2025

Description

Closes #20125

This PR allows evaluating IS_NULL(col) and by extension NOT(IS_NULL(col)) expressions while filtering Parquet row groups and data pages (requires page index) using corresponding statistics. The PR also includes optimizations in case the host columns containing page-stats don't contain any nulls.

Checklist

  • Optimize page_stats_caster and row_group_stats_caster to use has_is_null_operator
  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 30, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Sep 30, 2025
@mhaseeb123 mhaseeb123 added feature request New feature or request 2 - In Progress Currently a work in progress cuIO cuIO issue non-breaking Non-breaking change libcudf Affects libcudf (C++/CUDA) code. and removed libcudf Affects libcudf (C++/CUDA) code. labels Sep 30, 2025
@GregoryKimball GregoryKimball moved this to Burndown in libcudf Oct 17, 2025
@mhaseeb123 mhaseeb123 added 4 - Needs Review Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Oct 21, 2025
@karthikeyann karthikeyann requested a review from devavret October 22, 2025 19:38
Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I cannot find anything to complain about.

@mhaseeb123 mhaseeb123 added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs Review Waiting for reviewer to review or respond labels Oct 27, 2025
@mhaseeb123
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 3435ee7 into rapidsai:main Oct 27, 2025
378 of 384 checks passed
@mhaseeb123 mhaseeb123 deleted the fea/evaluate-is-null-using-stats branch October 27, 2025 23:07
@karthikeyann karthikeyann moved this from Burndown to Slip in libcudf Nov 12, 2025
@karthikeyann karthikeyann moved this from Slip to Landed in libcudf Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

5 - Ready to Merge Testing and reviews complete, ready to merge cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change

Projects

Status: Landed

Development

Successfully merging this pull request may close these issues.

[FEA] Support unary operators in Parquet row group and page filtering

3 participants