-
Notifications
You must be signed in to change notification settings - Fork 1k
[Parquet]Optimize the performance in record reader #8607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🤖 |
Thank you for this contribution @hhhizzz . I have kicked off the benchmarks for this PR as well |
🤖: Benchmark completed Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @hhhizzz -- this is a really nice contribution and very nice performance results.
I had a few comments and I would ilke to get another pair of eyes on the packed decoder changes, but otherwise 👍 from me
THanks again
🤖 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @hhhizzz -- assuming the benchmarks are still good this one looks good to go from my perspective
This is a nice find
Self(mask) | ||
} | ||
|
||
/// Mark the given [`Encoding`] as present in this mask. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
🤖: Benchmark completed Details
|
Which issue does this PR close?
Related to:
Rationale for this change
Improve the performance in ParquetRecoredBatchReader, especially when the
rowselector
is short.What changes are included in this PR?
For
parquet/src/arrow/array_reader/cached_array_reader.rs
, update the hash functionAre these changes tested?
The hashmaps are already covered by existing tests.
Also tested by manual read parquets.
Are there any user-facing changes?
No
Performance results in arrow_reader_row_filter.rs
on my 3950X