ORC File Reading with Predicate Pushdown Flow (with compression enabled) #11247
abdulwadood97
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
If I want to read an ORC file with one stripe only (generated with zlib compression enabled and a 256KB compression block size) with predicate pushdown, will it select which blocks to decompress, or will it decompress all of them and then select which rows to process based on row group statistics? If it selects the blocks then what is the logic? How can it know which block will have the data?
Similarly what about decoding? I can think of selecting decoding blocks is easier bcz when you read header you can know how many numbers are there and you can skip. I want to know does it actually happen?
Beta Was this translation helpful? Give feedback.
All reactions