Replies: 5 comments
-
@zhouyuan Currently Parquet decompression is separate from DWRF and ORC. I think it would be better to provide the same decompressors for all file formats, not just Parquet. And it's better to have a central location where the decompressed input streams are created. I left a comment in your draft PR about this. What do you think? @Yuhta |
Beta Was this translation helpful? Give feedback.
-
This is related to #4411 . Besides these two efforts, Im also working on a customized decompressor. I think it's time to think of a unified solution for decompressor managements. @Yuhta @zhouyuan Shall we discuss this in the Parquet reader meeting this Friday? Velox native parquet reader project meeting |
Beta Was this translation helpful? Give feedback.
-
CC: @oerling |
Beta Was this translation helpful? Give feedback.
-
@yingsu00 @mbasmanova thanks, -yuan |
Beta Was this translation helpful? Give feedback.
-
I think it's good to consolidate the decompressors. Only one thing to keep in mind is that We could merge QPL support first though, since the integration point is not large, only a few lines in |
Beta Was this translation helpful? Give feedback.
-
Description
Hi Velox community,
Parquet w/ GZIP compression is still widely used in some production envs as it can provide high compression ratio. However gzip based I/O throughput will slower comparing with new codec like lz4, zstd.
ISA-L provides optimized low-level functions(CRC, Compression, Erasure coding) which are commonly used in storage applications. Its igzip provides much better performance on compression/decompression, and is also compatible with zlib based algorithms. With this new API, gzip performance is almost able to catch up with zstd[1].
I have made a draft patch(#4533) on parquet gzip decompression code path and its performance did improve a lot.
[1] https://s3.us-east-2.amazonaws.com/intel-builders/day_2_intelligent_storage_acceleration_library.pdf
CC: @Yuhta @mbasmanova
Thanks, -yuan
Beta Was this translation helpful? Give feedback.
All reactions