-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Which part is this question about
https://github.com/apache/arrow-rs/blob/55.2.0/parquet/src/bloom_filter/mod.rs
Describe your question
Hi team!
I’d love to reuse the existing Split Block Bloom Filter implementation in parquet to back a byte-level cache in a downstream service of our team.
The problem is that the two helpers that convert between raw bitsets and Sbbf instances are pub(crate), so external crates can’t create an Sbbf from bytes or emit its bytes without re-implementing your logic.
Would you be open to relaxing the visibility and making the following methods pub (without changing their signatures)?
pub(crate) fn new(bitset: &[u8]) -> Self;
pub(crate) fn write<W: Write>(&self, mut writer: W) -> Result<(), ParquetError>
That would let us deserialize SBBFs straight from storage and re-serialize them after caching, while reusing the canonical implementation from this crate.
Thanks,
Additional context