-
Notifications
You must be signed in to change notification settings - Fork 1
Refactor to remove State
and make easier to use downstream
#1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+1,575
−1,288
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The same as Cenc::encode/decode except it also returns the # of bytes encoded and decoded
ed96a38
to
4ec35b4
Compare
State
and make easier to use downstream
ttiurani
approved these changes
May 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Feel free to merge and do the release procedure yourself!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a major rewrite to simplify the API of this crate. We have been using
Vec<u8>
(orBox<[u8]>
) as a buffer plus aState
struct to track where in the buffer we are encoding or decoding. Instead, now we just use a buffer, and take slices of it to track where we are encoding or decoding. It also changes how theCompactEncoding
trait works. Instead of doingimpl CompactEncoding<T> for State
we now doimpl CompactEncoding for T
.This came about because while working downstream crates I found the usage awkward. See
HypercoreState
and thisEncoder
trait, needed an object to to be mutable to encode it here. These things are removed by PR's to their respective repos by issues fixed by this PR. hypercore pr and hypercore-protocol PR.Further changes are explained below:
Implementing
CompactEncoding
ForVec<T>
whereT: CompactEncoding
There was a discussion in discord about having the
Vec<T>
impl's be automatically derived. There are problems with this.First, we often don't want the same encoding of
T
withinVec<T>
which is the case when we encode u32 as a variable width number, but in Vec, we want to encode the u32 as a fixed width number.Second: when we auto derive
CompacEncoding
forVec<T>
we end up with the same generic implementation for all T.This causes a problem for the
preencode
/encoded_size
step.Consider When T is always encoded with a fixed width the result is just:
(preencode(vec.len()) + (preencode(T)*vec.len())
this is O(1) but for variable sized T, (like for
Vec<String>
) we must do something like(prencode(vec.len()) + vec.fold(0, |acc, x| acc + preencode(x))
basically we must calculate the size of everything element in the vec and sum them. this is O(N)
To have a correct generic preencode we'd have to use latter, and we'd miss out on getting O(1) preencode for things like Vec
And we would not implement
CompactEncoding
for specificVec<T>
without creating a duplicate implementation that would conflict with a blanket impl forVec<T>
A solution:
VecEncodable
Instead I've added
VecEncodable
.You implement:
VecEncodable
forMyType
and you automatically getimpl CompactEncoding for Vec<MyType>
.This also let's you avoid external traits on external types issues, since you are implementing the trait on
MyType
, notVec<MyType>
I believe this orphan trait issue is why in the hypercore crate we added a
HypercoreState(State)
struct.VecEncodable
has a default impl for encode/decode and just requires that you implementvec_encoded_size
.However for the cases like
u32
where we want a variable width encoding for a standaloneu32
but a fixedwidth encoding forVec<u32>
we do implement the
VecEncodable::vec_encode
andvec_decode
methods to overide the default variable width encoding foru32
Note: We use the
vec_
prefix on methods to avoid having disambiguate whichencode
/decode
we want on types that implementCompactEncodable
which hasencode
anddecode
methodsI've also added
BoxedSliceEncodable8
which works the same asVecEncodable
but gives youBox<[MyType]>
.name changes
CompactEncoding::preencode
is replaced withCompactEncoding::encoded_size
. I make this distinction because 'preencode' implies that it is doing some preperatory work, previously 'preencode' would update state by adding to the 'end' value. But the 'encoded_size' does not do this it just returns the calculated encoded size.I kept the function naming pattern: encode_, decode_, etc.for fixed width. And also encode_var, decode_var for variable width
methods on
BoxedSliceEncodable
/VecEncodable
etc are pefixed withboxed_slice_*
,vec_*
to avoid name collisions with theCompactEncoding
trait's methods.ergonomics
In my opinion, these changes make the library much easier to use. I've also included some macro's that make encoding and decoding more declarative, and concise. An example from the rewritten docs:
Here is a comparison between implementing
CompactingEncoding
before and after this PR:Now with this PR:
more notes
CompactEncoding
impls - I've only included the implementation that existed in the existingcompact_encoding
crate. I noticed there were some that would be nice to have, but were missing (and not currently used downstream). I left these out they could be added later.write_array
,take_array
, etc) which make implementingCompactEncoding
in downstream libraries easier.FixedWidthEncoding
trait for details.Note that these changes do add more code. But I think without the added and updated tests and documentation there would actually be a small decrease it lines-of-code.
I propose releasing this as a major version bump.