Skip to content

Conversation

@Semisol
Copy link
Contributor

@Semisol Semisol commented Aug 28, 2025

This PR adds an experimental tuple decoder that is optimized for lower allocations and higher performance, support for fixed-length byte strings, along with fixing some panics.

This should be kept experimental until a way to handle the potentially breaking changes is found (possibly combining it with other breaking binding changes in a Go bindings "v2")

Over the original decoder, the new decoder attempts to zero-copy strings, and uses a new Boxed type which requires zero allocations compared to the any type which requires its contents are on the heap.
A neat hack relating to on-stack slice allocation is also used to remove allocations from append that arise from tuples while not having to estimate the length of the tuple for capacity.

The tests are also improved to ensure unpacking works correctly, and benchmarks are added.

Benchmarking

To run benchmarks: go test -run=NONE -bench=^BenchmarkTupleUnpack

To-do list

  • Improve code documentation
  • Add fuzz tests that were used to discover panics that should have been errors when unpacking tuples
  • Handle the breaking change to google/uuid (we may be able to just alias tuple.UUID to it)
  • Make the V2 decoder the primary decoder, and remove the original decoder's code before merge.
  • Handle the fact that some users may expect Unpack to not depend on the input buffer. This could be "fixed" by copying the input buffer or by making a breaking v2 release (alongside possibly other binding improvements).
  • Add support for big.Int. The new decoder does not support this right now.
  • Add support for end-of-tuple type encoding and returning bytes after EOT
  • Consider adding support for 64-bit identifier type

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

This commit:
- Requires a minimum Go version 1.20 to allow using newer features
  in future commits
- Adds support for the length-prefixed bytes type
- Fixes certain panics detected by fuzzing on Unpack
- Switches to binary.BigEndian methods for decoding integers which
  eliminates allocations and is faster
- Switches to google/uuid for UUIDs to reduce conversion boilerplate
  as it seems to be the de-facto library for UUIDs
- Does some other refactors for a future tuple "builder" to reduce
  allocations and eliminate type-casting overhead
This commit adds an experimental tuple unpacker to the Go bindings
which reduces unnecessary allocations and tries to optimize decoding.

Tests for the FixedLen type are also added, alongside extending tests
to ensure the unpacked value is identical to the starting value and
adding benchmarks, and negative numbers.

- Zero-copy is used for strings and byte slices whenever possible to
  eliminate allocations.
- Suprisingly, a loop was faster than using standard library functions
  to locate the end of a string and check.
- A function table was used as it could allow extensibility in the
  future and may be faster on certain platforms.
- A custom Boxed type is used, which is larger than an interface{}
  but does not require contents to be allocated on the heap. This
  is effectively a tagged union.
- A neat hack is used in the unpacking loop to only do 1 allocation
  in the majority of tuple decoding cases.
@Semisol
Copy link
Contributor Author

Semisol commented Aug 28, 2025

I should also note that I have improvements lined up for the Directory Layer which is a breaking change and the futures API.

For the directory layer this is a breaking change and partly diverges from the API used by other bindings, as it no longer supports relative move/delete operations on directories. This can be a hazard when directories are attempted to be used across transactions, as it assumes the directory path has not changed.

@Semisol Semisol deleted the branch apple:main August 28, 2025 22:35
@Semisol Semisol closed this Aug 28, 2025
@Semisol Semisol deleted the main branch August 28, 2025 22:35
@Semisol
Copy link
Contributor Author

Semisol commented Aug 28, 2025

Moved to #12338

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant