Skip to content

Latest commit

 

History

History
218 lines (180 loc) · 8.74 KB

0047-versioning-changes-for-hard-fork.md

File metadata and controls

218 lines (180 loc) · 8.74 KB

Summary

Remove unneeded version tags from Bin_prot-serialized data produced by ppx_version. Generate version tags on an as-needed basis where they're useful. Detect versioning violations in the CI version linter in a more useful way.

Motivation

Versioned data types are built from other versioned types and primitive OCaml types. When generating Bin_prot-serialized data from instances of those type, a version tag is generated not only for the containing type, but also for each constituent type. That strategy allows determining the version of any piece of the serialized data.

The Jane Street Async versioned RPC mechanism used in Mina_networking already provides versioning of data sent between nodes. Therefore, version-tagging of the serialized data is not needed for that use case. Removing the version tags will reduce the amount of data sent over the network.

Some precomputed blocks from Mainnet show the amount of data currently used by version tags:

Bin_io size Number of tags Proportion


  9956                  930            9%

1682720 121776 7% 1826633 131088 7% 1803712 129340 7%

We may wish to save serialized data and be able to determine its version. For that use case, a single "top" tag for the containing type suffices.

Some user-visible data, such as public keys, are formatted using the current serialization mechanism, where all contained types have version tags. For those few cases, allow use of the existing mechanism.

The CI version linter currently compares changes to versioned types against the PR base branch. That's more fine-grained than needed, if the goal of preventing version changes is to allow interoperability with deployed software. Instead, compare the types against the branch of the latest released software, as well as the base branch. That strategy will reduce the amount of noise produced by the linter, and provide more useful results.

Detailed design

Version-tagging changes

Currently, tagging a Stable module with %versioned produces shadowing bin_read_t and bin_write_t functions that read and write version tags, positive integers, prepended to serialized data, via the ppx_version library. The proposal is not to shadow those functions, and instead use those generated by deriving bin_io, directly. It is still useful to use %versioned to assure that serializations do not change for a given versioned type.

For backwards-compatibility, we can add the annotation

 [@@@with_all_version_tags]

to the legacy Stable.Vn modules. For each such module, generate a module within it:

 module With_all_version_tags = struct
   type t = ...

   let bin_read_t = ... bin_read_t ...
   let bin_write_t = ... bin_write_t ...
   ...
 end

where the shadowing functions are generated as they are today. For this case, the constituent types must also have the same annotation, which is enforced by the construction of the type t: the constituent types of t are of the form M.Stable.Vn.With_all_version_tags.t.

Within Stable modules, we can have the annotation

 [@@@with_top_version_tag]

to generate:

 module With_top_version_tag = struct
   type t = ...

   let bin_read_t = ... bin_read_t ...
   let bin_write_t = ... bin_write_t ...
   ...
 end

where the shadowing functions handle a single version tag to the serializations of outermost type instances. The constituent types need not have the annotation: the type t will be identical to the type t in the containing Vn module.

The existing versioning system generates a function in Stable modules:

 val bin_read_to_latest_opt : Bin_prot.Common.buf -> pos_ref:(int ref) -> Stable.Latest.t option

That function deserializes data by dispatching on version tags. Serializations may or may not have version tags. If there are neither all-tagged or top-tagged serializations, generate nothing. If there are all-tagged serializations, generate:

 val bin_read_all_tagged_to_latest : Bin_prot.Common.buf -> pos_ref:(int ref) -> Stable.Latest.t Or_error.t

This function will know only about the version in the legacy Vn modules where we've maintained all-tagging.

If there are top-tagged serializations, generate:

 val bin_read_top_tagged_to_latest : Bin_prot.Common.buf -> pos_ref:(int ref) -> Stable.Latest.t Or_error.t

This function will know about all the versions within a Stable module.

Because the default serialization does not contain version tags, it no longer makes sense to generate such a function in Stable. Instead, generate it inside With_all_version_tags and With_top_version_tag, if they're created. The return type should be changed to an Or_error.t type, to describe errors, instead of using an option type.

Version linting changes

The CI version linter looks for changes to versioned types in a PR branch by comparing those types with counterparts in the base branch. The linter will complain if a new version is added, and later modified, even if there are no deployed nodes using the new version.

Instead, the linter should use both the PR base branch and a release branch for comparison, according to this table:

Base diff Release diff Significance Warn?


N N No version changes N Y Y Version changed Y N Y Previous force merge N Y N Modified new version N

Under this regime, a new version can be added and modified as needed, without triggering a linter warning. If a PR contains an innocuous change to a versioned type, such as a field renaming, that can be accomplished with a force-merge, without triggering warnings in later PRs.

It should be enough to change the script ci_diff_types.sh to compare the PR branch against the release branch, in addition to the comparison made with $BASE_BRANCH_NAME. There may need to be a mechanism to automate that change when releases become available, which is beyond the scope of this RFC.

The version linter had been using the text of type definitions for comparisons, and the "Binable" linter had been using the text of the uses of the Binable functors. We can instead use Bin_prot shape digests for comparisons in both cases. For a type t, the digest is available as a string:

  Bin_prot.Shape.eval_to_digest_string bin_shape_t

With this approach to comparisons, we can use a single linter task in CI, rather than separate version and Binable linters.

Using shape digests to detect changes to serializations also means we no longer need unit tests for version-asserted types. ppx_version was checking for a For_tests module in that case, to remind that such tests were needed; that check can be removed.

Shape digests will also allow detecting changes to serializations provided by hand-written bin_io code. We may wish to add a module annotation like %versioned_custom for that use case. If the bin_shape_t for a type with a custom serialization is built with Bin_prot.Shape.basetype, the shape should use a new UUID if the serialization changes (effectively, a version), for change detection to be effective.

Drawbacks

The existing versioning mechanism is working acceptably well. The only drawback of making changes to the version-tagging is the effort to update ppx_version. That should not be a major effort, since it's mostly reorganizing existing code.

Rationale and alternatives

If these changes are not made, network performance will be less than it could be.

We could omit the top-tagging mechanism, since there isn't a specific use case for it now. Adding it later is possible, if it's needed for some purpose. Then again, adding that feature now while working on other code changes may be easier than deferring it.

We could phase out the all-tagged mechanism, by allowing two syntaxes for public keys and other affected types, and eventually disallowing the old syntax. That strategy may be too drastic to impose upon users.

Prior art

The relevant prior art is the existing versioning mechanism.

Unresolved questions

  • Do we need top-tagging?
  • What types will require all-tagging? Any user-visible Base58Check-encoded data created from Bin_prot-serializations will need that. One way to enforce this would be to have Codable.Make_base58_check take a module containing a value all_tagged, which would be contained in the generated With_all_version_tags modules.