Skip to content

Latest commit

 

History

History
146 lines (117 loc) · 4.89 KB

0013-rpc-versioning.md

File metadata and controls

146 lines (117 loc) · 4.89 KB

Summary

We propose a mechanism to allow updating the versions of remote procedure calls (RPCs) between Coda protocol nodes.

Motivation

Coda uses a remote procedure call (RPC) mechanism for nodes to query other nodes, and to broadcast messages to the gossip network. As the codebase evolves, the structure of those RPC messages may evolve, and new kinds of messages may be added. Nodes running different versions of the software need to be able to communicate.

When querying, the caller and callee nodes may be using different versions of RPC calls. The caller can be running the newer version, and the callee the older version, or vice-versa. Both scenarios have to be accommodated.

Detailed design

The Jane Street Async library contains a module Versioned_rpc with the machinery to allow evolution of RPC call versions.

In the coda_network library, the Versioned_rpc library is used in two ways, for queries and for broadcasting.

Queries

For queries, the pattern is (simplifying somewhat):

  module Query = struct

    module T = struct
      let name = ...
      module T = struct
        type query = ...
        type response = ... option
      end
      module Caller = T
      module Callee = T
    end

    include Versioned_rpc.Both_convert.Plain.Make (T)

    module V1 = struct
      module T = struct
        type query = ...
        type response = ... option
        let version = 1
        let query_of_caller_model = Fn.id
        let callee_model_of_query = Fn.id
        let response_of_callee_model = Fn.id
        let caller_model_of_response = Fn.id
     end
     include Register (T)
   end

  end

The name identifies the particular RPC query. Calling the functor Plain.Make creates the other functor Register called within V1. The module T.T offers types for a query and response, and in this code, both the caller and callee agree on those types (they could differ, in theory).

The four functions implemented here with Fn.id, the identity function, are coercions between the query and response types in T.T and V1.T. In the existing RPC queries, those types are are the same, so we can use the identity function.

For a given query, if we wish to update the protocol, we'd add:

  module V2 = struct
    module T = struct
      type query = ...
      type response = ... option
      let version = 2
      let query_of_caller_model : T.Caller.query -> query = ...
      let callee_model_of_query : query -> T.Callee.query = ...
      let response_of_callee_model : T.Callee.response -> response = ...
      let caller_model_of_response : reponse -> T.Caller.response = ...
   end
   include Register (T)
 end

The types of the coercions are shown. For each coercion, the input and output types could differ.

There could be additional new modules for subsequent versions. Eventually, versions could be pruned from the code, to encourage nodes to upgrade their software. When a new query version is created, the Vn module for the previous version could have an annotation:

  [@@remove_after "20200702"]

where the annotation is implemented via a ppx. Compiling after the given date results in a warning. A year or so past the introduction of the new version might be a suitable date for removing the previous version.

The query modules are used in a list of "implementations". To define an implementation, we need a function of type:

  Host_and_port.t -> version:int -> T.Caller.query -> T.Callee.response option Deferred.t

which does the work within the node to respond to the query. The host and port represent the "connection state" of the TCP connection between the nodes, which is the host and ephemeral port of the caller. The version passed is the caller's. In theory, these functions could dispatch on the version. Instead, the version should be considered informative, and the real accomodation between versions should happen in the coercions. Therefore, the implementation functions do not need to change between versions.

Broadcasting

The RPC versioning mechanism for broadcasting is similar, except that instead of query and response types, there is a "msg" type. The versioning module defines coercions

  val msg_of_caller_model : Caller.msg -> msg
  val callee_model_of_msg : msg -> Callee.msg

In the V1 module, those are both Fn.id. For a new version, we'd create a new Vn module with a new version number and appropriate coercions. As for queries, we'd want to indicate a removal date for earlier-version modules.

Drawbacks

Nodes using a version implemented by a versioning module cannot communicate with nodes where that module has been removed. That's a feature, really, although perhaps a temporary inconvenience for nodes that haven't upgraded their software.

Prior art

The Jane Street version RPC library is already in the Coda codebase.

Unresolved questions

The versioning mechanism described here has not been tested locally.