Merkleize · bigspider · Feb 20, 2024 · Feb 19, 2024 · Feb 20, 2024
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,5 @@
+This folder contains high level documentation on the MATT framework, and the core concepts used in the examples of smart contracts implemented in this repository.
+
+- [matt](./matt.md)<br>Start here for a general introduction to MATT.
+- [contracts](./contracts.md)<br>How to build smart contracts based on state machines
+- [checkcontractverify](./checkcontractverify.md)<br>Draft specs for the `OP_CHECKCONTRACTVERIFY` opcode.
diff --git a/docs/checkcontractverify.md b/docs/checkcontractverify.md
@@ -0,0 +1,215 @@
+# Introduction
+
+The content of this page describes the current semantics of `OP_CHECKCONTRACTVERIFY`.
+
+Together with `OP_CAT`, it is a straightforward implementation of MATT in bitcoin.
+
+## P2TR and _augmented_ P2TR
+
+In a P2TR scriptPubKey, the output key is computed from an _internal pubkey_ `pk` and a `taptree`, as:
+
+```
+output_key = taproot_tweak(pk, taptree)
+```
+
+We call an _augmented_ P2TR any P2TR where the _internal pubkey_ is, in turn, computed from a _naked pubkey_ `naked_pk`, tweaked with some embedded `data`:
+
+```
+pk = tweak(naked_pk, data)
+```
+
+`OP_CHECKCONTRACTVERIFY` allows to verify that the `scriptPubkey` of an input or an output is a certain P2TR Script, possibly _augmented_ with some embedded data.
+
+The embedded data is a 32-byte value.
+
+
+## `OP_CHECKCONTRACTVERIFY`
+
+This section describes the semantics of the `OP_CHECKCONTRACTVERIFY` opcode, as currently implemented in the [docker container for MATT](https://github.com/Merkleize/docker).
+
+### Description
+
+`OP_CHECKCONTRACTVERIFY` is only active for scripts spending a Segwit version 1 input.
+
+Get `data`, `index`, `pk`, `taptree`, `flags` from the stack (bottom-to-top).
+
+`OP_CHECKCONTRACTVERIFY` verifies that the scriptPubKey of the input/output with the given `index` is a P2TR script with a pubkey obtained by the x-only pubkey `pk`, optionally tweaked with `data`, optionally taptweaked with `taptree`. The `CIOCV_FLAG_CHECK_INPUT` determines if the `index` refers to an input or an output. Special values for the parameters, are listed below.
+
+The `flags` parameter alters the behaviour of the opcode. If negative, the opcode checks the `scriptPubkey` of an input; otherwise, it checks the `scriptPubkey` of an output. The following value for the `flags` is currently the only one defined for inputs:
+
+- `CCV_FLAG_CHECK_INPUT = -1`: makes the opcode check an input.
+
+Non-negative values make the opcode check an output, and different values have different behaviour in the way the output's amount (`nValue`) is checked. The following values for the `flags` are currently defined for checking an output:
+
+- `0`: default behavior, the (possibly residual) amount of this input must be present in the output. This amount
+- `CCV_FLAG_IGNORE_OUTPUT_AMOUNT = 1`: For outputs, disables the default deferred checks on amounts defined below. Undefined when `CCV_FLAG_CHECK_INPUT` is present.
+- `CCV_FLAG_DEDUCT_OUTPUT_AMOUNT = 2`: Fail if the amount of the output is larger than the amount of the input; otherwise, subtracts the value of the output from the value of the current input in future calls top `OP_CHECKCONTRACTVERIFY`.
+
+The following values of the parameters are special values:
+- If `pk` is empty, it is replaced with the NUMS x-only pubkey `0x50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0` defined in [BIP-0340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki).
+- If `pk` is `-1`, it is replaced with the current input's internal key.
+- If `index` is `-1`, it is replaced with the current input index.  
+- If `data` is empty, the data tweak is skipped.
+- If `taptree` is empty, the taptweak is skipped.
+- If `taptree` is `-1`, the taptree of the current input is used for the taptweak.
+
+The following additional deferred checks are performed after the validation of all inputs is completed:
+- The amount of each output must be at least equal to the sum of the amount of all the inputs that have a `CCV` for that output with the default flag (equal to `0`).
+- No output that is a target of `CCV_FLAG_DEDUCT_OUTPUT_AMOUNT` can also be the target of another `OP_CHECKCONTRACTVERIFY`, unless it's with the `CCV_FLAG_IGNORE_OUTPUT_AMOUNT`.
+
+### Pseudocode
+
+Semantics (initialization before input evaluation):
+```python
+  for in_index in range(n_inputs)
+    in_ccv_amount[in_index] = inputs[in_index].amount
+
+  for out_index in range(n_outputs)
+    out_min_amount[out_index] = 0
+```
+
+
+Semantics (per input):
+
+```python
+if flags < CCV_FLAG_CHECK_INPUT or flags > CCV_FLAG_DEDUCT_OUTPUT_AMOUNT:
+  return success()  # undefined flags are OP_SUCCESS
+
+if index == -1:
+  index = current_input_index
+
+if flags == CCV_FLAG_CHECK_INPUT:
+  if index < 0 or index >= n_inputs:
+    return fail()
+
+  target = inputs[index].scriptPubKey
+else:
+  if index < 0 or index >= n_outputs:
+    return fail()
+
+  target = outputs[index].scriptPubKey
+
+if taptree == <-1>:
+  taptree = current_input_taptree
+
+if pk == <0>:
+  result = BIP340_NUMS_KEY
+elif pk == <-1>:
+  result = current_input_internal_key 
+elif len(pk) == 32:
+  result = pk
+else:
+  return fail()
+
+if data != <0>:
+  if len(data) != 32:
+    return fail()
+
+  result = tweak(result, data)
+
+if len(taptree) != 0:
+  if len(taptree) != 32:
+    return fail()
+
+  result = taptweak(result, taptree)
+
+if target != P2TR(result)
+  return fail()
+
+if flags == 0:
+  out_min_amount[index] += in_ccv_amount[current_input_index]
+elif flags == CCV_FLAG_DEDUCT_OUTPUT_AMOUNT:
+  if in_ccv_amount[current_input_index] > outputs[index].amount:
+    return fail()
+  in_ccv_amount[current_input_index] -= outputs[index].amount
+
+stack.pop(5)  # drop all 5 stack elements
+```
+
+Semantics (deferred, checks after all inputs are validated successfully):
+
+```python
+
+  for out_index in range(n_outputs):
+    if outputs[out_index].amount < out_min_amount[out_index]:
+      return fail()
+
+  if an_output_was_used_both_with_default_behavior_and_with_DEDUCT_OUTPUT_AMOUNT_semantics():
+    return fail()
+```
+
+## Common patterns
+
+Here are some examples for the most common combination of parameters.
+
+### Check that some data is embedded in the current input
+
+This is used to check data that was typically committed to in an output from a covenant-encumbered spend that produced the current input. 
+
+```
+<data=data> <index=-1> <pk=naked_pk> <taptree=-1> <flags=CCV_FLAG_CHECK_INPUT> CCV
+```
+
+### Check that a certain output with index `out_i` is a certain contract with specified data, preserving input amount
+
+This might be used for a 1-to-1 or many-to-1 covenant-encumbered spend: one or several inputs are spent to an output with certain code and data.
+
+```
+<data=data> <index=out_i> <pk=output_naked_pk> <taptree=output_taptree> <flags=0> CCV 
+```
+
+### Check that the output with the same index as the current input is a certain contract with specified data, preserving input amount
+
+This is a common pattern for 1-input-1-output contracts, as it allows flexibility when creating the transaction. Typically, this would be one after checking the current input's data using the [standard pattern](#check-that-some-data-is-embedded-in-the-current-input).
+
+Many spends of this kind could easily be batched in the same transaction, possibly together with other unencumbered inputs/outputs.
+
+```
+<data=data> <index=-1> <pk=output_naked_pk> <taptree=output_taptree> <flags=0> CCV 
+```
+
+### Check that a certain output with index `out_i` is a a P2TR with a pubkey `output_pk`, preserving amount:
+
+A simpler case where we just want the output to be a certain P2TR output, without any embedded data.
+
+```
+<data=<>> <index=out_i> <pk=output_pk> <taptree=<>> <flags=0> CCV
+```
+
+
+## Advanced patterns
+
+The examples in this section use some less common use cases of `OP_CHECKCONTRACTVERIFY`.
+
+### Check that some other input with index `in_i` is a specific contract with embedded data:
+
+This allows to "read" the data of another input.
+
+```
+<data=input_data> <index=in_i> <pk=input_i_naked_pk> <taptree=input_taptree><flags=CCV_FLAG_CHECK_INPUT> CCV
+```
+
+### Subtract the amount of output `out_i` from the current input
+
+This checks the _data_ and _program_ of an output, an subtracts the value of this output from the value of the current input. The residual value of the current input will be used in further calls to `OP_CHECKCONTRACTVERIFY`.
+
+This allows the pattern of sending some amount to one or more specified destination, and then separately decide where to send any residual value.
+
+```
+<data=data> <index=out_i> <pk=output_naked_pk> <taptree=output_taptree> <flags=CCV_FLAG_DEDUCT_OUTPUT_AMOUNT> CCV 
+```
+
+### Check that a certain output with index `out_i` is a certain contract with specified data; don't check amount
+
+This could be used to check _data_ and _program_ of an output, but not its amount (which might be either irrelevant, or is checked via a different introspection opcode).
+
+```
+<data=data> <index=out_i> <pk=output_naked_pk> <taptree=output_taptree> <flags=CCV_FLAG_IGNORE_OUTPUT_AMOUNT> CCV 
+```
+
+### Check that the input is sent exactly to the same scriptPubKey
+This requires that the output with the same index as the current input is exactly the same script, and with the same amount.
+
+```
+<data=<>> <index=-1> <pk=-1> <taptree=-1> <flags=0> CCV
+```
diff --git a/docs/contracts.md b/docs/contracts.md
@@ -0,0 +1,155 @@
+# Introduction
+
+MATT allows to define on-chain protocols in the UTXO model by using the covenant encumbrance.
+
+In such protocols, the UTXOs themselves contain the _state_ of the contract. The state is updated by spending some UTXOs and producing new UTXOs with a different state - with rules that are encoded in the UTXO itself.
+
+This page documents the general framework that is used in the demos.
+
+See [checkcontractverify.md](checkcontractverify.md) for the semantics of `OP_CHECKCONTRACTVERIFY` and the concept of _augmented_ UTXOs.
+
+## Contracts, programs and clauses
+
+The internal pubkey (or the _naked_ pubkey for an augmented P2TR), together with the taptree, constitutes the ___program___ of the contract, which encodes all the spending conditions of the contract.
+
+An actual UTXO whose `scriptPubKey` is a program, possibly with some specified embedded _data_, is a ___contract instance___.
+
+We call ___clause___ each of the spending conditions in the taptree of. Each clause might also specify the state transition rules, by defining the program of one or more of the outputs.<br>
+The keypath, if not a NUMS (Nothing-Up-My-Sleeve) point, can also be considered an additional special clause with no condition on the outputs.
+
+### Merklelized data
+
+While the embedded _data_ of the contract is a 32-byte value, it is always possible to represent arbitrary collections of value and use the 32-byte slot to store a _commitment_ to the data.
+
+We can think as a contract instance as a UTXO that _stores_ some arbitrary state. The exact representation of the state is an implementation detail, but here are some rules of thumb:
+
+- A single 32-byte value is stored as-is.
+- For a single value that is not 32-bytes long, the SHA256 hash is the embedded data.
+- If multiple values are part of the contract state, they can be encoded as the leaves of a [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree), and only the root hash is stored.
+
+___Remark___: other ways of committing to a collection of values are possible, and sometimes more efficient. For example, _the SHA256-hash of the concatenation of the SHA256-hashes of the elements_ is indeed more efficient than Merkle trees if all the values need to be revealed anyway. Care is necessary as __not all ways of concatenating/hashing a collection of elements are safe__, as some are prone to collisions.
+
+## Smart contracts as finite state machines
+
+Clauses of a contract can specify that certain outputs must be certain other contracts, and their embedded data.
+
+This allows to represent UTXO-based smart contract protocols as finite state machines, where each node represents a contract, and its clauses specify the transitions to 0 or more other contracts.
+
+For many constructions, spending the UTXO of a contract produces one or more pre-determined contracts as its output UTXOs. In this case, the resulting diagram is acyclic and is (or can be reduced to) a *Directed Acyclic Graph* (DAG)
+
+Some contracts might have an output with the _same_ contract as the input being spent. In that case, the diagram is not a DAG, but only loops from a node to itself are allowed; this avoids impossible hash cycles.
+
+Here's an example of a [vault](https://github.com/Merkleize/pymatt/tree/master/examples/vault):
+
+```mermaid
+graph LR
+    VAULT -->|recover| R[ ]
+    VAULT -->|"trigger\n(ctv_hash)"| UNVAULTING
+    VAULT ---|"trigger_and_revault\n(ctv_hash)"| X( )
+    X --> VAULT
+    X --> UNVAULTING
+    style X display: none;
+
+    UNVAULTING("UNVAULTING\n[ctv_hash]")
+    UNVAULTING -->|recover| R[ ]
+    UNVAULTING -->|withdraw| D[ ]
+
+    classDef contract stroke:#333,stroke-width:2px;
+    class VAULT,UNVAULTING contract
+```
+
+***Remark***: this diagram represents the possible states and transitions of each individual UTXO. For some construction, the entire smart contract resides in a single UTXO; however, other constructions might require the existence of multiple UTXOs, which could interact in some spending conditions.
+
+## Definitions
+
+In this section we will define a pseudocode notation to describe MATT contracts.
+
+Naming conventions:
+  - _parameters_: decided at contract creation time, hardcoded in the Script.
+  - _variables_: data stored in the UTXO instance, accessible to Script via `OP_CHECKCONTRACTVERIFY`.
+  - _arguments_: passed via the witness during transitions (script spending paths)
+
+
+We represent a contract with the following notation:
+
+```
+ContractName{params}[vars]
+```
+
+where:
+  - `ContractName`, in camelcase, is the name of the contract
+  - `params`: the compile-time list of parameters of the contract
+  - `vars`: the list of variables (concretely stored in the data commitment of the covenant, aka the _state_ of the contract)
+
+`params` and `vars` should be omitted if empty. Moreover, for notational simplicity we prefer to omit (and list separately) the *global* parameters that are unchanged for all the contract in the diagram.
+
+We call *clause* each spending condition of a contract. Each clause has a name (in lowercase, in snake_case if multiple words)
+
+Transition notation:
+```
+    clause_name(args) => out_i: Contract{contract_params}[contract_vars]
+```
+if only a single output contract is produced by this clause, or:
+
+```
+    clause_name(args) => [
+        out1_i: Contract1{contract1_params}[contract1_vars],
+        out2_i: Contract2{contract2_params}[contract2_vars]
+    ]
+```
+
+`out_i` is the index of the output that must match the contract. If omitted (allowed for at most one of the outputs), it must be equal to the input index being spent.
+
+
+where:
+ - `args`: the arguments of the clause, passed via the witness stack.
+ - `=> Contract...` the destination contract of this clause. Omitted if the spending condition is not encumbered by the covenant. `contract_params` can only depend on the `params` of the current contract. `contract_vars` can depend on the `params` and the `vars` of the current contract, and also on the argument `args`.
+
+The spending condition can be any predicate that can be expressed in Script, with access to all the `params`, `vars` and `args`.
+
+_Note_: this ignores the technical details of how to encode/decode the state variables to/from a single hash; that is an implementation detail that can safely be left out when discussing the semantic of a smart contract.
+
+### Default contract
+
+The contract `P2TR{pk}` is equal to the output script descriptor `tr(pk)`.
+
+### Example: Vault
+
+With the above conventions, we can model the Vault contract drawn above as follows:
+
+Global parameters:
+  - `unvault_pk`: a public key that can start trigger a withdrawal
+  - `spend_delay`: the number of blocks triggered coins have to wait before the final withdrawal
+  - `recover_pk`: a public key for a P2TR address that coins will be sent to if the *recover* clause is used.
+
+
+```
+global unvault_pk
+global recover_pk
+global spend_delay
+
+
+Vault:
+  trigger(ctv_hash, out_i) => [out_i: Unvaulting[ctv_hash]]:
+    checksig(unvault_pk)
+
+  trigger_and_revault(ctv_hash, revault_out_i, trigger_out_i) => [
+    deduct revault_out_i: Vault,
+    trigger_out_i: Unvaulting[ctv_hash]
+  ]:
+    checksig(unvault_pk)
+
+  recover => P2TR{recover_pk}:
+    pass
+
+
+Unvaulting[ctv_hash]:
+  withdraw:
+    older(spend_delay)
+    ctv(ctv_hash)
+
+  recover => P2TR{recover_pk}:
+    pass
+```
+
+A matching Python implementation can be found in [vault_contracts.py](../examples/vault/vault_contracts.py).