Skip to content

Added docs on CCV specs, and the general framework #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
This folder contains high level documentation on the MATT framework, and the core concepts used in the examples of smart contracts implemented in this repository.

- [matt](./matt.md)<br>Start here for a general introduction to MATT.
- [contracts](./contracts.md)<br>How to build smart contracts based on state machines
- [checkcontractverify](./checkcontractverify.md)<br>Draft specs for the `OP_CHECKCONTRACTVERIFY` opcode.
215 changes: 215 additions & 0 deletions docs/checkcontractverify.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
# Introduction

The content of this page describes the current semantics of `OP_CHECKCONTRACTVERIFY`.

Together with `OP_CAT`, it is a straightforward implementation of MATT in bitcoin.

## P2TR and _augmented_ P2TR

In a P2TR scriptPubKey, the output key is computed from an _internal pubkey_ `pk` and a `taptree`, as:

```
output_key = taproot_tweak(pk, taptree)
```

We call an _augmented_ P2TR any P2TR where the _internal pubkey_ is, in turn, computed from a _naked pubkey_ `naked_pk`, tweaked with some embedded `data`:

```
pk = tweak(naked_pk, data)
```

`OP_CHECKCONTRACTVERIFY` allows to verify that the `scriptPubkey` of an input or an output is a certain P2TR Script, possibly _augmented_ with some embedded data.

The embedded data is a 32-byte value.


## `OP_CHECKCONTRACTVERIFY`

This section describes the semantics of the `OP_CHECKCONTRACTVERIFY` opcode, as currently implemented in the [docker container for MATT](https://github.com/Merkleize/docker).

### Description

`OP_CHECKCONTRACTVERIFY` is only active for scripts spending a Segwit version 1 input.

Get `data`, `index`, `pk`, `taptree`, `flags` from the stack (bottom-to-top).

`OP_CHECKCONTRACTVERIFY` verifies that the scriptPubKey of the input/output with the given `index` is a P2TR script with a pubkey obtained by the x-only pubkey `pk`, optionally tweaked with `data`, optionally taptweaked with `taptree`. The `CIOCV_FLAG_CHECK_INPUT` determines if the `index` refers to an input or an output. Special values for the parameters, are listed below.

The `flags` parameter alters the behaviour of the opcode. If negative, the opcode checks the `scriptPubkey` of an input; otherwise, it checks the `scriptPubkey` of an output. The following value for the `flags` is currently the only one defined for inputs:

- `CCV_FLAG_CHECK_INPUT = -1`: makes the opcode check an input.

Non-negative values make the opcode check an output, and different values have different behaviour in the way the output's amount (`nValue`) is checked. The following values for the `flags` are currently defined for checking an output:

- `0`: default behavior, the (possibly residual) amount of this input must be present in the output. This amount
- `CCV_FLAG_IGNORE_OUTPUT_AMOUNT = 1`: For outputs, disables the default deferred checks on amounts defined below. Undefined when `CCV_FLAG_CHECK_INPUT` is present.
- `CCV_FLAG_DEDUCT_OUTPUT_AMOUNT = 2`: Fail if the amount of the output is larger than the amount of the input; otherwise, subtracts the value of the output from the value of the current input in future calls top `OP_CHECKCONTRACTVERIFY`.

The following values of the parameters are special values:
- If `pk` is empty, it is replaced with the NUMS x-only pubkey `0x50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0` defined in [BIP-0340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki).
- If `pk` is `-1`, it is replaced with the current input's internal key.
- If `index` is `-1`, it is replaced with the current input index.
- If `data` is empty, the data tweak is skipped.
- If `taptree` is empty, the taptweak is skipped.
- If `taptree` is `-1`, the taptree of the current input is used for the taptweak.

The following additional deferred checks are performed after the validation of all inputs is completed:
- The amount of each output must be at least equal to the sum of the amount of all the inputs that have a `CCV` for that output with the default flag (equal to `0`).
- No output that is a target of `CCV_FLAG_DEDUCT_OUTPUT_AMOUNT` can also be the target of another `OP_CHECKCONTRACTVERIFY`, unless it's with the `CCV_FLAG_IGNORE_OUTPUT_AMOUNT`.

### Pseudocode

Semantics (initialization before input evaluation):
```python
for in_index in range(n_inputs)
in_ccv_amount[in_index] = inputs[in_index].amount

for out_index in range(n_outputs)
out_min_amount[out_index] = 0
```


Semantics (per input):

```python
if flags < CCV_FLAG_CHECK_INPUT or flags > CCV_FLAG_DEDUCT_OUTPUT_AMOUNT:
return success() # undefined flags are OP_SUCCESS

if index == -1:
index = current_input_index

if flags == CCV_FLAG_CHECK_INPUT:
if index < 0 or index >= n_inputs:
return fail()

target = inputs[index].scriptPubKey
else:
if index < 0 or index >= n_outputs:
return fail()

target = outputs[index].scriptPubKey

if taptree == <-1>:
taptree = current_input_taptree

if pk == <0>:
result = BIP340_NUMS_KEY
elif pk == <-1>:
result = current_input_internal_key
elif len(pk) == 32:
result = pk
else:
return fail()

if data != <0>:
if len(data) != 32:
return fail()

result = tweak(result, data)

if len(taptree) != 0:
if len(taptree) != 32:
return fail()

result = taptweak(result, taptree)

if target != P2TR(result)
return fail()

if flags == 0:
out_min_amount[index] += in_ccv_amount[current_input_index]
elif flags == CCV_FLAG_DEDUCT_OUTPUT_AMOUNT:
if in_ccv_amount[current_input_index] > outputs[index].amount:
return fail()
in_ccv_amount[current_input_index] -= outputs[index].amount

stack.pop(5) # drop all 5 stack elements
```

Semantics (deferred, checks after all inputs are validated successfully):

```python

for out_index in range(n_outputs):
if outputs[out_index].amount < out_min_amount[out_index]:
return fail()

if an_output_was_used_both_with_default_behavior_and_with_DEDUCT_OUTPUT_AMOUNT_semantics():
return fail()
```

## Common patterns

Here are some examples for the most common combination of parameters.

### Check that some data is embedded in the current input

This is used to check data that was typically committed to in an output from a covenant-encumbered spend that produced the current input.

```
<data=data> <index=-1> <pk=naked_pk> <taptree=-1> <flags=CCV_FLAG_CHECK_INPUT> CCV
```

### Check that a certain output with index `out_i` is a certain contract with specified data, preserving input amount

This might be used for a 1-to-1 or many-to-1 covenant-encumbered spend: one or several inputs are spent to an output with certain code and data.

```
<data=data> <index=out_i> <pk=output_naked_pk> <taptree=output_taptree> <flags=0> CCV
```

### Check that the output with the same index as the current input is a certain contract with specified data, preserving input amount

This is a common pattern for 1-input-1-output contracts, as it allows flexibility when creating the transaction. Typically, this would be one after checking the current input's data using the [standard pattern](#check-that-some-data-is-embedded-in-the-current-input).

Many spends of this kind could easily be batched in the same transaction, possibly together with other unencumbered inputs/outputs.

```
<data=data> <index=-1> <pk=output_naked_pk> <taptree=output_taptree> <flags=0> CCV
```

### Check that a certain output with index `out_i` is a a P2TR with a pubkey `output_pk`, preserving amount:

A simpler case where we just want the output to be a certain P2TR output, without any embedded data.

```
<data=<>> <index=out_i> <pk=output_pk> <taptree=<>> <flags=0> CCV
```


## Advanced patterns

The examples in this section use some less common use cases of `OP_CHECKCONTRACTVERIFY`.

### Check that some other input with index `in_i` is a specific contract with embedded data:

This allows to "read" the data of another input.

```
<data=input_data> <index=in_i> <pk=input_i_naked_pk> <taptree=input_taptree><flags=CCV_FLAG_CHECK_INPUT> CCV
```

### Subtract the amount of output `out_i` from the current input

This checks the _data_ and _program_ of an output, an subtracts the value of this output from the value of the current input. The residual value of the current input will be used in further calls to `OP_CHECKCONTRACTVERIFY`.

This allows the pattern of sending some amount to one or more specified destination, and then separately decide where to send any residual value.

```
<data=data> <index=out_i> <pk=output_naked_pk> <taptree=output_taptree> <flags=CCV_FLAG_DEDUCT_OUTPUT_AMOUNT> CCV
```

### Check that a certain output with index `out_i` is a certain contract with specified data; don't check amount

This could be used to check _data_ and _program_ of an output, but not its amount (which might be either irrelevant, or is checked via a different introspection opcode).

```
<data=data> <index=out_i> <pk=output_naked_pk> <taptree=output_taptree> <flags=CCV_FLAG_IGNORE_OUTPUT_AMOUNT> CCV
```

### Check that the input is sent exactly to the same scriptPubKey
This requires that the output with the same index as the current input is exactly the same script, and with the same amount.

```
<data=<>> <index=-1> <pk=-1> <taptree=-1> <flags=0> CCV
```
155 changes: 155 additions & 0 deletions docs/contracts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Introduction

MATT allows to define on-chain protocols in the UTXO model by using the covenant encumbrance.

In such protocols, the UTXOs themselves contain the _state_ of the contract. The state is updated by spending some UTXOs and producing new UTXOs with a different state - with rules that are encoded in the UTXO itself.

This page documents the general framework that is used in the demos.

See [checkcontractverify.md](checkcontractverify.md) for the semantics of `OP_CHECKCONTRACTVERIFY` and the concept of _augmented_ UTXOs.

## Contracts, programs and clauses

The internal pubkey (or the _naked_ pubkey for an augmented P2TR), together with the taptree, constitutes the ___program___ of the contract, which encodes all the spending conditions of the contract.

An actual UTXO whose `scriptPubKey` is a program, possibly with some specified embedded _data_, is a ___contract instance___.

We call ___clause___ each of the spending conditions in the taptree of. Each clause might also specify the state transition rules, by defining the program of one or more of the outputs.<br>
The keypath, if not a NUMS (Nothing-Up-My-Sleeve) point, can also be considered an additional special clause with no condition on the outputs.

### Merklelized data

While the embedded _data_ of the contract is a 32-byte value, it is always possible to represent arbitrary collections of value and use the 32-byte slot to store a _commitment_ to the data.

We can think as a contract instance as a UTXO that _stores_ some arbitrary state. The exact representation of the state is an implementation detail, but here are some rules of thumb:

- A single 32-byte value is stored as-is.
- For a single value that is not 32-bytes long, the SHA256 hash is the embedded data.
- If multiple values are part of the contract state, they can be encoded as the leaves of a [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree), and only the root hash is stored.

___Remark___: other ways of committing to a collection of values are possible, and sometimes more efficient. For example, _the SHA256-hash of the concatenation of the SHA256-hashes of the elements_ is indeed more efficient than Merkle trees if all the values need to be revealed anyway. Care is necessary as __not all ways of concatenating/hashing a collection of elements are safe__, as some are prone to collisions.

## Smart contracts as finite state machines

Clauses of a contract can specify that certain outputs must be certain other contracts, and their embedded data.

This allows to represent UTXO-based smart contract protocols as finite state machines, where each node represents a contract, and its clauses specify the transitions to 0 or more other contracts.

For many constructions, spending the UTXO of a contract produces one or more pre-determined contracts as its output UTXOs. In this case, the resulting diagram is acyclic and is (or can be reduced to) a *Directed Acyclic Graph* (DAG)

Some contracts might have an output with the _same_ contract as the input being spent. In that case, the diagram is not a DAG, but only loops from a node to itself are allowed; this avoids impossible hash cycles.

Here's an example of a [vault](https://github.com/Merkleize/pymatt/tree/master/examples/vault):

```mermaid
graph LR
VAULT -->|recover| R[ ]
VAULT -->|"trigger\n(ctv_hash)"| UNVAULTING
VAULT ---|"trigger_and_revault\n(ctv_hash)"| X( )
X --> VAULT
X --> UNVAULTING
style X display: none;

UNVAULTING("UNVAULTING\n[ctv_hash]")
UNVAULTING -->|recover| R[ ]
UNVAULTING -->|withdraw| D[ ]

classDef contract stroke:#333,stroke-width:2px;
class VAULT,UNVAULTING contract
```

***Remark***: this diagram represents the possible states and transitions of each individual UTXO. For some construction, the entire smart contract resides in a single UTXO; however, other constructions might require the existence of multiple UTXOs, which could interact in some spending conditions.

## Definitions

In this section we will define a pseudocode notation to describe MATT contracts.

Naming conventions:
- _parameters_: decided at contract creation time, hardcoded in the Script.
- _variables_: data stored in the UTXO instance, accessible to Script via `OP_CHECKCONTRACTVERIFY`.
- _arguments_: passed via the witness during transitions (script spending paths)


We represent a contract with the following notation:

```
ContractName{params}[vars]
```

where:
- `ContractName`, in camelcase, is the name of the contract
- `params`: the compile-time list of parameters of the contract
- `vars`: the list of variables (concretely stored in the data commitment of the covenant, aka the _state_ of the contract)

`params` and `vars` should be omitted if empty. Moreover, for notational simplicity we prefer to omit (and list separately) the *global* parameters that are unchanged for all the contract in the diagram.

We call *clause* each spending condition of a contract. Each clause has a name (in lowercase, in snake_case if multiple words)

Transition notation:
```
clause_name(args) => out_i: Contract{contract_params}[contract_vars]
```
if only a single output contract is produced by this clause, or:

```
clause_name(args) => [
out1_i: Contract1{contract1_params}[contract1_vars],
out2_i: Contract2{contract2_params}[contract2_vars]
]
```

`out_i` is the index of the output that must match the contract. If omitted (allowed for at most one of the outputs), it must be equal to the input index being spent.


where:
- `args`: the arguments of the clause, passed via the witness stack.
- `=> Contract...` the destination contract of this clause. Omitted if the spending condition is not encumbered by the covenant. `contract_params` can only depend on the `params` of the current contract. `contract_vars` can depend on the `params` and the `vars` of the current contract, and also on the argument `args`.

The spending condition can be any predicate that can be expressed in Script, with access to all the `params`, `vars` and `args`.

_Note_: this ignores the technical details of how to encode/decode the state variables to/from a single hash; that is an implementation detail that can safely be left out when discussing the semantic of a smart contract.

### Default contract

The contract `P2TR{pk}` is equal to the output script descriptor `tr(pk)`.

### Example: Vault

With the above conventions, we can model the Vault contract drawn above as follows:

Global parameters:
- `unvault_pk`: a public key that can start trigger a withdrawal
- `spend_delay`: the number of blocks triggered coins have to wait before the final withdrawal
- `recover_pk`: a public key for a P2TR address that coins will be sent to if the *recover* clause is used.


```
global unvault_pk
global recover_pk
global spend_delay


Vault:
trigger(ctv_hash, out_i) => [out_i: Unvaulting[ctv_hash]]:
checksig(unvault_pk)

trigger_and_revault(ctv_hash, revault_out_i, trigger_out_i) => [
deduct revault_out_i: Vault,
trigger_out_i: Unvaulting[ctv_hash]
]:
checksig(unvault_pk)

recover => P2TR{recover_pk}:
pass


Unvaulting[ctv_hash]:
withdraw:
older(spend_delay)
ctv(ctv_hash)

recover => P2TR{recover_pk}:
pass
```

A matching Python implementation can be found in [vault_contracts.py](../examples/vault/vault_contracts.py).
Loading