Replies: 1 comment
-
Just to clarify and drive home an important point: The memory used to represent the inferred types from Simplicity programs is linear in the size of the program, but only because of the natural sharing of type subexpressions induced by the type inference algorithm. The unshared size of inferred types can be exponential in the size of the program. So it is, in fact, the clever sharing that saves us here. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This post is a semi-structured dump of some thoughts I've had while trying to combinatorially generate Simplicity structures.
Simplicity types live in a set$\mathbb{T}$ which admits a very simple recursive definition:
Then for each type there is a natural notion of size$t\mapsto|t|$ which expresses the bitlength of a value of type $t$ when encoded in the bit machine:
This has the curious effect that for any particular size$n$ there are an infinity of types whose size is $n$ , because you can always take arbitrarily many products with units, which does not increase the bitlength of a type.
There are two other notions of size which may be relevant:
Naively, this looks alarming -- we have three notions of size which are decoupled from each other and the "in-memory" size may be exponentially larger than the "on-chain" size reflected in the Simplicity cost model. In fact, there is no problem here -- it turns out that every Simplicity combinator introduces at most 1 free type and at most 4 total type bounds (which each can, at most, "turn into" a single sum or product). And since every combinator requires some minimum number of bits to encode, I think 4, this means that actually the in-memory footprint of the types inferred in a given program linear in the encoded size of that program. Phew.
This actually introduces yet another notion of the "size" of a type -- how many bounds were required to construct it. This is a murky definition since you can't really assign a particular bound to a particular type. But in type-inferring a Simplicity program the number of bounds is proportional to the amount of memory used during inference.
From a software testing point of view, we would like to generate a diverse set of types (and values which inhabit them). For simplicity's sake it would be nice if this generation looked like "pick a size$n$ and recursively generate a type of size $n$ ". To do this we need to pick a notion of "size". For the sake of not blowing up our test harness, the most natural choice would be the "in-memory size", although of all the sizes above, this is the only one that doesn't reflect some on-chain property of a Simplicity program or value.
An interesting alternate strategy might be to combinatorially generate bounds, and from there produce types, and from there choose values (which is "easy" to do uniformly once you know the target type). This just occurred to me as I was typing this, and it might also resolve another problem how can we combinatorially generate well-typed Simplicity programs. The problem with this is that while there's a simple recursive definition of "untyped Simplicty", there is no recursive definition (I'm pretty sure) of "well-typed Simplicity". So while there's a wealth of literature on efficiently generating recursive structures, it is inapplicable to the problem of generating well-typed Simplicity so you wind up doing a lot of very inefficient backtracking and type inference and re-inference.
An alternate strategy is:
Beta Was this translation helpful? Give feedback.
All reactions