Skip to content

Conversation

@mikeday
Copy link
Contributor

@mikeday mikeday commented Dec 18, 2023

The compilation of a format that ends with a union or a repeat will depend on the format that follows it, as this may influence the match tree used for lookahead, so initially we compiled each format into multiple decoders, one for each possible "next".

This pull request compiles each format to a single decoder instead, taking the union of all the "nexts". I think this is sound: if it's valid for F to be followed by A and valid for F to be followed by B then it should be valid for F to be followed by (A|B).

It's nice to create exactly one decoder per format however this still requires "whole program analysis" in the sense that a format cannot be compiled independently of how it is used, as you would hope a function or module could be.

Also the code feels slightly fragile given the way it has some subtle invariants on the decoder indices, that could probably be improved a little.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants