Add Wireshark Lua dissector backend#264
Open
AaronWebster wants to merge 2 commits into
Open
Conversation
Adds a parallel back end at compiler/back_end/lua/ that turns an Emboss
.emb into a runnable Wireshark Lua dissector. Mirrors the C++ backend's
shape: a py_binary driver, a starlark rule (lua_emboss_library) exposed
from the root build_defs.bzl, a (wireshark)-qualified attribute set, and
golden tests parallel to cpp_golden_test.
Generator highlights:
* One Proto per .emb, one local function per struct/bits, one value
strings table per enum.
* Nested structs dissected via forward-declared dispatch.
* Bit-addressable (`bits`) blocks emitted as masked ProtoFields against
a single container read.
* `--` doc comments become the ProtoField description; `#` hash
comments are ignored.
* Endianness honored via `subtree:add` vs `subtree:add_le`.
Module-level attributes:
* `[(wireshark) protocol: "name"]` name of the generated Proto
* `[(wireshark) root: "Struct"]` which struct dispatches the top
* `[(wireshark) register_on: "..."]` Wireshark-display-filter-style
string of `<table> == <pattern>`
terms separated by `or` / `||`.
Each term becomes a
DissectorTable.get(...):add(...)
call so Wireshark routes packets
from Ethernet/IP/UDP/TCP layers
into the generated dissector.
Struct- and field-level:
* `[(wireshark) filter: "name"]` overrides the auto-generated
Wireshark filter-name segment.
Plumbing:
* New `emboss_lua_library` macro + `lua_emboss_library` rule + aspect
in the root build_defs.bzl, modelled on cc_emboss_library.
* `embossc --generate lua` (in addition to the existing `cc`).
* scripts/regenerate_goldens.py also refreshes the Lua goldens.
Tests:
* compiler/back_end/lua/dissector_generator_test.py — 27 unit tests
covering identifier sanitization, integer-width mapping, register_on
parsing, enum value-strings emission, filter composition, doc-text
extraction, attribute validation, root-struct selection, and nested
struct dispatch.
* lua_golden_test targets in compiler/back_end/lua/BUILD covering
enum, nested_structure, uint_sizes, int_sizes, and the new
wireshark.emb fixture.
514553c to
c962481
Compare
…k Lua backend Add an Emboss-expression -> Lua translator so the dissector backend can handle constructs it previously skipped: * Conditional (`if`) fields are emitted as `if <cond> then ... end`. * Variable-length arrays (`T[n]`, with `n` a sibling field) and dynamically-located fields are emitted with the length/offset expression translated to Lua. Sibling field values referenced by a condition, array length, or offset are captured into `local val_*` reads. Fields whose governing expression can't be translated are still skipped with a comment, so the generator always emits valid Lua. Constant-only structs produce byte-identical output to before. Add testdata/wireshark_dynamic.emb (and its golden), expression/conditional/ array unit tests, and a TShark smoke test that loads a generated dissector and checks both branches of a conditional, length-prefixed message (the test skips itself when tshark is not installed).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a parallel back end at
compiler/back_end/lua/that turns an Emboss.embdefinition into a runnable Wireshark Lua dissector. Mirrors theC++ backend's shape (driver, starlark rule, golden tests) so the new
backend is invoked exactly like its C++ sibling:
How layered protocols work
Wireshark already dissects Ethernet → IP → UDP/TCP using its built-in
dissectors. The user only needs to define their payload in
.emb;declaring
[(wireshark) register_on: "..."]plugs the generateddissector into the correct Wireshark dissector table at load time.
The
register_onvalue uses Wireshark-display-filter syntax — one ormore
<table> == <integer>terms joined byor/||, with decimal or0x-hex patterns. Each term becomes aDissectorTable.get("<table>"):add(<pattern>, <proto>)call.Generator features
Protoper.emb, onelocal functionperstruct/bits, onevalue-strings table per
enum.reference order works.
bits) blocks emitted as maskedProtoFieldsagainst a single container read.
if) fields are wrapped inif <condition> then … end, with the Emboss condition translated to Lua.T[n]wherenis a sibling field) anddynamically-located fields, via an Emboss-expression → Lua
translator. Sibling values referenced by a condition, array length, or
offset are captured into
local val_*reads ahead of use.--doc comments become eachProtoField's description;#comments are ignored.
subtree:addvssubtree:add_le).[(wireshark) filter: "..."]override the auto-generated filter-name segment.
Explicit non-goals (for this initial cut)
The generator emits valid Lua even when it can't fully describe a
field — it emits a
-- skipped …comment and moves on. Future workcan extend coverage:
let/ virtual fields.bitsblocks(byte-level structs are supported).
$max(e.g.?:,$present).Tests
compiler/back_end/lua/dissector_generator_test.py— 44 unit testscovering sanitization, integer-width mapping,
register_onparsing,enum tables, filter composition, doc extraction, attribute validation,
root-struct selection, nested-struct dispatch, and now the
expression translator, conditional fields, value capture, and
variable-length arrays.
lua_golden_testtargets incompiler/back_end/lua/BUILDforenum.emb,nested_structure.emb,uint_sizes.emb,int_sizes.emb,the
testdata/wireshark.embsmoke fixture, and the newtestdata/wireshark_dynamic.embfixture (conditional field +length-prefixed variable array).
compiler/back_end/lua/tshark_smoke_test.py— loads a generateddissector into a real TShark and asserts the decoded tree for both
branches of a conditional, length-prefixed message. Auto-skips when
tshark/text2pcaparen't installed (e.g. in CI).scripts/regenerate_goldens.pyalso refreshes the Lua goldens.Test plan
bazel test //compiler/back_end/lua:dissector_generator_testbazel test //compiler/back_end/lua/...(golden + smoke tests)bazel build //testdata:wireshark_lua_emboss //testdata:wireshark_dynamic_lua_embosssynthetic packets — automated in
tshark_smoke_test.py, andverified by hand for
testdata/wireshark.emband the dynamicfixture (enum value-strings, big-endian decode, nested structs,
the conditional
error_code, and the variable-lengthpayload).