Skip to content

Commit 2612868

Browse files
committed
Add documentation for NAR spec in kaitai
1 parent 36ee38e commit 2612868

File tree

1 file changed

+180
-0
lines changed

1 file changed

+180
-0
lines changed

doc/manual/source/protocols/nix-archive.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,183 @@ The `str` function / parameterized rule is defined as follows:
4141
- `int(n)` = the 64-bit little endian representation of the number `n`
4242

4343
- `pad(s)` = the byte sequence `s`, padded with 0s to a multiple of 8 byte
44+
45+
## Kaitai Struct Specification
46+
47+
The Nix Archive (NAR) format is also formally described using [Kaitai Struct](https://kaitai.io/), an Interface Description Language (IDL) for defining binary data structures.
48+
49+
> Kaitai Struct provides a language-agnostic, machine-readable specification that can be compiled into parsers for various programming languages (e.g., C++, Python, Java, Rust).
50+
51+
```yaml
52+
meta:
53+
id: nix_nar
54+
title: Nix Archive (NAR)
55+
file-extension: nar
56+
endian: le
57+
doc: |
58+
Nix Archive (NAR) format. A simple, reproducible binary archive
59+
format used by the Nix package manager to serialize file system objects.
60+
doc-ref: 'https://nixos.org/manual/nix/stable/command-ref/nix-store.html#nar-format'
61+
62+
seq:
63+
- id: magic
64+
type: padded_str
65+
doc: "Magic string, must be 'nix-archive-1'."
66+
valid:
67+
expr: _.body == 'nix-archive-1'
68+
- id: root_node
69+
type: node
70+
doc: "The root of the archive, which is always a single node."
71+
72+
types:
73+
padded_str:
74+
doc: |
75+
A string, prefixed with its length (u8le) and
76+
padded with null bytes to the next 8-byte boundary.
77+
seq:
78+
- id: len_str
79+
type: u8
80+
- id: body
81+
type: str
82+
size: len_str
83+
encoding: 'ascii'
84+
- id: padding
85+
size: (8 - (len_str % 8)) % 8
86+
87+
node:
88+
doc: "A single filesystem node (file, directory, or symlink)."
89+
seq:
90+
- id: open_paren
91+
type: padded_str
92+
doc: "Must be '(', a token starting the node definition."
93+
valid:
94+
expr: _.body == '('
95+
- id: type_key
96+
type: padded_str
97+
doc: "Must be 'type'."
98+
valid:
99+
expr: _.body == 'type'
100+
- id: type_val
101+
type: padded_str
102+
doc: "The type of the node: 'regular', 'directory', or 'symlink'."
103+
- id: body
104+
type:
105+
switch-on: type_val.body
106+
cases:
107+
"'directory'": type_directory
108+
"'regular'": type_regular
109+
"'symlink'": type_symlink
110+
- id: close_paren
111+
type: padded_str
112+
valid:
113+
expr: _.body == ')'
114+
if: "type_val.body != 'directory'"
115+
doc: "Must be ')', a token ending the node definition."
116+
117+
type_directory:
118+
doc: "A directory node, containing a list of entries."
119+
seq:
120+
- id: entries
121+
type: dir_entry
122+
repeat: until
123+
repeat-until: _.kind.body == ')'
124+
types:
125+
dir_entry:
126+
doc: "A single entry within a directory, or a terminator."
127+
seq:
128+
- id: kind
129+
type: padded_str
130+
valid:
131+
expr: _.body == 'entry' or _.body == ')'
132+
doc: "Must be 'entry' (for a child node) or '' (for terminator)."
133+
- id: open_paren
134+
type: padded_str
135+
valid:
136+
expr: _.body == '('
137+
if: 'kind.body == "entry"'
138+
- id: name_key
139+
type: padded_str
140+
valid:
141+
expr: _.body == 'name'
142+
if: 'kind.body == "entry"'
143+
- id: name
144+
type: padded_str
145+
if: 'kind.body == "entry"'
146+
- id: node_key
147+
type: padded_str
148+
valid:
149+
expr: _.body == 'node'
150+
if: 'kind.body == "entry"'
151+
- id: node
152+
type: node
153+
if: 'kind.body == "entry"'
154+
doc: "The child node, present only if kind is 'entry'."
155+
- id: close_paren
156+
type: padded_str
157+
valid:
158+
expr: _.body == ')'
159+
if: 'kind.body == "entry"'
160+
instances:
161+
is_terminator:
162+
value: kind.body == ')'
163+
164+
type_regular:
165+
doc: "A regular file node."
166+
seq:
167+
# Read attributes (like 'executable') until we hit 'contents'
168+
- id: attributes
169+
type: reg_attribute
170+
repeat: until
171+
repeat-until: _.key.body == "contents"
172+
# After the 'contents' token, read the file data
173+
- id: file_data
174+
type: file_content
175+
instances:
176+
is_executable:
177+
value: 'attributes[0].key.body == "executable"'
178+
doc: "True if the file has the 'executable' attribute."
179+
types:
180+
reg_attribute:
181+
doc: "An attribute of the file, e.g., 'executable' or 'contents'."
182+
seq:
183+
- id: key
184+
type: padded_str
185+
doc: "Attribute key, e.g., 'executable' or 'contents'."
186+
valid:
187+
expr: _.body == 'executable' or _.body == 'contents'
188+
- id: value
189+
type: padded_str
190+
if: 'key.body == "executable"'
191+
valid:
192+
expr: _.body == ''
193+
doc: "Must be '' if key is 'executable'."
194+
file_content:
195+
doc: "The raw data of the file, prefixed by length."
196+
seq:
197+
- id: len_contents
198+
type: u8
199+
# This relies on the property of instances that they are lazily evaluated and cached.
200+
- size: 0
201+
if: nar_offset < 0
202+
- id: contents
203+
size: len_contents
204+
- id: padding
205+
size: (8 - (len_contents % 8)) % 8
206+
instances:
207+
nar_offset:
208+
value: _io.pos
209+
210+
type_symlink:
211+
doc: "A symbolic link node."
212+
seq:
213+
- id: target_key
214+
type: padded_str
215+
doc: "Must be 'target'."
216+
valid:
217+
expr: _.body == 'target'
218+
- id: target_val
219+
type: padded_str
220+
doc: "The destination path of the symlink."
221+
```
222+
223+
The source of the spec can be found [here](https://github.com/fzakaria/nix-nar-kaitai-spec/blob/main/NAR.ksy). Contributions and improvements to the spec are welcomed.

0 commit comments

Comments
 (0)