Skip to content

Commit 034b266

Browse files
rrudakovbbatsov
authored andcommitted
Add "Syntax highlighting" section to the design documentation
1 parent c7c7550 commit 034b266

File tree

1 file changed

+67
-1
lines changed

1 file changed

+67
-1
lines changed

doc/design.md

Lines changed: 67 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,73 @@ changes in the grammar.
183183

184184
## Syntax Highlighting
185185

186-
TODO
186+
To set up Tree-sitter fontification, `clojure-ts-mode` sets the
187+
`treesit-font-lock-settings` variable with the output of
188+
`clojure-ts--font-lock-settings`, and then calls `treesit-major-mode-setup`.
189+
190+
`clojure-ts--font-lock-settings` returns a list of compiled queries. Each query
191+
must have at least one capture name (names that start with `@`). If a capture
192+
name matches an existing face name (e.g., `font-lock-keyword-face`), the
193+
captured node will be fontified with that face.
194+
195+
A capture name can also be arbitrary and used to check the text of the captured
196+
node. It can also be used for both fontification and text checking. For
197+
example in the following query:
198+
199+
```emacs-lisp
200+
`((list_lit :anchor [(comment) (meta_lit) (old_meta_lit)] :*
201+
:anchor (sym_lit !namespace name: (sym_name) @font-lock-keyword-face))
202+
(:match ,clojure-ts--builtin-symbol-regexp @font-lock-keyword-face))
203+
```
204+
205+
We match any list whose first symbol (skipping any number of comments and
206+
metadata nodes) does not have a namespace and matches a regex stored in the
207+
`clojure-ts--builtin-symbol-regexp` variable. The matched symbol is fontified
208+
using `font-lock-keyword-face`.
209+
210+
### Embedded parsers
211+
212+
The Clojure grammar in `clojure-ts-mode` is a main or "host" grammar. Emacs
213+
also supports the use of any number of "embedded" grammars. `clojure-ts-mode`
214+
currently uses the `markdown-inline` grammar to highlight Markdown constructs in
215+
docstrings and the `regex` grammar to highlight regular expression syntax.
216+
217+
To use an embedded parser, `clojure-ts-mode` must set an appropriate value for
218+
the `treesit-range-settings` variable. The Clojure grammar provides convenient
219+
nodes to capture only the content of strings and regexes, which makes defining
220+
range settings for regexes quite simple:
221+
222+
```emacs-lisp
223+
(treesit-range-rules
224+
:embed 'regex
225+
:host 'clojure
226+
:local t
227+
'((regex_content) @capture))
228+
```
229+
230+
For docstrings, the query is a bit more complex. Therefore, we have the
231+
function `clojure-ts--docstring-query`, which is used for syntax highlighting,
232+
indentation rules, and range settings for the embedded Markdown parser:
233+
234+
```emacs-lisp
235+
(treesit-range-rules
236+
:embed 'markdown-inline
237+
:host 'clojure
238+
:local t
239+
(clojure-ts--docstring-query '@capture))
240+
```
241+
242+
It is important to use the `:local` option for embedded parsers; otherwise, the
243+
range will not be restricted to the captured node, which will lead to broken
244+
fontification (see bug [#77733](https://debbugs.gnu.org/cgi/bugreport.cgi?bug=77733)).
245+
246+
### Additional information
247+
248+
To find more details one can evaluate the following expression in Emacs:
249+
250+
```emacs-lisp
251+
(info "(elisp) Parser-based Font Lock")
252+
```
187253

188254
## Indentation
189255

0 commit comments

Comments
 (0)