You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Added traversal steps from flatgraph repo
* Added more properties and discussed the `.start` step
* Added details about each step type
* Fixed `glimpse-of-a-simple-use-case` link
Copy file name to clipboardExpand all lines: content/traversals/_index.md
+181-9Lines changed: 181 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,23 +3,195 @@ title = "Traversal steps"
3
3
weight = 3
4
4
+++
5
5
6
-
The most important basic traversal steps will be the ones generated for your domain as highlighted in [glimpse-of-a-simple-use-case](index.html#glimpse-of-a-simple-use-case).
6
+
The most important basic traversal steps will be the ones generated for your domain as highlighted in [glimpse-of-a-simple-use-case](../_index.html#glimpse-of-a-simple-use-case).
7
7
8
8
In addition to the generated domain-specific steps based on your schema, there's some basic traversal steps that you can use generically across domains, e.g. to traverse from a node (or an `Iterator[Node]`) to their neighbors, lookup their properties etc.
9
9
There are also more advanced steps like `repeat` and advanced features like path tracking which will be described further below.
10
10
11
11
{{% notice tip %}}
12
-
flatgraph traversals are based on Scala's Iterator, so you can also use all regular [collection methods](https://docs.scala-lang.org/scala3/book/collections-methods.html).
12
+
flatgraph traversals are based on Scala's Iterator, so you can also use all regular [collection methods](https://docs.scala-lang.org/scala3/book/collections-methods.html). If you want to begin a traversal from a given node, the `.start` method will wrap that node in a traversal making the traversal steps available.
13
13
{{% /notice %}}
14
14
15
+
## Step Types
16
+
17
+
The various traversal queries can be divided into a number of types: Filter, Map, Side Effect, and Terminal.
18
+
19
+
### Filter Steps
20
+
21
+
_Filter Steps_ are atomic traversals that filter nodes according to given criteria. The most common filter step is aptly-named `filter`, which continues the traversal in the step it suffixes for all nodes which pass its criterion. Its criterion is represented by a lambda function which has access to the node of the previous step and returns a boolean. Continuing with the previous example, let us execute a query which returns all `METHOD` nodes of the Code Property Graph for [`X42`](https://github.com/ShiftLeftSecurity/x42.git), but only if their `IS_EXTERNAL` property is set to `false`:
In the example above, we used the lambda function `_.isExternal == false` as the predicate for the filter.
31
+
The `_` is simply syntactic sugar referring to the parameter of the function, so this could be rewritten
32
+
as `method => method.isExternal == false`.
33
+
{{% /notice %}}
34
+
35
+
Dissecting this query, we have `cpg` as the root object, a node-type step `method` which returns all nodes of type `METHOD`, a filter step `where(_.isExternal == false)` which continues the traversal only for nodes which have their `IS_EXTERNAL` property set to `false` (with `_` referencing the individual nodes, and `isExternal` a property directive which accesses their `IS_EXTERNAL` property), followed by a property directive `name` which returns the values of the `NAME` property of the nodes that passed the _Filter Step_, and finally an _Execution Directive_`toList` which executes the traversal and returns the results in a list.
36
+
37
+
A shorter version of a query which returns the same results as the one above can be written using a _Property-Filter Step_. Property-filter steps are _Filter Steps_ which continue the traversal only for nodes which have a specific value in the property the _Property Filter Step_ refers to:
38
+
39
+
```java
40
+
joern> cpg.method.isExternal(false).name.toList
41
+
res11:List[String] = List("main")
42
+
```
43
+
44
+
Dissecting the query again, `cpg` is the root object, `method` is a node-type step, `isExternal(false)` is a property-filter step that filters for nodes which have `false` as the value of their `IS_EXTERNAL` property, `name` is a property directive, and `toList` is the execution directive you are already familiar with.
45
+
46
+
{{% notice tip %}}
47
+
Be careful not to mix up property directives with property-filter steps, they look awfully similar.
48
+
Consider that:
49
+
50
+
a) `cpg.method.isExternal(true).name.toList` returns all `METHOD` nodes which have the `IS_EXTERNAL` property set to `true` (in this case, 10 results)
51
+
52
+
b) `cpg.method.isExternal.toList` returns the value of the `IS_EXTERNAL` property for all `METHOD` nodes in the graph (12 results)
53
+
54
+
c) `cpg.method.isExternal.name.toList` is an invalid query which will not execute
55
+
{{% /notice %}}
56
+
57
+
A final _Filter Step_ we will look at is named `where`. Unlike `filter`, this doesn't take a simple predicate `A => Boolean`, but instead takes a `Traversal[A] => Traversal[_]`. I.e. you supply a traversal which will be executed at the current position. The resulting Traversal will preserves elements if the provided traversal has _at least one_ result. The previous query that used a _Property Filter Step_ can be re-written using `where` like so:
Maybe not particularly useful-seeming given this specific example, but keep it in the back of your head, because `filter` is a handy tool to have in the toolbox. Next up, _Map Steps_.
65
+
66
+
### Map Steps
67
+
68
+
_Map Steps_ are traversals that map a set of nodes into a different form given a function. _Map Steps_ are a powerful mechanism when you need to transform results to fit your specifics. For example, say you'd like to return both the `IS_EXTERNAL` and the `NAME` properties of all `METHOD` nodes in `X42`'s Code Property Graph. You can achieve that with the following query:
Don't be intimidated by the syntax used in the `map`_Step_ above. If you examine `map(node => (node.isExternal, node.name))` for a bit, you might be able to infer that the first `node` simply defines the variable that represents the node which preceeds the `map`_Step_, that the ASCII arrow `=>` is just syntax that preceeds the body of a lambda function, and that `(node.isExternal, node.name)` means that the return value of the lambda is a list which contains the value of the `isExternal` and `name`_Property Directives_ for each of the nodes matched in the previous step and also passed into the lambda. In most cases in which you need `map`, you can simply follow the pattern above. But should you ever feel constrained by the common pattern shown, remember that the function for the `map` step is written in the Scala programming language, a fact which opens up a wide range of possibilities if you invest a little time learning the language.
86
+
87
+
### Side Effect Steps
88
+
89
+
_Side Effect Steps_ are traversal steps that perform an action or modify the state of the traversal without altering the path of the traversal itself. They do not directly contribute to the results that are returned, but they might be used to store information, log data, or manipulate variables during traversal. These steps can be thought of as adding "side effects" to the traversal that can be useful for various purposes like counting, aggregating, or modifying data.
90
+
91
+
### Terminal Steps
92
+
93
+
_Terminal Steps_ are steps that end the traversal and return the final result. Once a terminal step is reached, the traversal is considered complete, and it provides the output in some form (e.g., a list, a set, or a single element). Unlike intermediate steps that continue building the traversal, terminal steps execute the traversal and stop further processing. After a terminal step, the traversal cannot be continued or extended; it’s finished.
94
+
95
+
## Traversal Steps
96
+
97
+
The steps described below are available when called on an `Iterator`. For these to be available, the following packaged must be imported, i.e., `import flatgraph.traversal.language.*`.
15
98
16
99
#### Basic steps
17
100
Assuming you have an `Iterator[X]`, where `X` is typically a domain specific type, but could also be flatgraph's root type for nodes [`GNode`](https://github.com/joernio/flatgraph/blob/92f4cc4b84bf6b8315971128995a75872376dcff/core/src/main/java/flatgraph/GNode.java), here's a (non-exhaustive) list of basic traversal steps.
|**page**|_<empty>_| Mandatory reference to the page. |
22
-
|**onempty**|`disable`| Defines what to do with the button if the content overlay is empty:<br><br>- `disable`: The button is displayed in disabled state.<br>- `hide`: The button is removed. |
23
-
|**onwidths**|_<varying>_| The action, that should be executed if the site is displayed in the given width:<br><br>- `show`: The button is displayed in its given area<br>- `hide`: The button is removed.<br>- `area-XXX`: The button is moved from its given area into the area `XXX`. |
|**coalesce**| Filter | Evaluates the provided traversals in order and returns the first traversal that emits at least one element. |
108
+
|**collectAll[B]**| Filter | Collects all elements of the provided class `B` (beware of type-erasure). |
109
+
|**dedup**| Filter | Deduplicate elements of this traversal - a.k.a. distinct, unique. |
110
+
|**dedupBy**| Filter | Deduplicate elements of this traversal by a given function. |
111
+
|**discardPathTracking**| Side Effect | Disables path tracking, and any tracked paths so far. |
112
+
|**enablePathTracking**| Side Effect | Enable path tracking - prerequisite for path/simplePath steps. |
113
+
|**filter**| Filter | Filters in everything that evaluates to _true_ by the given transformation function. |
114
+
|**filterNot**| Filter | Filters in everything that evaluates to _false_ by the given transformation function. |
115
+
|**groupCount**| Map | Group elements and count how often they appear. |
116
+
|**groupCount[B]**| Map | Group elements by a given transformation function and count how often the results appear. |
117
+
|**head**| Terminal | The first element of the traversal. |
118
+
|**is**| Filter | Filters in everything that _is_ the given value. |
119
+
|**or**| Filter | Only preserves elements for which _at least one of_ the given traversals has at least one result. |
120
+
|**l/toSet/toSeq**| Terminal | Execute the traversal and returns the result as a list, set, or indexed sequence respectively. |
121
+
|**last**| Terminal | The last element of the traversal. |
122
+
|**not**| Filter | Filters out everything that _is not_ the given value. Alias for `whereNot`. |
123
+
|**path**| Terminal | Retrieve entire paths that have been traversed thus far. |
124
+
|**repeat**| Map | Repeat the given traversal. |
125
+
|**sideEffect**| Side Effect | Perform side effect without changing the contents of the traversal. |
126
+
|**simplePath**| Filter | Ensure the traversal does not include any paths that visit the same node more than once. |
127
+
|**size**| Terminal | Total size of elements in the traversal. |
128
+
|**sorted**| Map | Sort elements by their natural order. |
129
+
|**sortBy**| Map | Sort elements by the value of the given transformation function. |
130
+
|**union**| Filter | Union/sum/aggregate/join given traversals from the current point. |
131
+
|**within**| Filter | Filters out all elements that are _not_ in the provided set. |
132
+
|**without**| Filter | Filters out all elements that _are_ in the provided set. |
133
+
|**where**| Filter | Only preserves elements if the provided traversal has at least one result. |
134
+
|**whereNot**| Filter | Only preserves elements if the provided traversal does _not_ have any results. |
135
+
136
+
#### Node Steps
137
+
138
+
When starting the traversal from an `Iterator` of nodes [`GNode`](https://github.com/joernio/flatgraph/blob/92f4cc4b84bf6b8315971128995a75872376dcff/core/src/main/java/flatgraph/GNode.java).
|**both**| Map/Filter | Follow both in and out-neighbours for a given node. Can be restricted by edge type. |
143
+
|**bothE**| Map/Filter | Follow both in and out-edges for a given node. Can be restricted by edge type. |
144
+
|**hasLabel**| Filter | Filters in nodes that match the given labels. Alias for `label`|
145
+
|**id**| Map/Filter | Return a unique identifier(s) for the node(s) in the traversal. Can filter by given IDs. |
146
+
|**in**| Map/Filter | In-neighbours for a given node. Can be restricted by edge type. |
147
+
|**inE**| Map/Filter | In-edges for a given node. Can be restricted by edge type. |
148
+
|**out**| Map/Filter | Out-neighbours for a given node. Can be restricted by edge type. |
149
+
|**outE**| Map/Filter | Out-edges for a given node. Can be restricted by edge type. |
150
+
|**property**| Map | Retrieve the value for a single property for the defined property name. |
151
+
|**propertiesMap**| Map | Retrieves all entity properties as a map. |
152
+
|**label**| Map/Filter | Node label. Can filter by given labels. |
153
+
|**labelNot**| Filter | Inverse of `label`. |
154
+
155
+
#### Edge Steps
156
+
157
+
When starting the traversal from an `Iterator` of nodes [`Edge`](https://github.com/joernio/flatgraph/blob/92f4cc4b84bf6b8315971128995a75872376dcff/core/src/main/scala/flatgraph/Edge.scala).
|**src**| Map | Traverse to the source node (out-going node). |
162
+
|**dst**| Map | Traverse to the destination node (incoming node). |
163
+
164
+
## Property Directives
165
+
166
+
The steps described below are available when called on the entity/object directly. These are available as methods or properties on the objects so no import is necessary.
167
+
168
+
#### Graph Steps
169
+
170
+
Steps available from an instance of [`Graph`](https://github.com/joernio/flatgraph/blob/92f4cc4b84bf6b8315971128995a75872376dcff/core/src/main/scala/flatgraph/Graph.scala).
|**edgeCount**| Terminal | The total edges in the graph. Can be restricted by a given label. |
177
+
|**nodes**| Filter | Create a traversal from the nodes of the graph that match the given IDs or labels. |
178
+
|**nodeCount**| Terminal | Total nodes in the graph. Can be restricted by a given label. |
179
+
180
+
#### Edge Steps
181
+
182
+
Steps available from an instance of [`Edge`](https://github.com/joernio/flatgraph/blob/92f4cc4b84bf6b8315971128995a75872376dcff/core/src/main/scala/flatgraph/Edge.scala).
|**propertyName**| Map | The property value of the edge, if one exists. |
188
+
189
+
#### Node Steps
190
+
191
+
Steps available from an instance of [`GNode`](https://github.com/joernio/flatgraph/blob/92f4cc4b84bf6b8315971128995a75872376dcff/core/src/main/java/flatgraph/GNode.java).
0 commit comments