|
1 |
| -# gyp (go-yara-parser) |
2 |
| - |
3 |
| -`gyp` is a Go library for manipulating YARA rulesets. |
4 |
| -It uses the same grammar and lexer files as the original libyara to ensure that lexing and parsing work exactly like YARA. |
5 |
| -The grammar and lexer files have been modified to fill protocol buffers (PB) messages for ruleset manipulation instead of compiling rulesets for data matching. |
6 |
| - |
7 |
| -Using `gyp`, one will be able to read YARA rulesets to programatically change metadata, rule names, rule modifiers, tags, strings, conditions and more. |
8 |
| - |
9 |
| -Encoding rulesets as PB messages enable their manipulation in other languages. |
10 |
| -Additionally, the `y2j` tool is provided for serializing rulesets to JSON. |
11 |
| -Similarly, `j2y` provides JSON-to-YARA conversion, but do see __Limitations__ below. |
12 |
| - |
13 |
| -## `y2j` Usage |
14 |
| - |
15 |
| -Command line usage for `y2j` looks like the following: |
16 |
| - |
17 |
| -``` |
18 |
| -$ y2j --help |
19 |
| -Usage of y2j: y2j [options] file.yar |
20 |
| -
|
21 |
| -options: |
22 |
| - -indent int |
23 |
| - Set number of indent spaces (default 2) |
24 |
| - -o string |
25 |
| - JSON output file |
26 |
| -``` |
| 1 | +[](https://godoc.org/github.com/VirusTotal/gyp) |
| 2 | +[](https://goreportcard.com/report/github.com/VirusTotal/gyp) |
27 | 3 |
|
28 |
| -In action, `y2j` would convert the following ruleset: |
29 |
| - |
30 |
| -```yara |
31 |
| -import "pe" |
32 |
| -import "cuckoo" |
33 |
| -
|
34 |
| -include "other.yar" |
35 |
| -
|
36 |
| -global rule demo : tag1 { |
37 |
| -meta: |
38 |
| - description = "This is a demo rule" |
39 |
| - version = 1 |
40 |
| - production = false |
41 |
| - description = "because we can" |
42 |
| -strings: |
43 |
| - $string = "this is a string" nocase wide |
44 |
| - $regex = /this is a regex/i ascii fullword |
45 |
| - $hex = { 01 23 45 67 89 ab cd ef [0-5] ?1 ?2 ?3 } |
46 |
| -condition: |
47 |
| - $string or $regex or $hex |
48 |
| -} |
49 |
| -``` |
| 4 | +# gyp (go-yara-parser) |
50 | 5 |
|
51 |
| -to this JSON output: |
52 |
| - |
53 |
| -```json |
54 |
| -{ |
55 |
| - "imports": [ |
56 |
| - "pe", |
57 |
| - "cuckoo" |
58 |
| - ], |
59 |
| - "includes": [ |
60 |
| - "other.yar" |
61 |
| - ], |
62 |
| - "rules": [ |
63 |
| - { |
64 |
| - "modifiers": { |
65 |
| - "global": true, |
66 |
| - "private": false |
67 |
| - }, |
68 |
| - "identifier": "demo", |
69 |
| - "tags": [ |
70 |
| - "tag1" |
71 |
| - ], |
72 |
| - "meta": [ |
73 |
| - { |
74 |
| - "key": "description", |
75 |
| - "text": "This is a demo rule" |
76 |
| - }, |
77 |
| - { |
78 |
| - "key": "version", |
79 |
| - "number": "1" |
80 |
| - }, |
81 |
| - { |
82 |
| - "key": "production", |
83 |
| - "boolean": false |
84 |
| - }, |
85 |
| - { |
86 |
| - "key": "description", |
87 |
| - "text": "because we can" |
88 |
| - } |
89 |
| - ], |
90 |
| - "strings": [ |
91 |
| - { |
92 |
| - "id": "$string", |
93 |
| - "text": { |
94 |
| - "text": "this is a string", |
95 |
| - "modifiers": { |
96 |
| - "nocase": true, |
97 |
| - "ascii": false, |
98 |
| - "wide": true, |
99 |
| - "fullword": false, |
100 |
| - "xor": false |
101 |
| - } |
102 |
| - } |
103 |
| - }, |
104 |
| - { |
105 |
| - "id": "$regex", |
106 |
| - "regexp": { |
107 |
| - "text": "this is a regex", |
108 |
| - "modifiers": { |
109 |
| - "nocase": false, |
110 |
| - "ascii": true, |
111 |
| - "wide": false, |
112 |
| - "fullword": true, |
113 |
| - "xor": false, |
114 |
| - "i": true |
115 |
| - } |
116 |
| - } |
117 |
| - }, |
118 |
| - { |
119 |
| - "id": "$hex", |
120 |
| - "hex": { |
121 |
| - "token": [ |
122 |
| - { |
123 |
| - "sequence": { |
124 |
| - "value": "ASNFZ4mrze8=", |
125 |
| - "mask": "//////////8=" |
126 |
| - } |
127 |
| - }, |
128 |
| - { |
129 |
| - "jump": { |
130 |
| - "start": "0", |
131 |
| - "end": "5" |
132 |
| - } |
133 |
| - }, |
134 |
| - { |
135 |
| - "sequence": { |
136 |
| - "value": "AQID", |
137 |
| - "mask": "Dw8P" |
138 |
| - } |
139 |
| - } |
140 |
| - ] |
141 |
| - } |
142 |
| - } |
143 |
| - ], |
144 |
| - "condition": { |
145 |
| - "orExpression": { |
146 |
| - "terms": [ |
147 |
| - { |
148 |
| - "stringIdentifier": "$string" |
149 |
| - }, |
150 |
| - { |
151 |
| - "stringIdentifier": "$regex" |
152 |
| - }, |
153 |
| - { |
154 |
| - "stringIdentifier": "$hex" |
155 |
| - } |
156 |
| - ] |
157 |
| - } |
158 |
| - } |
159 |
| - } |
160 |
| - ] |
161 |
| -} |
162 |
| -``` |
| 6 | +`gyp` is a Go library for parsing YARA rules. It uses the same grammar and lexer files as the original libyara to ensure that lexing and parsing work exactly like YARA. This library produces an Abstract Syntax Tree (AST) for the parsed YARA rules. Additionally, the AST can be serialized as a Protocol Buffer, which facilitate its manipulation in other programming languages. |
163 | 7 |
|
164 | 8 | ## Go Usage
|
165 | 9 |
|
166 |
| -Sample usage for working with rulesets in Go looks like the following: |
| 10 | +The example below illustrates the usage of `gyp`, this a simple program that reads a YARA source file from the standard input, creates the corresponding AST, and writes the rules back to the standard output. The resulting output won't be exactly like the input, during the parsing and re-generation of the rules the text is reformatted and comments are lost. |
167 | 11 |
|
168 | 12 | ```go
|
169 | 13 | package main
|
170 | 14 |
|
171 | 15 | import (
|
172 |
| - "fmt" |
173 |
| - "log" |
174 |
| - "os" |
175 |
| - proto "github.com/golang/protobuf/proto" |
| 16 | + "log" |
| 17 | + "os" |
176 | 18 |
|
177 |
| - "github.com/VirusTotal/gyp" |
| 19 | + "github.com/VirusTotal/gyp" |
178 | 20 | )
|
179 | 21 |
|
180 | 22 | func main() {
|
181 |
| - input, err := os.Open(os.Args[1]) // Single argument: path to your file |
182 |
| - if err != nil { |
183 |
| - log.Fatalf("Error: %s\n", err) |
184 |
| - } |
185 |
| - |
186 |
| - ruleset, err := gyp.Parse(input) |
187 |
| - if err != nil { |
188 |
| - log.Fatalf(`Parsing failed: "%s"`, err) |
189 |
| - } |
190 |
| - |
191 |
| - fmt.Printf("Ruleset:\n%v\n", ruleset) |
192 |
| - |
193 |
| - // Manipulate the first rule |
194 |
| - rule := ruleset.Rules[0] |
195 |
| - rule.Identifier = proto.String("new_rule_name") |
196 |
| - rule.Modifiers.Global = proto.Bool(true) |
197 |
| - rule.Modifiers.Private = proto.Bool(false) |
| 23 | + ruleset, err := gyp.Parse(os.Stdin) |
| 24 | + if err != nil { |
| 25 | + log.Fatalf(`Error parsing rules: %v`, err) |
| 26 | + } |
| 27 | + if err = ruleset.WriteSource(os.Stdout); err != nil { |
| 28 | + log.Fatalf(`Error writing rules: %v`, err) |
| 29 | + } |
198 | 30 | }
|
199 | 31 | ```
|
200 | 32 |
|
@@ -231,14 +63,6 @@ The `Makefile` includes targets for quickly building the parser and lexer and th
|
231 | 63 | - Build `y2j` tool: `make y2j`
|
232 | 64 | - Build `j2y` tool: `make j2y`
|
233 | 65 |
|
234 |
| -## Limitations |
235 |
| - |
236 |
| -Currently, there are no guarantees with the library that modified rules will serialize back into a valid YARA ruleset: |
237 |
| - |
238 |
| -1. you can set `rule.Identifier = "123"`, but this would be invalid YARA. |
239 |
| -2. Adding or removing strings may cause a condition to become invalid. |
240 |
| -3. Comments cannot be retained. |
241 |
| -4. Numbers are always serialized in decimal base. |
242 | 66 |
|
243 | 67 | ## License and third party code
|
244 | 68 |
|
|
0 commit comments