Skip to content

Commit f9803d3

Browse files
consolidate/clarify parsers info (#2275)
* consolidate/clarify parsers info Signed-off-by: Alexa Kreizinger <[email protected]> * Apply suggestions from code review Signed-off-by: Alexa Kreizinger <[email protected]> * Apply suggestions from code review Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Alexa Kreizinger <[email protected]> --------- Signed-off-by: Alexa Kreizinger <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent 91c369d commit f9803d3

File tree

12 files changed

+168
-132
lines changed

12 files changed

+168
-132
lines changed

SUMMARY.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -125,12 +125,12 @@
125125
* [Windows exporter metrics](pipeline/inputs/windows-exporter-metrics.md)
126126
* [Windows System Statistics (winstat)](pipeline/inputs/windows-system-statistics.md)
127127
* [Parsers](pipeline/parsers.md)
128-
* [Configuring parsers](pipeline/parsers/configuring-parser.md)
129-
* [Decoders](pipeline/parsers/decoders.md)
130-
* [JSON](pipeline/parsers/json.md)
131-
* [Logfmt](pipeline/parsers/logfmt.md)
132-
* [LTSV](pipeline/parsers/ltsv.md)
133-
* [Regular expression](pipeline/parsers/regular-expression.md)
128+
* [Configuring custom parsers](pipeline/parsers/configuring-parser.md)
129+
* [JSON format](pipeline/parsers/json.md)
130+
* [Logfmt format](pipeline/parsers/logfmt.md)
131+
* [LTSV format](pipeline/parsers/ltsv.md)
132+
* [Regular expression format](pipeline/parsers/regular-expression.md)
133+
* [Decoder settings](pipeline/parsers/decoders.md)
134134
* [Processors](pipeline/processors.md)
135135
* [Content modifier](pipeline/processors/content-modifier.md)
136136
* [Labels](pipeline/processors/labels.md)

administration/configuring-fluent-bit/yaml.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@ don't support, like processors.
88

99
YAML configuration files support the following top-level sections:
1010

11-
- `env`: Configures [environment variables](./yaml/environment-variables-section).
12-
- `includes`: Specifies additional YAML configuration files to [include as part of a parent file](./yaml/includes-section).
13-
- `service`: Configures global properties of the Fluent Bit [service](./yaml/service-section).
14-
- `pipeline`: Configures active [`inputs`, `filters`, and `outputs`](./yaml/pipeline-section).
15-
- `parsers`: Defines [custom parsers](./yaml/parsers-section).
16-
- `multiline_parsers`: Defines [custom multiline parsers](./yaml/multiline-parsers-section).
17-
- `plugins`: Defines paths for [custom plugins](./yaml/plugins-section).
18-
- `upstream_servers`: Defines [nodes](./yaml/upstream-servers-section) for output plugins.
11+
- `env`: Configures [environment variables](../administration/configuring-fluent-bit/yaml/environment-variables-section.md).
12+
- `includes`: Specifies additional YAML configuration files to [include as part of a parent file](../administration/configuring-fluent-bit/yaml/includes-section.md).
13+
- `service`: Configures global properties of the Fluent Bit [service](../administration/configuring-fluent-bit/yaml/service-section.md).
14+
- `pipeline`: Configures active [`inputs`, `filters`, and `outputs`](../administration/configuring-fluent-bit/yaml/pipeline-section.md).
15+
- `parsers`: Defines [custom parsers](../administration/configuring-fluent-bit/yaml/parsers-section.md).
16+
- `multiline_parsers`: Defines [custom multiline parsers](../administration/configuring-fluent-bit/yaml/multiline-parsers-section.md).
17+
- `plugins`: Defines paths for [custom plugins](../administration/configuring-fluent-bit/yaml/plugins-section.md).
18+
- `upstream_servers`: Defines [nodes](../administration/configuring-fluent-bit/yaml/upstream-servers-section.md) for output plugins.
1919

2020
{% hint style="info" %}
2121
YAML configuration is used in the smoke tests for containers. An always-correct up-to-date example is here: <https://github.com/fluent/fluent-bit/blob/master/packaging/testing/smoke/container/fluent-bit.yaml>.
Lines changed: 72 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,86 @@
11
# Parsers
22

3-
Parsers enable Fluent Bit components to transform unstructured data into a structured internal representation. You can define YAML parsers either directly in the main configuration file or in separate external files for better organization.
3+
You can define customer [parsers](../pipeline/parsers.md) in the `parsers` section of YAML configuration files.
44

5-
This page provides a general overview of how to declare parsers.
5+
{% hint style="info" %}
66

7-
The main section name is `parsers`, and it lets you define a list of parser configurations. The following example demonstrates how to set up two basic parsers:
7+
To define custom multiline parsers, use [the `multiline_parsers` section](../administration/configuring-fluent-bit/yaml/multiline-parsers-section.md) of YAML configuration files.
8+
9+
{% endhint %}
10+
11+
## Syntax
12+
13+
To define customers parsers in the `parsers` section of a YAML configuration file, use the following syntax.
14+
15+
{% tabs %}
16+
{% tab title="fluent-bit.yaml" %}
817

918
```yaml
1019
parsers:
11-
- name: json
20+
- name: custom_parser1
1221
format: json
22+
time_key: time
23+
time_format: '%Y-%m-%dT%H:%M:%S.%L'
24+
time_keep: on
25+
26+
- name: custom_parser2
27+
format: regex
28+
regex: '^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$'
29+
time_key: time
30+
time_format: '%Y-%m-%dT%H:%M:%S.%L'
31+
time_keep: on
32+
types: pid:integer
33+
```
34+
35+
{% endtab %}
36+
{% endtabs %}
37+
38+
For information about supported configuration options for custom parsers, see [configuring parsers](../pipeline/parsers/configuring-parser.md).
39+
40+
## Standalone parsers files
41+
42+
In addition to defining parsers in the `parsers` section of YAML configuration files, you can store parser definitions in standalone files. These standalone files require the same syntax as parsers defined in a standard YAML configuration file.
43+
44+
To add a standalone parsers file to Fluent Bit, use the `parsers_file` parameter in the `service` section of your YAML configuration file.
1345

14-
- name: docker
46+
### Add a standalone parsers file to Fluent Bit
47+
48+
To add a standalone parsers file to Fluent Bit, follow these steps.
49+
50+
1. Define custom parsers in a standalone YAML file. For example, `my-parsers.yaml` defines two custom parsers:
51+
52+
{% tabs %}
53+
{% tab title="my-parsers.yaml" %}
54+
55+
```yaml
56+
parsers:
57+
- name: custom_parser1
1558
format: json
1659
time_key: time
17-
time_format: "%Y-%m-%dT%H:%M:%S.%L"
18-
time_keep: true
60+
time_format: '%Y-%m-%dT%H:%M:%S.%L'
61+
time_keep: on
62+
63+
- name: custom_parser2
64+
format: regex
65+
regex: '^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*)\]|-)) (?<message>.+)$'
66+
time_key: time
67+
time_format: '%Y-%m-%dT%H:%M:%S.%L'
68+
time_keep: on
69+
types: pid:integer
1970
```
2071

21-
You can define multiple parsers sections, either within the main configuration file or distributed across included files.
72+
{% endtab %}
73+
{% endtabs %}
74+
75+
1. Update the `parsers_file` parameter in the `service` section of your YAML configuration file:
76+
77+
{% tabs %}
78+
{% tab title="fluent-bit.yaml" %}
79+
80+
```yaml
81+
service:
82+
parsers_file: my-parsers.yaml
83+
```
2284

23-
For more detailed information on parser options and advanced configurations, refer to the [Configuring Parsers](../../../pipeline/parsers/configuring-parser.md) documentation.
85+
{% endtab %}
86+
{% endtabs %}

administration/configuring-fluent-bit/yaml/pipeline-section.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ pipeline:
6565

6666
In the cases where each value in a list requires two values they must be separated by a space, such as in the `record` property for the `record_modifier` filter.
6767

68-
### Input
68+
### Inputs
6969

7070
An `input` section defines a source (related to an input plugin). Each section has a base configuration. Each input plugin can add it own configuration keys:
7171

@@ -88,7 +88,7 @@ pipeline:
8888
tag: my_cpu
8989
```
9090

91-
### Filter
91+
### Filters
9292

9393
A `filter` section defines a filter (related to a filter plugin). Each section has a base configuration and each filter plugin can add its own configuration keys:
9494

@@ -113,7 +113,7 @@ pipeline:
113113
regex: log aa
114114
```
115115

116-
### Output
116+
### Outputs
117117

118118
The `outputs` section specifies a destination that certain records should follow after a `Tag` match. Fluent Bit can route up to 256 `OUTPUT` plugins. The configuration supports the following keys:
119119

administration/configuring-fluent-bit/yaml/service-section.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
The `service` section defines global properties of the service. The available configuration keys are:
44

5-
65
| Key | Description | Default Value |
76
| --- | ----------- | ------------- |
87
| `flush` | Sets the flush time in `seconds.nanoseconds`. The engine loop uses a flush timeout to define when to flush the records ingested by input plugins through the defined output plugins. | `1` |
@@ -11,9 +10,9 @@ The `service` section defines global properties of the service. The available co
1110
| `dns.mode` | Sets the primary transport layer protocol used by the asynchronous DNS resolver. Can be overridden on a per-plugin basis. | `UDP` |
1211
| `log_file` | Absolute path for an optional log file. By default, all logs are redirected to the standard error interface (`stderr`). | _none_ |
1312
| `log_level` | Sets the logging verbosity level. Possible values: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are cumulative. For example, if `debug` is set, it will include `error`, `warning`, `info`, and `debug`. The `trace` mode is only available if Fluent Bit was built with the `WITH_TRACE` option enabled. | `info` |
14-
| `parsers_file` | Path for a parsers configuration file. Multiple `parsers_file` entries can be defined within the section. Parsers can be declared directly in the [`parsers` section](./parsers-section.md) of YAML configuration files. | _none_ |
15-
| `plugins_file` | Path for a `plugins` configuration file. This file specifies the paths to custom plugins (.so files) that Fluent Bit can load at runtime. Plugins can be declared directly in the [`plugins` section](./plugins-section.md) of YAML configuration files. | _none_ |
16-
| `streams_file` | Path for the [stream processor](../../../stream-processing/overview.md) configuration file. This file defines the rules and operations for stream processing in Fluent Bit. Stream processor configurations can also be defined directly in the `streams` section of YAML configuration files. | _none_ |
13+
| `parsers_file` | Path for [standalone parsers configuration files](../administration/configuring-fluent-bit/yaml/parsers-section.md#standalone-parsers-files). You can include one or more files. | _none_ |
14+
| `plugins_file` | Path for a `plugins` configuration file. This file specifies the paths to custom plugins (.so files) that Fluent Bit can load at runtime. Plugins can be declared directly in the [`plugins` section](../administration/configuring-fluent-bit/yaml/plugins-section.md) of YAML configuration files. | _none_ |
15+
| `streams_file` | Path for the [stream processor](../stream-processing/overview.md) configuration file. This file defines the rules and operations for stream processing in Fluent Bit. Stream processor configurations can also be defined directly in the `streams` section of YAML configuration files. | _none_ |
1716
| `http_server` | Enables the built-in HTTP server. | `off` |
1817
| `http_listen` | Sets the listening interface for the HTTP Server when it's enabled. | `0.0.0.0` |
1918
| `http_port` | Sets the TCP port for the HTTP server. | `2020` |
@@ -28,7 +27,7 @@ The `service` section defines global properties of the service. The available co
2827

2928
## Storage configuration
3029

31-
The following storage-related keys can be set in the `service` section:
30+
The following storage-related keys can be set as children to the `storage` key:
3231

3332
| Key | Description | Default Value |
3433
| --- | ----------- | ------------- |

pipeline/parsers.md

Lines changed: 45 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
# Parsers
22

3-
Dealing with raw strings or unstructured messages is difficult. Having a structure makes data more usable. Set a structure for the incoming data by using input plugins as data is collected.
4-
5-
Parsers are fully configurable and are independently and optionally handled by each input plugin.
3+
You can use parsers to transform unstructured log entries into structured log entries.
64

75
```mermaid
86
graph LR
@@ -18,15 +16,15 @@ graph LR
1816
style B stroke:darkred,stroke-width:2px;
1917
```
2018

21-
The parser converts unstructured data to structured data. As an example, consider the following Apache (HTTP Server) log entry:
19+
For example, a parser can turn an unstructured log entry like this:
2220

2321
```text
2422
192.168.2.20 - - [28/Jul/2006:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395
2523
```
2624

27-
This log line is a raw string without format. Structuring the log makes it easier to process the data later. If the [regular expression parser](./parsers/regular-expression.md) is used, the log entry could be converted to:
25+
...into a structured JSON object like this:
2826

29-
```javascript
27+
```json
3028
{
3129
"host": "192.168.2.20",
3230
"user": "-",
@@ -38,3 +36,44 @@ This log line is a raw string without format. Structuring the log makes it easie
3836
"agent": ""
3937
}
4038
```
39+
40+
## How parsers work
41+
42+
Parsers modify the data ingested by input plugins. This modification happens before Fluent Bit applies any [filters](../pipeline/filters.md) or [processors](..pipeline/processors.md) to that data.
43+
44+
Each input plugin can have one active parser. Multiple plugins within the same Fluent Bit configuration file can use the same parser or use different parsers from each other.
45+
46+
### Default parsers and custom parsers
47+
48+
Fluent Bit includes a variety of [default parsers](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf) for parsing common data formats, like Apache and Docker logs. You can also [define custom parsers](../configuring-fluent-bit/yaml/parsers-section.md).
49+
50+
## Add a parser to an input plugin
51+
52+
To add a parser to an input plugin, follow these steps.
53+
54+
1. Either identify the name of the [default parser](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf) you want to use, or [define a custom parser](../configuring-fluent-bit/yaml/parsers-section.md) with your desired [configuration settings](../pipeline/parsers/configuring-parser.md).
55+
56+
1. Add a `parsers` key to the plugin's settings in the [`inputs`](../administration/configuring-fluent-bit/yaml/pipeline-section.md#inputs) section of your YAML configuration file.
57+
58+
For example, the following configuration file adds the default [`apache` parser](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf#L2) to one input plugin and a custom parser named `custom_parser1` to another input plugin:
59+
60+
61+
{% tabs %}
62+
{% tab title="fluent-bit.yaml" %}
63+
64+
```yaml
65+
pipeline:
66+
inputs:
67+
- name: tail
68+
path: /input/input.log
69+
refresh_interval: 1
70+
parser: apache
71+
72+
- name: http
73+
listen: 0.0.0.0
74+
port: 8888
75+
parser: custom_parser1
76+
```
77+
78+
{% endtab %}
79+
{% endtabs %}

0 commit comments

Comments
 (0)