Skip to content

out_file: By symlink_path, some symlinks are not created with bulk inputs #5099

@tty061

Description

@tty061

Describe the bug

We are using the symlink_path setting with placeholders of the out_file plugin. However, symlinks are not created as expected for all buffer chunks (the latest chunk for each chunk key). In practice, when data containing multiple metadata is input at once (e.g., records with different myid values that chunk keys includes), only one symlink is created, and no others are created thereafter.

To Reproduce

I used a docker image fluentd:v1.19.0-1.0 in the experiment.

  1. Run fluentd with configs below.
  2. Input multiple data at once: echo -e '{"myid":"aa","value":"111"}\n{"myid":"bb","value":"222"}' >> /tmp/test.log
  3. data.bb.new.log (the symlink for bb) is only created. /fluentd/log/buffer/ has two buffer files and two buffer meta files.
  4. Input data with myid=aa: echo -e '{"myid":"aa","value":"333"}' >> /tmp/test.log
  5. data.aa.new.log (the symlink for aa) is not created.

Configure file:

<source>
  @type  tail
  path /tmp/test.log
  tag test
  <parse>
    @type json
  </parse>
</source>

<match test>
  @type file
  @id   output1
  path         /fluentd/log/data.${myid}.*.log
  symlink_path /fluentd/log/data.${myid}.new.log
  append       true
  <buffer myid,time>
    @type file
    timekey 1d
    timekey_wait 1m
    path /fluentd/log/buffer
  </buffer>
</match>

Expected behavior

A symlink is created for the latest chunk for each chunk key.

Your Environment

- Fluentd version: 1.19.0
- Package version: Docker image `fluentd:v1.19.0-1.0`
- Operating system:
- Kernel version:

(in actual operation: td-agent 4.3.2 fluentd 1.14.6 (c0f48a0080550eff6aa6fa19d269e480684e7a45))

Your Configuration

Specified in “To Reproduce” section.

Your Error Log

N/A

Additional context

I tried to trace the relevant code.

SymlinkBufferMixin#metadata is implemented to hold a single @latest_metadata. It seems that this is held to use in the next generate_chunk call. I suspect that metadata might be called with different parameters before generate_chunk is called.

def metadata(timekey: nil, tag: nil, variables: nil)
metadata = super
@latest_metadata ||= new_metadata(timekey: 0)
if metadata.timekey && (metadata.timekey >= @latest_metadata.timekey)
@latest_metadata = metadata
end
metadata
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions