log: Update log related metrics with a certian interval #11084

cosmo0920 · 2025-10-30T09:34:25Z

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

Example configuration file for the change

---
service:
  flush: 1
  daemon: Off
  log_level: debug
  # Enable/Disable the built-in HTTP Server for metrics
  http_server: Off
  http_listen: 127.0.0.1
  http_port: 2020

pipeline:
  inputs:
    - name: fluentbit_metrics
      tag: metrics_fluentbit
      scrape_interval: 60s

  outputs:
    - name: prometheus_remote_write
      match: 'metrics_*'
      host: localhost
      port: 9090
      uri: /api/v1/write
      retry_limit: 2
      log_response_payload: True
      tls: Off
      add_label: job fluentbit2

With the following configuration and Prometheus:

scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: []

Launch Prometheus with:

$ prometheus --config.file=/path/to/prometheus.yaml --web.enable-remote-write-receiver

Debug log output from testing the change

Fluent Bit v4.2.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __  
|  ___| |                | |   | ___ (_) |           /   | /  | 
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| | 
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | | 
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/


[2025/10/30 18:27:48.943855000] [ info] Configuration:
[2025/10/30 18:27:48.943862000] [ info]  flush time     | 1.000000 seconds
[2025/10/30 18:27:48.943866000] [ info]  grace          | 5 seconds
[2025/10/30 18:27:48.943868000] [ info]  daemon         | 0
[2025/10/30 18:27:48.943870000] [ info] ___________
[2025/10/30 18:27:48.943872000] [ info]  inputs:
[2025/10/30 18:27:48.943874000] [ info]      fluentbit_metrics
[2025/10/30 18:27:48.943876000] [ info] ___________
[2025/10/30 18:27:48.943878000] [ info]  filters:
[2025/10/30 18:27:48.943880000] [ info] ___________
[2025/10/30 18:27:48.943882000] [ info]  outputs:
[2025/10/30 18:27:48.943884000] [ info]      prometheus_remote_write.0
[2025/10/30 18:27:48.943886000] [ info] ___________
[2025/10/30 18:27:48.943888000] [ info]  collectors:
[2025/10/30 18:27:48.945038000] [ info] [fluent bit] version=4.2.0, commit=03dada8a7b, pid=86872
[2025/10/30 18:27:48.945046000] [debug] [engine] coroutine stack size: 36864 bytes (36.0K)
[2025/10/30 18:27:48.945508000] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/10/30 18:27:48.945679000] [ info] [simd    ] NEON
[2025/10/30 18:27:48.945683000] [ info] [cmetrics] version=1.0.5
[2025/10/30 18:27:48.946101000] [ info] [ctraces ] version=0.6.6
[2025/10/30 18:27:48.946227000] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] initializing
[2025/10/30 18:27:48.946233000] [ info] [input:fluentbit_metrics:fluentbit_metrics.0] storage_strategy='memory' (memory only)
[2025/10/30 18:27:48.946240000] [debug] [fluentbit_metrics:fluentbit_metrics.0] created event channels: read=25 write=26
[2025/10/30 18:27:48.946411000] [debug] [prometheus_remote_write:prometheus_remote_write.0] created event channels: read=27 write=28
[2025/10/30 18:27:48.946687000] [ info] [output:prometheus_remote_write:prometheus_remote_write.0] worker #0 started
[2025/10/30 18:27:48.946695000] [ info] [output:prometheus_remote_write:prometheus_remote_write.0] worker #1 started
[2025/10/30 18:27:48.946936000] [ info] [sp] stream processor started
[2025/10/30 18:27:48.947026000] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
[2025/10/30 18:28:50.4113000] [debug] [task] created task=0x8c70106c0 id=0 OK
[2025/10/30 18:28:50.4326000] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] task_id=0 assigned to thread #0
[2025/10/30 18:28:50.4480000] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetrics msgpack size: 7550
[2025/10/30 18:28:50.7549000] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] cmetric_id=0 decoded 0-7550 payload_size=4065
[2025/10/30 18:28:50.7608000] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] final payload size: 4065
[2025/10/30 18:28:50.21820000] [debug] [upstream] KA connection #59 to localhost:9090 is connected
[2025/10/30 18:28:50.22244000] [debug] [http_client] not using http_proxy for header
[2025/10/30 18:28:50.24022000] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] localhost:9090, HTTP status=204
[2025/10/30 18:28:50.24054000] [debug] [upstream] KA connection #59 to localhost:9090 is now available
[2025/10/30 18:28:50.24067000] [debug] [output:prometheus_remote_write:prometheus_remote_write.0] http_post result FLB_OK
[2025/10/30 18:28:50.24090000] [debug] [out flush] cb_destroy coro_id=0
[2025/10/30 18:28:50.24144000] [debug] [task] destroy task=0x8c70106c0 (task_id=0)
<snip>

Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Run local packaging test showing all targets (including any new ones) build.
Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

Documentation required for this feature

Backporting

Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

New Features
- Introduced heartbeat-based logging metrics that periodically emit per-type log counters for improved observability.
- Added metrics lifecycle APIs to create and destroy logging metrics, and initialization of heartbeat timing so metrics are emitted at configured intervals.

Signed-off-by: Hiroshi Hatake <[email protected]>

coderabbitai · 2025-10-30T09:34:49Z

Walkthrough

Added heartbeat-driven metrics emission with cross-platform monotonic timing, new metrics lifecycle APIs, and heartbeat scheduling fields on the public log struct; heartbeat checks are invoked from the log print path to emit per-type counters at configured intervals.

Changes

Cohort / File(s)	Summary
Header field additions `include/fluent-bit/flb_log.h`	Added two public fields to `struct flb_log`: `uint64_t next_hb_ns` and `uint64_t hb_interval_ns` for heartbeat scheduling.
Heartbeat & monotonic time `src/flb_log.c`	Added cross-platform `monotonic_now_ns()` (Windows and POSIX variants) and heartbeat scheduling logic initialized in `flb_log_create()`.
Metrics lifecycle & emission `src/flb_log.c`	Added `flb_log_metrics_create()` and `flb_log_metrics_destroy()` APIs and `flb_log_metrics_heartbeat()` to emit per-type log counters; integrated heartbeat invocation into `flb_log_print()` before counter increments.

Sequence Diagram

sequenceDiagram
    participant Print as flb_log_print()
    participant Time as monotonic_now_ns()
    participant HB as Heartbeat Check
    participant Metrics as flb_log_metrics_heartbeat()

    Print->>Time: get current monotonic ns
    Print->>HB: check (log->next_hb_ns, current_ns)
    alt current_ns >= log->next_hb_ns
        HB->>Metrics: emit per-type counters
        Metrics-->>HB: update metrics state
        HB->>Print: advance log->next_hb_ns
    else not yet
        HB-->>Print: no-op
    end
    Print->>Print: increment per-type counter

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Review focus:
- monotonic_now_ns() platform-specific implementations for correctness and edge cases.
- Initialization of hb_interval_ns / next_hb_ns in flb_log_create() and their thread-safety.
- Integration point in flb_log_print() (hot path) for performance impact and correctness.
- Public APIs flb_log_metrics_create() / flb_log_metrics_destroy() for allocation, cleanup, and null-safety.

Poem

🐇 A rabbit taps the monotonic drum,
Beats in nanoseconds, steady they come.
Counters hum, heartbeats sing,
Metrics dance on every ping.
Hooray — the logs keep time and run!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Linked Issues Check	❓ Inconclusive	Issue #11082 requires that fluentbit_ metrics continue being included in prometheus_remote_write payloads indefinitely, without ceasing after approximately one hour. The PR introduces heartbeat mechanisms (`flb_log_metrics_create()`, `flb_log_metrics_destroy()`, `flb_log_metrics_heartbeat()`) that emit per-type log counters at configured intervals, and integrates heartbeat calls into `flb_log_print()` prior to counter increment. However, the provided summaries lack sufficient detail about how these changes specifically prevent the metric exclusion behavior or how they integrate with the fluentbit_metrics input and prometheus_remote_write output pipeline to definitively confirm the issue is fully resolved.	To conclusively validate compliance, the analysis would benefit from additional context showing: how the heartbeat mechanism specifically prevents metrics from being excluded after one hour, how the new metrics APIs integrate with the fluentbit_metrics input, and confirmation that only fluentbit_ metrics are affected as intended (not other metric sources).

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "log: Update log related metrics with a certian interval" directly relates to the main changes in this PR, which introduce heartbeat-based metric updating infrastructure. The changes add interval-based metric updating through the new `hb_interval_ns` and `next_hb_ns` fields, plus the `flb_log_metrics_heartbeat()` function that emits per-type log counters at configured intervals. While the title could be more specific and contains a typo ("certian" vs "certain"), it accurately captures the core purpose of the changeset.
Out of Scope Changes Check	✅ Passed	All code changes in this PR are directly related to implementing interval-based metric updating infrastructure to address the fluentbit_metrics issue. The changes include adding heartbeat-related fields to the flb_log structure, implementing cross-platform monotonic timing, creating new metrics lifecycle APIs, and integrating heartbeat invocations into the logging pipeline. No changes appear to address unrelated concerns or introduce functionality outside the stated objectives of ensuring continuous metric emission.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch cosmo0920-update-log-related-metrics-with-intervals

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/flb_log.c (1)
667-674: Incomplete error cleanup leads to resource leaks.

If flb_log_metrics_create() fails, the error path at lines 670-673 destroys the event loop and frees the log structure, but doesn't clean up the ch_mng pipe created at line 644. This leaks the pipe file descriptors.

Apply this diff:
     log->metrics = flb_log_metrics_create();
     if (log->metrics == NULL) {
         fprintf(stderr, "[log] could not create log metrics\n");
+        flb_pipe_destroy(log->ch_mng);
         mk_event_loop_destroy(log->evl);
         flb_free(log);
         config->log = NULL;
         return NULL;
     }
Additionally, the error path at lines 694-700 is also incomplete: it should call flb_log_metrics_destroy(log->metrics) before freeing the log structure.

Apply this diff:
     ret = flb_log_worker_init(worker);
     if (ret == -1) {
         flb_errno();
+        flb_log_metrics_destroy(log->metrics);
+        flb_pipe_destroy(log->ch_mng);
         mk_event_loop_destroy(log->evl);
         flb_free(log);
         config->log = NULL;

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6345fd1 and 57507e3.

📒 Files selected for processing (2)

include/fluent-bit/flb_log.h (1 hunks)
src/flb_log.c (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/flb_log.c (1)

lib/cmetrics/src/cmt_counter.c (1)

cmt_counter_add (119-135)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-centos-7
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-without-cxx (3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: PR - fuzzing test
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)

🔇 Additional comments (4)

src/flb_log.c (4)

544-553: LGTM!

Proper null checks and cleanup order. The function safely handles null inputs and destroys resources in the correct sequence.

559-609: LGTM!

The metrics creation logic is well-structured with proper error handling. Each failure path correctly calls flb_log_metrics_destroy() to clean up partial state. The use of cmt_counter_set() for initialization is appropriate.

Note: The loop at lines 592-606 has the same range limitation as the heartbeat function (excludes HELP and IDEBUG), which is consistent but worth confirming as intentional.

892-892: LGTM!

The heartbeat integration is correctly placed in the log print path. While this adds a small overhead to every log call, the heartbeat function has an early return (line 510 in flb_log_metrics_heartbeat) that minimizes impact when the heartbeat isn't due. The use of wall-clock time (cfl_time_now()) for metric timestamps while the heartbeat uses monotonic time for scheduling is the correct approach.

963-963: LGTM!

The metrics cleanup is properly integrated into the log destruction sequence, ensuring resources are released before the log structure is freed.

include/fluent-bit/flb_log.h

src/flb_log.c

coderabbitai · 2025-10-30T09:40:15Z

src/flb_log.c

+    log->hb_interval_ns = 60ULL * 1000000000ULL;
+    log->next_hb_ns     = 0;


⚠️ Potential issue | 🟠 Major

Hardcoded heartbeat interval should be configurable.

The heartbeat interval is hardcoded to 60 seconds (line 641), but the PR test configuration uses scrape_interval=60s in the fluentbit_metrics input. If users configure a different scrape interval, the hardcoded heartbeat won't align with the expected metric update frequency.

Consider:

Reading the scrape interval from the fluentbit_metrics input configuration

Making the heartbeat interval configurable via a parameter

Documenting why 60 seconds was chosen if it's intentional to be fixed

Initial heartbeat fires immediately.

Setting next_hb_ns = 0 (line 642) means the first heartbeat will trigger on the first log message, since monotonic_now_ns() returns a positive value. This may be intentional to ensure metrics are emitted early, but consider setting next_hb_ns = monotonic_now_ns() + hb_interval_ns if you want the first heartbeat to fire after one full interval.

🤖 Prompt for AI Agents

In src/flb_log.c around lines 641-642, the heartbeat interval is hardcoded to 60s and next_hb_ns is set to 0; change this so the hb interval is configurable and the initial heartbeat is scheduled correctly: read the scrape interval from the fluentbit_metrics input (or expose a new parameter) and use that value (with a sensible default of 60s) to set log->hb_interval_ns, and set log->next_hb_ns = monotonic_now_ns() + log->hb_interval_ns so the first heartbeat fires after one interval (also add a short comment and update docs/config example to note the new parameter or why 60s is default).

This is a draft proposal so we wouldn't be needed to make configurable for now.
So, we'll choose option 3 for now.

@cosmo0920, understood! Since this is a draft proposal, keeping the hardcoded 60-second interval is perfectly reasonable for now. You can make it configurable later if needed once the approach is validated. Thanks for clarifying! 👍

Signed-off-by: Hiroshi Hatake <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

src/flb_log.c (2)
517-521: Use continue instead of break for robustness.

Line 520 uses break when message_type_str is NULL, which stops processing all remaining log types. While NULL is not expected for the current loop range (FLB_LOG_ERROR through FLB_LOG_TRACE are all defined in the switch), using continue would be more defensive and allow processing of subsequent types if the loop range or type definitions change in the future.

Apply this diff:
         if (!message_type_str) {
-            break;
+            continue;
         }
502-528: Consider thread-safety for heartbeat scheduling.

The next_hb_ns field (lines 513, 527) is read and written without synchronization. When flb_log_print() is called concurrently from multiple threads, multiple threads might observe an expired next_hb_ns and all emit heartbeats simultaneously. While cmt_counter_add() is atomic and prevents data corruption, this could result in redundant heartbeat emissions.

For improved accuracy, consider adding a compare-and-swap or mutex to ensure only one thread advances next_hb_ns per interval. However, since duplicate heartbeats are not critical, this optimization can be deferred.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57507e3 and ca9dc71.

📒 Files selected for processing (1)

src/flb_log.c (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

src/flb_log.c (1)

lib/cmetrics/src/cmt_counter.c (1)

cmt_counter_add (119-135)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
GitHub Check: PR - fuzzing test
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-centos-7
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
GitHub Check: pr-compile-without-cxx (3.31.6)

🔇 Additional comments (4)

src/flb_log.c (4)

466-491: LGTM: Overflow mitigation is sound.

The Windows monotonic time calculation now splits the division to avoid overflow (lines 487-490), addressing the previous review concern. The approach is correct for all practical purposes.

Note: A theoretical overflow still exists if rem * 1000000000ULL exceeds UINT64_MAX (requires performance counter frequency > 18 billion Hz), but this is extremely unlikely in real-world systems.

493-500: LGTM: Standard POSIX monotonic clock implementation.

The non-Windows implementation correctly uses CLOCK_MONOTONIC with clock_gettime, which is the standard approach for obtaining monotonic timestamps.

644-645: LGTM: Heartbeat configuration is appropriate for draft proposal.

The 60-second interval aligns with the test configuration's scrape interval, and setting next_hb_ns = 0 ensures the first heartbeat emits on the first log message, which is sensible for initialization.

As discussed in previous reviews, making the interval configurable can be deferred until the draft is finalized.

895-895: LGTM: Heartbeat invocation placement is correct.

Calling flb_log_metrics_heartbeat() before incrementing the log counter ensures metrics are refreshed at the configured interval, which addresses the PR objective of keeping fluentbit_* metrics alive for prometheus_remote_write.

Note: This adds a monotonic_now_ns() call overhead to every log message, even when the heartbeat interval hasn't elapsed. For most workloads this overhead is negligible, but if profiling reveals performance issues with high-frequency logging, consider sampling or caching the monotonic time check.

cosmo0920 · 2025-10-31T08:54:28Z

This could be first aid approach. So, we need to reevaluate to mitigate this type of issue.

log: Update log related metrics with a certain interval

57507e3

Signed-off-by: Hiroshi Hatake <[email protected]>

cosmo0920 requested review from edsiper, fujimotos and koleini as code owners October 30, 2025 09:34

github-actions bot added the docs-required label Oct 30, 2025

cosmo0920 temporarily deployed to pr October 30, 2025 09:34 — with GitHub Actions Inactive

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

cosmo0920 temporarily deployed to pr October 30, 2025 09:53 — with GitHub Actions Inactive

cosmo0920 added this to the Fluent Bit v4.2 milestone Oct 30, 2025

cosmo0920 added backport to v4.0.x backport to v4.1.x labels Oct 30, 2025

log: Address a coderabitai's comment

ca9dc71

Signed-off-by: Hiroshi Hatake <[email protected]>

cosmo0920 temporarily deployed to pr October 30, 2025 10:57 — with GitHub Actions Inactive

coderabbitai bot reviewed Oct 30, 2025

View reviewed changes

cosmo0920 temporarily deployed to pr October 30, 2025 11:20 — with GitHub Actions Inactive

cosmo0920 closed this Oct 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

log: Update log related metrics with a certian interval #11084

log: Update log related metrics with a certian interval #11084

Uh oh!

cosmo0920 commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 30, 2025 •

edited

Loading

Uh oh!

cosmo0920 Oct 30, 2025 •

edited

Loading

Uh oh!

coderabbitai bot Oct 30, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

cosmo0920 commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		log->hb_interval_ns = 60ULL * 1000000000ULL;
		log->next_hb_ns = 0;

log: Update log related metrics with a certian interval #11084

log: Update log related metrics with a certian interval #11084

Uh oh!

Conversation

cosmo0920 commented Oct 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cosmo0920 Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cosmo0920 commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cosmo0920 commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 30, 2025 •

edited

Loading

coderabbitai bot Oct 30, 2025 •

edited

Loading

cosmo0920 Oct 30, 2025 •

edited

Loading