Skip to content

Conversation

@tanmaya-panda1
Copy link
Contributor

@tanmaya-panda1 tanmaya-panda1 commented Sep 1, 2025

This pull request introduces support for streaming ingestion to Azure Kusto in the Fluent Bit output plugin. Streaming ingestion allows data to be sent directly to the Kusto engine, bypassing blob storage and ingestion queues, with a 4MB per-request payload limit and no local buffering. The changes include configuration options, endpoint management, ingestion logic, and resource cleanup.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Optional streaming ingestion: direct-send mode with per-request limits (4 MB uncompressed) and automatic fallback to queued ingestion when limits are exceeded.
  • Configuration

    • Added streaming_ingestion_enabled (default: false).
    • Streaming and local buffering are mutually exclusive; conflict disables streaming with a warning.
  • Stability

    • Improved cleanup of streaming resources and clearer failure/retry behavior; streaming enforced when enabled, with queued fallback.
  • Logging

    • Reduced upload-size verbosity and added warnings about ingestion mode and endpoint selection.

@coderabbitai
Copy link

coderabbitai bot commented Sep 1, 2025

Walkthrough

Adds optional streaming ingestion to the Azure Kusto output plugin: new config flag, derived cluster upstream, streaming flush path with 4MB uncompressed limit and tokenized HTTP POST, fallback to queued ingestion, mutual-exclusivity validation with buffering, and cleanup of the streaming upstream.

Changes

Cohort / File(s) Summary of Changes
Core plugin logic & lifecycle
plugins/out_azure_kusto/azure_kusto.c
Adds streaming_ingestion_enabled handling; derives/creates ctx->u_cluster from ingestion endpoint (strip ingest- prefix when present, warn if absent); routes flush to streaming path vs queued path, enforces streaming behavior and handles fallback/retries; destroys u_cluster on exit; lowers one info log to debug.
Public context struct
plugins/out_azure_kusto/azure_kusto.h
Adds public fields to struct flb_azure_kusto: int streaming_ingestion_enabled and struct flb_upstream *u_cluster.
Configuration & validation
plugins/out_azure_kusto/azure_kusto_conf.c
Validates mutual exclusivity between buffering and streaming; disables streaming and logs a warning if both set; logs selected ingestion mode and characteristics (streaming direct-send behavior, 4MB limit, buffering behavior).
Ingestion implementation
plugins/out_azure_kusto/azure_kusto_ingest.c
Adds azure_kusto_streaming_ingestion(...) implementing streaming ingestion (4MB uncompressed size validation, token acquisition, optional host mapping, headers including optional Content-Encoding, HTTP POST lifecycle and cleanup); oversize payloads fall back to queued ingestion; updates azure_kusto_queued_ingestion signature to accept struct azure_kusto_file *upload_file; adds generate_random_string.
Ingestion header/API
plugins/out_azure_kusto/azure_kusto_ingest.h
Adds macro KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE (4 * 1024 * 1024) and declares int azure_kusto_streaming_ingestion(...) with the new uncompressed_size parameter.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant FLB as Fluent Bit
  participant AK as Azure Kusto Plugin
  participant U_ing as Ingest Endpoint
  participant U_clu as Cluster (Streaming)
  participant Auth as OAuth2

  Note over AK: Initialization
  AK->>AK: parse config (buffering_enabled, streaming_ingestion_enabled)
  alt streaming enabled
    AK->>AK: derive cluster endpoint (strip "ingest-" if present)
    AK->>U_clu: create ctx->u_cluster
  else streaming disabled
    AK->>U_ing: use ingestion upstream
  end

  Note over FLB,AK: Flush
  FLB->>AK: flush(tag, payload)
  alt streaming_ingestion_enabled
    AK->>AK: enforce size limit (4MB uncompressed)
    AK->>Auth: request access token
    AK->>U_clu: HTTP POST streaming ingest (token + headers)
    alt success
      U_clu-->>AK: 200/204
    else error/oversize
      U_clu-->>AK: failure
      AK->>AK: fallback -> queued ingestion or retry path
    end
  else queued_ingestion
    AK->>AK: prepare upload resources
    AK->>U_ing: queued ingestion (blob/queue)
  end

  Note over AK: Shutdown
  AK->>U_clu: destroy ctx->u_cluster (if created)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Review focus:
    • upstream creation/destruction and hostname transformation in azure_kusto.c
    • new streaming HTTP flow, token handling, header construction, and error/fallback logic in azure_kusto_ingest.c
    • updated azure_kusto_queued_ingestion signature and its callsites
    • mutual-exclusivity logic and logging in azure_kusto_conf.c

Poem

I nibble bytes with eager nose,
A streaming tunnel softly grows.
Four meg hops, headers tight,
If queues call, I switch the light.
Hop on — small packets take flight. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "out_azure_kusto: added streaming ingestion support" directly and clearly summarizes the primary change across all modified files. The title identifies the specific plugin (out_azure_kusto) and the main feature being added (streaming ingestion support), which aligns perfectly with the changeset that adds configuration options, endpoint management, ingestion logic, and resource cleanup to enable streaming ingestion. The title is concise, specific, and avoids vague terminology, making it clear to developers scanning the history that streaming ingestion capability was added to the Azure Kusto output plugin.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (12)
plugins/out_azure_kusto/azure_kusto.h (3)

111-113: Use boolean type and document default.

Prefer flb_boolean (FLB_TRUE/FLB_FALSE) for flags; clarify default behavior in a brief comment (default: disabled).

-    /* streaming ingestion mode */
-    int streaming_ingestion_enabled;
+    /* streaming ingestion mode (default: disabled) */
+    int streaming_ingestion_enabled; /* consider flb_boolean */

173-175: Clarify ownership and configure timeouts for u_cluster.

Document lifecycle: who creates/destroys u_cluster; ensure it inherits TLS/timeout settings and is freed on exit/errors.

-    /* Upstream connection to the main Kusto cluster for streaming ingestion */
-    struct flb_upstream *u_cluster;
+    /* Upstream connection to the main Kusto cluster for streaming ingestion.
+     * Ownership: created by init; must be destroyed on exit/error. */
+    struct flb_upstream *u_cluster;

188-189: Header guard naming consistency.

Minor: this header uses FLB_OUT_AZURE_KUSTO while ingest.h uses FLB_OUT_AZURE_KUSTO_INGEST_H. If the repo prefers _H suffix, align.

plugins/out_azure_kusto/azure_kusto_conf.c (1)

806-815: Clarify mode logs and 4MB constraint location.

Good logging. Consider stating that the 4MB limit applies to the request body (post-compression) and will be pre-validated.

plugins/out_azure_kusto/azure_kusto_ingest.h (1)

29-31: API contract: document 4MB limit and return semantics.

Add a brief comment noting payload must be ≤ 4 MiB (per-request), whether size is before/after compression, and what return codes mean.

-int azure_kusto_streaming_ingestion(struct flb_azure_kusto *ctx, flb_sds_t tag,
-                                    size_t tag_len, flb_sds_t payload, size_t payload_size);
+/* Returns 0 on success, -1 on transport/HTTP error.
+ * Payload must respect Kusto streaming 4 MiB limit (effective on-the-wire size). */
+int azure_kusto_streaming_ingestion(struct flb_azure_kusto *ctx, flb_sds_t tag,
+                                    size_t tag_len, flb_sds_t payload, size_t payload_size);
plugins/out_azure_kusto/azure_kusto_ingest.c (4)

548-555: Honor configured timeouts for streaming upstream.

Align cluster upstream timeouts with ingestion settings to avoid long hangs.

-    u_conn = flb_upstream_conn_get(ctx->u_cluster);
+    ctx->u_cluster->base.net.connect_timeout = ctx->ingestion_endpoint_connect_timeout;
+    ctx->u_cluster->base.net.io_timeout = ctx->io_timeout;
+    u_conn = flb_upstream_conn_get(ctx->u_cluster);

575-588: URI building: guard against oversized database/table names.

flb_sds_snprintf auto-expands, but log the final length or validate against reasonable upper bounds to catch misconfigurations early.


595-612: Reduce payload logging to avoid PII leakage.

Even at debug, 200 chars may leak sensitive data. Trim further or behind a verbose flag.

-        flb_plg_debug(ctx->ins, "[STREAMING_INGESTION] Payload sample (first 200 chars): %.200s", (char*)payload);
+        flb_plg_trace(ctx->ins, "[STREAMING_INGESTION] Payload sample (first 64 chars): %.64s", (char*)payload);

622-636: Accept HTTP 202 as success (defensive).

Some services use 202 for accepted streaming writes. Consider treating 200/202/204 as success.

-            if (c->resp.status == 200 || c->resp.status == 204) {
+            if (c->resp.status == 200 || c->resp.status == 202 || c->resp.status == 204) {
                 ret = 0;
plugins/out_azure_kusto/azure_kusto.c (3)

1461-1471: Consider fallback from streaming to queued ingestion on failure

When streaming ingestion fails, the code returns FLB_RETRY which will retry with the same streaming method. Consider falling back to queued ingestion for more resilient operation.

         /* Perform streaming ingestion to Kusto */
         flb_plg_info(ctx->ins, "[FLUSH_STREAMING] Initiating streaming ingestion to Kusto");
         ret = azure_kusto_streaming_ingestion(ctx, event_chunk->tag, tag_len, final_payload, final_payload_size);
         flb_plg_info(ctx->ins, "[FLUSH_STREAMING] Streaming ingestion completed with result: %d", ret);
 
         if (ret != 0) {
-            flb_plg_error(ctx->ins, "[FLUSH_STREAMING] ERROR: Streaming ingestion failed, will retry");
-            ret = FLB_RETRY;
-            goto error;
+            flb_plg_warn(ctx->ins, "[FLUSH_STREAMING] WARNING: Streaming ingestion failed, falling back to queued ingestion");
+            
+            /* Fallback to queued ingestion */
+            ret = azure_kusto_load_ingestion_resources(ctx, config);
+            if (ret != 0) {
+                flb_plg_error(ctx->ins, "cannot load ingestion resources for fallback");
+                ret = FLB_RETRY;
+                goto error;
+            }
+            
+            ret = azure_kusto_queued_ingestion(ctx, event_chunk->tag, tag_len, final_payload, final_payload_size, NULL);
+            if (ret != 0) {
+                flb_plg_error(ctx->ins, "fallback queued ingestion also failed");
+                ret = FLB_RETRY;
+                goto error;
+            }
         } else {
             flb_plg_info(ctx->ins, "[FLUSH_STREAMING] SUCCESS: Streaming ingestion completed successfully");
         }

1449-1459: Add metrics/monitoring for streaming ingestion

Consider adding metrics to track streaming ingestion success/failure rates and payload sizes to help with monitoring and debugging in production.

+        /* Track metrics for streaming ingestion */
+        if (ctx->metrics) {
+            flb_metrics_add(ctx->metrics->streaming_attempts, 1);
+            flb_metrics_add(ctx->metrics->streaming_bytes, final_payload_size);
+        }
+        
         /* Check payload size limit for streaming ingestion (4MB) */
         flb_plg_info(ctx->ins, "[FLUSH_STREAMING] Checking payload size: %zu bytes against 4MB limit", final_payload_size);
         if (final_payload_size > 4194304) { /* 4MB = 4 * 1024 * 1024 */
             flb_plg_error(ctx->ins, "[FLUSH_STREAMING] ERROR: Payload size %zu bytes exceeds 4MB limit for streaming ingestion", final_payload_size);
+            if (ctx->metrics) {
+                flb_metrics_add(ctx->metrics->streaming_size_exceeded, 1);
+            }
             ret = FLB_ERROR;
             goto error;
         }

982-983: Consider configuration validation for invalid endpoint formats

The warning when the ingestion endpoint doesn't contain "ingest-" might not be appropriate for all deployment scenarios. Some users might intentionally use direct cluster endpoints.

     } else {
-        flb_plg_warn(ctx->ins, "ingestion endpoint does not contain 'ingest-' prefix, using as cluster endpoint");
+        flb_plg_info(ctx->ins, "Using ingestion endpoint directly as cluster endpoint (no 'ingest-' prefix found)");
         /* Use ingestion endpoint directly as cluster endpoint */
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1132279 and bf150d8.

📒 Files selected for processing (5)
  • plugins/out_azure_kusto/azure_kusto.c (4 hunks)
  • plugins/out_azure_kusto/azure_kusto.h (3 hunks)
  • plugins/out_azure_kusto/azure_kusto_conf.c (1 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.c (2 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.h (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
plugins/out_azure_kusto/azure_kusto_ingest.h (1)
plugins/out_azure_kusto/azure_kusto_ingest.c (1)
  • azure_kusto_streaming_ingestion (520-658)
plugins/out_azure_kusto/azure_kusto_ingest.c (5)
include/fluent-bit/flb_compat.h (1)
  • gmtime_r (75-81)
src/flb_upstream.c (2)
  • flb_upstream_conn_get (700-833)
  • flb_upstream_conn_release (851-936)
plugins/out_azure_kusto/azure_kusto.c (1)
  • get_azure_kusto_token (114-161)
src/flb_sds.c (3)
  • flb_sds_create_size (92-95)
  • flb_sds_destroy (389-399)
  • flb_sds_snprintf (405-428)
src/flb_http_client.c (4)
  • flb_http_client (800-845)
  • flb_http_add_header (949-981)
  • flb_http_do (1522-1571)
  • flb_http_client_destroy (1627-1634)
plugins/out_azure_kusto/azure_kusto.c (3)
src/flb_sds.c (2)
  • flb_sds_create (78-90)
  • flb_sds_destroy (389-399)
src/flb_upstream.c (2)
  • flb_upstream_create_url (416-481)
  • flb_upstream_destroy (649-687)
plugins/out_azure_kusto/azure_kusto_ingest.c (1)
  • azure_kusto_streaming_ingestion (520-658)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (3)
plugins/out_azure_kusto/azure_kusto_ingest.c (1)

646-655: Cleanup is correct; ensure u_conn non-NULL prior to release.

Looks fine since early returns happen before assignment; keep as-is.

plugins/out_azure_kusto/azure_kusto.c (2)

1580-1583: LGTM! Proper cleanup of cluster upstream

The cleanup of the cluster upstream connection in the exit callback is correctly implemented.


1649-1653: LGTM! Well-documented configuration option

The streaming ingestion configuration option is properly documented with clear limitations (4MB limit, no buffering support).

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
plugins/out_azure_kusto/azure_kusto.c (1)

915-915: Use %zu format specifier for size_t.

The file_size field is of type size_t, which should use the %zu format specifier rather than %lu. On some platforms, size_t may not match unsigned long, leading to warnings or incorrect output.

Apply this diff:

-        flb_plg_debug(ctx->ins, "Using upload size %lu bytes", ctx->file_size);
+        flb_plg_debug(ctx->ins, "Using upload size %zu bytes", ctx->file_size);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf150d8 and 4fd3fd4.

📒 Files selected for processing (4)
  • plugins/out_azure_kusto/azure_kusto.c (5 hunks)
  • plugins/out_azure_kusto/azure_kusto_conf.c (1 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.c (2 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.h (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
plugins/out_azure_kusto/azure_kusto_ingest.h (1)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_queued_ingestion (758-810)
  • azure_kusto_streaming_ingestion (520-666)
plugins/out_azure_kusto/azure_kusto_ingest.c (5)
include/fluent-bit/flb_compat.h (1)
  • gmtime_r (75-81)
src/flb_upstream.c (2)
  • flb_upstream_conn_get (700-833)
  • flb_upstream_conn_release (851-936)
plugins/out_azure_kusto/azure_kusto.c (1)
  • get_azure_kusto_token (114-161)
src/flb_sds.c (3)
  • flb_sds_create_size (92-95)
  • flb_sds_destroy (389-399)
  • flb_sds_snprintf (405-428)
src/flb_http_client.c (4)
  • flb_http_client (800-845)
  • flb_http_add_header (949-981)
  • flb_http_do (1522-1571)
  • flb_http_client_destroy (1627-1634)
plugins/out_azure_kusto/azure_kusto.c (4)
src/flb_sds.c (2)
  • flb_sds_create (78-90)
  • flb_sds_destroy (389-399)
src/flb_upstream.c (2)
  • flb_upstream_destroy (649-687)
  • flb_upstream_create_url (416-481)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_streaming_ingestion (520-666)
  • azure_kusto_queued_ingestion (758-810)
plugins/out_azure_kusto/azure_kusto_conf.c (1)
  • azure_kusto_load_ingestion_resources (530-683)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-without-cxx (3.31.6)
🔇 Additional comments (7)
plugins/out_azure_kusto/azure_kusto_conf.c (1)

799-815: LGTM! Mutual exclusivity and mode logging implemented correctly.

The implementation properly resolves the conflict between buffering and streaming modes by deterministically disabling streaming when buffering is enabled (with a clear warning), exactly as suggested in the previous review. The mode selection logging provides good visibility into which ingestion path is active and its characteristics.

plugins/out_azure_kusto/azure_kusto_ingest.h (1)

26-35: LGTM! Size limits and function declaration are correct.

The streaming ingestion size limits are clearly defined with appropriate comments referencing the ManagedStreamingIngestClient behavior. The constants (4 MB uncompressed, 1 MB compressed) align with Azure Kusto's documented streaming ingestion limits.

plugins/out_azure_kusto/azure_kusto_ingest.c (2)

520-666: LGTM! Streaming ingestion implementation is solid.

The function correctly implements upfront size validation (lines 542-560) before any network operations, addressing the concern from the previous review. The fallback to queued ingestion when payload exceeds limits is well-handled. Resource management is proper throughout—connections, tokens, URIs, and HTTP clients are all cleaned up on both success and error paths. The HTTP response handling correctly accepts 200/204 status codes as success.


758-810: LGTM! Queued ingestion signature update is correct.

The addition of the upload_file parameter properly enables conditional deletion of buffer files after successful blob ingestion (lines 784-789). The function correctly handles NULL values for this parameter (as passed from streaming ingestion fallback), ensuring compatibility with both buffered and non-buffered usage paths.

plugins/out_azure_kusto/azure_kusto.c (3)

946-1001: LGTM! Cluster upstream creation properly handles errors and URL parsing.

The implementation correctly addresses the issues from previous reviews:

  1. Memory leak fix: Lines 962-964, 984-986, and 996-998 properly clean up ctx->u when cluster upstream creation fails.
  2. Robust URL parsing: Lines 953-977 isolate the hostname portion before checking for the "ingest-" prefix, avoiding false matches in other URL components.
  3. Graceful fallback: Lines 990-999 handle endpoints that don't have the "ingest-" prefix by using the ingestion endpoint directly.

1456-1514: LGTM! Payload size validation and ingestion routing are correct.

The flush logic properly implements the dual size limits for streaming ingestion:

  1. Uncompressed validation (line 1468): Correctly checks json_size against the 4 MB limit before compression.
  2. Compressed validation (line 1475): Checks final_payload_size against the 1 MB limit when compression is enabled.

This addresses the concern from the previous review about validating uncompressed size. The routing between streaming and queued ingestion is clean, with proper resource loading and error handling for both paths.


1603-1606: LGTM! Cleanup and configuration are correct.

The cluster upstream cleanup (lines 1603-1606) properly releases resources on exit. The configuration map entry (lines 1672-1676) has clear documentation explaining that streaming ingestion bypasses blob storage/queues, has a 4 MB limit, and doesn't support buffering—making the trade-offs clear to users.

Also applies to: 1672-1676

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
plugins/out_azure_kusto/azure_kusto.c (1)

915-915: Appropriate log level adjustment.

Reducing the upload size message from info to debug level is a good refinement—this configuration detail doesn't warrant info-level visibility during normal operation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4fd3fd4 and 9b47a6d.

📒 Files selected for processing (5)
  • plugins/out_azure_kusto/azure_kusto.c (5 hunks)
  • plugins/out_azure_kusto/azure_kusto.h (3 hunks)
  • plugins/out_azure_kusto/azure_kusto_conf.c (1 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.c (2 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.h (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/out_azure_kusto/azure_kusto_conf.c
  • plugins/out_azure_kusto/azure_kusto_ingest.h
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/out_azure_kusto/azure_kusto_ingest.c (5)
include/fluent-bit/flb_compat.h (1)
  • gmtime_r (75-81)
src/flb_upstream.c (2)
  • flb_upstream_conn_get (700-833)
  • flb_upstream_conn_release (851-936)
plugins/out_azure_kusto/azure_kusto.c (1)
  • get_azure_kusto_token (114-161)
src/flb_sds.c (3)
  • flb_sds_create_size (92-95)
  • flb_sds_destroy (389-399)
  • flb_sds_snprintf (405-428)
src/flb_http_client.c (4)
  • flb_http_client (800-845)
  • flb_http_add_header (949-981)
  • flb_http_do (1522-1571)
  • flb_http_client_destroy (1627-1634)
plugins/out_azure_kusto/azure_kusto.c (4)
src/flb_sds.c (2)
  • flb_sds_create (78-90)
  • flb_sds_destroy (389-399)
src/flb_upstream.c (2)
  • flb_upstream_destroy (649-687)
  • flb_upstream_create_url (416-481)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_streaming_ingestion (520-666)
  • azure_kusto_queued_ingestion (758-810)
plugins/out_azure_kusto/azure_kusto_conf.c (1)
  • azure_kusto_load_ingestion_resources (530-683)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
🔇 Additional comments (6)
plugins/out_azure_kusto/azure_kusto.h (1)

111-112: LGTM! Clean additions to support streaming ingestion.

The new fields streaming_ingestion_enabled and u_cluster are well-integrated into the existing struct and properly documented with comments explaining their purpose.

Also applies to: 173-174

plugins/out_azure_kusto/azure_kusto_ingest.c (1)

520-666: Excellent resource management and cleanup.

The function properly handles resource cleanup in all code paths:

  • Connection is released via flb_upstream_conn_release(u_conn) in all exit paths (success and error)
  • Token and URI are destroyed using null-safe flb_sds_destroy() calls
  • HTTP client is destroyed after use via flb_http_client_destroy(c)

The cleanup logic at lines 656-663 ensures no resource leaks regardless of success or failure.

plugins/out_azure_kusto/azure_kusto.c (4)

946-1001: Well-implemented streaming upstream creation with proper error handling.

The logic correctly:

  • Parses the URL to isolate the hostname portion (lines 952-954)
  • Checks if the hostname starts with "ingest-" prefix using strncmp (line 957)
  • Removes the prefix when present to derive the cluster endpoint (lines 959-977)
  • Falls back to using the ingestion endpoint directly when the prefix is absent (lines 990-1000)
  • Cleans up ctx->u on all error paths to prevent resource leaks (lines 962-964, 984-986, 996-998)

This addresses the concerns from past reviews about robust URL manipulation and proper cleanup on failure.


1467-1482: Correct implementation of dual size limits for streaming ingestion.

The validation properly enforces both Azure Kusto streaming ingestion limits as documented:

  1. Uncompressed data limit: 4 MB (checked at line 1468 using json_size)
  2. Compressed data limit: 1 MB (checked at line 1475 using final_payload_size when compression is enabled)

This is the appropriate location for validation because both the original uncompressed size (json_size) and the final payload size (final_payload_size) are available. This addresses the concerns raised in past reviews about validating the uncompressed size regardless of compression usage.


1456-1514: Clean separation of streaming and queued ingestion paths.

The conditional branching correctly:

  • Routes to streaming ingestion when enabled (lines 1456-1494), validating sizes upfront and sending directly to the cluster endpoint
  • Routes to queued ingestion when streaming is disabled (lines 1495-1514), loading the necessary resources (blob/queue upstreams and identity token) before ingestion
  • Handles errors appropriately with FLB_RETRY for transient failures
  • Logs debug and info messages at appropriate levels for observability

The separation ensures each path has the resources it needs—streaming uses the cluster upstream, while queued ingestion uses blob/queue resources.


1603-1606: Proper cleanup of streaming upstream on exit.

The cleanup correctly mirrors the pattern used for the ingestion upstream (ctx->u) and ensures the streaming cluster upstream (ctx->u_cluster) is properly destroyed when the plugin exits, preventing resource leaks.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/out_azure_kusto/azure_kusto.c (1)

140-148: Fix deadlock: early return while token_mutex is held

If flb_sds_create_size fails, the function returns without unlocking token_mutex, blocking all future token fetches.

-        output = flb_sds_create_size(flb_sds_len(ctx->o->token_type) +
-                                     flb_sds_len(ctx->o->access_token) + 2);
-        if (!output) {
-            flb_plg_error(ctx->ins, "error creating token buffer");
-            return NULL;
-        }
+        output = flb_sds_create_size(flb_sds_len(ctx->o->token_type) +
+                                     flb_sds_len(ctx->o->access_token) + 2);
+        if (!output) {
+            flb_plg_error(ctx->ins, "error creating token buffer");
+            /* Avoid deadlock: release mutex before returning */
+            pthread_mutex_unlock(&ctx->token_mutex);
+            return NULL;
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b47a6d and 47707be.

📒 Files selected for processing (1)
  • plugins/out_azure_kusto/azure_kusto.c (6 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
plugins/out_azure_kusto/azure_kusto.c (3)
src/flb_sds.c (2)
  • flb_sds_create (78-90)
  • flb_sds_destroy (389-399)
src/flb_upstream.c (2)
  • flb_upstream_destroy (649-687)
  • flb_upstream_create_url (416-481)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_streaming_ingestion (520-666)
  • azure_kusto_queued_ingestion (758-810)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (3)
plugins/out_azure_kusto/azure_kusto.c (3)

946-1002: Cluster endpoint derivation and error-path cleanup look solid

Hostname-prefix check, safe memmove, and upstream cleanups are correct. Nice recovery for both prefix and no‑prefix paths.

If ingestion_endpoint ever lacks a scheme (http/https), prefix removal won’t run. Confirm config validation guarantees a full URL.


1612-1615: Good: destroy u_cluster on exit

Prevents leaking the cluster upstream when streaming was enabled.


1681-1685: Correct the streaming ingestion help text; remove inaccurate 1MB compressed limit claim

Runtime enforcement of mutual exclusivity between buffering and streaming ingestion is verified (lines 801–804 in azure_kusto_conf.c). However, the 4 MB limit is for uncompressed data per Microsoft documentation, and there is no separate 1 MB compressed payload limit. The suggested text claiming "If compression is enabled, keep payload <= 1MB" is inaccurate and should be removed. The current help text is substantially correct; minor clarification is acceptable, but do not add unsupported compressed-size guidance.

Likely an incorrect or invalid review comment.

ctx->timer_created = FLB_FALSE;
ctx->timer_ms = (int) (ctx->upload_timeout / 6) * 1000;
flb_plg_info(ctx->ins, "Using upload size %lu bytes", ctx->file_size);
flb_plg_debug(ctx->ins, "Using upload size %lu bytes", ctx->file_size);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use %zu for size_t in logs to avoid UB across platforms

On Windows and some platforms, size_t isn’t unsigned long. Prefer %zu.

-        flb_plg_debug(ctx->ins, "Using upload size %lu bytes", ctx->file_size);
+        flb_plg_debug(ctx->ins, "Using upload size %zu bytes", ctx->file_size);
🤖 Prompt for AI Agents
In plugins/out_azure_kusto/azure_kusto.c around line 915, the debug log uses the
"%lu" format specifier for ctx->file_size which is a size_t and can cause
undefined behavior on platforms where size_t is not unsigned long; change the
format specifier to "%zu" so the printf-family call correctly formats size_t
across platforms and rebuild to ensure no warnings.

Signed-off-by: Tanmaya Panda <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/out_azure_kusto/azure_kusto_ingest.c (1)

670-686: Fix thread-safety issue in random string generation.

srand() and rand() use shared global state that is not thread-safe. Since generate_random_string is called from azure_kusto_create_blob_id, which in turn is called from the flush path (potentially by multiple concurrent coroutines), concurrent calls will race on the RNG state, producing poor-quality randomness or duplicate values that could cause blob ID collisions.

Use the existing thread-safe flb_random_bytes helper already available in the codebase:

 void generate_random_string(char *str, size_t length)
 {
     const char charset[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
     const size_t charset_size = sizeof(charset) - 1;
-
-    /* Seed the random number generator with multiple sources of entropy */
-    unsigned int seed = (unsigned int)(time(NULL) ^ clock() ^ getpid());
-    srand(seed);
 
     size_t i;
+    uint64_t rand_val;
+    
     for (i = 0; i < length; ++i) {
-        size_t index = (size_t)rand() % charset_size;
+        if (flb_random_bytes((unsigned char *)&rand_val, sizeof(uint64_t))) {
+            rand_val = time(NULL) + i;
+        }
+        size_t index = (size_t)(rand_val % charset_size);
         str[i] = charset[index];
     }
 
     str[length] = '\0';
 }
♻️ Duplicate comments (1)
plugins/out_azure_kusto/azure_kusto.c (1)

915-915: Use %zu format specifier for size_t to ensure portability.

The %lu format specifier assumes size_t is unsigned long, which is not true on all platforms (e.g., Windows x64 where size_t is unsigned long long). This causes undefined behavior.

Apply this fix:

-        flb_plg_debug(ctx->ins, "Using upload size %lu bytes", ctx->file_size);
+        flb_plg_debug(ctx->ins, "Using upload size %zu bytes", ctx->file_size);

Based on past review comments.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 47707be and 600918b.

📒 Files selected for processing (2)
  • plugins/out_azure_kusto/azure_kusto.c (6 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.c (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
plugins/out_azure_kusto/azure_kusto.c (3)
src/flb_sds.c (2)
  • flb_sds_create (78-90)
  • flb_sds_destroy (389-399)
src/flb_upstream.c (2)
  • flb_upstream_destroy (649-687)
  • flb_upstream_create_url (416-481)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_streaming_ingestion (520-666)
  • azure_kusto_queued_ingestion (758-810)
plugins/out_azure_kusto/azure_kusto_ingest.c (5)
include/fluent-bit/flb_compat.h (1)
  • gmtime_r (75-81)
src/flb_upstream.c (2)
  • flb_upstream_conn_get (700-833)
  • flb_upstream_conn_release (851-936)
plugins/out_azure_kusto/azure_kusto.c (1)
  • get_azure_kusto_token (114-161)
src/flb_sds.c (3)
  • flb_sds_create_size (92-95)
  • flb_sds_destroy (389-399)
  • flb_sds_snprintf (405-428)
src/flb_http_client.c (4)
  • flb_http_client (800-845)
  • flb_http_add_header (949-981)
  • flb_http_do (1522-1571)
  • flb_http_client_destroy (1627-1634)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (3)
plugins/out_azure_kusto/azure_kusto.c (3)

946-1001: LGTM! Cluster endpoint derivation logic is well-implemented.

The code correctly:

  • Isolates the hostname portion after the protocol scheme
  • Checks if the hostname specifically starts with the "ingest-" prefix using strncmp (avoiding false matches elsewhere in the URL)
  • Removes the prefix safely via memmove and adjusts the SDS length
  • Falls back to using the ingestion endpoint directly when the prefix is absent
  • Cleans up ctx->u on all error paths to prevent leaks

The improvements address all concerns from previous review comments.


1456-1497: LGTM! Resource loading correctly supports fallback scenario.

Loading ingestion resources at line 1461 before the streaming/queued branch is the right approach: it ensures streaming ingestion can fall back to queued ingestion when payload limits are exceeded, and queued ingestion requires these resources. The slight inefficiency of loading resources even when streaming succeeds is acceptable for robustness.

Note: The call to azure_kusto_streaming_ingestion at line 1477 should be updated to pass the uncompressed size (json_size) once that parameter is added (see comment on the streaming ingestion function in azure_kusto_ingest.c).


1586-1589: LGTM! Proper cleanup of cluster upstream.

The cleanup pattern matches the existing pattern for ctx->u and correctly sets the pointer to NULL after destruction.

Comment on lines 538 to 560
/*
* Size validation (matching ManagedStreamingIngestClient behavior)
* Check payload size against limits before attempting streaming ingestion
*/
if (ctx->compression_enabled) {
size_limit = KUSTO_STREAMING_MAX_COMPRESSED_SIZE;
} else {
size_limit = KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE;
}

if (payload_size > size_limit) {
flb_plg_warn(ctx->ins, "[STREAMING_INGESTION] Payload size %zu bytes exceeds %s limit of %zu bytes - falling back to queued ingestion",
payload_size, ctx->compression_enabled ? "compressed" : "uncompressed", size_limit);

/* Fallback to queued ingestion */
ret = azure_kusto_queued_ingestion(ctx, tag, tag_len, payload, payload_size, NULL);
if (ret == 0) {
flb_plg_info(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion and succeeded");
} else {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion failed");
}
return ret;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Size validation checks wrong value when compression is enabled.

When compression is enabled, payload_size represents the compressed payload size, but the caller compresses the JSON before passing it to this function. The 4 MB streaming-ingestion limit applies to the uncompressed data size per Azure documentation (as confirmed in past review comments), yet this code validates the compressed size against KUSTO_STREAMING_MAX_COMPRESSED_SIZE (1 MB).

A payload with 5 MB uncompressed but 900 KB compressed would pass validation here but exceed Kusto's 4 MB uncompressed limit, causing server-side rejection and wasted bandwidth.

Add an uncompressed_size parameter to this function and validate it against KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE regardless of compression:

 int azure_kusto_streaming_ingestion(struct flb_azure_kusto *ctx, flb_sds_t tag,
-                                    size_t tag_len, flb_sds_t payload, size_t payload_size)
+                                    size_t tag_len, flb_sds_t payload, size_t payload_size,
+                                    size_t uncompressed_size)
 {
     int ret = -1;
     struct flb_connection *u_conn;
     struct flb_http_client *c;
     flb_sds_t uri = NULL;
     flb_sds_t token = NULL;
     size_t resp_size;
     time_t now;
     struct tm tm;
     char tmp[64];
     int len;
-    size_t size_limit;

     flb_plg_debug(ctx->ins, "[STREAMING_INGESTION] Starting for tag: %.*s, payload: %zu bytes, db: %s, table: %s, compression: %s",
                  (int)tag_len, tag, payload_size, ctx->database_name, ctx->table_name, ctx->compression_enabled ? "enabled" : "disabled");

     /* 
-     * Size validation (matching ManagedStreamingIngestClient behavior)
-     * Check payload size against limits before attempting streaming ingestion
+     * Validate uncompressed size against 4 MB limit (Azure requirement)
      */
-    if (ctx->compression_enabled) {
-        size_limit = KUSTO_STREAMING_MAX_COMPRESSED_SIZE;
-    } else {
-        size_limit = KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE;
-    }
-
-    if (payload_size > size_limit) {
-        flb_plg_warn(ctx->ins, "[STREAMING_INGESTION] Payload size %zu bytes exceeds %s limit of %zu bytes - falling back to queued ingestion",
-                     payload_size, ctx->compression_enabled ? "compressed" : "uncompressed", size_limit);
+    if (uncompressed_size > KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE) {
+        flb_plg_warn(ctx->ins, "[STREAMING_INGESTION] Uncompressed payload %zu bytes exceeds 4MB limit - falling back to queued ingestion", uncompressed_size);
         
         /* Fallback to queued ingestion */
-        ret = azure_kusto_queued_ingestion(ctx, tag, tag_len, payload, payload_size, NULL);
+        ret = azure_kusto_queued_ingestion(ctx, tag, tag_len, payload, payload_size, NULL);
         if (ret == 0) {
-            flb_plg_info(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion and succeeded");
+            flb_plg_info(ctx->ins, "[STREAMING_INGESTION] Successfully fell back to queued ingestion");
         } else {
             flb_plg_error(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion failed");
         }
         return ret;
     }

Then update the caller in azure_kusto.c to pass the uncompressed size:

/* In azure_kusto.c around line 1477 */
ret = azure_kusto_streaming_ingestion(ctx, event_chunk->tag, tag_len, final_payload, final_payload_size, json_size);
🤖 Prompt for AI Agents
In plugins/out_azure_kusto/azure_kusto_ingest.c around lines 538-560, the
function currently checks payload_size against compressed or uncompressed limits
based on ctx->compression_enabled, which is incorrect because when compression
is enabled payload_size is the compressed size while Kusto's streaming limit is
based on uncompressed data; change the function signature to accept an
additional size_t uncompressed_size parameter, always validate uncompressed_size
against KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE and fall back to queued ingestion
if it exceeds that limit, and (optionally) keep the existing compressed-size
check for an extra safeguard; then update the caller(s) in azure_kusto.c (around
line ~1477) to pass the original JSON/uncompressed size as the new
uncompressed_size argument and update any other call sites to match the new
signature.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
plugins/out_azure_kusto/azure_kusto_ingest.c (1)

696-719: Null-check required for generated_random_string allocation.

generated_random_string is malloc’ed and then used unconditionally; NULL leads to crash.

-    generated_random_string = flb_malloc(ctx->blob_uri_length + 1);
-    flb_time_get(&tm);
+    generated_random_string = flb_malloc(ctx->blob_uri_length + 1);
+    if (!generated_random_string) {
+        flb_errno();
+        return NULL;
+    }
+    flb_time_get(&tm);
plugins/out_azure_kusto/azure_kusto.c (2)

935-944: Potential NULL deref: check ctx->u before dereferencing.

ctx->u is used (flags/io_timeout) before the NULL check. Move the NULL check up.

-    ctx->u = flb_upstream_create_url(config, ctx->ingestion_endpoint, io_flags, ins->tls);
-    if (ctx->buffering_enabled ==  FLB_TRUE){
-        flb_stream_disable_flags(&ctx->u->base, FLB_IO_ASYNC);
-        ctx->u->base.net.io_timeout = ctx->io_timeout;
-        ctx->has_old_buffers = azure_kusto_store_has_data(ctx);
-    }
-    if (!ctx->u) {
+    ctx->u = flb_upstream_create_url(config, ctx->ingestion_endpoint, io_flags, ins->tls);
+    if (!ctx->u) {
         flb_plg_error(ctx->ins, "upstream creation failed");
         return -1;
     }
+    if (ctx->buffering_enabled ==  FLB_TRUE){
+        flb_stream_disable_flags(&ctx->u->base, FLB_IO_ASYNC);
+        ctx->u->base.net.io_timeout = ctx->io_timeout;
+        ctx->has_old_buffers = azure_kusto_store_has_data(ctx);
+    }

1505-1514: Do not free event_chunk->tag; only destroy owned strings.

When unify_tag is false, tag_name points to event_chunk->tag. Cleanup destroys it unconditionally -> double free/UB. Track ownership.

-    flb_sds_t tag_name = NULL;
+    flb_sds_t tag_name = NULL;
+    int tag_name_owned = FLB_FALSE;
...
-        if (ctx->unify_tag == FLB_TRUE){
-            tag_name = flb_sds_create("fluentbit-buffer-file-unify-tag.log");
-        }
-        else {
-            tag_name = event_chunk->tag;
-        }
+        if (ctx->unify_tag == FLB_TRUE){
+            tag_name = flb_sds_create("fluentbit-buffer-file-unify-tag.log");
+            tag_name_owned = FLB_TRUE;
+        } else {
+            tag_name = event_chunk->tag; /* not owned */
+            tag_name_owned = FLB_FALSE;
+        }
...
-    if (tag_name) {
-        flb_sds_destroy(tag_name);
-    }
+    if (tag_name_owned && tag_name) {
+        flb_sds_destroy(tag_name);
+    }
...
-    if (tag_name) {
-        flb_sds_destroy(tag_name);
-    }
+    if (tag_name_owned && tag_name) {
+        flb_sds_destroy(tag_name);
+    }

Also applies to: 1518-1526, 1304-1311

♻️ Duplicate comments (1)
plugins/out_azure_kusto/azure_kusto.c (1)

915-915: Use %zu for size_t (printf portability).

size_t may not be unsigned long on all platforms.

-        flb_plg_debug(ctx->ins, "Using upload size %lu bytes", ctx->file_size);
+        flb_plg_debug(ctx->ins, "Using upload size %zu bytes", ctx->file_size);
🧹 Nitpick comments (3)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)

666-682: Avoid srand/rand; use cryptographic RNG and limit symbol scope.

srand/rand are process‑global and not thread‑safe. Use flb_random_bytes and make the helper static.

-void generate_random_string(char *str, size_t length)
+static void generate_random_string(char *str, size_t length)
 {
     const char charset[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
     const size_t charset_size = sizeof(charset) - 1;
-
-    /* Seed the random number generator with multiple sources of entropy */
-    unsigned int seed = (unsigned int)(time(NULL) ^ clock() ^ getpid());
-    srand(seed);
-
-    size_t i;
-    for (i = 0; i < length; ++i) {
-        size_t index = (size_t)rand() % charset_size;
-        str[i] = charset[index];
-    }
+    size_t i;
+    for (i = 0; i < length; ++i) {
+        unsigned char b = 0;
+        if (flb_random_bytes(&b, 1) != 0) {
+            /* fallback with time-based byte if RNG fails */
+            b = (unsigned char)((time(NULL) ^ getpid() ^ (uintptr_t)&i) & 0xFF);
+        }
+        str[i] = charset[b % charset_size];
+    }
     str[length] = '\0';
 }

183-187: Fix mutex error messages (lock vs unlock).

Logs say “error unlocking mutex” on lock failures; invert messages for accuracy.

-        flb_plg_error(ctx->ins, "error unlocking mutex");
+        flb_plg_error(ctx->ins, "error locking mutex");
...
-        flb_plg_error(ctx->ins, "error unlocking mutex");
+        flb_plg_error(ctx->ins, "error unlocking mutex");
...
-        flb_plg_error(ctx->ins, "error unlocking mutex");
+        flb_plg_error(ctx->ins, "error locking mutex");
...
-        flb_plg_error(ctx->ins, "error unlocking mutex");
+        flb_plg_error(ctx->ins, "error unlocking mutex");

Also applies to: 191-194, 762-765, 771-775

plugins/out_azure_kusto/azure_kusto.c (1)

979-1001: Optionally associate u_cluster with the output instance.

Not required, but setting the owner improves observability and lifecycle coherence.

             ctx->u_cluster = flb_upstream_create_url(config, cluster_endpoint, io_flags, ins->tls);
             if (!ctx->u_cluster) {
                 ...
             }
+            flb_output_upstream_set(ctx->u_cluster, ins);
...
             ctx->u_cluster = flb_upstream_create_url(config, ctx->ingestion_endpoint, io_flags, ins->tls);
             if (!ctx->u_cluster) {
                 ...
             }
+            flb_output_upstream_set(ctx->u_cluster, ins);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 600918b and dbff29b.

📒 Files selected for processing (3)
  • plugins/out_azure_kusto/azure_kusto.c (6 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.c (2 hunks)
  • plugins/out_azure_kusto/azure_kusto_ingest.h (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
plugins/out_azure_kusto/azure_kusto_ingest.h (1)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_queued_ingestion (754-806)
  • azure_kusto_streaming_ingestion (520-662)
plugins/out_azure_kusto/azure_kusto_ingest.c (5)
include/fluent-bit/flb_compat.h (1)
  • gmtime_r (75-81)
src/flb_upstream.c (2)
  • flb_upstream_conn_get (700-833)
  • flb_upstream_conn_release (851-936)
plugins/out_azure_kusto/azure_kusto.c (1)
  • get_azure_kusto_token (114-161)
src/flb_sds.c (3)
  • flb_sds_create_size (92-95)
  • flb_sds_destroy (389-399)
  • flb_sds_snprintf (405-428)
src/flb_http_client.c (4)
  • flb_http_client (800-845)
  • flb_http_add_header (949-981)
  • flb_http_do (1522-1571)
  • flb_http_client_destroy (1627-1634)
plugins/out_azure_kusto/azure_kusto.c (3)
src/flb_sds.c (2)
  • flb_sds_create (78-90)
  • flb_sds_destroy (389-399)
src/flb_upstream.c (2)
  • flb_upstream_destroy (649-687)
  • flb_upstream_create_url (416-481)
plugins/out_azure_kusto/azure_kusto_ingest.c (2)
  • azure_kusto_streaming_ingestion (520-662)
  • azure_kusto_queued_ingestion (754-806)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: PR - fuzzing test
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: pr-compile-centos-7
🔇 Additional comments (2)
plugins/out_azure_kusto/azure_kusto_ingest.h (2)

26-35: LGTM on API and limit macro.

The 4 MiB uncompressed cap and the extra uncompressed_size arg align with the streaming constraints.


26-28: Comment based on incorrect assumptions about function signatures.

The code is correct: azure_kusto_streaming_ingestion receives uncompressed_size properly (passes json_size at line 1477, which is confirmed as the uncompressed payload size before compression). However, azure_kusto_queued_ingestion has a different signature and intentionally lacks an uncompressed_size parameter—it doesn't need one. The review conflates two distinct functions.

The minor "4 MB" → "4 MiB" terminology nit is valid but optional.

Likely an incorrect or invalid review comment.

Comment on lines +520 to +662
int azure_kusto_streaming_ingestion(struct flb_azure_kusto *ctx, flb_sds_t tag,
size_t tag_len, flb_sds_t payload, size_t payload_size,
size_t uncompressed_size)
{
int ret = -1;
struct flb_connection *u_conn;
struct flb_http_client *c;
flb_sds_t uri = NULL;
flb_sds_t token = NULL;
size_t resp_size;
time_t now;
struct tm tm;
char tmp[64];
int len;

flb_plg_info(ctx->ins, "[STREAMING_INGESTION] Starting for tag: %.*s, uncompressed: %zu bytes, payload: %zu bytes, db: %s, table: %s, compression: %s",
(int)tag_len, tag, uncompressed_size, payload_size, ctx->database_name, ctx->table_name, ctx->compression_enabled ? "enabled" : "disabled");

/*
* Size validation per Azure Kusto documentation:
* https://learn.microsoft.com/en-us/azure/data-explorer/ingest-data-streaming
* "Data size limit: The data size limit for a streaming ingestion request is 4 MB."
* This limit applies to the uncompressed data size.
*/
if (uncompressed_size > KUSTO_STREAMING_MAX_UNCOMPRESSED_SIZE) {
flb_plg_warn(ctx->ins, "[STREAMING_INGESTION] Uncompressed data size %zu bytes exceeds 4MB limit - falling back to queued ingestion",
uncompressed_size);

/* Fallback to queued ingestion */
ret = azure_kusto_queued_ingestion(ctx, tag, tag_len, payload, payload_size, NULL);
if (ret == 0) {
flb_plg_info(ctx->ins, "[STREAMING_INGESTION] Successfully fell back to queued ingestion");
} else {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion failed");
}
return ret;
}

now = time(NULL);
gmtime_r(&now, &tm);
len = strftime(tmp, sizeof(tmp) - 1, "%a, %d %b %Y %H:%M:%S GMT", &tm);

/* Get upstream connection to the main Kusto cluster endpoint (for streaming ingestion) */
if (!ctx->u_cluster) {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: cluster upstream not available - streaming ingestion requires cluster endpoint");
return -1;
}

u_conn = flb_upstream_conn_get(ctx->u_cluster);
if (!u_conn) {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: failed to get cluster upstream connection");
return -1;
}

/* Get authentication token */
token = get_azure_kusto_token(ctx);
if (!token) {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: failed to retrieve OAuth2 token");
flb_upstream_conn_release(u_conn);
return -1;
}

/* Build the streaming ingestion URI */
uri = flb_sds_create_size(256);
if (!uri) {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: failed to create URI buffer");
flb_sds_destroy(token);
flb_upstream_conn_release(u_conn);
return -1;
}

/* Create the streaming ingestion URI */
if (ctx->ingestion_mapping_reference) {
flb_sds_snprintf(&uri, flb_sds_alloc(uri),
"/v1/rest/ingest/%s/%s?streamFormat=MultiJSON&mappingName=%s",
ctx->database_name, ctx->table_name, ctx->ingestion_mapping_reference);
} else {
flb_sds_snprintf(&uri, flb_sds_alloc(uri),
"/v1/rest/ingest/%s/%s?streamFormat=MultiJSON",
ctx->database_name, ctx->table_name);
flb_plg_debug(ctx->ins, "[STREAMING_INGESTION] No ingestion mapping specified");
}

/* Create HTTP client for streaming ingestion */
c = flb_http_client(u_conn, FLB_HTTP_POST, uri, payload, payload_size,
NULL, 0, NULL, 0);

if (c) {
/* Add required headers for streaming ingestion */
flb_http_add_header(c, "User-Agent", 10, "Fluent-Bit", 10);
flb_http_add_header(c, "Accept", 6, "application/json", 16);
flb_http_add_header(c, "Authorization", 13, token, flb_sds_len(token));
flb_http_add_header(c, "x-ms-date", 9, tmp, len);
flb_http_add_header(c, "x-ms-client-version", 19, FLB_VERSION_STR, strlen(FLB_VERSION_STR));
flb_http_add_header(c, "x-ms-app", 8, "Kusto.Fluent-Bit", 16);
flb_http_add_header(c, "x-ms-user", 9, "Kusto.Fluent-Bit", 16);

/* Set Content-Type based on whether compression is enabled */
if (ctx->compression_enabled) {
flb_http_add_header(c, "Content-Type", 12, "application/json", 16);
flb_http_add_header(c, "Content-Encoding", 16, "gzip", 4);
} else {
flb_http_add_header(c, "Content-Type", 12, "application/json; charset=utf-8", 31);
}

/* Send the HTTP request */
ret = flb_http_do(c, &resp_size);

flb_plg_info(ctx->ins, "[STREAMING_INGESTION] HTTP request completed - http_do result: %d, HTTP Status: %i, Response size: %zu", ret, c->resp.status, resp_size);

if (ret == 0) {
/* Check for successful HTTP status codes */
if (c->resp.status == 200 || c->resp.status == 204) {
ret = 0;
} else {
ret = -1;
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: HTTP request failed with status: %i", c->resp.status);

if (c->resp.payload_size > 0) {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] Error response body (size: %zu): %.*s",
c->resp.payload_size, (int)c->resp.payload_size, c->resp.payload);
}
}
} else {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: HTTP request failed at transport level (ret=%d)", ret);
}

flb_http_client_destroy(c);
} else {
flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: failed to create HTTP client context");
}

/* Cleanup */
if (uri) {
flb_sds_destroy(uri);
}
if (token) {
flb_sds_destroy(token);
}
flb_upstream_conn_release(u_conn);

return ret;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Streaming: accept all 2xx; add retryable fallback (413/429/5xx).

Only 200/204 are treated as success; add 202 and, safer, any 2xx. Also fall back to queued ingestion on classic retryable statuses.

-            /* Check for successful HTTP status codes */
-            if (c->resp.status == 200 || c->resp.status == 204) {
+            /* Treat any 2xx as success */
+            if (c->resp.status >= 200 && c->resp.status < 300) {
                 ret = 0;
             } else {
                 ret = -1;
                 flb_plg_error(ctx->ins, "[STREAMING_INGESTION] ERROR: HTTP request failed with status: %i", c->resp.status);
 
                 if (c->resp.payload_size > 0) {
                     flb_plg_error(ctx->ins, "[STREAMING_INGESTION] Error response body (size: %zu): %.*s",
                                 c->resp.payload_size, (int)c->resp.payload_size, c->resp.payload);
                 }
+
+                /* Optional: automatic fallback on retryable statuses */
+                if (c->resp.status == 413 || c->resp.status == 429 ||
+                    (c->resp.status >= 500 && c->resp.status < 600)) {
+                    flb_plg_warn(ctx->ins, "[STREAMING_INGESTION] Status %i -> attempting queued ingestion fallback", c->resp.status);
+                    /* destroy client before fallback return path */
+                    flb_http_client_destroy(c);
+                    c = NULL;
+                    /* Fall back to queued ingestion */
+                    ret = azure_kusto_queued_ingestion(ctx, tag, tag_len, payload, payload_size, NULL);
+                    if (ret == 0) {
+                        flb_plg_info(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion succeeded");
+                    }
+                    else {
+                        flb_plg_error(ctx->ins, "[STREAMING_INGESTION] Fallback to queued ingestion failed");
+                    }
+                    /* Ensure we free local resources before returning */
+                    if (uri) { flb_sds_destroy(uri); }
+                    if (token) { flb_sds_destroy(token); }
+                    flb_upstream_conn_release(u_conn);
+                    return ret;
+                }
             }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant