Skip to content

Conversation

varunbharadwaj
Copy link
Contributor

@varunbharadwaj varunbharadwaj commented Oct 12, 2025

Description

This PR fixes the following bugs and inefficiencies.

  1. Removes persisted pointer concept to fix correctness corner cases related to user initiated consumer rewind (as described in the linked issue) when versioning is not used. Pull-based ingestion will provide atleast-once processing guarantees. Versioning must be used to ensure consistent view of documents on rewind/replay, if eventually consistent view is not acceptable.
  2. Setauto.offset.reset Kafka setting to none by default to throw errors on out-of-bounds offsets. This could be possible if the ingestion was paused and exceeds Kafka retention period. The user will be able to monitor consumer errors to identify this error and handle it accordingly. User still has the choice to set auto.offset.reset to earliest/latest.
  3. Update the Kafka polling logic to remove batch size filtering. This prevents ingestion getting stuck, if user configures auto.offset.reset to earliest/latest and offset is out-of-bounds. Instead, the OpenSearch consumer setting poll.max_batch_size is used in place of max.poll.records for Kafka consumer initialization to control batch size. If max.poll.records is configured by user, it takes precedence.

Related Issues

Resolves #19591

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions github-actions bot added bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing labels Oct 12, 2025
Copy link
Contributor

❌ Gradle check result for 4eb43e8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@varunbharadwaj varunbharadwaj force-pushed the vb/fixpoller branch 2 times, most recently from 6fabbc4 to cb49bb0 Compare October 12, 2025 04:22
Copy link
Contributor

✅ Gradle check result for cb49bb0: SUCCESS

Copy link

codecov bot commented Oct 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.10%. Comparing base (252cff8) to head (afaf5e0).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #19607      +/-   ##
============================================
- Coverage     73.11%   73.10%   -0.01%     
+ Complexity    70661    70634      -27     
============================================
  Files          5724     5725       +1     
  Lines        323498   323701     +203     
  Branches      46852    46876      +24     
============================================
+ Hits         236518   236655     +137     
- Misses        67846    67897      +51     
- Partials      19134    19149      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@varunbharadwaj varunbharadwaj changed the title [Pull-based Ingestion] Fix poller bug and remove persisted pointers [WIP][Pull-based Ingestion] Fix poller bug and remove persisted pointers Oct 13, 2025
@varunbharadwaj varunbharadwaj force-pushed the vb/fixpoller branch 2 times, most recently from 797bb76 to 4cfe2a1 Compare October 13, 2025 23:53
Copy link
Contributor

❌ Gradle check result for 4cfe2a1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@varunbharadwaj varunbharadwaj changed the title [WIP][Pull-based Ingestion] Fix poller bug and remove persisted pointers [Pull-based Ingestion] Fix poller bug and remove persisted pointers Oct 14, 2025
Copy link
Contributor

❌ Gradle check result for 515649b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 78e0ba0: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

…cords when not explicitly configured

Signed-off-by: Varun Bharadwaj <[email protected]>
Signed-off-by: Varun Bharadwaj <[email protected]>
Copy link
Contributor

❕ Gradle check result for afaf5e0: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@msfroh msfroh merged commit 14578f6 into opensearch-project:main Oct 14, 2025
55 checks passed
peteralfonsi pushed a commit to peteralfonsi/OpenSearch that referenced this pull request Oct 15, 2025
…pensearch-project#19607)

* fix poller bug and remove persisted pointers
* update kafka consumer to use opensearch max_batch_size as max.poll.records when not explicitly configured
* fix kinesis tests

---------

Signed-off-by: Varun Bharadwaj <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Indexing Indexing, Bulk Indexing and anything related to indexing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Fix duplicate or old message skipping logic in pull-based ingestion when versioning is not used

2 participants