Skip to content

Conversation

@xuxiong1
Copy link
Contributor

@xuxiong1 xuxiong1 commented Dec 3, 2025

Description

Added support for handling non-segment files (custom metadata) in subdirectory-aware store recovery.

  • Renamed loadSubdirectoryMetadataFromSegments() to loadSubdirectoryMetadata()
  • Added computeFileMetadata() to calculate metadata (checksum, length) for non-segment files

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • New Features

    • Support for custom non-segment metadata files in subdirectory stores; their metadata (length, checksum, writer version) is now read and included.
  • Bug Fixes

    • Recovery now enforces matching sets of segment and non-segment metadata files across nodes to avoid inconsistent recoveries.
  • Tests

    • Added tests for mixed segment/non-segment loading, checksum validation, and metadata population.
  • Chores

    • Changelog updated to document handling of custom metadata files.

✏️ Tip: You can customize this high-level summary in your review settings.

@xuxiong1 xuxiong1 requested a review from a team as a code owner December 3, 2025 00:58
@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

Walkthrough

Loads metadata for both segment files and custom non-segment files (prefixed test_metadata_) in subdirectory stores, computing length, checksum, and Lucene writtenBy/version for non-segment files; integrates these into metadata aggregation and recovery comparisons and adds tests that create and require consistent subdirectory file sets across nodes.

Changes

Cohort / File(s) Change Summary
Subdirectory-aware store logic
modules/store-subdirectory/src/main/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareStore.java
Replaces segment-only loader with loadSubdirectoryMetadata to handle both segments_* and non-segment files; adds helpers to compute per-file metadata (length, checksum via CodecUtil, writtenBy/Version) for non-segment files and integrates them into the metadata builder and document counts.
Recovery tests (internalClusterTest)
modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java
Tests now create Lucene-compatible non-segment files (test_metadata_*) during flush/commit, read and aggregate both segment and non-segment files per node, and require identical file sets across nodes; adds helper(s) to write non-segment files and imports (CodecUtil, IOContext, IndexOutput).
Plugin unit tests / test scaffolding
modules/store-subdirectory/src/test/java/org/opensearch/plugin/store/subdirectory/SubdirectoryStorePluginTests.java
Adds test scaffolding (createTestStoreSetup, writeTestFileWithCodec) and a nested TestIndexCommit that aggregates files from main directory and subdirectory (filtering custom metadata files); new tests validate metadata population, checksum correctness, mixed file handling, and duplicate avoidance; adds Lucene IO/checksum imports.
Changelog
CHANGELOG.md
Adds Unreleased 3.x entry: "Handle custom metadata files in subdirectory-store (#20157)".

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Verify checksum computation and CodecUtil usage in SubdirectoryAwareStore matches Lucene expectations (endianness, footer/header handling).
  • Review writtenBy/version extraction and any compatibility implications.
  • Check non-segment file creation in tests (IndexOutput, IOContext) for correctness and potential flakiness.
  • Validate TestIndexCommit aggregation/filtering logic to ensure no duplicates/missing files and parity with real commit behavior.
  • Confirm that integrating non-segment metadata does not unintentionally alter doc counts or recovery logic.

Poem

🐰 I hopped through bytes beneath the moonlight,
I stitched small files with header bright,
Checksums hummed like tunes at night,
Nodes compared what felt just right,
🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.93% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding support for handling custom metadata files in the subdirectory-store module.
Description check ✅ Passed The description covers the main functionality changes and includes test validation checkbox, but lacks related issue number in 'Related Issues' section which is part of the template.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14730ff and ddac05e.

📒 Files selected for processing (1)
  • CHANGELOG.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: check-files
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: detect-breaking-change

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
modules/store-subdirectory/src/main/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareStore.java (2)

140-154: Consider more precise segment file identification.

The check fileName.contains(IndexFileNames.SEGMENTS) could potentially match non-segment files that happen to contain "segments" in their name. Consider using a more precise check:

-                if (fileName.contains(IndexFileNames.SEGMENTS)) {
+                String localFileName = filePath.getFileName().toString();
+                if (localFileName.startsWith(IndexFileNames.SEGMENTS)) {

This ensures only actual segment files (prefixed with "segments_") are matched, rather than any file containing "segments" anywhere in the path.


247-247: Consider documenting the version selection rationale.

Using minimumIndexCompatibilityVersion().luceneVersion is a reasonable fallback for non-segment files where the actual writer version is unknown. Consider adding a brief comment explaining this choice for future maintainability.

+                // Use minimum compatible version as fallback since non-segment files don't store writer version
                 Version version = org.opensearch.Version.CURRENT.minimumIndexCompatibilityVersion().luceneVersion;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b6d266 and 65c8087.

📒 Files selected for processing (2)
  • modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java (7 hunks)
  • modules/store-subdirectory/src/main/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareStore.java (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
🔇 Additional comments (9)
modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java (6)

11-22: LGTM!

The new imports for CodecUtil, IOContext, and IndexOutput are necessary for the non-segment file creation functionality.


116-133: LGTM!

The verification logic correctly aggregates both segment files (from SegmentInfos) and non-segment files (by prefix matching), ensuring comprehensive recovery validation.


146-157: LGTM!

The additional verification ensures non-segment files are present in the subdirectory after recovery, which is essential for validating the new functionality.


205-211: LGTM!

The constant and counter are well-defined. The counter is an instance field, which is appropriate for this test engine context where each shard has its own engine instance.


259-273: Well-structured non-segment file creation.

The method properly creates a Lucene-compatible file with CodecUtil.writeHeader and CodecUtil.writeFooter, ensuring the checksum can be validated during recovery. The use of try-with-resources ensures proper resource cleanup.


343-351: LGTM!

The modification to include non-segment files in getFileNames() is consistent with the verification logic and ensures these files are properly included in the recovery process.

modules/store-subdirectory/src/main/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareStore.java (3)

13-23: LGTM!

The new imports for CodecUtil and Version are necessary for the checksum computation and version handling in the new computeFileMetadata method.


127-128: LGTM!

The updated comment and method call correctly reflect the expanded scope to include non-segment files.


158-167: LGTM!

The processing logic correctly handles both file types:

  • Segment files are processed via loadMetadataFromSubdirectorySegmentsFile which aggregates all referenced segment files
  • Non-segment files are only processed if not already present in the builder, avoiding duplicate work

@xuxiong1
Copy link
Contributor Author

xuxiong1 commented Dec 3, 2025

@coderabbitai generate docstrings

@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

✅ Actions performed

Initiated docstring generation; will generate only if new commits exist.

@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #20158

coderabbitai bot added a commit that referenced this pull request Dec 3, 2025
Docstrings generation was requested by @xuxiong1.

* #20157 (comment)

The following files were modified:

* `modules/store-subdirectory/src/main/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareStore.java`
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

❌ Gradle check result for 65c8087: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: xuxiong1 <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java (1)

342-350: LGTM!

The addition correctly includes non-segment files in the IndexCommit.getFileNames() result, ensuring these files are included during recovery. The logic appropriately mirrors the verification method for consistency.

Optional: The non-segment file detection logic (filtering by prefix) is duplicated here and in verifySubdirectoryFilesOnAllNodes. Consider extracting a helper method like isNonSegmentFile(String fileName) to centralize this logic, though for test code the current approach is acceptable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65c8087 and bfd6b6f.

📒 Files selected for processing (2)
  • CHANGELOG.md (1 hunks)
  • modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java (7 hunks)
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: detect-breaking-change
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: Analyze (java)
🔇 Additional comments (5)
modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java (5)

11-22: LGTM!

The new imports are necessary for creating Lucene-compatible non-segment files with proper codec headers and footers.


116-133: LGTM!

The verification logic correctly aggregates both segment files (via SegmentInfos) and non-segment files (via prefix filtering) to ensure complete file coverage during recovery verification.


146-156: LGTM!

Good addition of the assertion to verify non-segment files are actually present. This ensures the test validates the intended functionality rather than silently passing when no non-segment files exist.


204-210: LGTM!

The constant for non-segment file prefix and the instance-level counter for generating unique file names are appropriately defined.


251-272: LGTM!

The createNonSegmentFile method correctly creates Lucene-compatible files with proper codec headers and footers. This ensures the non-segment files have valid checksums that can be verified during recovery, which aligns with the PR's goal of testing custom metadata file handling.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

✅ Gradle check result for bfd6b6f: SUCCESS

@codecov
Copy link

codecov bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 95.65217% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 73.19%. Comparing base (da18cc6) to head (42c9c06).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...gin/store/subdirectory/SubdirectoryAwareStore.java 95.65% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20157      +/-   ##
============================================
- Coverage     73.25%   73.19%   -0.07%     
- Complexity    71684    71706      +22     
============================================
  Files          5788     5793       +5     
  Lines        327866   328073     +207     
  Branches      47218    47248      +30     
============================================
- Hits         240194   240134      -60     
- Misses        68399    68599     +200     
- Partials      19273    19340      +67     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: xuxiong1 <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
modules/store-subdirectory/src/test/java/org/opensearch/plugin/store/subdirectory/SubdirectoryStorePluginTests.java (5)

150-187: TestStoreSetup/createTestStoreSetup look sound, but consider reusing in existing tests

The holder and factory encapsulate shard/store setup cleanly and mirror the existing testStats setup; construction of SubdirectoryAwareStore and paths looks consistent. You might optionally refactor testStats to use createTestStoreSetup() as well to avoid duplication and keep test configuration changes in one place.


241-292: Mixed files test logic is good; re-use subdirectory path helper to avoid drift

The mixed segment/non‑segment scenario and duplicate‑detection assertions are solid and should catch regressions in how getMetadata merges file lists.

Same concern as above about setup.path.resolve("subdir"): it would be safer to compute this path via a shared helper (or a constant matching the production subdirectory name and relative root) so both tests and implementation refer to the same physical location.


294-349: Checksum validation test is strong; consider asserting presence of both files explicitly

This test nicely validates that the checksum produced by the store metadata matches a direct CodecUtil.checksumEntireFile computation and that different content yields different checksums.

Two small suggestions:

  • Add explicit assertTrue(metadata.contains("subdir/checksum_test.dat")) / contains("subdir/checksum_test2.dat") before reading from metadata2 to get clearer failure messages if the files are accidentally filtered out.
  • As with the other tests, consider factoring out subdirectory path computation to ensure it stays aligned with the production subdirectory location.

351-361: writeTestFileWithCodec helper is correct; optional deterministic timestamp

The helper correctly writes a Lucene‑style header/footer and some payload, which is appropriate for files that will be checksummed via Lucene APIs.

If you ever need bit‑for‑bit deterministic test data (e.g., across JVMs or runs), you could replace System.currentTimeMillis() with a fixed value. For the current use (only comparing different files), the current approach is fine.


363-460: TestIndexCommit implementation is generally correct; consider deduplication and small robustness tweaks

The custom TestIndexCommit correctly:

  • Delegates segment metadata to a captured SegmentInfos.
  • Aggregates main‑directory segment files plus any subdirectory segment files (when a commit exists).
  • Adds non‑segment “custom metadata” files while filtering out Lucene internals and test infrastructure via isCustomMetadataFile.

A few optional improvements:

  1. Avoid shadowing directory: inside getFileNames, the local Directory directory = FSDirectory.open(subdirPath) shadows the field directory. Renaming the local (e.g., subdirDirectory) would improve readability.

  2. Return a Set to enforce uniqueness: using a Set<String> for allFiles would guarantee uniqueness at construction time rather than relying on downstream distinct() checks.

  3. Code reuse with existing TestIndexCommit: there’s a very similar TestIndexCommit in SubdirectoryAwareRecoveryTests. If feasible, consider sharing a common helper (e.g., in a test utility) to avoid divergence in how file lists are composed and filtered.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bfd6b6f and 42c9c06.

📒 Files selected for processing (1)
  • modules/store-subdirectory/src/test/java/org/opensearch/plugin/store/subdirectory/SubdirectoryStorePluginTests.java (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
modules/store-subdirectory/src/test/java/org/opensearch/plugin/store/subdirectory/SubdirectoryStorePluginTests.java (2)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
modules/store-subdirectory/src/internalClusterTest/java/org/opensearch/plugin/store/subdirectory/SubdirectoryAwareRecoveryTests.java (1)
  • TestIndexCommit (314-385)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-latest)

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

✅ Gradle check result for 42c9c06: SUCCESS

Copy link
Contributor

@yupeng9 yupeng9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good

try (IndexInput in = dir.openInput(localFileName, IOContext.READONCE)) {
long length = in.length();
String checksum = Store.digestToString(CodecUtil.checksumEntireFile(in));
Version version = org.opensearch.Version.CURRENT.minimumIndexCompatibilityVersion().luceneVersion;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for non-segment info file, we also need lucene version as placeholder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you check the constructor of StoreFileMetadata.java, the version is mandatory.

this.writtenBy = Objects.requireNonNull(writtenBy, "writtenBy must not be null");

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

❌ Gradle check result for ddac05e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants