-
Notifications
You must be signed in to change notification settings - Fork 1.4k
MergeRollupTask config behaviour enhancement for output.segment.dir.uri
#17015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the configuration behavior for MergeRollupTask to properly prioritize the output.segment.dir.uri value from TaskConfig over the controller's data directory. This enables METADATA push mode for merge rollup operations when using remote storage.
Key Changes:
- Refactored
getPushTaskConfigmethod to prioritize task-level configuration for output segment directory - Added new helper method
getOutputSegmentDirURIto handle URI resolution with proper precedence - Restructured the push mode logic using a switch statement for better clarity
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| MinionTaskUtils.java | Refactored push task configuration logic to prioritize task config over controller data dir and improved code structure |
| MinionTaskUtilsTest.java | Added comprehensive test coverage for various push mode scenarios and configuration combinations |
...on-builtin-tasks/src/test/java/org/apache/pinot/plugin/minion/tasks/MinionTaskUtilsTest.java
Outdated
Show resolved
Hide resolved
| URI outputSegmentDirURI = getOutputSegmentDirURI(taskConfigs, clusterInfoAccessor, tableName); | ||
| if (!isLocalOutputDir(outputSegmentDirURI.getScheme())) { | ||
| switch (segmentPushType) { |
Copilot
AI
Oct 14, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider extracting the switch statement logic into a separate method to improve readability and reduce the complexity of getPushTaskConfig.
…c/test/java/org/apache/pinot/plugin/minion/tasks/MinionTaskUtilsTest.java Co-authored-by: Copilot <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17015 +/- ##
============================================
+ Coverage 63.18% 63.20% +0.02%
- Complexity 1425 1433 +8
============================================
Files 3123 3124 +1
Lines 184813 185318 +505
Branches 28320 28334 +14
============================================
+ Hits 116765 117124 +359
- Misses 59028 59163 +135
- Partials 9020 9031 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
shounakmk219
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for adding this functionality!
|
@Jackie-Jiang could you please have a look? Thanks :) |
|
Please take a look at the failed test, and also solve the conflict |
|
@Jackie-Jiang: Fixed the conflict, and it appears that an unrelated test is now failing. Should I just try to re-run? |
|
@t0mpere Thanks for the contribution! Could you please also help update the pinot doc about this change? |
output.segment.dir.urioutput.segment.dir.uri
See: #17014
Changing behaviour to let MergeRollupTask pick up
output.segment.dir.urivalue from the TaskConfig first.The current behaviour only reads from
controller.data.dirand if this path points to a local directory, it's impossible to run ametadataMergeRollup job.Example:
Controller config
Task config
Result
Expected