-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28549: Limit the maximum number of operators merged by SharedWorkOptimizer #5492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java
Outdated
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java
Outdated
Show resolved
Hide resolved
ArrayListMultimap<String, TableScanOperator> tableNameToOps, int batchSize) { | ||
if (batchSize == -1) { | ||
return Collections.singletonList(sortedTables.stream().map(Entry::getKey) | ||
.flatMap(tableName -> tableNameToOps.get(tableName).stream()).collect(Collectors.toList())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This puts all TS ops into a single List, regardless of their source. Maybe you intend the following code?
if (batchSize == -1) {
return sortedTables.stream()
.map(entry -> tableNameToOps.get(entry.getKey()))
.collect(Collectors.toList());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I was just confused when I wrote the line. I modified it and ran some qtests which failed
e574161
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, +1
(The failed test seems just a flaky one. In my local environment, the test finished successfully.)
e574161
to
3bb41b2
Compare
Retriggered CI just in case |
3bb41b2
to
c6b76d9
Compare
I am checking why |
hi @okumin, should we rebase? |
b11b2d7
to
0ed6b63
Compare
@deniskuzZ |
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minor comment
hi @okumin, not sure if you noticed the above comment, do you think it's legit? |
@deniskuzZ Yes, I do. I'm using my machine power to review and test another PR. I will likely update this one tomorrow. |
|
Thanks |
What changes were proposed in this pull request?
This PR would limit the maximum number of table scan operators which SWO tries to merge.
https://issues.apache.org/jira/browse/HIVE-28549
Why are the changes needed?
We observed SWO makes a negative impact when it merges too many, e.g. 50, operators. If operators are memory intensive, they might throw OOM or might slow down.
I believe we can resolve OOM with the following patch, but we still want an upper limit so that we can tune concurrency or RAM per operator reasonably.
#5478
Does this PR introduce any user-facing change?
No.
Is the change a dependency upgrade?
No.
How was this patch tested?
I added a qtest