-
Couldn't load subscription status.
- Fork 25.6k
ES|QL: manage INLINE STATS count(*) on result sets with no columns #137017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES|QL: manage INLINE STATS count(*) on result sets with no columns #137017
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Hi @luigidellaquila, I've created a changelog YAML for you. |
|
|
||
| private int blockIndex() { | ||
| return countAll ? 0 : channels.get(0); | ||
| return countAll ? -1 : channels.get(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this was wrong: countAll does not rely on blocks, it only relies on page.positionCount
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intent here to return an illegal value when countAll is true given that addRawInput does not call blockIndex in that case? E.g. in case a future modification leads to calling page.getBlock(-1) at line 100 ?
Also curious if the intent is to trigger the assert in addIntermediateInput in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the assert should better protect us now, in case of empty pages.
page.getBlock(-1) will never happen with current implementation, and in case of modifications it will throw a clear error (IMHO better than silently returning a wrong value, because we are calculating results based on the wrong block)
| if (mask.getBoolean(0) == false) { | ||
| return; | ||
| } | ||
| count = countAll ? block.getPositionCount() : block.getTotalValueCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't countAll always be false in this path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, it can be simplified, thanks @cimequinox!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change relies on properties of the mask when encountering blocks with no values. The calculations appear correct.
Two suggestions:
- It would be good to add a comment describing the intent of the -1 constant returned by
blockIndex. - The count calculation on line 107 could be simplified to
count = block.getTotalValueCount();
|
Thanks @cimequinox, I pushed a fix for the suggested changes |
| from employees | ||
| | keep emp_no | ||
| | drop emp_no | ||
| | inline stats c = count(*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix doesn't seem to be inline stats specific. If so (please, correct me if I'm wrong), there should be use cases where a regular stats should reveal this error, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original query had a STATS as last command.
Unfortunately it doesn't run anymore, because of the limitations introduced by #136981, and without the LIMIT the query works just fine.
(Disabling that limitation, and with this fix, the original query doesn't fail anymore btw)
I suspect that the problem could still happen with STATS, but I couldn't find a way to reproduce it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that the problem could still happen with STATS
I mean, it could have happened, but this fix should also cover normal STATS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to debug the two queries
ROW a = 12 | DROP a | STATS count(*)
ROW a = 12 | DROP a | INLINE STATS b = count(*)
The interesting bit is that with INLINE STATS the page in the aggregator has no blocks, while with STATS it still has the original block for a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking at this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've looked at this and your point @luigidellaquila about aggregator having a different behavior for stats vs. inline stats showed that what we do with inline stats has a "manual" step that I did question in code several times so far: marking a plan as optimized without pushing it through another round of optimizations, only because the parts it's made of were already pushed through the logical optimizer.
What happens is that with inline stats we get to execute a plan like this one below where the EsqlProject with empty projection list as a ProjectOperator on the execution layer gets to create a Page with no blocks (because there is nothing to project)
Aggregate[[],[COUNT(*[KEYWORD],true[BOOLEAN]) AS count(*)#11]]
\_EsqlProject[[]]
\_LocalRelation[[a{r}#9],org.elasticsearch.xpack.esql.plan.logical.local.CopyingLocalSupplier@7cc]
While with regular stats (the same aggregation as above), the optimizer "merges" the two plans:
Rule logical.CombineProjections applied with change
Limit[1000[INTEGER],false,false] = Limit[1000[INTEGER],false,false]
\_Aggregate[[],[COUNT(*[KEYWORD],true[BOOLEAN]) AS count(*)#117]] = \_Aggregate[[],[COUNT(*[KEYWORD],true[BOOLEAN]) AS count(*)#117]]
\_EsqlProject[[]] ! \_Row[[12[INTEGER] AS a#115]]
\_Row[[12[INTEGER] AS a#115]] !
And there is no ProjectOperator creating an empty Page for which AggregateOperator not to know what to do with it.
I am not saying the fix in this PR is not necessary @luigidellaquila (you know better than me the Page with no rows story), but the fact that the code reached a point with inline stats that it wouldn't reach with regular stats made me wonder. I have opened a Draft PR to explore an idea that I didn't explore so far because there were no concrete reasons to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details and for the in-depth analysis @astefan!
As long as we have zero-blocks pages, I think it makes sense to have this PR merged, but I agree with you that it would be better to be consistent between STATS and INLINE STATS.
…o_columns' into esql/fix_inline_stats_no_columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fixing queries where INLINE STATS count(*) is calculated on results with no columns, eg.
Fixes: #136851