Skip to content

Conversation

jainankitk
Copy link
Contributor

Description

This code change introduces AbstractQueryProfilerBreakdown that can be extended by ConcurrentQueryProfilerBreakdown to show query profiling information for concurrent search executions

Issue

Relates to #14375

@jpountz
Copy link
Contributor

jpountz commented Mar 27, 2025

Can you explain why we need two impls? I would have assumed that the ConcurrentQueryProfilerBreakdown could also be used for searches that are not concurrent?

@jainankitk
Copy link
Contributor Author

jainankitk commented Mar 27, 2025

Can you explain why we need two impls? I would have assumed that the ConcurrentQueryProfilerBreakdown could also be used for searches that are not concurrent?

ConcurrentQueryProfilerBreakdown maintains separate instance of QueryProfileBreakdown for each segment. In the context method, ConcurrentQueryProfilerBreakdown returns corresponding QueryProfileBreakdown object for each segment unlike DefaultQueryProfilerBreakdown that shares the same object across segments.

  @Override
  public QueryProfilerBreakdown context(LeafReaderContext context) {
    return this;
  }

Hence, I felt we don't need to take the overhead of creating breakdown per segment and then merging it together during response. That being said, eventually we can keep just ConcurrentQueryProfilerBreakdown if we prefer that for simplicity.

@msfroh
Copy link
Contributor

msfroh commented Mar 27, 2025

Does it make sense to create a separate QueryProfilerBreakDown per leaf? Or should it create one per slice?

Can this be implemented as part of the ProfilerCollectorManager#newCollector logic? Maybe not, since we would also need to profile the work done by the Weight + Scorer on each slice.

@jpountz
Copy link
Contributor

jpountz commented Mar 27, 2025

@jainankitk OK. In my opinion, it's more important to handle the concurrent and non-concurrent cases consistently than to save some overhead when searches are not concurrent. I'd really like non-concurrent search to look and feel like a concurrent search with a single slice running on a SameThreadExecutorService as far as profiling is concerned. So I wouldn't specialize the class hierarchy for concurrent vs. non-concurrent.

@jainankitk
Copy link
Contributor Author

Does it make sense to create a separate QueryProfilerBreakDown per leaf? Or should it create one per slice?

Actually, create one per slice makes lot of sense.

Can this be implemented as part of the ProfilerCollectorManager#newCollector logic? Maybe not, since we would also need to profile the work done by the Weight + Scorer on each slice

We can always use ThreadId for correctly tracking the mapping from slice to QueryProfilerBreakdown object.

@jainankitk
Copy link
Contributor Author

In my opinion, it's more important to handle the concurrent and non-concurrent cases consistently than to save some overhead when searches are not concurrent. I'd really like non-concurrent search to look and feel like a concurrent search with a single slice running on a SameThreadExecutorService as far as profiling is concerned.

Let me try and see if we can maintain per slice QueryProfilerBreakdown object. With that, both concurrent and non-concurrent paths would be consistent as well as efficient.

@jainankitk
Copy link
Contributor Author

One of the failing check is:

----------
1. ERROR in /home/runner/work/lucene/lucene/lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerBreakdown.java (at line 63)
	currentThreadId, ctx -> new QuerySliceProfilerBreakdown());
	                 ^^^
The value of the lambda parameter ctx is not used
----------
1 problem (1 error)

> Task :lucene:sandbox:ecjLintMain FAILED

I am wondering if there is a workaround for this? One option is to use putIfAbsent which doesn't require function as input, but then need to explicitly get before returning

@jpountz
Copy link
Contributor

jpountz commented Mar 28, 2025

You just need to replace ctx with _.

@jainankitk
Copy link
Contributor Author

You just need to replace ctx with _.

Ah, my bad! I tried ., but we can't use that as part of variable name. Thanks for the suggestion @jpountz.

At a high level, I have unified the concurrent/non-concurrent profiling paths as suggested. The QueryProfilerTree is shared across slices, and we recursively build the ProfilerTree for each slice for response. There are few kinks that we still need to be iron out. For example:

  • Weight creation is global across slices. How do we account for its time? Should be have separate global tree with just the weight times? We can't just get away with having weight count at the top as Weight is shared for child queries as well, right?
  • The new in-memory structure for profiled queries is bit like below (notice additional list for slices):
"query": [ <-- for list of slices
               [ <-- for list of root queries
              {
                "type": "TermQuery",
                "description": "foo:bar",
                "time_in_nanos" : 11972972,
                "breakdown" :
                {

We can probably have map of slices, with key being the sliceId:

       "query": {
               "some global information":
               "slices": {
               "slice1": [ <-- for list of root queries
                {
                "type": "TermQuery",
                "description": "foo:bar",
                "time_in_nanos" : 11972972,
                "breakdown" :
                {...}}],
               "slice2": [],
               "slice3": []}
              }

@jainankitk jainankitk changed the title Preparing existing profiler for adding concurrent profiling Adding profiling support for concurrent segment search Mar 31, 2025
@jainankitk
Copy link
Contributor Author

@jpountz - Can you provide your thoughts on above?

@jpountz
Copy link
Contributor

jpountz commented Apr 1, 2025

I'd have a top-level tree for everything related to initializing the search and combining results (rewrite(), createWeight(), CollectorManager#reduce) and then a list of trees for each slice. Related, it'd be nice if each per-slice object could also tell us about the thread that it ran in and its start time so that we could understand how exactly Lucene managed to parallelize the search.

@jainankitk
Copy link
Contributor Author

I'd have a top-level tree for everything related to initializing the search and combining results (rewrite(), createWeight(), CollectorManager#reduce) and then a list of trees for each slice.

While working on the code, I realized it is better to have list of slices within the tree itself at each level, instead of repeating the query structure and information across multiple trees. In this approach, we can easily view the tree for specific sliceId using jq or simple script. The structure looks like below:

"query": [ <-- for list of root queries
              {
                "type": "TermQuery",
                "description": "foo:bar",
                "startTime" : 11972972,
                "totalTime": 354343,
                "breakdown" :   {.....}, <-- query level breakdown like weight count and time
                "sliceBreakdowns": [
                {.....},    <-- first slice information
                {.....}]    <-- second slice information
                "queryChildren": [
                {......},  <-- recursive repetition of above structure
                {......}]

Related, it'd be nice if each per-slice object could also tell us about the thread that it ran in and its start time so that we could understand how exactly Lucene managed to parallelize the search.

Yes, that would be really useful. I have included the threadId as sliceId, startTime and totalTime information for each slice object at every query level.

@jainankitk
Copy link
Contributor Author

@jpountz - The code changes are ready for review. For now, I have made changes to accommodate all the timers in QueryProfilerTimingType.

While this does not modify (rewrite(), CollectorManager#reduce) called out earlier as it is not part of QueryProfilerTimingType, it does lay the groundwork using createWeight(). We can take those changes in a followup PR.

@github-actions github-actions bot added the Stale label May 14, 2025
@jainankitk
Copy link
Contributor Author

I submitted talk on this topic (Profiling Concurrent Search in Lucene: A Deep Dive into Parallel Execution) for ASF conference (https://communityovercode.org/schedule/) and it was selected. Would love to iterate and get this PR merged before that!

@github-actions github-actions bot removed the Stale label May 28, 2025
Copy link
Contributor

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Jun 11, 2025
Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I had lost track of this PR. I think that my only concern left is how it computes end times by adding the approximate timing to the start time. I'd rather not report it since this profiler doesn't actually compute an end time, but I'd be fine with reporting the sum of approximate timings if we think that it helps.

}

/** Retrieve the lucene description of this query (e.g. the "explain" text) */
public long getId() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a mismatch between javadocs and the method signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to have copied these getters from some existing code and missed editing the javadocs. Let me fix that!


/** The timing breakdown for this node. */
public Map<String, Long> getTimeBreakdown() {
return Collections.unmodifiableMap(breakdown);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do the wrapping only once in the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point! Since the object is immutable

this.leafLevel = leafLevel;
}

public boolean isLeafLevel() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some javadocs? I believe that this means that this operation runs on a LeafReader as opposed to the top-level IndexReader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your understanding is correct. Added the javadocs!


import java.util.List;

interface QueryLeafProfilerAggregator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually need this interface since it seems to have a single implementation?

Copy link
Contributor Author

@jainankitk jainankitk Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to introduce other implementations later, but I guess we can add this interface at that point. Removing for now!

Arrays.stream(QueryProfilerTimingType.values()).filter(t -> t.isLeafLevel()).toList();

/** The accumulated timings for this query node */
private final QueryProfilerTimer[] timers;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using an EnumMap, which will be implemented by an array like this one under the hood?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although minor, but EnumMap will initialize the timers for all the possible keys vals = new Object[keyUniverse.length];. It seemed bit unnecessary to me, but we can change if using EnumMap improves the readability.

sliceStartTime = Math.min(sliceStartTime, timer.getEarliestTimerStartTime());
sliceEndTime =
Math.max(
sliceEndTime, timer.getEarliestTimerStartTime() + timer.getApproximateTiming());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't feel right the compute the end time as the start time plus the approximate timing, since operations will often be interleaved. What about reporting the sum of the approximate timings across operation types instead, ie. the value of toTotalTime()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! That was an oversight from my side. Will change this toTotalTime(), which should approximately reflect the consumed cpu time for processing this query

SLICE,
// Aggregate leaf level breakdowns based on thread execution
THREAD
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually used? I think that from previous discussions we agreed that SLICE wouldn't work since Lucene doesn't tell us what slice it's in. And I don't see LEAF being used in this PR, only THREAD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, only THREAD is being used for now. I was planning to introduce other implementations like LEAF later, but I guess we can add this enum at that point.

Copy link
Contributor

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Jun 28, 2025
@jainankitk
Copy link
Contributor Author

Sorry, I had lost track of this PR. I think that my only concern left is how it computes end times by adding the approximate timing to the start time. I'd rather not report it since this profiler doesn't actually compute an end time, but I'd be fine with reporting the sum of approximate timings if we think that it helps.

No worries @jpountz, thanks for the review. Even I was away for a bit, will try to address the comments and push new revision this week.

@github-actions github-actions bot removed the Stale label Jul 22, 2025
Signed-off-by: Ankit Jain <[email protected]>
Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Signed-off-by: Ankit Jain <[email protected]>
Copy link
Contributor

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Signed-off-by: Ankit Jain <[email protected]>
@github-actions github-actions bot added this to the 10.3.0 milestone Jul 28, 2025
@jainankitk
Copy link
Contributor Author

@jpountz - I have addressed all the comments from earlier review. Are you able to take another look, to help close this out?

Copy link
Contributor

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Aug 20, 2025
@jainankitk
Copy link
Contributor Author

Thanks all for reviewing this PR. Planning to merge this PR by tomorrow, if there is no new feedback. Again, thanks for helping improve this change with your inputs!

@github-actions github-actions bot removed the Stale label Aug 26, 2025
@jainankitk jainankitk merged commit f30c7b6 into apache:main Aug 27, 2025
8 checks passed
@jainankitk jainankitk deleted the profiling-changes branch August 27, 2025 16:14
jainankitk added a commit that referenced this pull request Aug 27, 2025
@dungba88
Copy link
Contributor

dungba88 commented Oct 14, 2025

Hi @jainankitk , is QueryProfilerResult no longer JSON (de)serializable? The reason is the introduction of AggregatedQueryLeafProfilerResult which contains Thread object. I don't see its Thread object is being used anywhere, can it be removed?

@msokolov
Copy link
Contributor

or if we do need to maintain the Thread association, can we store the Thread.id (an int I think) instead of the Thread?

@jainankitk
Copy link
Contributor Author

Thanks @dungba88 for trying this out.

I don't see its Thread object is being used anywhere, can it be removed?

The Thread object is used for maintaining the association (as @msokolov called out) during aggregation of segment level Profiler Result. I was not exactly sure what all fields from Thread might be needed/make sense for serialization, so kept Thread itself in AggregatedQueryLeafProfilerResult. I am wondering if it makes sense to add @JsonIgnore / transient to not serialize Thread (and avoid exception/errors) and instead have limited view of Thread, say SerializableThread containing all the properties that we want to have in the serialization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants