Adding profiling support for concurrent segment search #14413

jainankitk · 2025-03-27T01:31:23Z

Description

This code change introduces AbstractQueryProfilerBreakdown that can be extended by ConcurrentQueryProfilerBreakdown to show query profiling information for concurrent search executions

Issue

Relates to #14375

Signed-off-by: Ankit Jain <[email protected]>

jpountz · 2025-03-27T06:29:19Z

Can you explain why we need two impls? I would have assumed that the ConcurrentQueryProfilerBreakdown could also be used for searches that are not concurrent?

jainankitk · 2025-03-27T16:49:34Z

Can you explain why we need two impls? I would have assumed that the ConcurrentQueryProfilerBreakdown could also be used for searches that are not concurrent?

ConcurrentQueryProfilerBreakdown maintains separate instance of QueryProfileBreakdown for each segment. In the context method, ConcurrentQueryProfilerBreakdown returns corresponding QueryProfileBreakdown object for each segment unlike DefaultQueryProfilerBreakdown that shares the same object across segments.

  @Override
  public QueryProfilerBreakdown context(LeafReaderContext context) {
    return this;
  }

Hence, I felt we don't need to take the overhead of creating breakdown per segment and then merging it together during response. That being said, eventually we can keep just ConcurrentQueryProfilerBreakdown if we prefer that for simplicity.

msfroh · 2025-03-27T20:06:52Z

Does it make sense to create a separate QueryProfilerBreakDown per leaf? Or should it create one per slice?

Can this be implemented as part of the ProfilerCollectorManager#newCollector logic? Maybe not, since we would also need to profile the work done by the Weight + Scorer on each slice.

jpountz · 2025-03-27T20:15:58Z

@jainankitk OK. In my opinion, it's more important to handle the concurrent and non-concurrent cases consistently than to save some overhead when searches are not concurrent. I'd really like non-concurrent search to look and feel like a concurrent search with a single slice running on a SameThreadExecutorService as far as profiling is concerned. So I wouldn't specialize the class hierarchy for concurrent vs. non-concurrent.

jainankitk · 2025-03-27T22:40:56Z

Does it make sense to create a separate QueryProfilerBreakDown per leaf? Or should it create one per slice?

Actually, create one per slice makes lot of sense.

Can this be implemented as part of the ProfilerCollectorManager#newCollector logic? Maybe not, since we would also need to profile the work done by the Weight + Scorer on each slice

We can always use ThreadId for correctly tracking the mapping from slice to QueryProfilerBreakdown object.

jainankitk · 2025-03-27T22:44:01Z

In my opinion, it's more important to handle the concurrent and non-concurrent cases consistently than to save some overhead when searches are not concurrent. I'd really like non-concurrent search to look and feel like a concurrent search with a single slice running on a SameThreadExecutorService as far as profiling is concerned.

Let me try and see if we can maintain per slice QueryProfilerBreakdown object. With that, both concurrent and non-concurrent paths would be consistent as well as efficient.

Signed-off-by: Ankit Jain <[email protected]>

jainankitk · 2025-03-28T07:07:39Z

One of the failing check is:

----------
1. ERROR in /home/runner/work/lucene/lucene/lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerBreakdown.java (at line 63)
	currentThreadId, ctx -> new QuerySliceProfilerBreakdown());
	                 ^^^
The value of the lambda parameter ctx is not used
----------
1 problem (1 error)

> Task :lucene:sandbox:ecjLintMain FAILED

I am wondering if there is a workaround for this? One option is to use putIfAbsent which doesn't require function as input, but then need to explicitly get before returning

jpountz · 2025-03-28T13:14:52Z

You just need to replace ctx with _.

Signed-off-by: Ankit Jain <[email protected]>

jainankitk · 2025-03-28T17:51:06Z

You just need to replace ctx with _.

Ah, my bad! I tried ., but we can't use that as part of variable name. Thanks for the suggestion @jpountz.

At a high level, I have unified the concurrent/non-concurrent profiling paths as suggested. The QueryProfilerTree is shared across slices, and we recursively build the ProfilerTree for each slice for response. There are few kinks that we still need to be iron out. For example:

Weight creation is global across slices. How do we account for its time? Should be have separate global tree with just the weight times? We can't just get away with having weight count at the top as Weight is shared for child queries as well, right?
The new in-memory structure for profiled queries is bit like below (notice additional list for slices):

"query": [ <-- for list of slices
               [ <-- for list of root queries
              {
                "type": "TermQuery",
                "description": "foo:bar",
                "time_in_nanos" : 11972972,
                "breakdown" :
                {

We can probably have map of slices, with key being the sliceId:

       "query": {
               "some global information":
               "slices": {
               "slice1": [ <-- for list of root queries
                {
                "type": "TermQuery",
                "description": "foo:bar",
                "time_in_nanos" : 11972972,
                "breakdown" :
                {...}}],
               "slice2": [],
               "slice3": []}
              }

jainankitk · 2025-03-31T21:16:15Z

@jpountz - Can you provide your thoughts on above?

jpountz · 2025-04-01T13:58:23Z

I'd have a top-level tree for everything related to initializing the search and combining results (rewrite(), createWeight(), CollectorManager#reduce) and then a list of trees for each slice. Related, it'd be nice if each per-slice object could also tell us about the thread that it ran in and its start time so that we could understand how exactly Lucene managed to parallelize the search.

Signed-off-by: Ankit Jain <[email protected]>

jainankitk · 2025-04-03T06:16:43Z

I'd have a top-level tree for everything related to initializing the search and combining results (rewrite(), createWeight(), CollectorManager#reduce) and then a list of trees for each slice.

While working on the code, I realized it is better to have list of slices within the tree itself at each level, instead of repeating the query structure and information across multiple trees. In this approach, we can easily view the tree for specific sliceId using jq or simple script. The structure looks like below:

"query": [ <-- for list of root queries
              {
                "type": "TermQuery",
                "description": "foo:bar",
                "startTime" : 11972972,
                "totalTime": 354343,
                "breakdown" :   {.....}, <-- query level breakdown like weight count and time
                "sliceBreakdowns": [
                {.....},    <-- first slice information
                {.....}]    <-- second slice information
                "queryChildren": [
                {......},  <-- recursive repetition of above structure
                {......}]

Related, it'd be nice if each per-slice object could also tell us about the thread that it ran in and its start time so that we could understand how exactly Lucene managed to parallelize the search.

Yes, that would be really useful. I have included the threadId as sliceId, startTime and totalTime information for each slice object at every query level.

jainankitk · 2025-04-03T06:21:55Z

@jpountz - The code changes are ready for review. For now, I have made changes to accommodate all the timers in QueryProfilerTimingType.

While this does not modify (rewrite(), CollectorManager#reduce) called out earlier as it is not part of QueryProfilerTimingType, it does lay the groundwork using createWeight(). We can take those changes in a followup PR.

jainankitk · 2025-05-27T18:35:27Z

I submitted talk on this topic (Profiling Concurrent Search in Lucene: A Deep Dive into Parallel Execution) for ASF conference (https://communityovercode.org/schedule/) and it was selected. Would love to iterate and get this PR merged before that!

github-actions · 2025-06-11T00:26:52Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

jpountz

Sorry, I had lost track of this PR. I think that my only concern left is how it computes end times by adding the approximate timing to the start time. I'd rather not report it since this profiler doesn't actually compute an end time, but I'd be fine with reporting the sum of approximate timings if we think that it helps.

jpountz · 2025-06-13T14:47:10Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/AggregatedQueryLeafProfilerResult.java

+  }
+
+  /** Retrieve the lucene description of this query (e.g. the "explain" text) */
+  public long getId() {


There seems to be a mismatch between javadocs and the method signature.

I seem to have copied these getters from some existing code and missed editing the javadocs. Let me fix that!

jpountz · 2025-06-13T14:47:41Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/AggregatedQueryLeafProfilerResult.java

+
+  /** The timing breakdown for this node. */
+  public Map<String, Long> getTimeBreakdown() {
+    return Collections.unmodifiableMap(breakdown);


You could do the wrapping only once in the constructor?

Oh yeah, good point! Since the object is immutable

jpountz · 2025-06-13T14:52:15Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerTimingType.java

+    this.leafLevel = leafLevel;
+  }
+
+  public boolean isLeafLevel() {


Add some javadocs? I believe that this means that this operation runs on a LeafReader as opposed to the top-level IndexReader?

Yes, your understanding is correct. Added the javadocs!

jpountz · 2025-06-13T14:52:21Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryLeafProfilerAggregator.java

+
+import java.util.List;
+
+interface QueryLeafProfilerAggregator {


I wonder if we actually need this interface since it seems to have a single implementation?

I was planning to introduce other implementations later, but I guess we can add this interface at that point. Removing for now!

jpountz · 2025-06-13T14:54:11Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryLeafProfilerBreakdown.java

+      Arrays.stream(QueryProfilerTimingType.values()).filter(t -> t.isLeafLevel()).toList();
+
+  /** The accumulated timings for this query node */
+  private final QueryProfilerTimer[] timers;


What about using an EnumMap, which will be implemented by an array like this one under the hood?

Although minor, but EnumMap will initialize the timers for all the possible keys vals = new Object[keyUniverse.length];. It seemed bit unnecessary to me, but we can change if using EnumMap improves the readability.

jpountz · 2025-06-13T15:04:03Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryLeafProfilerBreakdown.java

+        sliceStartTime = Math.min(sliceStartTime, timer.getEarliestTimerStartTime());
+        sliceEndTime =
+            Math.max(
+                sliceEndTime, timer.getEarliestTimerStartTime() + timer.getApproximateTiming());


It doesn't feel right the compute the end time as the start time plus the approximate timing, since operations will often be interleaved. What about reporting the sum of the approximate timings across operation types instead, ie. the value of toTotalTime()?

Good catch! That was an oversight from my side. Will change this toTotalTime(), which should approximately reflect the consumed cpu time for processing this query

jpountz · 2025-06-13T15:15:09Z

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryProfilerResult.java

+    SLICE,
+    // Aggregate leaf level breakdowns based on thread execution
+    THREAD
+  }


Is this actually used? I think that from previous discussions we agreed that SLICE wouldn't work since Lucene doesn't tell us what slice it's in. And I don't see LEAF being used in this PR, only THREAD?

Yeah, only THREAD is being used for now. I was planning to introduce other implementations like LEAF later, but I guess we can add this enum at that point.

lucene/sandbox/src/java/org/apache/lucene/sandbox/search/QueryLeafProfilerThreadAggregator.java

github-actions · 2025-06-28T00:26:08Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

jainankitk · 2025-07-21T19:07:11Z

Sorry, I had lost track of this PR. I think that my only concern left is how it computes end times by adding the approximate timing to the start time. I'd rather not report it since this profiler doesn't actually compute an end time, but I'd be fine with reporting the sum of approximate timings if we think that it helps.

No worries @jpountz, thanks for the review. Even I was away for a bit, will try to address the comments and push new revision this week.

Signed-off-by: Ankit Jain <[email protected]>

github-actions · 2025-07-28T02:00:35Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Signed-off-by: Ankit Jain <[email protected]>

github-actions · 2025-07-28T02:06:32Z

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR.

Signed-off-by: Ankit Jain <[email protected]>

jainankitk · 2025-08-05T19:25:35Z

@jpountz - I have addressed all the comments from earlier review. Are you able to take another look, to help close this out?

github-actions · 2025-08-20T00:25:58Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

jainankitk · 2025-08-25T21:12:57Z

Thanks all for reviewing this PR. Planning to merge this PR by tomorrow, if there is no new feedback. Again, thanks for helping improve this change with your inputs!

--------- Signed-off-by: Ankit Jain <[email protected]>

dungba88 · 2025-10-14T01:29:42Z

Hi @jainankitk , is QueryProfilerResult no longer JSON (de)serializable? The reason is the introduction of AggregatedQueryLeafProfilerResult which contains Thread object. I don't see its Thread object is being used anywhere, can it be removed?

msokolov · 2025-10-14T17:26:12Z

or if we do need to maintain the Thread association, can we store the Thread.id (an int I think) instead of the Thread?

jainankitk · 2025-10-15T00:54:15Z

Thanks @dungba88 for trying this out.

I don't see its Thread object is being used anywhere, can it be removed?

The Thread object is used for maintaining the association (as @msokolov called out) during aggregation of segment level Profiler Result. I was not exactly sure what all fields from Thread might be needed/make sense for serialization, so kept Thread itself in AggregatedQueryLeafProfilerResult. I am wondering if it makes sense to add @JsonIgnore / transient to not serialize Thread (and avoid exception/errors) and instead have limited view of Thread, say SerializableThread containing all the properties that we want to have in the serialization?

Preparing existing profiler for adding concurrent profiling

bc16613

github-project-automation bot added this to OpenSearch Lucene & Core Performance Tracking Mar 27, 2025

github-project-automation bot moved this to Open in OpenSearch Lucene & Core Performance Tracking Mar 27, 2025

github-actions bot added the module:sandbox label Mar 27, 2025

jainankitk added 4 commits March 26, 2025 21:20

Adding apache license header to AbstractQueryProfilerBreakdown

468fd78

Signed-off-by: Ankit Jain <[email protected]>

Tidying the code

67c2449

Signed-off-by: Ankit Jain <[email protected]>

Adding the missing javadoc

c4565e2

Signed-off-by: Ankit Jain <[email protected]>

Tidying the javadoc

8917ec6

Signed-off-by: Ankit Jain <[email protected]>

jainankitk added 3 commits March 27, 2025 22:35

Unifying code execution for concurrent and non-concurrent profiling

c310173

Signed-off-by: Ankit Jain <[email protected]>

Fixing the import list

5d5d9f7

Signed-off-by: Ankit Jain <[email protected]>

Fixing the unit test issue

0c06174

Signed-off-by: Ankit Jain <[email protected]>

Fixing :lucene:sandbox:ecjLintMain

821c5dd

Signed-off-by: Ankit Jain <[email protected]>

jainankitk changed the title ~~Preparing existing profiler for adding concurrent profiling~~ Adding profiling support for concurrent segment search Mar 31, 2025

jainankitk added 4 commits April 2, 2025 00:15

Separating query timers at global/slice level

f2ec1f2

Signed-off-by: Ankit Jain <[email protected]>

Interleaving slice level information within the query profile tree

6db3716

Signed-off-by: Ankit Jain <[email protected]>

Fixing unit tests to accommodate slice level breakdowns

da12f97

Signed-off-by: Ankit Jain <[email protected]>

Removing unnecessary null check

ea34e3f

Signed-off-by: Ankit Jain <[email protected]>

github-actions bot added the Stale label May 14, 2025

github-actions bot removed the Stale label May 28, 2025

github-actions bot added the Stale label Jun 11, 2025

jpountz reviewed Jun 13, 2025

View reviewed changes

github-actions bot removed the Stale label Jun 14, 2025

asimmahmood1 mentioned this pull request Jun 16, 2025

[Performance] big5.cardinality-agg-high regressed in OS 3.1 opensearch-project/OpenSearch#18385

Open

github-actions bot added the Stale label Jun 28, 2025

Merge branch 'main' into profiling-changes

6a85b10

github-actions bot removed the Stale label Jul 22, 2025

Addressing review comments

a051dab

Signed-off-by: Ankit Jain <[email protected]>

Tidying the code

e809c55

Signed-off-by: Ankit Jain <[email protected]>

Adding changelog entry

e9c7191

Signed-off-by: Ankit Jain <[email protected]>

github-actions bot added this to the 10.3.0 milestone Jul 28, 2025

github-actions bot added the Stale label Aug 20, 2025

github-actions bot removed the Stale label Aug 26, 2025

jainankitk merged commit f30c7b6 into apache:main Aug 27, 2025
8 checks passed

jainankitk deleted the profiling-changes branch August 27, 2025 16:14

jainankitk added a commit that referenced this pull request Aug 27, 2025

Adding profiling support for concurrent segment search (#14413)

712e47b

--------- Signed-off-by: Ankit Jain <[email protected]>


		import java.util.List;

		interface QueryLeafProfilerAggregator {

Adding profiling support for concurrent segment search #14413

Adding profiling support for concurrent segment search #14413

Uh oh!

Conversation

jainankitk commented Mar 27, 2025

Description

Issue

Uh oh!

jpountz commented Mar 27, 2025

Uh oh!

jainankitk commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

msfroh commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpountz commented Mar 27, 2025

Uh oh!

jainankitk commented Mar 27, 2025

Uh oh!

jainankitk commented Mar 27, 2025

Uh oh!

jainankitk commented Mar 28, 2025

Uh oh!

jpountz commented Mar 28, 2025

Uh oh!

jainankitk commented Mar 28, 2025

Uh oh!

jainankitk commented Mar 31, 2025

Uh oh!

jpountz commented Apr 1, 2025

Uh oh!

jainankitk commented Apr 3, 2025

Uh oh!

jainankitk commented Apr 3, 2025

Uh oh!

jainankitk commented May 27, 2025

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

jpountz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jainankitk Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 28, 2025

Uh oh!

jainankitk commented Jul 21, 2025

Uh oh!

github-actions bot commented Jul 28, 2025

jainankitk commented Mar 27, 2025 •

edited

Loading

msfroh commented Mar 27, 2025 •

edited

Loading

jainankitk Jul 25, 2025 •

edited

Loading

dungba88 commented Oct 14, 2025 •

edited

Loading