[L0v2] fix unbounded memory growth of queue's submitted kernels #20847

pbalcer · 2025-12-05T21:39:14Z

L0v2 avoids internally tracking each kernel submission through an event for lifetime management. Instead, when a kernel is submitted to the queue, its handle is added to a vector, to be removed at the next queue synchronization point, urQueueFinish(). This is a much more efficient way of handling kernel tracking, since it avoids taking and storing an event. However, if the application never synchronizes the queue, this vector of submitted kernels will grow unbounded.

This patch forcibly synchronizes the queue once the submitted kernels vector reaches a threshold.

pbalcer · 2025-12-05T21:39:58Z

I've created a PR against 6.2, since this functionality changed a little in the latest version, and will require a different patch.

nrspruit

Looks good to me!

L0v2 avoids internally tracking each kernel submission through an event for lifetime management. Instead, when a kernel is submitted to the queue, its handle is added to a vector, to be removed at the next queue synchronization point, urQueueFinish(). This is a much more efficient way of handling kernel tracking, since it avoids taking and storing an event. However, if the application never synchronizes the queue, this vector of submitted kernels will grow unbounded. This patch forcibly synchronizes the queue once the submitted kernels vector reaches a threshold.

MichalMrozek · 2025-12-08T04:17:44Z

unified-runtime/source/adapters/level_zero/v2/queue_immediate_in_order.cpp

-    ur_kernel_handle_t hKernel) {
+    locked<ur_command_list_manager> &commandList, ur_kernel_handle_t hKernel) {
+  if (submittedKernels.size() > MAX_QUEUE_SUBMITTED_KERNELS) {
+    synchronize(commandList);


There is very low chance that we will have 1000 unique kernels, most likely we have duplicates here.
RIght now this will always synchronize every 1k submits which is undesired.

Can you compact the vector by finding duplicates and releasing those?
You just need one kernel instance in container to hold the object all additional ones are not neeeded.

Also what I would consider is not adding kernel to the vector if it is already there.

pbalcer requested a review from a team as a code owner December 5, 2025 21:39

pbalcer temporarily deployed to WindowsCILock December 5, 2025 21:39 — with GitHub Actions Inactive

nrspruit approved these changes Dec 5, 2025

View reviewed changes

pbalcer had a problem deploying to WindowsCILock December 5, 2025 22:00 — with GitHub Actions Failure

pbalcer force-pushed the cleanup-large-submitted-kernels branch from 753bc9e to f99e1cd Compare December 5, 2025 22:09

pbalcer temporarily deployed to WindowsCILock December 5, 2025 22:09 — with GitHub Actions Inactive

pbalcer had a problem deploying to WindowsCILock December 5, 2025 22:30 — with GitHub Actions Failure

pbalcer had a problem deploying to WindowsCILock December 5, 2025 22:39 — with GitHub Actions Failure

pbalcer marked this pull request as draft December 5, 2025 23:59

pbalcer force-pushed the cleanup-large-submitted-kernels branch from f99e1cd to 57aef87 Compare December 6, 2025 09:15

pbalcer temporarily deployed to WindowsCILock December 6, 2025 09:16 — with GitHub Actions Inactive

pbalcer had a problem deploying to WindowsCILock December 6, 2025 09:36 — with GitHub Actions Failure

pbalcer force-pushed the cleanup-large-submitted-kernels branch from 57aef87 to 156235d Compare December 6, 2025 09:39

pbalcer temporarily deployed to WindowsCILock December 6, 2025 09:40 — with GitHub Actions Inactive

pbalcer had a problem deploying to WindowsCILock December 6, 2025 09:50 — with GitHub Actions Failure

nrspruit approved these changes Dec 6, 2025

View reviewed changes

MichalMrozek reviewed Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[L0v2] fix unbounded memory growth of queue's submitted kernels #20847

[L0v2] fix unbounded memory growth of queue's submitted kernels #20847

pbalcer commented Dec 5, 2025

Uh oh!

pbalcer commented Dec 5, 2025

Uh oh!

nrspruit left a comment

Uh oh!

MichalMrozek Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[L0v2] fix unbounded memory growth of queue's submitted kernels #20847

Are you sure you want to change the base?

[L0v2] fix unbounded memory growth of queue's submitted kernels #20847

Conversation

pbalcer commented Dec 5, 2025

Uh oh!

pbalcer commented Dec 5, 2025

Uh oh!

nrspruit left a comment

Choose a reason for hiding this comment

Uh oh!

MichalMrozek Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants