You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m exploring approaches to introduce cluster-level admission control in Kyuubi for multi-tenant workloads (BI + batch), and wanted to validate the approach and learn from existing production practices.
Background
As far as I understand, Kyuubi currently acts as a SQL gateway that forwards queries to engines immediately. While it provides strong multi-tenancy and engine isolation, it does not seem to have:
global concurrency limits
query admission control
built-in queuing mechanism
In bursty workloads (e.g., Tableau dashboards or Airflow jobs), this can lead to:
driver/executor overload
degraded latency for all queries
lack of fairness across tenants
Current Approach
We are experimenting with admission control at the operation layer:
Intercept at KyuubiOperationManager.newExecuteStatementOperation()
Wrap ExecuteStatement
Acquire a global token (via external store like Redis) before execution
Release token in afterRun()
This allows enforcing a cluster-wide concurrency limit before queries reach the engine.
Open Questions
How are others handling admission control today?
Spark scheduler (FAIR pools / queues)?
Engine isolation (per user/group)?
External gateway or orchestration layer?
Any custom Kyuubi extensions or patches?
Query behavior under saturation:
Current implementation is fail-fast (reject when limit reached)
However, for BI workloads, queueing (blocking execute) seems more appropriate
Has anyone implemented:
query queuing while preserving JDBC semantics?
where clients wait instead of fail?
Extension points:
Is there any plan or discussion around making OperationManager pluggable via config?
Or introducing a pre-execution admission hook in Kyuubi?
Goal
Ultimately trying to move towards more warehouse-like behavior:
controlled concurrency
fairness across workloads
stable performance under load
Would appreciate any insights, prior art, or guidance from folks running Kyuubi at scale.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
I’m exploring approaches to introduce cluster-level admission control in Kyuubi for multi-tenant workloads (BI + batch), and wanted to validate the approach and learn from existing production practices.
Background
As far as I understand, Kyuubi currently acts as a SQL gateway that forwards queries to engines immediately. While it provides strong multi-tenancy and engine isolation, it does not seem to have:
In bursty workloads (e.g., Tableau dashboards or Airflow jobs), this can lead to:
Current Approach
We are experimenting with admission control at the operation layer:
KyuubiOperationManager.newExecuteStatementOperation()ExecuteStatementafterRun()This allows enforcing a cluster-wide concurrency limit before queries reach the engine.
Open Questions
How are others handling admission control today?
Query behavior under saturation:
Has anyone implemented:
Extension points:
OperationManagerpluggable via config?Goal
Ultimately trying to move towards more warehouse-like behavior:
Would appreciate any insights, prior art, or guidance from folks running Kyuubi at scale.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions