CA DRA: correctly handle Node readiness after scale-up #7780

towca · 2025-01-29T18:15:38Z

Which component are you using?:

/area cluster-autoscaler
/area core-autoscaler
/wg device-management

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

Nodes with custom resources exposed by device plugins (e.g. GPUs) have condition Ready before they actually expose the resources. Cluster Autoscaler has to hack them to be not-Ready until they do expose the resources, otherwise the unschedulable pods don't pack on the Nodes in filter_out_schedulable and CA does another, unnecessary scale-up.

The same happens for DRA resources - until the driver for a given Node publishes its ResourceSlices, the Node is considered Ready but the Pod can't schedule on it, so CA does another scale-up.

Describe the solution you'd like.:

We could re-do the current GPU hack and treat Nodes that should have ResourceSlices exposed but don't as not Ready. We can detect whether a given Node should have ResourceSlices ready by comparing with the template node for its node group.

Alternatively, maybe we could add a new Condition to the Node, specifying whether ResourceSlices have been exposed already? Then CA could just look at the condition instead of correlating with the template node. This seems like a much cleaner solution, but it requires changes in core Kubernetes objects, so not sure how feasible it is.

Additional context.:

This is a part of Dynamic Resource Allocation (DRA) support in Cluster Autoscaler. An MVP of the support was implemented in #7530 (with the whole implementation tracked in kubernetes/kubernetes#118612). There are a number of post-MVP follow-ups to be addressed before DRA autoscaling is ready for production use - this is one of them.

johnbelamaric · 2025-02-19T21:36:44Z

This same "ready before it's actually ready" thing is really the root cause of kubernetes/kubernetes#129310. We worked around it by tweaking the semantics of All because it made sense. But some sort of additional Node condition (or Node equivalent to Pod readiness gates) would probably be a better solution. I'm surprised we don't have something like this already. cc @SergeyKanzhelev

towca · 2025-04-11T13:31:19Z

/assign @abdelrahman882

k8s-ci-robot · 2025-04-11T13:31:23Z

@towca: GitHub didn't allow me to assign the following users: abdelrahman882.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @abdelrahman882

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

jackfrancis · 2025-05-22T15:33:21Z

@towca long-term solution for this will probably be here:

https://docs.google.com/document/d/11i2_rewvcbQkFFq1BIwHa7lgIefgZ8Mak-QMZjP7fFs

I've added a few brief notes to ensure we consider the DRA use case, let's review this for CA generally and make sure it solves the general use case for us as well.

k8s-ci-robot added area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. labels Jan 29, 2025

This was referenced May 1, 2025

Node Readiness Probes kubernetes/kubernetes#131569

Closed

Feat: DRA Node Readiness #8082

Open

towca mentioned this issue May 5, 2025

Add proposal for CSI node awareness #8083

Open

abdelrahman882 linked a pull request May 7, 2025 that will close this issue

Handle node readiness for DRA after a scale-up #8109

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CA DRA: correctly handle Node readiness after scale-up #7780

CA DRA: correctly handle Node readiness after scale-up #7780

towca commented Jan 29, 2025 •

edited

Loading

johnbelamaric commented Feb 19, 2025

Uh oh!

towca commented Apr 11, 2025

Uh oh!

k8s-ci-robot commented Apr 11, 2025

Uh oh!

jackfrancis commented May 22, 2025

Uh oh!

CA DRA: correctly handle Node readiness after scale-up #7780

CA DRA: correctly handle Node readiness after scale-up #7780

Comments

towca commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

johnbelamaric commented Feb 19, 2025

Uh oh!

towca commented Apr 11, 2025

Uh oh!

k8s-ci-robot commented Apr 11, 2025

Uh oh!

jackfrancis commented May 22, 2025

Uh oh!

towca commented Jan 29, 2025 •

edited

Loading