Skip to content

KEP-4816 update for beta in 1.34 #5261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mortent
Copy link
Member

@mortent mortent commented Apr 27, 2025

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 27, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mortent
Once this PR has been reviewed and has the lgtm label, please assign sanposhiho, soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Apr 27, 2025
@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Apr 27, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Apr 27, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 27, 2025

Scheduling a claim that uses this feature may take a bit longer, if it is
necessary to go deeper into the list of alternative options before finding a
suitable device. We can measure this impact in alpha.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we can easily measure this, as it largely depends on what the structure of the ResourceClaim.

But in general each subrequest requires the same amount of work as a regular request, so a request with two subrequest will in the worst-case take about twice as long as just a single request. The maximum number of subrequests for each request is 8, so in the worst case, where none of the eight subrequests succeed, it would take 8 times longer than just a normal request.

Since the subrequests are tried in priority order, the extra work is only needed in situations where the first subrequests can be satisfied, so a situation where using just a single request would have failed to allocate devices for the request.

@@ -1010,7 +1017,6 @@ ensure they are handled by the scheduler as described in this KEP.
#### Beta

- Gather feedback
- Implement node scoring

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we initially thought we'd pick this up in the beta cycle but now we're kicking it out of the Prioritized Alternatives scope altogether?

I'm thinking specifically about the comment above the DeviceRequest.FirstAvailable property.

    // DRA does not yet implement scoring, so the scheduler will
    // select the first set of devices that satisfies all the
    // requests in the claim. And if the requirements can
    // be satisfied on more than one node, other scheduling features
    // will determine which node is chosen. This means that the set of
    // devices allocated to a claim might not be the optimal set
    // available to the cluster. Scoring will be implemented later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.

3 participants