-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
📖 Update autoscaling from zero enhancement proposal with support for platform-aware autoscale from zero #11962
base: main
Are you sure you want to change the base?
📖 Update autoscaling from zero enhancement proposal with support for platform-aware autoscale from zero #11962
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @aleskandro! |
Hi @aleskandro. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
4fbe927
to
666c952
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems very straightforward, no objection from me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this generally makes sense to me, and i don't have any objections.
i would like to get some attention on the API field change though, just to make sure folks agree about the name and placement.
Thanks @elmiko. To give more context about the choice of naming and structure for everyone to discuss alternatives. I couldn't consider the architecture as part of the capacity field because the values of keys in that stanza are expected to be a Quantity. I tried to trace back the reasoning the community did about the capacity field and I noticed that the Node objects' status field has the same field - capacity - with the same content we set in the Infrastructure Machine Template objects. The Node objects' status field also has a nodeInfo field with the same name and type that I'm now proposing for the Infrastructure Machine Template objects. Among the alternatives I was evaluating, this seemed the most reasonable to me. |
makes sense to me @aleskandro , no objection from me. |
666c952
to
106bcbd
Compare
/hold Especially for getting consensus on and resolve #11962 (comment) |
106bcbd
to
20f9c76
Compare
20f9c76
to
d6e69f8
Compare
…atform-aware autoscale from zero This commit updates the contract between the cluster-autoscaler Cluster API provider and the infrastructure provider's controllers that reconcile the Infrastructure Machine Template to support platform-aware autoscale from 0 in clusters consisting of nodes heterogeneous in CPU architecture and OS. With this commit, the infrastructure providers implementing controllers to reconcile the status of their Infrastructure Machine Templates for supporting autoscale from 0 will be able to fill the status.nodeInfo stanza with additional information about the nodes. The status.nodeInfo stanza has type corev1.NodeSystemInfo to reflect the same content, the rendered nodes' objects would store in their status field. The cluster-autoscaler can use that information to build the node template labels `kubernetes.io/arch` and `kubernetes.io/os` if that information is present. Suppose the pending pods that trigger the cluster autoscaler have a node selector or a requiredDuringSchedulingIgnoredDuringExecution node affinity concerning the architecture or operating system of the node where they can execute. In that case, the autoscaler will be able to filter the nodes groups options according to the architecture or operating system requested by the pod. The users could already provide this information to the cluster autoscaler through the labels capacity annotation. However, there is no similar capability to support future labels/taints through information set by the reconcilers of the status of Infrastructure Machine Templates.
d6e69f8
to
85da18e
Compare
`platform.NodeSystemInfo` is a struct that contains the architecture and operating system information of the node, to | ||
implement in the `util` package of the `cluster-api` project. The defined types and constants are exported for use by the | ||
cluster-api providers integrations. | ||
Its definition would look like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not put an API struct in the util package.
On the other side I'm also not sure if we should put it in any of our existing packages of our api groups. Because as soon we create a new apiVersion every infra provider CRD also requires a new apiVersion to bump to our new apiVersion.
So I think it might be a good idea to create a new API package for this. One option could be api/clusterautoscaler/<version>
(could be fine to directly start with version v1).
In general I would like to see this API versioned, so we have a chance to evolve it over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sbueringer, I'm looking into the api package. Does it make more sense to put these structs in the api/${version:-v1beta1}/
package? Perhaps we can have a machinetemplate_types.go
file as we do for other APIs.
I'm not sure about api/clusterautoscaler/<version>
because (a) other types are defined in api/<version>/...
, and (b) the clusterautoscaler may not be the direct consume of these types as it reads the MachineTemplate as an unstructured object. Providers, though, should use these types to expose the platform information to the cluster autoscaler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api//...
Will be going away soon. We'll move to something like this
api
core
addons
bootstrap/kubeadm
controlplane/kubeadm
I think we should not put it into the core API package (today api/v1beta1, later api/core/v1beta1) as it is not part of the core CAPI API.
Regarding the package name. What about api/infrastructure/\<version\>
?
But I would like to hear opinions from other maintainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to mention it. Another alternative could be to just define the struct here in this KEP and then infra providers just copy it over into their API types. Would also avoid coupling between core CAPI and infra providers (i.e. if CAPI does an API bump that affects the core CAPI struct, providers are not affected).
This might be okay as the actual dependency is to the cluster-autoscaler code that uses these fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then infra providers just copy it over into their API types.
Do we expect every provider to implement every single field? If we define the schema here for each possible field, then the providers can copy over only the fields they support, and not have to add fields to their API that they may not actually do anything with.
If they later add a new capability, they just need to update by looking at how the field was defined here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm definitely fine with the copy approach. Embedding structs is really tricky and probably not worth the coupling.
Also unclear to me if all infra providers are able to implement all fields (especially if we are going to expand the struct in the future).
If we don't have the struct in a Go file we can 100% make sure that nobody embeds it (not embedding it is probably the safest option anyway)
@JoelSpeed If you find some time, PTAL :) |
0b19770
to
8349163
Compare
8349163
to
6d2fb96
Compare
/ok-to-test |
lgtm from my side. Pending resolution of #11962 (comment) (I'll be on PTO for the next few weeks, but fine for me if there is consensus with other maintainers) |
package platform | ||
|
||
// Architecture represents the architecture of the node. Its underlying type is a string. | ||
// +enum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, this marker isn't the correct one for all tools, so you need two markers. In the future we plan to fix this in controller-gen though
// +enum | |
// +kubebuilder:validation:Enum:=amd64;arm64;s390x;ppc64le | |
// +enum |
ArchitectureAmd64 Architecture = "amd64" | ||
ArchitectureArm64 Architecture = "arm64" | ||
ArchitectureS390x Architecture = "s390x" | ||
ArchitecturePpc64le Architecture = "ppc64le" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use PascalCase for enum values to be consistent with other Kube APIs, that, for these probably means all caps
ArchitectureAmd64 Architecture = "amd64" | |
ArchitectureArm64 Architecture = "arm64" | |
ArchitectureS390x Architecture = "s390x" | |
ArchitecturePpc64le Architecture = "ppc64le" | |
ArchitectureAmd64 Architecture = "AMD64" | |
ArchitectureArm64 Architecture = "ARM64" | |
ArchitectureS390x Architecture = "S390X" | |
ArchitecturePpc64le Architecture = "PPC64LE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I double-checked this and k/k is using lower case for these specifically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example from a kind node
nodeInfo:
architecture: arm64
bootID: e71b173f-a131-4aa8-bb05-f677d37e9a55
containerRuntimeVersion: containerd://2.0.2
kernelVersion: 6.8.0-57-generic
kubeProxyVersion: v1.32.2
kubeletVersion: v1.32.2
machineID: 7da9cbb2def44eda8a3a0c35fd10dcbb
operatingSystem: linux
osImage: Debian GNU/Linux 12 (bookworm)
systemUUID: 4435bcc4-f4a1-42be-8099-171de75b12cb
Given that our value is inferred from the value from the Node object I think it would be good if we use the same constants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, throughout Kubernetes we use the LLVM/Goarch format to represent architectures. I would keep it that way, as the ecosystem already contains a mix of formats (e.g., GNU, LLVM, and hybrid approaches), which makes architecture-specific logic more complex than necessary to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, if that's already in use for node objects then it makes sense to keep
type NodeInfo struct { | ||
// Architecture is the architecture of the node. It is a string that can be any of (amd64, arm64, s390x, ppc64le). | ||
// +optional | ||
// +kubebuilder:validation:Enum=amd64;arm64;s390x;ppc64le |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my other comment, probably want the enum tag there rather than here
|
||
// NodeInfo contains information about the node's architecture and operating system. | ||
type NodeInfo struct { | ||
// Architecture is the architecture of the node. It is a string that can be any of (amd64, arm64, s390x, ppc64le). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Architecture is the architecture of the node. It is a string that can be any of (amd64, arm64, s390x, ppc64le). | |
// architecture is the CPU architecture of the node. | |
// It is a string that can be any of AMD64, ARM64, S390X, PPC64LE. |
// +optional | ||
// +kubebuilder:validation:Enum=amd64;arm64;s390x;ppc64le | ||
Architecture Architecture `json:"architecture,omitempty"` | ||
// OperatingSystem is a string representing the operating system of the node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// OperatingSystem is a string representing the operating system of the node. | |
// operatingSystem is a string representing the operating system of the node. |
What kind of values do we support here? What would this look like for a ubuntu based node, vs a windows based node? Do we care about the difference between different linux distros?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have to make sure that we align with how Kubernetes sets Node.status.nodeInfo.operatingSystem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what does the nodeinfo in the node status do? Looks like linux
or ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup linux
(#11962 (comment)), but I don't know all the possible values that Kubernetes uses or where the value comes from (that's why I just added the generic statement of: "it has to match up", because afaik the cluster-autoscaler expects the Node that will be created from this InfraMachineTemplate to have the same OS)
`platform.NodeSystemInfo` is a struct that contains the architecture and operating system information of the node, to | ||
implement in the `util` package of the `cluster-api` project. The defined types and constants are exported for use by the | ||
cluster-api providers integrations. | ||
Its definition would look like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then infra providers just copy it over into their API types.
Do we expect every provider to implement every single field? If we define the schema here for each possible field, then the providers can copy over only the fields they support, and not have to add fields to their API that they may not actually do anything with.
If they later add a new capability, they just need to update by looking at how the field was defined here
What this PR does / why we need it:
This PR updates the contract between the cluster-autoscaler Cluster API provider and the infrastructure provider's controllers that reconcile the Infrastructure Machine Template to support platform-aware autoscale from 0 in clusters consisting of nodes heterogeneous in CPU architecture and OS.
With this commit, the infrastructure providers implementing controllers to reconcile the status of their Infrastructure Machine Templates for supporting autoscale from 0 will be able to fill the status.nodeInfo stanza with additional information about the nodes.
The status.nodeInfo stanza has type corev1.NodeSystemInfo to reflect the same content, the rendered nodes' objects would store in their status field.
The cluster-autoscaler can use that information to build the node template labels
kubernetes.io/arch
andkubernetes.io/os
if that information is present.Suppose the pending pods that trigger the cluster autoscaler have a node selector or a requiredDuringSchedulingIgnoredDuringExecution node affinity concerning the architecture or operating system of the node where they can execute. In that case, the autoscaler will be able to filter the nodes groups options according to the architecture or operating system requested by the pod.
The users could already provide this information to the cluster autoscaler through the labels capacity annotation. However, there is no similar capability to support future labels/taints through information set by the reconcilers of the status of Infrastructure Machine Templates.
Which issue(s) this PR fixes
Related to #11961
/area provider/core