Is your feature request related to a problem or existing issue? Please describe.
Cluster Autoscaler treats a taint as a startup taint in two ways: the --startup-taint flag, or the reserved key prefix startup-taint.cluster-autoscaler.kubernetes.io/, which is auto-detected without any flag.
On managed platforms the flag is not user-editable, so the reserved prefix is the only way to get startup-taint semantics:
- GKE documents the prefix as the only supported mechanism (docs).
- AKS supports boot-time taints via
--node-init-taints, but its autoscaler does not expose --startup-taints (Azure/AKS#3276)
A single taint key cannot carry both prefixes. So on GKE and AKS, NRC cannot be the component that removes the startup taint.
Concrete use case
Gating scheduling on autoscaled GPU nodes: the node pool template applies a startup taint, Node Problem Detector publishes a GPU readiness node condition, and NRC removes the taint when the condition is True. This works with self-managed Cluster Autoscaler (where --startup-taint can be set) but not on GKE or AKS.
Describe the solution you'd like
Extend the validation to also accept the Cluster Autoscaler startup-taint prefix:
// +kubebuilder:validation:XValidation:rule="self.key.startsWith('readiness.k8s.io/') || self.key.startsWith('startup-taint.cluster autoscaler.kubernetes.io/')"
Optionally also allow the legacy ignore-taint.cluster-autoscaler.kubernetes.io/ prefix, which older autoscaler versions auto-detect.
Describe alternatives you've considered
- Admin-configured prefix allowlist (controller flag). More general, but CRD-level CEL cannot read controller flags, so apply-time validation would move to the reconciler (or webhook).
- Cluster Autoscaler auto-detecting
readiness.k8s.io/ as a startup-taint namespace. A good long-term change on the autoscaler side, but this requires coordination among multiple providers.
Is your feature request related to a problem or existing issue? Please describe.
Cluster Autoscaler treats a taint as a startup taint in two ways: the
--startup-taintflag, or the reserved key prefixstartup-taint.cluster-autoscaler.kubernetes.io/, which is auto-detected without any flag.On managed platforms the flag is not user-editable, so the reserved prefix is the only way to get startup-taint semantics:
--node-init-taints, but its autoscaler does not expose--startup-taints(Azure/AKS#3276)A single taint key cannot carry both prefixes. So on GKE and AKS, NRC cannot be the component that removes the startup taint.
Concrete use case
Gating scheduling on autoscaled GPU nodes: the node pool template applies a startup taint, Node Problem Detector publishes a GPU readiness node condition, and NRC removes the taint when the condition is
True. This works with self-managed Cluster Autoscaler (where--startup-taintcan be set) but not on GKE or AKS.Describe the solution you'd like
Extend the validation to also accept the Cluster Autoscaler startup-taint prefix:
// +kubebuilder:validation:XValidation:rule="self.key.startsWith('readiness.k8s.io/') || self.key.startsWith('startup-taint.cluster autoscaler.kubernetes.io/')"Optionally also allow the legacy
ignore-taint.cluster-autoscaler.kubernetes.io/prefix, which older autoscaler versions auto-detect.Describe alternatives you've considered
readiness.k8s.io/as a startup-taint namespace. A good long-term change on the autoscaler side, but this requires coordination among multiple providers.