Skip to content

Improve compatibility with kstatus: avoid "resource is ready" race #9299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tmmorin opened this issue Aug 24, 2023 · 18 comments
Closed

Improve compatibility with kstatus: avoid "resource is ready" race #9299

tmmorin opened this issue Aug 24, 2023 · 18 comments
Labels
area/api Issues or PRs related to the APIs kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@tmmorin
Copy link

tmmorin commented Aug 24, 2023

What would you like to be added (User Story)?

There is a low-hanging fruit of something easy to do to have CAPI resources play much nicer with tools relying on kstatus to know if a CAPI resource is ready.

Detailed Description

CAPI custom resources can be considered "Ready" by kstatus library before they actually are ready.

In a context where a tool relying on kstatus is used (e.g FluxCD) this opens the door to inconsistencies: wrongly concluding that something is ready and triggering things that depend on that too early.

This typically happens very shortly after resource creation and for a very short period of time. And this resolves very quickly. But there is a race condition if the tool using kstatus is used during the problematic time window.

The problematic time is when, for instance, a Cluster CR has no status yet, or when it has status but no status.conditions yet, with only the status.observedGeneration being set to 1 (equal to metadata.generation). As soon as the resource is processed by its controller, the controller will set its status to include in status.conditions a condition of type Ready and status False , and then kstatus will report a correct result (InProgress which means "ready").

The typical solution to this issue is to ensure that the CRD defines a default of -1 for status.observedGeneration - this is sufficient to let kstatus library ignore the rest and conclude that the resource isn't ready yet.

Quoting @stefanprodan (FluxCD dev):

They should set this to make ClusterAPI compatible
// +kubebuilder:default={"observedGeneration":-1}

Example here (FluxCD does this on their own CRDs):
https://github.com/fluxcd/source-controller/blob/a302c71c57e370403042a2e307e3f4446b539730/api/v1/gitrepository_types.go#L328

Anything else you would like to add?

#5472 has been opened a while ago, is strongly related, but has a much generic/wider scope than what I described her which is focusing only on having the InProgress/ready state free of this resource creation race condition.

Label(s) to be applied

/kind feature
/area api

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. area/api Issues or PRs related to the APIs labels Aug 24, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 24, 2023
@tmmorin
Copy link
Author

tmmorin commented Aug 24, 2023

This was discussed on slack two days ago with @killianmuldoon , who suggested me to file an issue.

@sbueringer
Copy link
Member

sbueringer commented Aug 28, 2023

@fabriziopandini WDYT?

Seems sort of okay to me, although I'm not a huge fan of doing something like this just for Flux, is this some sort of standard?

(I assume observedGeneration 0 would work, if we wouldn't use omitempty?)

@stefanprodan
Copy link

Seems sort of okay to me, although I'm not a huge fan of doing something like this just for Flux, is this some sort of standard?

Quite intrigued about this statement, Flux is a graduated CNCF project, why wouldn't we want better synergy between CNCF projects?

PS. As @tmmorin mentioned in this issue, this about Kubernetes kstatus compatibility. Kstatus is a library used in kubectl and also Flux, maybe others CNCF projects use it too. Doc here: https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus

@sbueringer
Copy link
Member

sbueringer commented Aug 28, 2023

I didn't want to say that we wouldn't want to be compatible with Flux. I just wondered if there is some standard across the ecosystem (which not only includes Flux). For example at the moment Cluster API itself is using Condition types which diverge from the standard k/k conditions (afaik Cluster API and k/k conditions were introduced at ~ the same time). One interesting difference there is that observedGeneration is part of the conditions themselves in k/k (xref: https://github.com/kubernetes/apimachinery/blob/master/pkg/apis/meta/v1/types.go#L1497).

We were thinking for a while if there is a way to make our status / status.conditions more standard so we can move utils like our patch helper to controller-runtime instead of requiring our own in Cluster API

It would be just nice if we could find a way for our condition to align to Kubernetes-ecosystem-wide standards.

I assume the Kubernetes condition type is compatible? (I mean it basically has to be, right? :)).

So setting observedGeneration default to -1 would be a stopgap to make Cluster API conditions compatible, but it wouldn't be another problem if we want to move to k/k conditions? (because they don't have a default of -1 as far as I can tell)

@stefanprodan
Copy link

The issue here is not with conditions but with resources just created that don't have .status set, before the controller gets the chance to set conditions. The Kubernetes builtin resources are handled in kstatus as exceptions, a Deployment without status will not return ready, while a custom resource would. Using the -1 default tells kstatus that a custom resource wasn't yet processed by a controller. Another way of handling this, is by adopting kstatus conditions and set the default for Reconciling to True. In Flux we've adopted kstatus Reconciling/Stalled conditions, we also set observedGeneration for each condition, but this is a lot of work compared to setting the status observedGeneration to -1.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 26, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sbueringer
Copy link
Member

sbueringer commented Apr 9, 2025

/reopen
/remove lifecycle rotten

Let's revisit this given the ongoing work on #11947. Especially the work on moving to metav1.Conditions.
(with CAPI v1.11 (August 2025 release) we will be using metav1.Conditions in status.conditions)

I'm not sure if we can entirely follow what is lined out here: https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus e.g.:

The conditions defined in the library are designed to adhere to the "abnormal-true" polarity pattern (https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties) , i.e. that conditions are present and with a value of true whenever something unusual happens. So the absence of any conditions means everything is normal.

Cluster API follows https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties which states

Condition type names should make sense for humans; neither positive nor negative polarity can be recommended as a general rule

Example: We have an "Available" condition that is "True" if the Cluster is available.

What is not entirely clear to me is why kstatus considers a CR that has a status subresource with an empty status as fully reconciled. I guess they want to handle the case where a CR has a status but it's valid that even after reconcile the status is entirely empty.

@stefanprodan If we would add the default value for status.observedGeneration, what would this improve on the Flux side nowadays? I think you'll still need configuration so Flux knows which condition to look for? (the full list of conditions of the Cluster v1beta2 type can be seen here: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20240916-improve-status-in-CAPI-resources.md#cluster-newconditions).
I'm not sure I understand how generation/observedGeneration correlates to that we'll need some config on the Flux side to specify which condition is the right one to look at (i.e. if we need one or both of these)

Related: What is the information you are looking for? The v1beta1 Cluster Ready condition doesn't cover much (especially if a Cluster is e.g. upgraded or workers are scaled up)

@JoelSpeed What is your take on setting a default of observedGeneration -1? I couldn't find anything about that in the API conventions.

cc @fabriziopandini

P.S. It was mention above that kstatus is used in kubectl. Is there a simple kubectl command I can use with our Cluster CR to play around with this?

@k8s-ci-robot k8s-ci-robot reopened this Apr 9, 2025
@k8s-ci-robot
Copy link
Contributor

@sbueringer: Reopened this issue.

In response to this:

/reopen
/remove lifecycle rotten

Let's revisit this given the ongoing work on #11947. Especially the work on moving to metav1.Conditions.
(with CAPI v1.11 (August 2025) release we will be using metav1.Conditions in status.conditions)

I'm not sure if we can enitrely follow what is lined out here: https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus e.g.:

The conditions defined in the library are designed to adhere to the "abnormal-true" polarity pattern (https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties) , i.e. that conditions are present and with a value of true whenever something unusual happens. So the absence of any conditions means everything is normal.

Cluster API follows https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties which states

Condition type names should make sense for humans; neither positive nor negative polarity can be recommended as a general rule

What is not entirely clear to me is why kstatus considers a CR that has a status subresource with an empty status as fully reconciled. I guess they want to handle the case where a CR has a status but it's valid that even after reconcile the status is entirely empty.

@JoelSpeed What is your take on setting a default of observedGeneration 1? I couldn't find anything about that in the API conventions.

@stefanprodan If we would add the default value, what would this improve on the Flux side nowadays? I think you'll still need configuration so Flux knows which condition to look for? (the full list of conditions of the Cluster v1beta2 type can be seen here: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20240916-improve-status-in-CAPI-resources.md#cluster-newconditions)
Related: What is the information you are looking for? The v1beta1 Cluster Ready condition doesn't cover much (especially if a Cluster is e.g. upgraded or workers are scaled up)

cc @JoelSpeed @fabriziopandini

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates an issue lacks a `priority/foo` label and requires one. label Apr 9, 2025
@stefanprodan
Copy link

@sbueringer given that the Ready condition that kstatus looks for is being replaced with Available we can't rely on kstatus in Flux for ClusterAPI. Flux 2.5 has added custom health checks using CEL and we've added ClusterAPI to our docs here: https://fluxcd.io/flux/cheatsheets/cel-healthchecks/#cluster

After upgrading to v1beta2 the current health checks will fail, so Flux users will need to update their CEL expression and switch to Available, or write an expression that looks for both.

@sbueringer
Copy link
Member

sbueringer commented Apr 9, 2025

I think something like this should work

healthCheckExprs:
 - apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    failed: status.conditions.filter(e, e.type == 'Ready' || e.type == 'Available').all(e, e.status == 'False')
    current: status.conditions.filter(e, e.type == 'Ready' || e.type == 'Available').all(e, e.status == 'True')

(Note for other readers (discussed offline): this won't actually use v1beta1 it will use the preferred version, which is why we have to handle both conditions as soon as v1beta2 has been added to the CAPI CRD)

@JoelSpeed
Copy link
Contributor

(I assume observedGeneration 0 would work, if we wouldn't use omitempty?)

Why does 0 not work, even with omitempty?

The Kubernetes builtin resources are handled in kstatus as exceptions, a Deployment without status will not return ready, while a custom resource would.

Why would a custom resource without status be considered ready? A lack of status, assuming the resource has a status to set, clearly means the controller hasn't got to it yet

@JoelSpeed What is your take on setting a default of observedGeneration -1? I couldn't find anything about that in the API conventions.

Observed generation already has a sensible default when the object is created, omitted, just like the rest of the status. I'm not sure why we would want to set this to a non-obvious default.

From an end user perspective, a lack of status is clearly "nothing has happened yet". Where a partial status with a default value looks odd, and now probably isn't necessarily obvious why there's a partial status.

I'm currently -1 on setting any defaults on status objects right now

@sbueringer
Copy link
Member

sbueringer commented Apr 9, 2025

Thx for the feedback everyone.

given that the Ready condition that kstatus looks for is being replaced with Available we can't rely on kstatus in Flux for ClusterAPI.

I think based on that there would be no benefit anymore in setting a default of -1 on observedGeneration

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2025
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Issues or PRs related to the APIs kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

6 participants