Skip to content

KEP-5241: Beta Feature Gate Promotion Requirements #5242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Apr 14, 2025

  • One-line PR description: Features gates must include all functional, security, and testing requirements along with resolving all issues and gaps identified prior to being enabled by default.
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 14, 2025
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 14, 2025
@deads2k deads2k changed the title KEP-5241: Requirements for feature gates enabled in production clusters by default KEP-5241: Beta Feature Gate Promotion Requirements Apr 14, 2025
@deads2k deads2k force-pushed the production-quality branch from d15bc9d to 3b33ffc Compare April 14, 2025 15:55
in v1.Y broke under the same feature gate.

#### Who will make sure that new KEPs follow the promotion rules?
We'll adjust the KEP template to indicate the allowed criteria, so authors should notice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only for new features. Authors of some existing KEP only notice if they actively sync with the KEP template (rarely done) or we announce this KEP here widely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another topic about that in SIG Arch today. bit.ly/kep-versioning_phase1

@kikisdeliveryservice kikisdeliveryservice added the kind/template Categorizes changes to the KEP template label Apr 21, 2025
@kikisdeliveryservice kikisdeliveryservice added this to the v1.34 milestone Apr 21, 2025
3. Beta means that a feature gate is usually enabled in all production Kubernetes clusters by default
and that feature can be disabled.
Exceptions exist for entirely new APIs and some node features, but this broadly the case.
4Alpha means that a feature gate is disabled in all production Kubernetes clusters by default and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4Alpha means that a feature gate is disabled in all production Kubernetes clusters by default and
4. Alpha means that a feature gate is disabled in all production Kubernetes clusters by default and

@deads2k deads2k force-pushed the production-quality branch from 5c76f39 to 43e3c6a Compare May 7, 2025 17:57
Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2025
@dims dims self-assigned this May 7, 2025
@dims
Copy link
Member

dims commented May 7, 2025

/approve

thanks for writing this up @deads2k

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, dims, johnbelamaric

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this may need a tweak on the template readme to call out monitoring/instrumentation.


## Summary

Features gates must include all functional, security, monitoring, and testing requirements along with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless i am mistaken, the summary adds 'monitoring' requirements, but its not highlighted earlier. i do not see it called out in beta or GA explicitly unless i missed it. relevant instrumentation is important, but i am not sure if addition of a metric observed during a beta phase extends the beta phase, blocks promotion to GA phase, or what. wdyt?

To balance these concerns, we are changing how we evaluate Beta and GA stability criteria.
The only valid GA criteria are “all issues and gaps identified as feedback during beta are resolved”.
Promotion from Beta to GA must be zero-diff for the release.
This means that Beta criteria must include all functional, security, monitoring, and testing requirements along
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is the intent, recommend calling out instrumentation in the kep template under beta, and note that any new instrumentation would extend the beta another release. i am not sure if this introduces a negative incentive in some cases, but we would have to see how that plays out. in some cases, correlating a metric to a single feature may not always be clear if it had cross-cutting value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding a new metric should "automatically" require a second beta, for these reasons:

  1. Metrics themselves have independent alpha/beta/stable guarantees which are not generally in sync with the feature they monitor.
  2. There is space between metrics that are "necessary to support the feature in production" and those that are "useful to have".

The evaluation of whether the metrics that are necessary to support a feature in production are all there is a GA criteria. This KEP update is saying that "we should require those to be available in beta", which I think is a good thing.

Even if we screw that up, and miss something that is critical in beta, adding it before GA should be sufficient. I don't think we want to say "all bugs/missing metrics must be fixed and then we must do another beta before GA".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated "zero-diff" to "no significant change" to try to better reflect our intent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory kind/template Categorizes changes to the KEP template sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants