KEP-3926: updating the PRR questionnaire#5645
Conversation
ibihim
commented
Oct 9, 2025
- One-line PR description: Refining PRR questionnaire based on alpha implementation learnings
- Issue link: Handling undecryptable resources #3926
- Other comments:
- Updates Production Readiness Review sections with corrections from alpha implementation
6dec232 to
a893ad0
Compare
|
/approve |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: enj, ibihim The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@deads2k could you take a look at this for PRR? |
soltysh
left a comment
There was a problem hiding this comment.
There are some missing answers for beta requirement, but the biggest one is links to integration tests to ensure this new and risky functionality is working as expected.
| @@ -558,6 +559,11 @@ in back-to-back releases. | |||
| - Error type is implemented | |||
There was a problem hiding this comment.
Above, missing links to integration tests, since you're not planning e2e at all, that's a major blocker for promotion.
There was a problem hiding this comment.
If this is a major blocker promotion, we could do e2e tests.
I just thought it might be not necessary.
With links to integration tests, you mean links to the source code or to the PRs?
There was a problem hiding this comment.
don't do e2e. They are not suited for that and brittle. Integrations are testing the real thing of apiserver and etcd.
There was a problem hiding this comment.
integration = test server tests, we have one for kube-apiserver.
There was a problem hiding this comment.
Nit, add a note in the e2e section, that you're only going with integration tests, b/c they are much better suited for the test scenarios you're excercising, where you need full control over kube-apiserver and etcd for the duration of the test.
There was a problem hiding this comment.
I will then update this section: ##### e2e tests, right?
| https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282 | ||
| --> | ||
| The implementation, including tests, is waiting for an approval of this enhancement. | ||
| All tests verify feature enablement / disablement to ensure backwards |
There was a problem hiding this comment.
Again, links for the tests, or make sure they are included in the earlier sections.
| If the average time of `apiserver_request_duration_seconds{verb="delete"}` of the kube-apiserver | ||
| increases greatly, this feature might have caused a performance regression. | ||
| If the average time of `apiserver_request_duration_seconds{verb="delete"}` or | ||
| `apiserver_request_duration_seconds{verb="list"}` the amount of |
There was a problem hiding this comment.
Nit, can you make sure these metrics are mentioned at the end of kep.yaml in metrics section, please?
| Longer term, we may want to require automated upgrade/rollback tests, but we | ||
| are missing a bunch of machinery and tooling and can't do that now. | ||
| --> | ||
| No testing of upgrade->downgrade->upgrade necessary. |
There was a problem hiding this comment.
Can you explain why no such tests are necessary?
| --> | ||
|
|
||
| All corrupt object DELETEs complete, when feature is enabled, option is set and | ||
| the user is authorized. |
There was a problem hiding this comment.
The question is about SLO, iow. what is the excpected time for delete completion? Check https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md for suggestions.
| - Impact of its outage on the feature: | ||
| - Impact of its degraded performance or high-error rates on the feature: | ||
| --> | ||
| - kube-apiserver |
There was a problem hiding this comment.
This is a question about external services to kubernetes. So in your case No is sufficient answer.
| approver: "@deads2k" No newline at end of file | ||
| approver: "@deads2k" | ||
| beta: | ||
| approver: "@deads2k" |
There was a problem hiding this comment.
You can put my name here, since I'm looking at this one.
|
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
76939f8 to
b368ea9
Compare
475f8ce to
1e88573
Compare
| @@ -558,6 +559,11 @@ in back-to-back releases. | |||
| - Error type is implemented | |||
There was a problem hiding this comment.
Nit, add a note in the e2e section, that you're only going with integration tests, b/c they are much better suited for the test scenarios you're excercising, where you need full control over kube-apiserver and etcd for the duration of the test.
| https://github.com/kubernetes/kubernetes/pull/97058/files#diff-7826f7adbc1996a05ab52e3f5f02429e94b68ce6bce0dc534d1be636154fded3R246-R282 | ||
| --> | ||
| The implementation, including tests, is waiting for an approval of this enhancement. | ||
| All tests verify feature enablement / disablement to ensure backwards |
| - Extended testing is available | ||
| - Dry-Run is implemented | ||
|
|
||
| ### Upgrade / Downgrade Strategy |
There was a problem hiding this comment.
Both this section and version skew strategy is required to be filled in. Although you have pretty straightforward answers to provide, since your change is only within kube-apiserver, so for both I don't expect any specific strategy required. But please make sure to write it down explicitly.
| rollout. Similarly, consider large clusters and how enablement/disablement | ||
| will rollout across nodes. | ||
| --> | ||
| No impact on rollout or rollback. |
There was a problem hiding this comment.
Explain why, which will be similar to one of previous sections. Basically, the change is contained within kube-apiserver only, so you're not expecting any problems during rollout/rollback.
|
|
||
| No. | ||
|
|
||
| ### Troubleshooting |
There was a problem hiding this comment.
This entire section is missing answers. Similarly implementation history, drawbacks and alternatives.
Adds Production Readiness Review responses for beta promotion: - Feature enablement/rollback documentation - Monitoring requirements with metrics - Scalability considerations - Troubleshooting guidance - Test plan with integration test references
- Explain why integration tests are used instead of e2e - Add test links with feature gate toggle line numbers - Fill Upgrade/Downgrade Strategy section - Fill Version Skew Strategy section - Expand rollout/rollback failure explanation - Answer Troubleshooting section questions - Add Implementation History with alpha/beta milestones - Add Drawbacks section - Add Alternatives section"
58fb12b to
677c572
Compare
|
It seems all the changes to PRR were copied over to #5739 so I'm going to close this as a duplicate of the other. /close |
|
@soltysh: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |