Skip to content

Commit 49b7592

Browse files
authored
Merge pull request #5175 from serathius/kep-4988-checklist
Fix template and fill checklist KEP-4988
2 parents 687b9fa + 4e9fc54 commit 49b7592

File tree

1 file changed

+35
-30
lines changed
  • keps/sig-api-machinery/4988-snapshottable-api-server-cache

1 file changed

+35
-30
lines changed

keps/sig-api-machinery/4988-snapshottable-api-server-cache/README.md

Lines changed: 35 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@
77
- [Goals](#goals)
88
- [Non-Goals](#non-goals)
99
- [Proposal](#proposal)
10-
- [Snapshotting](#snapshotting)
11-
- [Cache Inconsistency Detection Mechanism](#cache-inconsistency-detection-mechanism)
1210
- [Risks and Mitigations](#risks-and-mitigations)
1311
- [Memory overhead](#memory-overhead)
12+
- [Design Details](#design-details)
13+
- [Snapshotting](#snapshotting)
14+
- [Cache Inconsistency Detection Mechanism](#cache-inconsistency-detection-mechanism)
1415
- [Test Plan](#test-plan)
1516
- [Prerequisite testing updates](#prerequisite-testing-updates)
1617
- [Unit tests](#unit-tests)
@@ -39,20 +40,20 @@
3940

4041
Items marked with (R) are required *prior to targeting to a milestone / release*.
4142

42-
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
43-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
44-
- [ ] (R) Design details are appropriately documented
45-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
43+
- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
44+
- [x] (R) KEP approvers have approved the KEP status as `implementable`
45+
- [x] (R) Design details are appropriately documented
46+
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
4647
- [ ] e2e Tests for all Beta API Operations (endpoints)
47-
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
48-
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
49-
- [ ] (R) Graduation criteria is in place
50-
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
51-
- [ ] (R) Production readiness review completed
52-
- [ ] (R) Production readiness review approved
48+
- [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
49+
- [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free
50+
- [x] (R) Graduation criteria is in place
51+
- [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
52+
- [x] (R) Production readiness review completed
53+
- [x] (R) Production readiness review approved
5354
- [ ] "Implementation History" section is up-to-date for milestone
5455
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
55-
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
56+
- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
5657

5758
[kubernetes.io]: https://kubernetes.io/
5859
[kubernetes/enhancements]: https://git.k8s.io/enhancements
@@ -123,6 +124,25 @@ a robust mechanism for detecting inconsistencies is crucial.
123124
Therefore, we propose an automatic mechanism to validate cache consistency with etcd,
124125
providing users with confidence in the cache's accuracy without requiring manual debugging efforts.
125126

127+
### Risks and Mitigations
128+
129+
#### Memory overhead
130+
131+
B-tree snapshots are designed to minimize memory overhead by storing pointers to
132+
the actual objects, rather than the objects themselves. Since the objects are
133+
already cached to serve watch events, the primary memory impact comes from the
134+
B-tree structure itself. To quantify the memory overhead, we run 5k scalability tests.
135+
They should represent the worst case scenario, as they utilize large number of small objects.
136+
The results are promising:
137+
138+
* **Object Allocations:** Allocation profile collected during the test test has
139+
shown an increase of 7GB in object allocations, which translates to a
140+
negligible 0.2% of total allocations.
141+
* **Memory Usage:** Memory in use profile collected during the test has shown
142+
Btree memory usage of 300MB, representing a 1.3% of total memory used.
143+
144+
## Design Details
145+
126146
### Snapshotting
127147

128148
1. **Snapshot Creation:** When a watch event is received, the cacher creates
@@ -187,23 +207,6 @@ apiserver_storage_hash{resource="pods", storage="cache", hash="f364dcd6b58ebf020
187207
```
188208
Metric values for each resource should be updated atomically to prevent false positives.
189209

190-
### Risks and Mitigations
191-
192-
#### Memory overhead
193-
194-
B-tree snapshots are designed to minimize memory overhead by storing pointers to
195-
the actual objects, rather than the objects themselves. Since the objects are
196-
already cached to serve watch events, the primary memory impact comes from the
197-
B-tree structure itself. To quantify the memory overhead, we run 5k scalability tests.
198-
They should represent the worst case scenario, as they utilize large number of small objects.
199-
The results are promising:
200-
201-
* **Object Allocations:** Allocation profile collected during the test test has
202-
shown an increase of 7GB in object allocations, which translates to a
203-
negligible 0.2% of total allocations.
204-
* **Memory Usage:** Memory in use profile collected during the test has shown
205-
Btree memory usage of 300MB, representing a 1.3% of total memory used.
206-
207210
### Test Plan
208211

209212
[x] I/we understand the owners of the involved components may require updates to
@@ -385,6 +388,8 @@ Disabling the feature-gate.
385388

386389
## Implementation History
387390

391+
- 1.33: KEP proposed and approved for implementation
392+
388393
## Drawbacks
389394

390395
<!--

0 commit comments

Comments
 (0)