|
7 | 7 | - [Goals](#goals)
|
8 | 8 | - [Non-Goals](#non-goals)
|
9 | 9 | - [Proposal](#proposal)
|
10 |
| - - [Snapshotting](#snapshotting) |
11 |
| - - [Cache Inconsistency Detection Mechanism](#cache-inconsistency-detection-mechanism) |
12 | 10 | - [Risks and Mitigations](#risks-and-mitigations)
|
13 | 11 | - [Memory overhead](#memory-overhead)
|
| 12 | +- [Design Details](#design-details) |
| 13 | + - [Snapshotting](#snapshotting) |
| 14 | + - [Cache Inconsistency Detection Mechanism](#cache-inconsistency-detection-mechanism) |
14 | 15 | - [Test Plan](#test-plan)
|
15 | 16 | - [Prerequisite testing updates](#prerequisite-testing-updates)
|
16 | 17 | - [Unit tests](#unit-tests)
|
|
39 | 40 |
|
40 | 41 | Items marked with (R) are required *prior to targeting to a milestone / release*.
|
41 | 42 |
|
42 |
| -- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) |
43 |
| -- [ ] (R) KEP approvers have approved the KEP status as `implementable` |
44 |
| -- [ ] (R) Design details are appropriately documented |
45 |
| -- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) |
| 43 | +- [x] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) |
| 44 | +- [x] (R) KEP approvers have approved the KEP status as `implementable` |
| 45 | +- [x] (R) Design details are appropriately documented |
| 46 | +- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) |
46 | 47 | - [ ] e2e Tests for all Beta API Operations (endpoints)
|
47 |
| - - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
48 |
| - - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free |
49 |
| -- [ ] (R) Graduation criteria is in place |
50 |
| - - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
51 |
| -- [ ] (R) Production readiness review completed |
52 |
| -- [ ] (R) Production readiness review approved |
| 48 | + - [x] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
| 49 | + - [x] (R) Minimum Two Week Window for GA e2e tests to prove flake free |
| 50 | +- [x] (R) Graduation criteria is in place |
| 51 | + - [x] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) |
| 52 | +- [x] (R) Production readiness review completed |
| 53 | +- [x] (R) Production readiness review approved |
53 | 54 | - [ ] "Implementation History" section is up-to-date for milestone
|
54 | 55 | - [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
|
55 |
| -- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes |
| 56 | +- [x] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes |
56 | 57 |
|
57 | 58 | [kubernetes.io]: https://kubernetes.io/
|
58 | 59 | [kubernetes/enhancements]: https://git.k8s.io/enhancements
|
@@ -123,6 +124,25 @@ a robust mechanism for detecting inconsistencies is crucial.
|
123 | 124 | Therefore, we propose an automatic mechanism to validate cache consistency with etcd,
|
124 | 125 | providing users with confidence in the cache's accuracy without requiring manual debugging efforts.
|
125 | 126 |
|
| 127 | +### Risks and Mitigations |
| 128 | + |
| 129 | +#### Memory overhead |
| 130 | + |
| 131 | +B-tree snapshots are designed to minimize memory overhead by storing pointers to |
| 132 | +the actual objects, rather than the objects themselves. Since the objects are |
| 133 | +already cached to serve watch events, the primary memory impact comes from the |
| 134 | +B-tree structure itself. To quantify the memory overhead, we run 5k scalability tests. |
| 135 | +They should represent the worst case scenario, as they utilize large number of small objects. |
| 136 | +The results are promising: |
| 137 | + |
| 138 | +* **Object Allocations:** Allocation profile collected during the test test has |
| 139 | + shown an increase of 7GB in object allocations, which translates to a |
| 140 | + negligible 0.2% of total allocations. |
| 141 | +* **Memory Usage:** Memory in use profile collected during the test has shown |
| 142 | + Btree memory usage of 300MB, representing a 1.3% of total memory used. |
| 143 | + |
| 144 | +## Design Details |
| 145 | + |
126 | 146 | ### Snapshotting
|
127 | 147 |
|
128 | 148 | 1. **Snapshot Creation:** When a watch event is received, the cacher creates
|
@@ -187,23 +207,6 @@ apiserver_storage_hash{resource="pods", storage="cache", hash="f364dcd6b58ebf020
|
187 | 207 | ```
|
188 | 208 | Metric values for each resource should be updated atomically to prevent false positives.
|
189 | 209 |
|
190 |
| -### Risks and Mitigations |
191 |
| - |
192 |
| -#### Memory overhead |
193 |
| - |
194 |
| -B-tree snapshots are designed to minimize memory overhead by storing pointers to |
195 |
| -the actual objects, rather than the objects themselves. Since the objects are |
196 |
| -already cached to serve watch events, the primary memory impact comes from the |
197 |
| -B-tree structure itself. To quantify the memory overhead, we run 5k scalability tests. |
198 |
| -They should represent the worst case scenario, as they utilize large number of small objects. |
199 |
| -The results are promising: |
200 |
| - |
201 |
| -* **Object Allocations:** Allocation profile collected during the test test has |
202 |
| - shown an increase of 7GB in object allocations, which translates to a |
203 |
| - negligible 0.2% of total allocations. |
204 |
| -* **Memory Usage:** Memory in use profile collected during the test has shown |
205 |
| - Btree memory usage of 300MB, representing a 1.3% of total memory used. |
206 |
| - |
207 | 210 | ### Test Plan
|
208 | 211 |
|
209 | 212 | [x] I/we understand the owners of the involved components may require updates to
|
@@ -385,6 +388,8 @@ Disabling the feature-gate.
|
385 | 388 |
|
386 | 389 | ## Implementation History
|
387 | 390 |
|
| 391 | +- 1.33: KEP proposed and approved for implementation |
| 392 | + |
388 | 393 | ## Drawbacks
|
389 | 394 |
|
390 | 395 | <!--
|
|
0 commit comments