-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
container: make cpu_manager_policy
optional in kubelet_config
#11572
container: make cpu_manager_policy
optional in kubelet_config
#11572
Conversation
Hello! I am a robot. Tests will require approval from a repository maintainer to run. @melinath, a repository maintainer, has been assigned to review your changes. If you have not received review feedback within 2 business days, please leave a comment on this PR asking them to take a look. You can help make sure that review is quick by doing a self-review and by running impacted tests locally. |
Here's the full PR where it was originally marked required: #3760 |
Hi there, I'm the Modular magician. I've detected the following information about your changes: Diff reportYour PR generated some diffs in downstreams - here they are.
Missing test reportYour PR includes resource fields which are not covered by any test. Resource: resource "google_container_cluster" "primary" {
node_config {
kubelet_config {
cpu_manager_policy = # value needed
}
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding this correctly, the API has an API-side default of "none", but if you send ""
or "none"
it returns an empty value? If so, this might fall under https://googlecloudplatform.github.io/magic-modules/develop/permadiff/#default_if_empty
@hoskeri could you confirm the expected API behavior? |
Tests analyticsTotal tests: 203 Click here to see the affected service packages
Action takenFound 2 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
|
I’m not sure if there’s a permadiff in this case; I imagine a real fix to the linked issue may involve some more code changes. This is only intended to be a partial fix in that I don’t believe this parameter is actually required? But I think the behavior may be a net neutral or slight improvement as it is, and also lets people set unrelated parameters when needed. If the value is unset, I believe the behavior should already be default by setting it to an empty string or If I’m understanding the original intent correctly the main point was to make sure that at least one parameter was set in the block (which I think there’s a Maybe there are some other cases where certain things are required, but AFAICT, even the other So this PR solves the problem of people not being able to set other parameters in the same block without having to set Put differently: other than allowing some people to remove the attribute, I don't believe this should cause a change in behavior for anyone who has it currently set to one of the 3 allowed values ( I don’t think this should change the behavior for existing users as best I could ascertain. |
|
note: with this change, we could technically remove |
I can't comment on how it's supposed to behave, obviously, so hopefully we'll get a response on that, but: https://pkg.go.dev/google.golang.org/api/container/v1#NodeKubeletConfig
Even as-is, though, What I see is if I create a new cluster with the current 6.0.1 (unpatched) provider, it sends node_config {
kubelet_config {
cpu_manager_policy = "none"
}
} Call to API during initial cluster creation:
Interestingly, the API does seem to return a value vs. no value in that case: maybe @hoskeri can clarify. I'm not sure what that old comment about
( This is probably good from the standpoint of the tf provider - there is no permadrift when setting the value to |
@melinath This PR has been waiting for review for 3 weekdays. Please take a look! Use the label |
Side note: I confirmed with someone that this was (at the time this was added) common practice at one point to require one attribute to avoid a situation where an empty block was defined and cause issues, and apparently there are much better checks for this in place now. |
@GoogleCloudPlatform/terraform-team @melinath This PR has been waiting for review for 1 week. Please take a look! Use the label |
@melinath not sure if you heard back from @hoskeri, but a question I have is whether if we can establish that this PR is relatively safe / neutral as-is in terms of behavior, whether you and the team would feel comfortable allowing this as-is before a proper resolution to hashicorp/terraform-provider-google#19225. I could do some more testing if there are specific scenarios you're thinking of, but IMO, this will improve the existing behavior in a relatively safe way. I do absolutely think the behavior and the docs around the confusing should eventually be resolved and made more consistent, but I think that this change as-is should help avoid confusing diffs / behavior to users who need to set other values within Once we get into coercing |
apologies for the delayed review. I've been working through a backlog. I should be able to take a look tomorrow. In general this does look like a good change; I just want to do some manual testing to make sure I understand how it behaves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having some trouble reproducing the described permadiff (unless I try to update a cluster, but that's due to hashicorp/terraform-provider-google#19225, which I agree is orthogonal to this issue and doesn't need to be fixed at the same time.) hashicorp/terraform-provider-google#19225 so they don't need to be fixed in the same PR.
My main concern is that treating ""
(unset) and "none"
as distinct values even though they both behave the same may cause confusion for users, since the server could silently change the behavior for that case (similar to what will happen with insecure_kubelet_readonly_port_enabled
eventually).
However, this block may also have problems with sending values to the API when they're unset - ref: hashicorp/terraform-provider-google#15767 and hashicorp/terraform-provider-google#19428 (for another block on the same resource.)
I'm currently inclined to go with your proposed implementation and just make the field optional with no additional changes.
@@ -1172,6 +1172,7 @@ func expandKubeletConfig(v interface{}) *container.NodeKubeletConfig { | |||
kConfig := &container.NodeKubeletConfig{} | |||
if cpuManagerPolicy, ok := cfg["cpu_manager_policy"]; ok { | |||
kConfig.CpuManagerPolicy = cpuManagerPolicy.(string) | |||
kConfig.ForceSendFields = append(kConfig.ForceSendFields, "CpuManagerPolicy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause the field to be sent even if unset - per hashicorp/terraform-provider-google#15767 that might not be necessary / desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a few, so I may have to run through some different scenarios to remember exactly why this ended up seemingly being needed.
IIRC, there was some kind of issue when I didn't have it set -- at the least, I did try it first without that line added.
If you're already setup to test it, you could try building the provider with this line removed and step through some of the tests that I did. If I've got time to dig into it, I will try to see as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did at least a basic test with this line removed again, and a config with other kubelet_config
stuff set seemed to do what I'd expect.
Another scenario to look at is when the nodepool is managed separately (where nodepool updates do work)... I think there's a case where removing the attribute when it was set might cause an issue when the attribute doesn't get set in the API call. That said, I haven't been able to induce that issue thus far with this example, and TestAccContainerNodePool_withKubeletConfig
passes, so I'm pushing up an update that removes this again. I didn't test ""
but since that's not a documented value, probably doesn't matter too much even if that did break somehow.
resource "google_container_cluster" "test_cfs_quota" {
name = "test-cfs-quota"
location = "us-central1-f"
initial_node_count = 1
node_config {
kubelet_config {
cpu_cfs_quota = true
insecure_kubelet_readonly_port_enabled = "FALSE"
}
}
deletion_protection = false
network = "default"
subnetwork = "default"
}
resource "google_container_node_pool" "test" {
name = "secondary-test"
cluster = google_container_cluster.test_cfs_quota.id
node_count = 1
node_config {
kubelet_config {
cpu_cfs_quota = true
cpu_manager_policy = "none"
insecure_kubelet_readonly_port_enabled = "FALSE"
}
}
}
mmv1/third_party/terraform/website/docs/r/container_cluster.html.markdown
Outdated
Show resolved
Hide resolved
Partial fix for hashicorp/terraform-provider-google#19225 This should resolve some confusing behavior with `cpu_manager_policy` * It frequently will show a permadrift when it can't be set * It also doesn't seem to accept the documented value of "none" as an empty value, though the previously undocumented empty string (`""`) seems to work. This doesn't resolve all of the issues, but resolves other issues where it must be set where it's not actually needed (for example, if `insecure_kubelet_readonly_port_enabled` is set). It appears that it was marked as `Required` somewhat arbitrarily, and it's also possible that some of what's in place is tied to an API level bug that may have since been resolved.
…ml.markdown Co-authored-by: Stephen Lewis (Burrows) <[email protected]>
38cfe3c
to
d7bd3d9
Compare
yeah force pushes that leave the history intact are fine. |
Hi there, I'm the Modular magician. I've detected the following information about your changes: Diff reportYour PR generated some diffs in downstreams - here they are.
Missing test reportYour PR includes resource fields which are not covered by any test. Resource: resource "google_container_cluster" "primary" {
node_config {
kubelet_config {
cpu_manager_policy = # value needed
}
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this the more I agree with you that just making the field optional is safest for now. We may want to make other changes in the future but this fixes the immediate issue.
It would be best practice to make sure the google_container_cluster.node_config.kubelet_config.cpu_manager_policy
field is used in at least one test, though.
Tests analyticsTotal tests: 208 Click here to see the affected service packages
Action takenFound 2 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
|
|
Sure - I can add it. Do you want it in an existing test or a standalone new test case? only caveat is that, since we already know that update is broken when it’s used that way, it probably can’t be a test where the resource is updated. |
It is still present here, right? magic-modules/mmv1/third_party/terraform/services/container/resource_container_cluster_test.go.erb Line 6692 in 4ca0795
|
An existing test would be ideal (but it doesn't have to be one of the ones already modified in this PR).
That's |
@melinath I didn't see any real obvious choice, and the |
sure, seems fine |
Hi there, I'm the Modular magician. I've detected the following information about your changes: Diff reportYour PR generated some diffs in downstreams - here they are.
|
Tests analyticsTotal tests: 209 Click here to see the affected service packages
Action takenFound 1 affected test(s) by replaying old test recordings. Starting RECORDING based on the most recent commit. Click here to see the affected tests
|
|
…ogleCloudPlatform#11572) Co-authored-by: Stephen Lewis (Burrows) <[email protected]>
…ogleCloudPlatform#11572) Co-authored-by: Stephen Lewis (Burrows) <[email protected]>
…ogleCloudPlatform#11572) Co-authored-by: Stephen Lewis (Burrows) <[email protected]>
…ogleCloudPlatform#11572) Co-authored-by: Stephen Lewis (Burrows) <[email protected]>
Part of hashicorp/terraform-provider-google#19225
This should resolve some confusing behavior with
cpu_manager_policy
""
) seems to work.efb71a9#r458238583
efb71a9#r473173480
☝️ context on when it was originally marked
Required
This doesn't resolve all of the issues, but resolves other issues where it must be set where it's not actually needed (for example, if
insecure_kubelet_readonly_port_enabled
is set).It appears that it was marked as
Required
somewhat arbitrarily (see above), and it's also possible that some of what's in place is tied to an API level bug that may have since been resolved. Maybe we could require at last one instead -- happy to do that if there's a good example to follow.I did come across hashicorp/terraform-provider-google#15767 while testing this, but I think this is neutral as far as that goes.
Release Note Template for Downstream PRs (will be copied)