Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GKE blue_green_settings batch_node_count & batch_percentage is not idempotent #20283

Open
tumd opened this issue Nov 11, 2024 · 4 comments
Open

Comments

@tumd
Copy link

tumd commented Nov 11, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to a user, that user is claiming responsibility for the issue.
  • Customers working with a Google Technical Account Manager or Customer Engineer can ask them to reach out internally to expedite investigation and resolution of this issue.

Terraform Version & Provider Version(s)

Terraform v1.9.8
on darwin_arm64

  • provider registry.terraform.io/hashicorp/google v6.10.0

Affected Resource(s)

google_container_node_pool

Terraform Configuration

variable "project_id" {}
variable "location" { default = "us-central1" }
variable "cluster_name" { default = "my-gke-cluster" }

resource "google_service_account" "default" {
  project      = var.project_id
  account_id   = "service-account-id"
  display_name = "Service Account"
}

resource "google_container_cluster" "primary" {
  project  = var.project_id
  name     = var.cluster_name
  location = var.location

  # We can't create a cluster with no node pool defined, but we want to only use
  # separately managed node pools. So we create the smallest possible default
  # node pool and immediately delete it.
  remove_default_node_pool = true
  initial_node_count       = 1
}

resource "google_container_node_pool" "primary_preemptible_nodes" {
  project    = var.project_id
  name       = "my-node-pool"
  cluster    = google_container_cluster.primary.id
  node_count = 2

  node_config {
    preemptible  = true
    machine_type = "e2-medium"

    # Google recommends custom service accounts that have cloud-platform scope and permissions granted via IAM Roles.
    service_account = google_service_account.default.email
    oauth_scopes = [
      "https://www.googleapis.com/auth/cloud-platform"
    ]
  }
  upgrade_settings {
    blue_green_settings {
      node_pool_soak_duration = "100s"

      standard_rollout_policy {
        batch_node_count    = 0
        batch_soak_duration = "0s"
      }
    }

    strategy = "BLUE_GREEN"
  }
}

data "google_container_cluster" "primary" {
  project  = var.project_id
  name     = var.cluster_name
  location = var.location
}

output "node_pools" {
  value = data.google_container_cluster.primary.node_pool[0].upgrade_settings
}

Debug Output

No response

Expected Behavior

batch_node_count should be set to 0.

Actual Behavior

batch_node_count is set to 1.

Steps to reproduce

  1. terraform apply
  2. terraform apply

Important Factoids

No response

References

No response

b/379359342

@tumd tumd added the bug label Nov 11, 2024
@github-actions github-actions bot added forward/review In review; remove label to forward service/container labels Nov 11, 2024
@tumd
Copy link
Author

tumd commented Nov 12, 2024

Perhaps far fetched, but wanted to mention it here too, just in case some Googler might pick it up;

The batch_node_count set to 0 seem to be a bit buggy overall, which potentially could cause this bug. See this bug-report.

@ggtisc ggtisc self-assigned this Nov 15, 2024
@ggtisc
Copy link
Collaborator

ggtisc commented Nov 16, 2024

Confirmed issue!

Both terraform apply and terraform plan allow the value 0 for the batch_node_count argument but in the end assign it a value of 1.

In the terraform registry and API documentation there is no mention of whether it is allowed or not. For his part, the user shares this [link](See this bug-report) with additional information

As for the batch_percentage argument, it allows a value of 0 and assigns it correctly in the tfstate file successfully

@ggtisc ggtisc removed their assignment Nov 16, 2024
@ggtisc ggtisc removed the forward/review In review; remove label to forward label Nov 16, 2024
@wangzhen127
Copy link

Both terraform apply and terraform plan allow the value 0 for the batch_node_count argument but in the end assign it a value of 1.

Do you mean the Terraform wouldn't let you to change the batch_node_count field value to 0? But from #20283 (comment), it showed that the value is 0.

I tried with gcloud, the blue pool draining phase is correctly skipped.

@VeraQin
Copy link

VeraQin commented Feb 11, 2025

Zero values for scalar fields (0, 0.0, false, "", the zero value of an enum) are treated the same as unset in terraform. And the default value for "batch_node_count" is 1 if not provided.

Will fix it by adding property ForceSendFields (similar to sendEmptyValue) to the field batch_node_count in file. With the flag, the provider sends “empty” values (such as zero, false, or empty strings) to the API if set explicitly in the user’s configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants