Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error 409 when trying to update the resource google_compute_security_policy #14738

Open
marcusricardoaguiar opened this issue May 26, 2023 · 17 comments · May be fixed by GoogleCloudPlatform/magic-modules#12318

Comments

@marcusricardoaguiar
Copy link

marcusricardoaguiar commented May 26, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Version: 1.5.0

Affected Resource(s)

  • google_compute_security_policy

Terraform Configuration Files

After making some changes on the resource google_compute_security_policy that triggers the resource recreation, it is trying to create the resource again first before deleting it. Then, since we already have the resource with the same name, it's crashing it.

resource "google_compute_security_policy" "backend_apis_cloud_armor" {
  name = "backend-api"
  provider = google-beta
  lifecycle {
    ignore_changes = [
      # This is being ignored because GCP is raising an issue when trying to set this value to PREMIUM tier
      # Layer 7 DDoS defense premium is currently not enabled.
      adaptive_protection_config[0].layer_7_ddos_defense_config[0].rule_visibility
    ]
  }

  rule {
	...
  }
  rule {
  	...
  }
  
  rule {
	...
  }

  rule {
	...
  }
}

Debug Output

Panic Output

p1

Expected Behavior

I would like that GCP/Terraform was able to handle this situation and, in this case, remove the resource first before creating the new one.

Actual Behavior

Since it tries to create before remove, it will crash when trying to create a resource with the same name as the existing one.

Steps to Reproduce

Use an already existing google_compute_security_policy resource and make some changes on their terraform code. This might try to recreate the resource. However, you should have that already existing stuff.

@edwardmedia
Copy link
Contributor

@marcusricardoaguiar changing on two fields, name and project, trigger replacement. It works fine with me after I renamed the resource.

What did you do to trigger the recreation?

After making some changes on the resource google_compute_security_policy that triggers the resource recreation,

@marcusricardoaguiar
Copy link
Author

@marcusricardoaguiar changing on two fields, name and project, trigger replacement. It works fine with me after I renamed the resource.

What did you do to trigger the recreation?

After making some changes on the resource google_compute_security_policy that triggers the resource recreation,

Hey @edwardmedia , for instance, the last time it happened, it was a rule that I replaced. I changed the rule priority and the expressions. Then, it triggered the security policy recreation.

@marcusricardoaguiar
Copy link
Author

@marcusricardoaguiar changing on two fields, name and project, trigger replacement. It works fine with me after I renamed the resource.

What did you do to trigger the recreation?

After making some changes on the resource google_compute_security_policy that triggers the resource recreation,

However, I haven't changed the project nor the name.

@edwardmedia
Copy link
Contributor

edwardmedia commented May 26, 2023

@marcusricardoaguiar can you detail the steps to repro, specific the way to trigger recreation? Besides project and name, I do not see other fields that will trigger it.

@marcusricardoaguiar
Copy link
Author

marcusricardoaguiar commented Jun 1, 2023

@marcusricardoaguiar can you detail the steps to repro, specific the way to trigger recreation? Besides project and name, I do not see other fields that will trigger it.

HI @edwardmedia , I've just found out what's going on. I've just created a google_compute_security_policy and also attached a backend service to this security policy as shown below:

resource "google_compute_security_policy" "cloud_armor" {
  name = "cloud-armor"
  provider = google-beta

  rule {
    action   = "deny(403)"
    description = "Deny violations"
    priority = 4
    match {
      expr {
        expression = "evaluatePreconfiguredExpr('*****')"
      }
    }
  }

  rule {
    action   = "deny(403)"
    description = "Block all traffic"
    priority = 2
    match {
      expr {
        expression = join(" ", [
          "origin.region_code == 'BR'"
        ])
      }
    }
  }

  # Every security policy must have a rule with priority 2147483647 and match "*".
  rule {
    action   = "allow"
    priority = "2147483647"
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["*"]
      }
    }
    description = "Default rule"
  }
}

# PgAdmin Internal Service
resource "google_compute_backend_service" "pgadmin_cloudrun" {
  name                            = "pgadmin-service"
  enable_cdn                      = false
  port_name                       = "https"
  timeout_sec                     = 90
  connection_draining_timeout_sec = 90
  protocol                        = "HTTPS"

  backend {
    group = module.pgadmin.pgadmin4_instance_group
  }

  health_checks = [
    google_compute_https_health_check.pgadmin4_health.id,
  ]

  log_config {
    enable = true
  }

  security_policy = google_compute_security_policy.cloud_armor.self_link
}

Then, I tried to update the security policy and I added a new rule with a high priority:

resource "google_compute_security_policy" "cloud_armor" {
  name = "cloud-armor"
  ...
  rule {
    description = "Login rate limit"
    action      = "throttle"
    priority    = 28
    match {
      expr {
        expression = join(" ", [
          "request.method == 'POST'",
          "&& !has(request.headers['*****'])"
        ])
      }
    }

    rate_limit_options {
      conform_action = "allow"
      exceed_action = "deny(429)"

      enforce_on_key = ""

      enforce_on_key_configs {
        enforce_on_key_type = "HTTP_HEADER"
        enforce_on_key_name = "*******"
      }

      rate_limit_threshold {
        count = 5
        interval_sec = 60
      }
    }
  }
  ...
}

In this case, since this security policy has been used by the backend service, it will try to create the new one before deleting it. Then, it crashes because of name already exist (409 error).

@edwardmedia
Copy link
Contributor

@marcusricardoaguiar Using below config, I tested by 1) initial apply and 2) uncomment the 3rd rule and apply, and did not hit the problem. Adding the 3rd rule did not trigger recreate google_compute_security_policy. Can you try my config? Also what version of the provider do you use?

resource "google_compute_security_policy" "policy" {
  name = "issue14738-1"
  //type = "CLOUD_ARMOR"
  type = "CLOUD_ARMOR_INTERNAL_SERVICE"

  rule {
    action   = "deny(403)"
    priority = "1000"
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["9.9.9.0/24"]
      }
    }
    description = "Deny access to IPs in 9.9.9.0/24"
  }

  rule {
    action   = "allow"
    priority = "2147483647"
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["*"]
      }
    }
    description = "default rule"
  }
/*  uncomment for the 2nd apply
  rule {
    description = "Login rate limit"
    action      = "throttle"
    priority    = 28
    match {
      expr {
        expression = join(" ", [
          "request.method == 'POST'",
          "&& !has(request.headers['*****'])"
        ])
      }
    }

    rate_limit_options {
      conform_action = "allow"
      exceed_action = "deny(429)"

      enforce_on_key = ""

      rate_limit_threshold {
        count = 5
        interval_sec = 60
      }
    }
  }
  */
}


resource "google_compute_backend_service" "default" {
  name          = "backend-service"
  health_checks = [google_compute_http_health_check.default.id]

  security_policy = google_compute_security_policy.policy.self_link
}

resource "google_compute_http_health_check" "default" {
  name               = "health-check"
  request_path       = "/"
  check_interval_sec = 1
  timeout_sec        = 1
}

@marcusricardoaguiar
Copy link
Author

marcusricardoaguiar commented Jun 2, 2023

@marcusricardoaguiar Using below config, I tested by 1) initial apply and 2) uncomment the 3rd rule and apply, and did not hit the problem. Adding the 3rd rule did not trigger recreate google_compute_security_policy. Can you try my config? Also what version of the provider do you use?

resource "google_compute_security_policy" "policy" {
  name = "issue14738-1"
  //type = "CLOUD_ARMOR"
  type = "CLOUD_ARMOR_INTERNAL_SERVICE"

  rule {
    action   = "deny(403)"
    priority = "1000"
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["9.9.9.0/24"]
      }
    }
    description = "Deny access to IPs in 9.9.9.0/24"
  }

  rule {
    action   = "allow"
    priority = "2147483647"
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["*"]
      }
    }
    description = "default rule"
  }
/*  uncomment for the 2nd apply
  rule {
    description = "Login rate limit"
    action      = "throttle"
    priority    = 28
    match {
      expr {
        expression = join(" ", [
          "request.method == 'POST'",
          "&& !has(request.headers['*****'])"
        ])
      }
    }

    rate_limit_options {
      conform_action = "allow"
      exceed_action = "deny(429)"

      enforce_on_key = ""

      rate_limit_threshold {
        count = 5
        interval_sec = 60
      }
    }
  }
  */
}


resource "google_compute_backend_service" "default" {
  name          = "backend-service"
  health_checks = [google_compute_http_health_check.default.id]

  security_policy = google_compute_security_policy.policy.self_link
}

resource "google_compute_http_health_check" "default" {
  name               = "health-check"
  request_path       = "/"
  check_interval_sec = 1
  timeout_sec        = 1
}

Hi @edwardmedia , I've been using google-beta provider version 4.66.0. I've tried your suggestion and it worked as you mentioned. However, if we include the block enforce_on_key_configs it will trigger the policy recreation. Instead of applying your suggested new rule, I've just tried to add the rule below and it triggered the policy recreation:

  rule {
    description = "Login rate limit"
    action      = "throttle"
    priority    = 28
    match {
      expr {
        expression = join(" ", [
          "request.method == 'POST'",
          "&& !has(request.headers['******'])"
        ])
      }
    }

    rate_limit_options {
      conform_action = "allow"
      exceed_action = "deny(429)"

      enforce_on_key = ""

      enforce_on_key_configs {
        enforce_on_key_type = "HTTP_HEADER"
        enforce_on_key_name = "****"
      }

      rate_limit_threshold {
        count = 5
        interval_sec = 60
      }
    }
  }

@edwardmedia
Copy link
Contributor

@marcusricardoaguiar oh, yes, that is right. enforce_on_key_configs is only available in beta and it is a field that could trigger recreation.

@marcusricardoaguiar
Copy link
Author

Ah I see, thanks @edwardmedia , so, do you know if this will not trigger recreation once it's not beta anymore?
Thanks!

@edwardmedia
Copy link
Contributor

@marcusricardoaguiar any changes on that field will trigger recreation once it is GA. I do not have a solution off the top of my head to your situation.

@ryanyuan what do you think?

@edwardmedia edwardmedia assigned roaks3 and unassigned edwardmedia Jun 5, 2023
@ryanyuan
Copy link
Contributor

ryanyuan commented Jun 6, 2023

hey @edwardmedia, I reckon you wanted to tag @roaks3

@roaks3
Copy link
Collaborator

roaks3 commented Jun 6, 2023

I've checked through the code, and nothing stood out as an obvious cause of the 409. The force_replacement behavior will call the resource's delete function and then its create function, which both appear to properly retrieve operations and wait for them to complete.

My suspicion at this point is that the resource is deleted and allowed to proceed to the create, but the resource's name is not yet available (google_storage_bucket has similar behavior where a name cannot be reused immediately after a bucket is deleted, and it caused problems with ForceNew: GoogleCloudPlatform/magic-modules#7314).

@roaks3 roaks3 removed their assignment Jun 12, 2023
@santoshpandit1
Copy link

santoshpandit1 commented Jul 11, 2023

I could see the main potential issue here is:
google_compute_backend_service has the security_policy attribute which is string instead of an object type of google_compute_security_policy ;

We are creating two separate resources here and mapping it after the resources are created. This will have an issue because the proper dependency is missing.

@github-actions github-actions bot added forward/review In review; remove label to forward service/compute-security-policy labels Aug 17, 2023
@edwardmedia
Copy link
Contributor

b/297932215

@edwardmedia edwardmedia added forward/linked and removed forward/review In review; remove label to forward labels Aug 28, 2023
@jaksky
Copy link

jaksky commented Sep 14, 2023

@santoshpandit1 even adding dependency between LB BE to Security policy didn't help. Is there any resolution? I was updating a match part

@maxi-cit
Copy link

Hello there, I just checked this has been available in beta for some time now so I am bumping it to GA.

Additionally they removed the force_new flag for the enforce_on_key_configs block so now it updates-in-place instead of recreate.

@roaks3
Copy link
Collaborator

roaks3 commented Nov 13, 2024

Yea, looks like enforce_on_key_configs was fixed in GoogleCloudPlatform/magic-modules#8165. That leaves only name and project that can trigger replacements, for which we were not able to observe this issue.

@maxi-cit we should be able to close with your change (or even close it now), and then let users raise a new issue if there are still problems with the replacement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants