|
| 1 | +# Handling Broken Infrastructure as Code |
| 2 | + |
| 3 | +When working with Infrastructure as Code (IaC) at scale, you may occasionally encounter broken or invalid configuration files that prevent Terragrunt from successfully running operations. These issues can block entire CI/CD pipeline, preventing even valid infrastructure changes from being deployed. |
| 4 | + |
| 5 | +This guide presents several strategies for handling broken IaC while keeping your pipelines operational. |
| 6 | + |
| 7 | +## Understanding the Problem |
| 8 | + |
| 9 | +Common causes of broken IaC include: |
| 10 | + |
| 11 | +- Syntax errors in Terragrunt or Terraform files |
| 12 | +- References to non-existent modules or resources |
| 13 | +- Dependencies on resources that no longer exist |
| 14 | +- Partially committed changes with missing dependencies |
| 15 | +- Breaking changes in module versions |
| 16 | +- Temporary or experimental code |
| 17 | +- Resources or modules that have are work in progress |
| 18 | + |
| 19 | +Depending on the type of run pipeline is executing, broken IaC can fail a pipeline and prevent other, legitimate changes from being deployed. Especially in circumstances where pipelines will trigger a `terragrunt run-all` it is important that all IaC is valid or properly excluded. |
| 20 | + |
| 21 | +## Resolution Strategies |
| 22 | + |
| 23 | +Here are several approaches to manage broken IaC, presented in order of preference: |
| 24 | + |
| 25 | +### 1. Fix the Invalid Code (Preferred Solution) |
| 26 | + |
| 27 | +The ideal solution is to fix the underlying issues: |
| 28 | + |
| 29 | +```bash |
| 30 | +# Create a branch for your fix |
| 31 | +git checkout -b fix-broken-infrastructure |
| 32 | + |
| 33 | +# Fix the invalid IaC files |
| 34 | +# ... |
| 35 | + |
| 36 | +# Commit and push your changes |
| 37 | +git add . |
| 38 | +git commit -m "Fix broken IaC in module-x" |
| 39 | +git push |
| 40 | +``` |
| 41 | + |
| 42 | +Then create a merge/pull request to apply the fix to your main branch. |
| 43 | + |
| 44 | +### 2. Remove the Invalid IaC |
| 45 | + |
| 46 | +If you can't fix the issue immediately but the infrastructure is no longer needed, you can remove the problematic code: |
| 47 | + |
| 48 | +```bash |
| 49 | +# Remove the broken IaC directory |
| 50 | +git rm -rf path/to/broken/module |
| 51 | + |
| 52 | +# Commit and push your changes |
| 53 | +git add . |
| 54 | +git commit -m "Remove deprecated infrastructure module" |
| 55 | +git push |
| 56 | +``` |
| 57 | + |
| 58 | +### 3. Use a `.terragrunt-excludes` File |
| 59 | + |
| 60 | +If you wish to keep the broken code as is and simply have it ignored by pipelines and Terragrunt, you can use a `.terragrunt-excludes` file to skip problematic units: |
| 61 | + |
| 62 | +1. Create a `.terragrunt-excludes` file in the root of your repository: |
| 63 | + |
| 64 | +``` |
| 65 | +# .terragrunt-excludes |
| 66 | +# One directory per line (no globs) |
| 67 | +account/region/broken-module1 |
| 68 | +account/region/broken-module2 |
| 69 | +``` |
| 70 | + |
| 71 | +2. Commit this file to your repository, and Terragrunt will automatically exclude these directories when using `run-all`. Note, if you make a change to the code in those units and pipelines triggers a `run` in that directory itself, then the exclude will not be applied. |
| 72 | + |
| 73 | +### 4. Configure Exclusions with Pipelines Environment Variables |
| 74 | + |
| 75 | +If you don't wish to use `.terragrunt-excludes` in the root of the repository, you can create another file in a different location and set the `TG_QUEUE_EXCLUDES_FILE` environment variable to that path. You then use the Pipelines [`env` block](/2.0/reference/pipelines/configurations-as-code/api#env-block) in your `.gruntwork/pipelines.hcl` configuration to set environment variables that control Terragrunt's behavior: |
| 76 | + |
| 77 | +```hcl |
| 78 | +repository { |
| 79 | + env { |
| 80 | + # Set path to a file containing directories to exclude |
| 81 | + TG_QUEUE_EXCLUDES_FILE = ".terragrunt-excludes" |
| 82 | +
|
| 83 | + # ... |
| 84 | + } |
| 85 | +} |
| 86 | +``` |
| 87 | + |
| 88 | +:::warning |
| 89 | + Pipelines often uses `TG_QUEUE_EXCLUDE_DIR` internally, so we recommend using `TG_QUEUE_EXCLUDES_FILE` instead to avoid conflicts. |
| 90 | + ::: |
| 91 | + |
| 92 | +### Handling Dependencies |
| 93 | + |
| 94 | +When excluding modules, be aware of dependencies: |
| 95 | + |
| 96 | +1. If module B depends on module A, and module A is excluded, you may need to exclude module B as well. |
| 97 | +2. Use `terragrunt graph-dependencies` to visualize your dependency tree. |
| 98 | + |
| 99 | +## Best Practices |
| 100 | + |
| 101 | +1. **Document exclusions**: Add comments to your `.terragrunt-excludes` file explaining why each directory is excluded. |
| 102 | +2. **Track in issue system**: Create tickets for excluded modules that need to be fixed, including any relevant dates/timelines for when they should be revisited. |
| 103 | +3. **Regular cleanup**: Periodically review and update your excluded directories. |
| 104 | +4. **Validate locally**: Run `terragrunt hcl-validate` or `terragrunt validate` locally before committing changes. |
| 105 | + |
| 106 | +## Troubleshooting |
| 107 | + |
| 108 | +If you're still experiencing issues after excluding directories: |
| 109 | + |
| 110 | +- Ensure paths in your exclusion file match exactly (they're case-sensitive and relative to the repository root) |
| 111 | +- Check if excluded modules are dependencies for other modules |
| 112 | +- Review pipeline logs to confirm exclusions are being applied |
| 113 | +- Verify you don't have conflicting environment variable settings |
| 114 | + |
| 115 | +By implementing these strategies, you can keep your infrastructure pipelines running smoothly while addressing underlying issues in your codebase. |
0 commit comments