fix(server): remove and add nat config back if applicable during resize server job#6675
fix(server): remove and add nat config back if applicable during resize server job#6675phot0n wants to merge 1 commit into
Conversation
phot0n
commented
Jun 10, 2026
- added as a temp fix untill we find the permanent fix for this
…ze server job * added as a temp fix untill we find the permanent fix for this
Confidence Score: 3/5Needs fixes before merging — a mid-resize failure will leave the server running without NAT iptables, making it unreachable. The failure handler restarts the VM but never re-installs NAT rules, so any error between NAT removal and NAT re-addition produces a silently broken server. The skip guard also has a subtle logic hole that would attempt NAT operations on servers without a configured nat_server. resize_server.py — both the
|
| Filename | Overview |
|---|---|
| press/press/doctype/press_job/jobs/resize_server.py | Adds remove/add NAT iptables tasks to the resize flow; failure handler does not restore NAT config, and the skip guard has an edge-case logic flaw. |
| press/press/doctype/server/server.py | Minimal fix: _install_nat_iptables and _remove_nat_iptables now return the Ansible play result instead of discarding it, enabling callers to inspect play status. |
Sequence Diagram
sequenceDiagram
participant Job as ResizeServerJob
participant Server as server_doc
participant Ansible as Ansible Playbook
participant VM as VirtualMachine
Job->>Server: remove_nat_config_if_applicable()
Server->>Ansible: run remove_nat_iptables.yml
Ansible-->>Server: play result
Server-->>Job: play (checked for Success)
Job->>VM: stop_virtual_machine()
Job->>VM: wait_for_virtual_machine_to_stop()
Job->>VM: resize_virtual_machine()
Job->>VM: start_virtual_machine()
Job->>VM: wait_for_virtual_machine_to_start()
Job->>Server: wait_for_server_to_be_accessible()
Job->>Server: add_nat_config_if_applicable()
Server->>Ansible: run nat_iptables.yml
Ansible-->>Server: play result
Server-->>Job: play (checked for Success)
Job->>Server: set_additional_config()
Job->>Server: increase_disk_size()
note over Job,Server: on_press_job_failure only calls start_virtual_machine() — NAT not restored
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
press/press/doctype/press_job/jobs/resize_server.py:150-154
**NAT config not restored on job failure**
`on_press_job_failure` only calls `self.start_virtual_machine()` — it does not call `add_nat_config_if_applicable()`. If the job fails at any step after `remove_nat_config_if_applicable()` succeeds (e.g. during `resize_virtual_machine` or `wait_for_virtual_machine_to_start`), the VM is restarted without its NAT iptables rules, leaving it unreachable via NAT routing.
### Issue 2 of 3
press/press/doctype/press_job/jobs/resize_server.py:36-41
**Edge case in NAT skip guard**
The condition `not self.server_doc.nat_server and self.server_doc.ip` skips NAT operations only when a server has no NAT server **and** has a direct IP. If both fields are falsy (e.g. a pending or mis-configured server), the AND evaluates to `False`, so the code proceeds to run the Ansible NAT playbook against a server with no `nat_server` set — this will likely fail and block the resize. A simpler, safer check is to skip whenever `nat_server` is not set.
```suggestion
@task
def remove_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server
):
return
```
### Issue 3 of 3
press/press/doctype/press_job/jobs/resize_server.py:111-116
Same edge-case guard applies to `add_nat_config_if_applicable`.
```suggestion
@task
def add_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server
):
return
```
Reviews (1): Last reviewed commit: "fix(server): remove and add nat config b..." | Re-trigger Greptile
| @task | ||
| def remove_nat_config_if_applicable(self): | ||
| if self.server_type not in ("Server", "Database Server") or ( | ||
| not self.server_doc.nat_server and self.server_doc.ip | ||
| ): | ||
| return |
There was a problem hiding this comment.
The condition not self.server_doc.nat_server and self.server_doc.ip skips NAT operations only when a server has no NAT server and has a direct IP. If both fields are falsy (e.g. a pending or mis-configured server), the AND evaluates to False, so the code proceeds to run the Ansible NAT playbook against a server with no nat_server set — this will likely fail and block the resize. A simpler, safer check is to skip whenever nat_server is not set.
| @task | |
| def remove_nat_config_if_applicable(self): | |
| if self.server_type not in ("Server", "Database Server") or ( | |
| not self.server_doc.nat_server and self.server_doc.ip | |
| ): | |
| return | |
| @task | |
| def remove_nat_config_if_applicable(self): | |
| if self.server_type not in ("Server", "Database Server") or ( | |
| not self.server_doc.nat_server | |
| ): | |
| return |
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/press_job/jobs/resize_server.py
Line: 36-41
Comment:
**Edge case in NAT skip guard**
The condition `not self.server_doc.nat_server and self.server_doc.ip` skips NAT operations only when a server has no NAT server **and** has a direct IP. If both fields are falsy (e.g. a pending or mis-configured server), the AND evaluates to `False`, so the code proceeds to run the Ansible NAT playbook against a server with no `nat_server` set — this will likely fail and block the resize. A simpler, safer check is to skip whenever `nat_server` is not set.
```suggestion
@task
def remove_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server
):
return
```
How can I resolve this? If you propose a fix, please make it concise.| @task | ||
| def add_nat_config_if_applicable(self): | ||
| if self.server_type not in ("Server", "Database Server") or ( | ||
| not self.server_doc.nat_server and self.server_doc.ip | ||
| ): | ||
| return |
There was a problem hiding this comment.
Same edge-case guard applies to
add_nat_config_if_applicable.
| @task | |
| def add_nat_config_if_applicable(self): | |
| if self.server_type not in ("Server", "Database Server") or ( | |
| not self.server_doc.nat_server and self.server_doc.ip | |
| ): | |
| return | |
| @task | |
| def add_nat_config_if_applicable(self): | |
| if self.server_type not in ("Server", "Database Server") or ( | |
| not self.server_doc.nat_server | |
| ): | |
| return |
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/press_job/jobs/resize_server.py
Line: 111-116
Comment:
Same edge-case guard applies to `add_nat_config_if_applicable`.
```suggestion
@task
def add_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server
):
return
```
How can I resolve this? If you propose a fix, please make it concise.| not self.server_doc.nat_server and self.server_doc.ip | ||
| ): | ||
| return | ||
|
|
There was a problem hiding this comment.
maybe i should only add this if it was removed in the previous block?
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (55.55%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #6675 +/- ##
===========================================
- Coverage 56.46% 50.48% -5.99%
===========================================
Files 992 992
Lines 83408 83513 +105
Branches 685 526 -159
===========================================
- Hits 47097 42161 -4936
- Misses 36275 41320 +5045
+ Partials 36 32 -4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|