Skip to content

fix(server): remove and add nat config back if applicable during resize server job#6675

Open
phot0n wants to merge 1 commit into
frappe:developfrom
phot0n:nat-fx-resize
Open

fix(server): remove and add nat config back if applicable during resize server job#6675
phot0n wants to merge 1 commit into
frappe:developfrom
phot0n:nat-fx-resize

Conversation

@phot0n

@phot0n phot0n commented Jun 10, 2026

Copy link
Copy Markdown
Member
  • added as a temp fix untill we find the permanent fix for this

…ze server job

* added as a temp fix untill we find the permanent fix for this
@phot0n phot0n requested a review from adityahase as a code owner June 10, 2026 19:05
@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Confidence Score: 3/5

Needs fixes before merging — a mid-resize failure will leave the server running without NAT iptables, making it unreachable.

The failure handler restarts the VM but never re-installs NAT rules, so any error between NAT removal and NAT re-addition produces a silently broken server. The skip guard also has a subtle logic hole that would attempt NAT operations on servers without a configured nat_server.

resize_server.py — both the on_press_job_failure method and the NAT skip-guard condition need corrections.

Important Files Changed

Filename Overview
press/press/doctype/press_job/jobs/resize_server.py Adds remove/add NAT iptables tasks to the resize flow; failure handler does not restore NAT config, and the skip guard has an edge-case logic flaw.
press/press/doctype/server/server.py Minimal fix: _install_nat_iptables and _remove_nat_iptables now return the Ansible play result instead of discarding it, enabling callers to inspect play status.

Sequence Diagram

sequenceDiagram
    participant Job as ResizeServerJob
    participant Server as server_doc
    participant Ansible as Ansible Playbook
    participant VM as VirtualMachine

    Job->>Server: remove_nat_config_if_applicable()
    Server->>Ansible: run remove_nat_iptables.yml
    Ansible-->>Server: play result
    Server-->>Job: play (checked for Success)

    Job->>VM: stop_virtual_machine()
    Job->>VM: wait_for_virtual_machine_to_stop()
    Job->>VM: resize_virtual_machine()
    Job->>VM: start_virtual_machine()
    Job->>VM: wait_for_virtual_machine_to_start()
    Job->>Server: wait_for_server_to_be_accessible()

    Job->>Server: add_nat_config_if_applicable()
    Server->>Ansible: run nat_iptables.yml
    Ansible-->>Server: play result
    Server-->>Job: play (checked for Success)

    Job->>Server: set_additional_config()
    Job->>Server: increase_disk_size()

    note over Job,Server: on_press_job_failure only calls start_virtual_machine() — NAT not restored
Loading
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
press/press/doctype/press_job/jobs/resize_server.py:150-154
**NAT config not restored on job failure**

`on_press_job_failure` only calls `self.start_virtual_machine()` — it does not call `add_nat_config_if_applicable()`. If the job fails at any step after `remove_nat_config_if_applicable()` succeeds (e.g. during `resize_virtual_machine` or `wait_for_virtual_machine_to_start`), the VM is restarted without its NAT iptables rules, leaving it unreachable via NAT routing.

### Issue 2 of 3
press/press/doctype/press_job/jobs/resize_server.py:36-41
**Edge case in NAT skip guard**

The condition `not self.server_doc.nat_server and self.server_doc.ip` skips NAT operations only when a server has no NAT server **and** has a direct IP. If both fields are falsy (e.g. a pending or mis-configured server), the AND evaluates to `False`, so the code proceeds to run the Ansible NAT playbook against a server with no `nat_server` set — this will likely fail and block the resize. A simpler, safer check is to skip whenever `nat_server` is not set.

```suggestion
	@task
	def remove_nat_config_if_applicable(self):
		if self.server_type not in ("Server", "Database Server") or (
			not self.server_doc.nat_server
		):
			return
```

### Issue 3 of 3
press/press/doctype/press_job/jobs/resize_server.py:111-116
Same edge-case guard applies to `add_nat_config_if_applicable`.

```suggestion
	@task
	def add_nat_config_if_applicable(self):
		if self.server_type not in ("Server", "Database Server") or (
			not self.server_doc.nat_server
		):
			return
```

Reviews (1): Last reviewed commit: "fix(server): remove and add nat config b..." | Re-trigger Greptile

Comment on lines +36 to +41
@task
def remove_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server and self.server_doc.ip
):
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Edge case in NAT skip guard

The condition not self.server_doc.nat_server and self.server_doc.ip skips NAT operations only when a server has no NAT server and has a direct IP. If both fields are falsy (e.g. a pending or mis-configured server), the AND evaluates to False, so the code proceeds to run the Ansible NAT playbook against a server with no nat_server set — this will likely fail and block the resize. A simpler, safer check is to skip whenever nat_server is not set.

Suggested change
@task
def remove_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server and self.server_doc.ip
):
return
@task
def remove_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server
):
return
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/press_job/jobs/resize_server.py
Line: 36-41

Comment:
**Edge case in NAT skip guard**

The condition `not self.server_doc.nat_server and self.server_doc.ip` skips NAT operations only when a server has no NAT server **and** has a direct IP. If both fields are falsy (e.g. a pending or mis-configured server), the AND evaluates to `False`, so the code proceeds to run the Ansible NAT playbook against a server with no `nat_server` set — this will likely fail and block the resize. A simpler, safer check is to skip whenever `nat_server` is not set.

```suggestion
	@task
	def remove_nat_config_if_applicable(self):
		if self.server_type not in ("Server", "Database Server") or (
			not self.server_doc.nat_server
		):
			return
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +111 to +116
@task
def add_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server and self.server_doc.ip
):
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Same edge-case guard applies to add_nat_config_if_applicable.

Suggested change
@task
def add_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server and self.server_doc.ip
):
return
@task
def add_nat_config_if_applicable(self):
if self.server_type not in ("Server", "Database Server") or (
not self.server_doc.nat_server
):
return
Prompt To Fix With AI
This is a comment left during a code review.
Path: press/press/doctype/press_job/jobs/resize_server.py
Line: 111-116

Comment:
Same edge-case guard applies to `add_nat_config_if_applicable`.

```suggestion
	@task
	def add_nat_config_if_applicable(self):
		if self.server_type not in ("Server", "Database Server") or (
			not self.server_doc.nat_server
		):
			return
```

How can I resolve this? If you propose a fix, please make it concise.

not self.server_doc.nat_server and self.server_doc.ip
):
return

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe i should only add this if it was removed in the previous block?

@codecov-commenter

codecov-commenter commented Jun 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 55.55556% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 50.48%. Comparing base (8d1e382) to head (4185aa7).
⚠️ Report is 20 commits behind head on develop.

Files with missing lines Patch % Lines
...ress/press/doctype/press_job/jobs/resize_server.py 56.25% 7 Missing ⚠️
press/press/doctype/server/server.py 50.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (55.55%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6675      +/-   ##
===========================================
- Coverage    56.46%   50.48%   -5.99%     
===========================================
  Files          992      992              
  Lines        83408    83513     +105     
  Branches       685      526     -159     
===========================================
- Hits         47097    42161    -4936     
- Misses       36275    41320    +5045     
+ Partials        36       32       -4     
Flag Coverage Δ
dashboard 62.84% <ø> (-28.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants