Skip to content

network: backport VM QoS live/hot update to 1.7-dev#2

Open
zbb88888 wants to merge 1 commit into
qiniu:7niu-release-1.7from
kubehostvm:1.7-dev
Open

network: backport VM QoS live/hot update to 1.7-dev#2
zbb88888 wants to merge 1 commit into
qiniu:7niu-release-1.7from
kubehostvm:1.7-dev

Conversation

@zbb88888
Copy link
Copy Markdown
Collaborator

@zbb88888 zbb88888 commented Apr 7, 2026

Backport the VM network QoS changes from pr-17213 into the 1.7-dev branch.

This backport includes:

  • initial VM network QoS API and generated artifacts

  • NET_ADMIN capability for root virt-launcher so tc-based QoS commands can run

  • live update and hot-update handling for interface QoS changes

  • cleanup of temporary QoS hot-update debug logs

  • 1.7-dev-specific conflict resolution that ports virtwrap converter changes into the legacy converter layout

What this PR does

Before this PR:

After this PR:

References

Why we need it and why it was done in this way

The following tradeoffs were made:

The following alternatives were considered:

Links to places where the discussion took place:

Special notes for your reviewer

Checklist

This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.

Release note


Backport the VM network QoS changes from pr-17213 into the 1.7-dev branch.

This backport includes:

- initial VM network QoS API and generated artifacts

- NET_ADMIN capability for root virt-launcher so tc-based QoS commands can run

- live update and hot-update handling for interface QoS changes

- cleanup of temporary QoS hot-update debug logs

- 1.7-dev-specific conflict resolution that ports virtwrap converter changes into the legacy converter layout
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for network interface bandwidth (QoS) limits for Virtual Machine Instances. It adds new Bandwidth and BandwidthParams fields to the interface specification, allowing users to define inbound and outbound average, peak, and burst rates. The implementation includes API updates, conversion logic from Kubernetes quantities to Libvirt units, and the addition of the NET_ADMIN capability to the virt-launcher container to enable traffic control commands. Furthermore, the PR enables dynamic updates of bandwidth settings for existing interfaces. I have no feedback to provide as there were no review comments to evaluate.

Copy link
Copy Markdown

@fennoai fennoai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — VM Network QoS Backport

This PR introduces bandwidth QoS support via new Bandwidth/BandwidthParams API types, a NET_ADMIN capability grant, and live/hot update propagation to libvirt. Overall the structure is solid, but a few issues warrant attention.

if !util.IsNonRootVMI(vmi) {
// add a CAP_SYS_NICE capability to allow setting cpu affinity
capabilities = append(capabilities, CAP_SYS_NICE)
// add NET_ADMIN capability to allow network QoS / tc commands
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: NET_ADMIN granted to all root virt-launcher pods, not just those with QoS configured.

NET_ADMIN is one of the most powerful Linux capabilities — it allows arbitrary manipulation of network interfaces, iptables, routing tables, and traffic control. Adding it unconditionally to every root virt-launcher (regardless of whether any interface has a Bandwidth spec) violates least-privilege and expands the blast radius of any virt-launcher compromise.

Consider gating this on whether the VMI actually has at least one interface with a non-nil Bandwidth field, or applying it only to the specific code path that invokes tc commands.

}

type BandwidthParams struct {
// Average rate in bytes/sec
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API docs claim bytes/sec but conversion divides by 1024; also missing peak >= average enforcement.

  1. The converter's quantityToLibvirtKB divides every value by 1024 before passing it to libvirt. The test at converter_test.go inputs "1000Ki" and expects 1000, confirming the function really expects kibibytes as input — not raw bytes. However, the field comments here say "Average rate in bytes/sec". Any user who passes "1000" (1000 bytes) will get 0 after integer division (1000 / 1024 = 0), silently disabling the rate limit.

    The API documentation should match the actual unit contract (kibibytes/sec for average/peak, kibibytes for burst), or the converter should be updated to correctly handle byte-unit input.

  2. Libvirt requires peak >= average when both are set. There is no CEL validation marker or admission webhook check enforcing this ordering. A violating spec will pass API validation and fail opaquely at the libvirt layer rather than producing a clear Kubernetes admission error.

@@ -114,7 +126,6 @@ func (vim *virtIOInterfaceManager) updateDomainLinkState(currentDomain, desiredD
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Observability gap: log line removed without replacement, and the error message below still says "link state" when bandwidth changes can also flow through this path.

The log.Log.Infof("preparing to update link state...") call was removed here, and no replacement log was added for bandwidth updates. In a production cluster, there is now no trace log when updateIfaceInDomain is called, making it harder to diagnose QoS enforcement failures.

Also, the error message a few lines down ("libvirt failed to set link state to interface...") now fires for both link-state and bandwidth changes, which will mislead operators. Suggest replacing with something like: "libvirt failed to update interface %s: %v".

vmiSpecCopy.Domain.Devices.Interfaces = append(vmiSpecCopy.Domain.Devices.Interfaces, vmIface)

case shouldUpdateExistingIfaceState:
case shouldUpdateExistingIfaceState || shouldUpdateExistingIfaceBandwidth:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: state is unconditionally overwritten when only bandwidth changed.

When shouldUpdateExistingIfaceState == false and only shouldUpdateExistingIfaceBandwidth == true, vmiIface.State = vmIface.State still executes. This is currently safe because the states must already be equal (otherwise shouldUpdateExistingIfaceState would be true), but the code doesn't make that invariant explicit. A future change to the condition logic could silently introduce an unintended state reset.

Guarding the state assignment with if shouldUpdateExistingIfaceState would make intent unambiguous:

if shouldUpdateExistingIfaceState {
    vmiIface.State = vmIface.State
}
if shouldUpdateExistingIfaceBandwidth {
    vmiIface.Bandwidth = vmIface.Bandwidth.DeepCopy()
}

}

maxUint := uint64(^uint(0))
if uint64(val) > maxUint {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dead overflow guard on 64-bit platforms.

maxUint := uint64(^uint(0))
if uint64(val) > maxUint {
    return uint(maxUint)
}

val is int64; on a 64-bit platform ^uint(0) is uint64 max (18446744073709551615), which is always greater than any int64 value. The branch can never be taken. This provides false safety assurance — callers may assume extreme values are capped, but on 64-bit they pass through unchanged. Consider removing the dead check and instead enforcing an explicit upper-bound limit (e.g., reject anything above a sensible max like 10 Gbit/s) in the admission webhook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant