Skip to content

QoS: T7415: Fix tcp flags matching #4490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 27, 2025
Merged

Conversation

l0crian1
Copy link
Contributor

@l0crian1 l0crian1 commented May 1, 2025

Change summary

This fixes incorrect behavior in T7415. The qos.py conf-mode script cleans empty dictionaries. Since the ack and syn entries were not values, but valueless leafnodes themselves, it caused the entries to not be applied.

This moves the syn and ack to values under the tcp leafnode.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

https://vyos.dev/T7415

Related PR(s)

How to test / Smoketest result

Configure QoS policy:

set qos interface eth1 egress 'EGRESS-QOS'
set qos policy shaper EGRESS-QOS bandwidth '20gbit'
set qos policy shaper EGRESS-QOS class 2 bandwidth '20%'
set qos policy shaper EGRESS-QOS class 2 ceiling '100%'
set qos policy shaper EGRESS-QOS class 2 description 'TEST'
set qos policy shaper EGRESS-QOS class 2 match-group 'qACK'
set qos policy shaper EGRESS-QOS class 2 priority '0'
set qos policy shaper EGRESS-QOS class 2 queue-type 'fair-queue'
set qos policy shaper EGRESS-QOS default bandwidth '43%'
set qos policy shaper EGRESS-QOS default ceiling '100%'
set qos policy shaper EGRESS-QOS default priority '7'
set qos policy shaper EGRESS-QOS default queue-type 'random-detect'
set qos policy shaper EGRESS-QOS description 'test1'
set qos traffic-match-group qACK match ACK ip tcp 'ack'
set qos traffic-match-group qACK match SYNACK ip tcp 'syn'

Verify counters with show qos shaper:

vyos@cli-dev# run show qos shaper
--------------------------------------------------------------------------------
Interface: eth1
Policy Name: EGRESS-QOS

Class    Type      Bandwidth    Max. BW     Bytes    Pkts    Drops    Queued
-------  ------  -----------  ---------  --------  ------  -------  --------
root     htb       20.000 Gb  20.000 Gb  7.560 KB      50        0         0
2        sfq        4.000 Gb   1.000 Gb  7.266 KB      47        0         0
default  red        8.600 Gb  20.000 Gb    294  B       3        0         0

Verify tc filter:

vyos@cli-dev# sudo tc filter show dev eth1
filter parent 1: protocol all pref 49151 u32 chain 0
filter parent 1: protocol all pref 49151 u32 chain 0 fh 801: ht divisor 1
filter parent 1: protocol all pref 49151 u32 chain 0 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:2 not_in_hw
  match 00020000/00020000 at 32
        action order 1:  police 0x1 rate 200Mbit burst 15325b mtu 2Kb action reclassify overhead 0b
        ref 1 bind 1

filter parent 1: protocol all pref 49152 u32 chain 0
filter parent 1: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent 1: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:2 not_in_hw
  match 00100000/00100000 at 32

Smoketest results:

test_01_cake (__main__.TestQoS.test_01_cake) ... ok
test_02_drop_tail (__main__.TestQoS.test_02_drop_tail) ... ok
test_03_fair_queue (__main__.TestQoS.test_03_fair_queue) ... ok
test_04_fq_codel (__main__.TestQoS.test_04_fq_codel) ... ok
test_05_limiter (__main__.TestQoS.test_05_limiter) ... ok
test_06_network_emulator (__main__.TestQoS.test_06_network_emulator) ... ok
test_07_priority_queue (__main__.TestQoS.test_07_priority_queue) ... ok
test_08_random_detect (__main__.TestQoS.test_08_random_detect) ... ok
test_09_rate_control (__main__.TestQoS.test_09_rate_control) ... ok
test_10_round_robin (__main__.TestQoS.test_10_round_robin) ... ok
test_11_shaper (__main__.TestQoS.test_11_shaper) ... ok
test_12_shaper_with_red_queue (__main__.TestQoS.test_12_shaper_with_red_queue) ... ok
test_13_shaper_delete_only_rule (__main__.TestQoS.test_13_shaper_delete_only_rule) ... ok
test_14_policy_limiter_marked_traffic (__main__.TestQoS.test_14_policy_limiter_marked_traffic) ... ok
test_15_traffic_match_group (__main__.TestQoS.test_15_traffic_match_group) ... ok
test_16_wrong_traffic_match_group (__main__.TestQoS.test_16_wrong_traffic_match_group) ... ok
test_17_cake_updates (__main__.TestQoS.test_17_cake_updates) ... ok
test_18_priority_queue_default (__main__.TestQoS.test_18_priority_queue_default) ... ok
test_19_priority_queue_default_random_detect (__main__.TestQoS.test_19_priority_queue_default_random_detect) ... ok
test_20_round_robin_policy_default (__main__.TestQoS.test_20_round_robin_policy_default) ... ok
test_21_shaper_hfsc (__main__.TestQoS.test_21_shaper_hfsc) ... ok
test_22_rate_control_default (__main__.TestQoS.test_22_rate_control_default) ... ok
test_23_policy_limiter_iif_filter (__main__.TestQoS.test_23_policy_limiter_iif_filter) ... ok
test_24_policy_shaper_match_ether (__main__.TestQoS.test_24_policy_shaper_match_ether) ... ok

----------------------------------------------------------------------
Ran 24 tests in 493.018s

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

Empty leaf nodes are cleaned, causing the tcp
ack and syn flags to not match. These flags were moved
to values of the tcp leafNode
Copy link

github-actions bot commented May 1, 2025

👍
No issues in PR Title / Commit Title

Copy link
Member

@dmbaturin dmbaturin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have two unrelated problems here that need two separate PRs to address.

The behavior of the QoS script is a bug that can be fixed by not deleting "empty" nodes that are actually significant. If the bug exist in 1.4, we will have to produce such a fix, maybe by adding some hardcoded special cases, because we cannot change the CLI in an LTS release.

Then there's the question of the CLI change. I agree with the general idea but there are caveats.

First, your change will break existing configs because currently that node is ip { tcp { ack } } and that will not load when ack is supposed to be a value of a leaf node. So any such change will need a migration script.

Second, your change makes it impossible to match SYN ACK packets, only either SYN or ACK. That's possible in the original CLI so we have to account for that. The simplest way to do that is to mark the node as <multi> but I wonder if there are use cases for matching SYN, ACK packets separately from either SYN or ACK ones.

@l0crian1
Copy link
Contributor Author

l0crian1 commented May 2, 2025

@dmbaturin

If the bug exist in 1.4, we will have to produce such a fix, maybe by adding some hardcoded special cases, because we cannot change the CLI in an LTS release.

The bug is present in 1.4.2

In this context, is "CLI" the ordering of nodes, tag nodes, and leaf nodes? Or just syntax? If it's the latter, the syntax would be identical to the user. But obviously, the underlying structure that is invisible to the user would be different.

First, your change will break existing configs because currently that node is ip { tcp { ack } } and that will not load when ack is supposed to be a value of a leaf node. So any such change will need a migration script.

I didn't realize that this would break existing configs, but I tested and it does. In my head, the config is stored as set qos traffic-match-group qACK match ACK ip tcp ack regardless, but I guess it'd be ip tcp ack vs ip tcp 'ack'

If your CLI concern is not syntax, but the structure of the node types, along with the issue of breaking existing configs, I agree with modifying the clean function instead. I was trying to avoid that initially since the simpler and cleaner fix seemed to be avoiding the cleaning.

Here's my thoughts on the change. Is this along the lines of what you were thinking?

From:

    if isinstance(conf, dict):      
        return {node: _clean_conf_dict(val) for node, val in conf.items() if val != {} and _clean_conf_dict(val) != {}}

To:

    if isinstance(conf, dict):
        preserve_empty_nodes = {'syn', 'ack'}

        return {
            node: _clean_conf_dict(val)
            for node, val in conf.items()
            if (val != {} and _clean_conf_dict(val) != {}) or node in preserve_empty_nodes
        }       

Copy link
Member

@sarthurdev sarthurdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested if removing this _clean_conf_dict function works and also resolves this issue?

It was introduced as part of T4248, which seems to be an issue prior to QoS rewrite by @c-po. A quick glance at the new vyos.qos module seems to do sufficient checks on match criteria.

Tested without, issue persists from T4248.

I'm in agreement for easy backporting, that the exclusions in your comment are likely the best way forward.

Empty leaf nodes are cleaned, causing the tcp
ack and syn flags to not match. These values are exempted from being cleaned.
Copy link

CI integration ❌ failed!

Details

CI logs

  • CLI Smoketests (no interfaces) ❌ failed
  • CLI Smoketests (interfaces only) ❌ failed
  • Config tests ❌ failed
  • RAID1 tests ❌ failed
  • TPM tests ❌ failed

Copy link
Member

@sarthurdev sarthurdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolves issue with removed TCP nodes, without requiring migration. Suitable for circinus/sagitta.

@sarthurdev sarthurdev added bp/sagitta Create automatic backport for sagitta LTS version bp/circinus Create automatic backport for circinus labels May 22, 2025
Copy link
Member

@dmbaturin dmbaturin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Simon's testing confirms it, I'm fine with it. The logic does seem correct.

@dmbaturin dmbaturin merged commit 3436daa into vyos:current May 27, 2025
13 of 16 checks passed
@github-actions github-actions bot added the mirror-initiated This PR initiated for mirror sync workflow label May 27, 2025
@vyosbot vyosbot added mirror-completed and removed mirror-initiated This PR initiated for mirror sync workflow labels May 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bp/circinus Create automatic backport for circinus bp/sagitta Create automatic backport for sagitta LTS version current mirror-completed rebase
Development

Successfully merging this pull request may close these issues.

4 participants