Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN-5330: Add openvswitch-ipsec package for ipsec plugin #1718

Merged

Conversation

pperiyasamy
Copy link
Member

@pperiyasamy pperiyasamy commented Jan 24, 2025

Currently the network operator brings up ovn-ipsec-host daemonset pod once the ipsec machine config plugin is installed on the node. The pod spins up ovs-monitor-ipsec script to create/update mesh of IPsec connections across
the nodes. This makes ipsec connections to be established for the existing nodes a bit later after kubelet is started, but by the time workloads are scheduled on the node started hitting traffic drops because of unavailability of IPsec connections between nodes. This makes IPsec jobs in CI so unstable and monitor jobs always failing during IPsec upgrade.

The FDP story (https://issues.redhat.com/browse/FDP-1051) gets openvswitch-ipsec systemd service (runs ovs-monitor-ipsec) with required configurable parameters for network operator. It's available with OVS 3.5 version, So OCP can use this service running on the host for configuring IPsec for east west traffic.

This PR bumps OVS version to 3.5 and includes openvswitch-ipsec package to be part of the ipsec extension, It enables ovs-monitor-ipsec to be run as a systemd service on the node and ovn-ipsec-host pod would now only be used to configure the service. This provides more flexibility in managing IPsec connections created by OVN and OVS, helps to bring up existing IPsec connections timely before kubelet service comes up upon node reboot scenarios.

@openshift-ci openshift-ci bot requested review from aaradhak and jmarrero January 24, 2025 11:14
@pperiyasamy
Copy link
Member Author

/assign @trozet @zshi-redhat

@dustymabe
Copy link
Member

/retest

@travier
Copy link
Member

travier commented Jan 29, 2025

Can you clarify what this package brings / why it is needed?

@jlebon
Copy link
Member

jlebon commented Jan 29, 2025

To be clear, this package is inert unless the systemd service is enabled, right?

Any upgrading cluster with the ipsec extension enabled will now get this package. So we need to make sure that it doesn't break existing setups.

@pperiyasamy pperiyasamy force-pushed the add-openvswitch-ipsec branch from 83249fb to 5212a25 Compare February 24, 2025 09:43
@pperiyasamy
Copy link
Member Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2025
@pperiyasamy pperiyasamy force-pushed the add-openvswitch-ipsec branch from 5212a25 to 2a27e82 Compare February 24, 2025 10:04
@pperiyasamy
Copy link
Member Author

Can you clarify what this package brings / why it is needed?

@travier update the PR description with more details. hope that helps.

@pperiyasamy
Copy link
Member Author

pperiyasamy commented Feb 24, 2025

To be clear, this package is inert unless the systemd service is enabled, right?

Any upgrading cluster with the ipsec extension enabled will now get this package. So we need to make sure that it doesn't break existing setups.

yes @jlebon this service would be enabled only when ipsec extension is deployed. I would be testing this with few more dependent PRs in cluster-network-operator and ovn-kubernetes, Would come back on this once we have a solid results for IPsec install/upgrade.

@igsilya
Copy link
Contributor

igsilya commented Mar 11, 2025

CentOS builds should be available now.
/retest

@pperiyasamy pperiyasamy changed the title Add openvswitch-ipsec package for ipsec plugin SDN-5330: Add openvswitch-ipsec package for ipsec plugin Mar 11, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 11, 2025

@pperiyasamy: This pull request references SDN-5330 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

Currently the network operator brings up ovn-ipsec-host daemonset pod once the ipsec machine config plugin is installed on the node. The pod spins up ovs-monitor-ipsec script to create/update mesh of IPsec connections across
the nodes. This makes ipsec connections to be established for the existing nodes a bit later after kubelet is started, but by the time workloads are scheduled on the node started hitting traffic drops because of unavailability of IPsec connections between nodes. This makes IPsec jobs in CI so unstable and monitor jobs always failing during IPsec upgrade.

The FDP story (https://issues.redhat.com/browse/FDP-1051) gets openvswitch-ipsec systemd service (runs ovs-monitor-ipsec) with required configurable parameters for network operator. It's available with OVS 3.5 version, So OCP can use this service running on the host for configuring IPsec for east west traffic.

This PR bumps OVS version to 3.5 and includes openvswitch-ipsec package to be part of the ipsec extension, It enables ovs-monitor-ipsec to be run as a systemd service on the node and ovn-ipsec-host pod would now only be used to configure the service. This provides more flexibility in managing IPsec connections created by OVN and OVS, helps to bring up existing IPsec connections timely before kubelet service comes up upon node reboot scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 11, 2025
@pperiyasamy
Copy link
Member Author

/hold cancel

The openvswitch3.5-ipsec package just installs openvswitch-ipsec systemd service, but not activating it by default. so we are safe here.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 11, 2025
@pperiyasamy
Copy link
Member Author

/assign @igsilya

@pperiyasamy
Copy link
Member Author

pperiyasamy commented Mar 17, 2025

It looks like we must get this PR landed first because CI build with testwith command never get openvswitch3.5-ipsec package installed for testing CNO PR: openshift/cluster-network-operator#2662

2025-03-17T15:55:45.766574723Z Deployments:
2025-03-17T15:55:45.766574723Z * ostree-unverified-registry:registry.build03.ci.openshift.org/ci-op-00pwshpj/stable@sha256:03dab5ad4a79f27e857b40e283f352065aa10f78e44ee728cfe19e877aef3594
2025-03-17T15:55:45.766574723Z                    Digest: sha256:feefbcf7435f08e6f63b00ac60ed29b28ca3672adf3ecc55930173cc9f9c85d9
2025-03-17T15:55:45.766574723Z                   Version: 9.6.20250317-0 (2025-03-17T13:21:08Z)
2025-03-17T15:55:45.766574723Z           LayeredPackages: libreswan NetworkManager-libreswan
2025-03-17T15:55:45.766574723Z
2025-03-17T15:55:45.766574723Z   ostree-unverified-registry:registry.build03.ci.openshift.org/ci-op-00pwshpj/stable-initial@sha256:5bbe3dd09bbc56c1e7c2faf410ec0a2db5931efc3ca1494db2085b35d1eac4ea
2025-03-17T15:55:45.766574723Z                    Digest: sha256:5bbe3dd09bbc56c1e7c2faf410ec0a2db5931efc3ca1494db2085b35d1eac4ea
2025-03-17T15:55:45.766574723Z                   Version: 419.96.202503141748-0 (2025-03-14T17:53:03Z)
2025-03-17T15:55:45.766574723Z           LayeredPackages: libreswan NetworkManager-libreswan
2025-03-17T15:55:45.767175701Z I0317 15:55:45.767143    3210 coreos.go:53] CoreOS aleph version: mtime=2022-08-01 23:42:11 +0000 UTC

Copy link
Contributor

@igsilya igsilya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The openvswitch-ipsec service is disabled by default indeed, so shouldn't cause any issues and the OVS bump to 3.5 will be needed for other things as well, like improved handling of mixed (ipv4+ipv6) flow tables.

@igsilya
Copy link
Contributor

igsilya commented Mar 26, 2025

/retest

@igsilya
Copy link
Contributor

igsilya commented Mar 26, 2025

The rhcos-96-build-test-* test failures do not make a lot of sense to me. But I assume that's caused by a large refactor of extensions that went in and this PR may need a rebase, even if the check doesn't think so. We should also probably add the openvswitch-ipsec package into extensions-okd-c9s.yaml.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
Currently the network operator brings up ovn-ipsec-host daemonset pod once
the ipsec machine config plugin is installed on the node. The pod spins up
ovs-monitor-ipsec script to create/update  mesh of IPsec connections across
the nodes. This makes ipsec connections to be established for the existing
nodes a bit later after kubelet is started, but by the time workloads are
scheduled on the node started hitting traffic drops because of unavailability
of IPsec connections between nodes. This makes IPsec jobs in CI so unstable
and monitor jobs always failing during IPsec upgrade.

The FDP story (https://issues.redhat.com/browse/FDP-1051) gets openvswitch-ipsec
systemd service (runs ovs-monitor-ipsec) with required configurable parameters
for network operator. It's available with OVS 3.5 version, So OCP can use this
service running on the host for configuring IPsec for east west traffic.

Hence this commit includes openvswitch-ipsec package to be part of the ipsec
extension, ovs-monitor-ipsec to be run as a systemd service on the node and
ovn-ipsec-host pod would now only be used to configure the service.
This provides more flexibility in managing IPsec connections created by OVN
and OVS, helps to bring up existing IPsec connections timely before kubelet
service comes up upon node reboot scenarios.

Signed-off-by: Periyasamy Palanisamy <[email protected]>
@pperiyasamy pperiyasamy force-pushed the add-openvswitch-ipsec branch from 2a27e82 to 61b76c1 Compare March 31, 2025 07:15
Copy link
Contributor

openshift-ci bot commented Mar 31, 2025

@pperiyasamy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/rhcos-96-build-test-metal 83249fb link true /test rhcos-96-build-test-metal
ci/prow/rhcos-96-build-test-qemu 83249fb link true /test rhcos-96-build-test-qemu

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@pperiyasamy
Copy link
Member Author

/retest

1 similar comment
@jlebon
Copy link
Member

jlebon commented Apr 1, 2025

/retest

@jlebon
Copy link
Member

jlebon commented Apr 1, 2025

/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 1, 2025
Copy link
Contributor

openshift-ci bot commented Apr 1, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: igsilya, jlebon, pperiyasamy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 1, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit e1cf9d5 into openshift:master Apr 1, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants