Skip to content

SDN-5544: Unpin libreswan version#1771

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
pperiyasamy:bump-libreswan
Apr 3, 2025
Merged

SDN-5544: Unpin libreswan version#1771
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
pperiyasamy:bump-libreswan

Conversation

@pperiyasamy
Copy link
Copy Markdown
Member

@pperiyasamy pperiyasamy commented Mar 14, 2025

This would enable to consume libreswan 5 version from FDP repository.
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=66972299

@openshift-ci openshift-ci bot requested review from cgwalters and jschintag March 14, 2025 12:17
@pperiyasamy
Copy link
Copy Markdown
Member Author

/assign @zshi-redhat @huiran0826

@pperiyasamy
Copy link
Copy Markdown
Member Author

/retest

1 similar comment
@pperiyasamy
Copy link
Copy Markdown
Member Author

/retest

Copy link
Copy Markdown
Member

@travier travier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 17, 2025
@pperiyasamy
Copy link
Copy Markdown
Member Author

/hold

i'm still testing the changes.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 17, 2025
Comment thread extensions-ocp-rhel-9.6.yaml Outdated
# pin to 4.6-3.el9_0.3 for now for https://issues.redhat.com/browse/OCPBUGS-43498
# we can revert once that's fixed in latest libreswan
- libreswan-4.6-3.el9_0.3
- libreswan-5.2-1.el9fdp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of binding the version this way, we should just remove the pin here, i.e. remove the version. Once libreswan 5 is available in FDP, it will be automatically installed and we'll consume all the bug fixes automatically once they are available. Pinning a specific version is not a good long term solution.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igsilya sure, will do, please let us know once we have libreswan 5 is available in FDP, then will remove pinning to specific version. cc @The-Mule

@pperiyasamy
Copy link
Copy Markdown
Member Author

/retest

@pperiyasamy
Copy link
Copy Markdown
Member Author

/assign @trozet

This would enable to consume libreswan 5 version from FDP repository.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 19, 2025
@pperiyasamy pperiyasamy changed the title Bump libreswan version to 5.2-1 Unpin libreswan version Mar 19, 2025
@pperiyasamy
Copy link
Copy Markdown
Member Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 19, 2025
@jlebon
Copy link
Copy Markdown
Member

jlebon commented Mar 20, 2025

CI OKD failure looks unrelated

/override ci/prow/okd-scos-e2e-aws-ovn

/lgtm

Do you have an associated Jira card with this? If so, can you add that in the title?
Also, are we planning to backport this to some of the older versions where we also pinned libreswan?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlebon, pperiyasamy, travier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 20, 2025

@jlebon: Overrode contexts on behalf of jlebon: ci/prow/okd-scos-e2e-aws-ovn

Details

In response to this:

CI OKD failure looks unrelated

/override ci/prow/okd-scos-e2e-aws-ovn

/lgtm

Do you have an associated Jira card with this? If so, can you add that in the title?
Also, are we planning to backport this to some of the older versions where we also pinned libreswan?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Mar 20, 2025

/hold
We should hold this until the libreswna 5.2 is actually released. The tests are not finished yet.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 20, 2025
@pperiyasamy pperiyasamy changed the title Unpin libreswan version SDN-5544: Unpin libreswan version Mar 20, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 20, 2025
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 20, 2025

@pperiyasamy: This pull request references SDN-5544 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

Details

In response to this:

This would enable to consume libreswan 5 version from FDP repository.
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=66972299

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@pperiyasamy
Copy link
Copy Markdown
Member Author

Do you have an associated Jira card with this? If so, can you add that in the title?

@jlebon sure, added the corresponding JIRA to the PR.

Also, are we planning to backport this to some of the older versions where we also pinned libreswan?

yes, we have libreswan 4.6 version pinning until OCP 4.14, so will have to backport this change all the way to 4.14.

@jlebon
Copy link
Copy Markdown
Member

jlebon commented Mar 20, 2025

yes, we have libreswan 4.6 version pinning until OCP 4.14, so will have to backport this change all the way to 4.14.

And I assume the risks of bumping to a new major version all the way back to 4.14 has been evaluated?

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Mar 20, 2025

We should hold this until the libreswna 5.2 is actually released. The tests are not finished yet.

For some context on the hold:
We need to wait until https://errata.devel.redhat.com/advisory/147278 ships.
There is also an ongoing mess with the 5.2 builds being tagged into OCP and 4.6 builds no longer available because of that. Decision is pending.

@pperiyasamy
Copy link
Copy Markdown
Member Author

yes, we have libreswan 4.6 version pinning until OCP 4.14, so will have to backport this change all the way to 4.14.

And I assume the risks of bumping to a new major version all the way back to 4.14 has been evaluated?

@jlebon I thought access to libreswan 4.6 is removed completely for OCP, but as per @igsilya previous comment, this issue seems to be temporary. so let's test libreswan 5.2 with OCP 4.19 after this PR is merged, if everything is good, then we can think about backporting it, is that correct, @igsilya ?

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Mar 20, 2025

yes, we have libreswan 4.6 version pinning until OCP 4.14, so will have to backport this change all the way to 4.14.

And I assume the risks of bumping to a new major version all the way back to 4.14 has been evaluated?

@jlebon I thought access to libreswan 4.6 is removed completely for OCP, but as per @igsilya previous comment, this issue seems to be temporary. so let's test libreswan 5.2 with OCP 4.19 after this PR is merged, if everything is good, then we can think about backporting it, is that correct, @igsilya ?

Yes, the 4.6 build should return to the rhocp repository somewhere soon (I see it already appeared in some of the versions).

For the 5.2, the plan is to wait for the official build released in FDP (approx. Apr 3rd), then get it into 4.19, then we can let it soak for a bit, and then we'll need to backport this change all the way down to 4.14. The reason is that we want to move away from that specific 4.6 pinned build, as it is not sustainable to support it. It was just a hot fix for an immediate issue we had and it's not a long term solution.
The problem is that we can't update to any of the Libreswan 4 builds that are shipping in RHEL, because they all have and will continue to have the same problem that we were trying to solve by the 4.6 pin. The only way forward is to upgrade all the releases to Libreswan 5. It is not possible to ship Libreswan 5 in RHEL 9 due to compatibility problems, but there should be no compatibility issues for a limited use case we have in OCP specifically. That's why we're shipping this package via FDP repositories.

Note: Backporting to OCP 4.15 and 4.14 will also depend on #1774, because we need OVS 3.3 for compatibility with Libreswan 5.

@pperiyasamy
Copy link
Copy Markdown
Member Author

I can now see bot cluster is deploying libreswan 5.2 with this PR.

# oc get clusterversion
NAME      VERSION                                                   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.19.0-0.ci.test-2025-03-21-123033-ci-ln-s727v3t-latest   True        False         133m    Cluster version is 4.19.0-0.ci.test-2025-03-21-123033-ci-ln-s727v3t-latest
# oc debug node/ip-10-0-118-144.us-west-2.compute.internal
Starting pod/ip-10-0-118-144us-west-2computeinternal-debug-rbcsk ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.118.144
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# rpm -q libreswan
libreswan-5.2-1.el9fdp.x86_64

@jlebon
Copy link
Copy Markdown
Member

jlebon commented Mar 28, 2025

Do we still need to hold this one?

@pperiyasamy
Copy link
Copy Markdown
Member Author

/label acknowledge-critical-fixes-only

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Mar 28, 2025
@pperiyasamy
Copy link
Copy Markdown
Member Author

Do we still need to hold this one?

@jlebon sure, we can merge this PR now. need to open a followup PR in mco and need to get ovnk PR openshift/ovn-kubernetes#2498 to be merged as well. will do it ASAP.

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Mar 28, 2025

Do we still need to hold this one?

@jlebon sure, we can merge this PR now. need to open a followup PR in mco and need to get ovnk PR openshift/ovn-kubernetes#2498 to be merged as well. will do it ASAP.

@pperiyasamy don't we need a "regex" fix first to avoid the connection wait service failures?

Also, the errata is not released yet. I'd prefer if we just wait for the official process even if the unreleased libreswn-5.2 build is cross-tagged into OCP.

@pperiyasamy
Copy link
Copy Markdown
Member Author

pperiyasamy commented Mar 28, 2025

Do we still need to hold this one?

@jlebon sure, we can merge this PR now. need to open a followup PR in mco and need to get ovnk PR openshift/ovn-kubernetes#2498 to be merged as well. will do it ASAP.

@pperiyasamy don't we need a "regex" fix first to avoid the connection wait service failures?

Without "regex" fix, the connection wait service introduces 60s delay for the startup, overall not having a problem for ipsec upgrade. but anyway raised a PR now: openshift/machine-config-operator#4959.

Also, the errata is not released yet. I'd prefer if we just wait for the official process even if the unreleased libreswn-5.2 build is cross-tagged into OCP.

sure, let's wait then.

@joepvd
Copy link
Copy Markdown

joepvd commented Mar 28, 2025

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Mar 28, 2025

Think this is as official as it gets:

* https://access.redhat.com/errata/RHSA-2025:3068

* https://access.redhat.com/errata/RHSA-2025:3061

@joepvd I'm a little lost, how is that related to libreswan? The mentioned CVE seems to be for some Go library.
I see the list of packages in those erratas somehow mentions libreswan-5.2, which is messed up if that actually got released somehow. It would mean there a bug in the OCP compose.

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Mar 28, 2025

I see the list of packages in those erratas somehow mentions libreswan-5.2, which is messed up if that actually got released somehow. It would mean there a bug in the OCP compose.

OK, AFAIU, it's mentioned because it was cross-tagged and it is in fact in the compose. But that's fine since we're not actually installing that package. False alarm.

But I'm still not sure how these erratas are related to this PR otherwise.

@huiran0826
Copy link
Copy Markdown

huiran0826 commented Apr 3, 2025

pre-merge test PR from buildopenshift/os#1771,openshift/ovn-kubernetes#2498,openshift/machine-config-operator#4959, libreswan 5.2 was installed on the nodes.

1. # oc debug node/ip-10-0-94-160.us-east-2.compute.internal
Starting pod/ip-10-0-94-160us-east-2computeinternal-debug-cvtk2 ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.94.160
If you don't see a command prompt, try pressing enter.
sh-5.1# chroot /host
sh-5.1# ipsec --version
Libreswan 5.2
sh-5.1# rpm -qa | grep libreswan
libreswan-5.2-1.el9fdp.x86_64
  1. e2e passed
03-31 14:12:02.462  passed: (9.3s) 2025-03-31T06:12:02 "[sig-networking] SDN IPSEC Author:huirwang-NonHyperShiftHOST-High-72893-IPSec state can be shown in prometheus endpoint."
03-31 14:12:03.384  passed: (10.5s) 2025-03-31T06:12:03 "[sig-networking] SDN IPSEC EW Author:huirwang-Medium-37591-Make sure IPsec SA's are establishing in a transport mode"
03-31 14:12:25.454  passed: (29s) 2025-03-31T06:12:21 "[sig-networking] SDN IPSEC EW Author:huirwang-High-39216-Pod created on IPsec cluster should have appropriate MTU size to accomdate IPsec Header."
03-31 14:12:25.454  passed: (29.2s) 2025-03-31T06:12:22 "[sig-networking] SDN IPSEC EW Author:huirwang-High-38846-Should be able to send node to node ESP traffic on IPsec clusters"
03-31 14:12:37.596  passed: (43.4s) 2025-03-31T06:12:36 "[sig-networking] SDN IPSEC EW Author:huirwang-High-37392-pod to pod traffic on different nodes should be IPSec encrypted"
03-31 14:12:49.766  passed: (48.2s) 2025-03-31T06:12:49 "[sig-networking] SDN IPSEC EW Author:huirwang-Critical-79184-pod2pod cross nodes traffic should work and not broken."
03-31 15:12:57.161  passed: (7m45s) 2025-03-31T07:12:49 "[sig-networking] SDN IPSEC EW Author:huirwang-Medium-80232-After node rebooting, IPSec pod2pod connection should work as well. [Disruptive] [Serial]"
03-31 15:15:33.534  passed: (2m37s) 2025-03-31T07:15:26 "[sig-networking] SDN IPSEC EW Author:huirwang-High-38845-High-37590-Restarting pluto daemon, restarting ovn-ipsec pods, pods connection should not be broken. [Disruptive] [Serial]"
03-31 15:26:14.483  passed: (10m11s) 2025-03-31T07:26:02 "[sig-networking] SDN IPSEC EW Author:huirwang-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-80993-IPSec mode switch between Full and External. [Disruptive] [Serial]"
03-31 18:10:38.331  passed: (2m41s) 2025-03-31T10:10:26 "[sig-networking] SDN IPSEC NS Author:anusaxen-High-74220-[rdu2cluster] Transport mode can be setup for IPSec NS in NAT env - Host2Net [Serial][Disruptive] [Serial]"
03-31 18:13:14.729  passed: (2m39s) 2025-03-31T10:13:05 "[sig-networking] SDN IPSEC NS Author:anusaxen-High-74222-[rdu2cluster] Transport tunnel can be setup for IPSEC NS in NAT env, [Serial][Disruptive] [Serial]"
03-31 18:16:06.090  passed: (2m49s) 2025-03-31T10:15:54 "[sig-networking] SDN IPSEC NS Author:anusaxen-High-74223-[rdu2cluster] Tunnel mode can be setup for IPSEC NS in NAT env, [Serial][Disruptive] [Serial]"
03-31 18:18:42.448  passed: (2m39s) 2025-03-31T10:18:33 "[sig-networking] SDN IPSEC NS Author:anusaxen-High-74221-[rdu2cluster] Tunnel mode can be setup for IPSec NS in NAT env - Host2Net [Serial][Disruptive] [Serial]"
03-31 19:39:35.346  passed: (6m55s) 2025-03-31T11:39:22 "[sig-networking] SDN IPSEC NS Author:huirwang-High-67472-Transport tunnel can be setup for IPSEC NS, [Serial][Disruptive] [Serial]"
03-31 19:47:50.472  passed: (7m30s) 2025-03-31T11:47:39 "[sig-networking] SDN IPSEC NS Author:ansaxen-Medium-73554-External Traffic should still be IPsec encrypted in presense of Admin Network Policy application at egress node [Disruptive] [Serial]"
03-31 19:58:27.045  passed: (10m34s) 2025-03-31T11:58:13 "[sig-networking] SDN IPSEC NS Author:anusaxen-Longduration-NonPreRelease-High-71465-Multiplexing Tunnel and Transport type IPsec should work with external host. [Serial][Disruptive] [Serial]"
03-31 20:07:48.592  passed: (9m24s) 2025-03-31T12:07:37 "[sig-networking] SDN IPSEC NS Author:huirwang-High-69178-High-38873-Tunnel mode can be setup for IPSec NS,IPSec NS tunnel can be teared down by nmstate config. [Serial][Disruptive] [Serial]"
03-31 20:13:55.051  passed: (6m13s) 2025-03-31T12:13:49 "[sig-networking] SDN IPSEC NS Author:huirwang-High-67475-Be able to access hostnetwork pod with traffic encrypted,  [Serial][Disruptive] [Serial]"
03-31 20:27:01.649  passed: (13m4s) 2025-03-31T12:26:53 "[sig-networking] SDN IPSEC NS Author:huirwang-Longduration-NonPreRelease-Medium-67474-Medium-69176-IPSec tunnel can be up after restart IPSec service or restart node, [Serial][Disruptive] [Serial]"
03-31 20:33:23.090  passed: (6m23s) 2025-03-31T12:33:17 "[sig-networking] SDN IPSEC NS Author:huirwang-High-67473-Service nodeport can be accessed with ESP encrypted, [Serial][Disruptive] [Serial]"
03-31 19:38:55.440  passed: (50m15s) 2025-03-31T11:38:47 "[sig-networking] SDN IPSEC Author:qiowang-NonHyperShiftHOST-NonPreRelease-Longduration-Medium-64077-79804-[NETWORKCUSIM] IPSec enabled/disabled test at runtime. [Disruptive] [Serial] [Slow]"
  1. Upgrade from 4.18(lib-4.6)->PR build with worker mcp paused, upgrading successfully. Checked intermediate stage when master nodes with libre-5.2 and worker nodes with libre-4.6, no crash for ipsec pods, no crash for ipsec systemd service, pod2pod connection check passed.
  2. It passed 27 nodes 10 iteration p2p connection checking tests, see result https://privatebin.corp.redhat.com/?bfa1b6f7f80257c8#82R3UwHBM3a7JVbmbb8GCdGEm5y9xNFybebASBfhznek

@igsilya
Copy link
Copy Markdown
Contributor

igsilya commented Apr 3, 2025

OK, with the openshift/machine-config-operator#4959 merged, FDP errata https://errata.devel.redhat.com/advisory/147278 shipped and testing done in #1771 (comment) , let's remove the hold.

/remove-hold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 3, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 3, 2025

@pperiyasamy: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 432b2ff into openshift:master Apr 3, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants