Skip to content

Conversation

@vigh-m
Copy link
Contributor

@vigh-m vigh-m commented Oct 10, 2025

Issue number:

Closes #536

Description of changes:

  • Build systemd-cryptsetup for systemd-257
  • Also update systemd-257 to the latest upstream release.
    • This is required since upstream has added support to build systemd without openssl/ui.h which is a requirement to build systemd-cryptsetup

Testing done:

All testing was done on an aws-k8s-1.34(-nvidia) variant

  • No errors in the journalctl logs:

    Oct 10 17:29:29 localhost kernel: Run /sbin/init as init process
    Oct 10 17:29:29 localhost kernel:   with arguments:
    Oct 10 17:29:29 localhost kernel:     /sbin/init
    Oct 10 17:29:29 localhost kernel:     systemd.log_target=journal-or-kmsg
    Oct 10 17:29:29 localhost kernel:     systemd.log_color=0
    Oct 10 17:29:29 localhost kernel:     systemd.show_status=true
    Oct 10 17:29:29 localhost kernel:   with environment:
    Oct 10 17:29:29 localhost kernel:     HOME=/
    Oct 10 17:29:29 localhost kernel:     TERM=linux
    Oct 10 17:29:29 localhost kernel:     SYSTEMD_DEFAULT_MOUNT_RATE_LIMIT_BURST=25
    Oct 10 17:29:29 localhost kernel:     SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1
    Oct 10 17:29:29 localhost kernel:     BOOT_IMAGE=(hd0,gpt3)/vmlinuz
    Oct 10 17:29:29 localhost kernel:     selinux=1
    Oct 10 17:29:29 localhost kernel:     enforcing=1
    .
    .
    .
    Oct 10 17:29:29 localhost systemd[1]: systemd 257.9 running in system mode (-PAM -AUDIT +SELINUX -APPARMOR -IMA +IPE -SMACK +SECCOMP -GCRYPT -GNUTLS +OPENSSL +ACL +BLKID -CURL -ELFUTILS -FIDO2 -IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBCRYPTSETUP_PLUGINS +LIBFDISK -PCRE2 -PWQUALITY -P11KIT -QRENCODE +TPM2 -BZIP2 -LZ4 -XZ -ZLIB -ZSTD -BPF_FRAMEWORK -BTF -XKBCOMMON -UTMP -SYSVINIT -LIBARCHIVE)
    
  • Tested a few systemd commands without error

    bash-5.1# systemd-analyze
    Startup finished in 798ms (firmware) + 644ms (loader) + 2.376s (kernel) + 13.863s (userspace) = 17.682s
    multi-user.target reached after 13.824s in userspace.
    
    bash-5.1# systemctl restart kubelet
    bash-5.1# systemctl status kubelet
    ● kubelet.service - Kubelet
         Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/kubelet.service; enabled; preset: enabled)
        Drop-In: /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d
                 └─00-aws-config.conf
                 /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/kubelet.service.d
                 └─dockershim-symlink.conf
                 /etc/systemd/system/kubelet.service.d
                 └─exec-start.conf
                 /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/kubelet.service.d
                 └─make-kubelet-dirs.conf, prestart-load-pause-ctr.conf
        Active: active (running) since Fri 2025-10-10 17:35:03 UTC; 1s ago
     Invocation: 8b9155cd79764e9193afb39a475e60b2
           Docs: https://github.com/kubernetes/kubernetes
        Process: 4957 ExecStartPre=/sbin/iptables -P FORWARD ACCEPT (code=exited, status=0/SUCCESS)
        Process: 4958 ExecStartPre=/bin/ln -sf ./containerd/containerd.sock /run/dockershim.sock (code=exited, status=0/SUCCESS)
        Process: 4961 ExecStartPre=/usr/bin/mkdir -p /var/lib/kubelet/providers/secrets-store /var/lib/kubelet/node-feature-discovery/features.d (code=exited, status=0/SUCCESS)
        Process: 4963 ExecStartPre=/usr/bin/ctr --namespace=k8s.io image import --all-platforms /usr/libexec/kubernetes/kubernetes-pause.tar (code=exited, status=0/SUCCESS)
        Process: 4969 ExecStartPre=/usr/bin/ctr --namespace=k8s.io image label localhost/kubernetes/pause:0.1.0 io.cri-containerd.pinned=pinned (code=exited, status=0/SUCCESS)
       Main PID: 4974 (kubelet)
          Tasks: 11 (limit: 9313)
         Memory: 30.5M (peak: 38M)
            CPU: 381ms
         CGroup: /runtime.slice/kubelet.service
                 └─4974 /usr/bin/kubelet --cloud-provider external --kubeconfig /etc/kubernetes/kubelet/kubeconfig --config /etc/kubernetes/kubelet/config --container-runtime-endpoint=unix:///run/containerd/containerd.sock --containerd=/run/containerd/containerd.sock --root-dir /var/lib/kubelet --cert-dir /var/lib/kubel…
    
    
  • No unit files showing any errors

    bash-5.1# systemctl | grep -e "error" -e "ERROR" -e "Error" -e "failed" -e "FAILED" -e  "Failed"
    bash-5.1#
    
  • The instance joins the cluster as expected

  • Basic workloads are launched on the node correctly

  • Tested neuron instances:

    bash-5.1# systemctl status modprobe@neuron
    ● [email protected] - Load Kernel Module neuron
         Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/[email protected]; static)
        Drop-In: /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/service.d
                 └─00-aws-config.conf
                 /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/[email protected]
                 └─10-remain-after-exit.conf
                 /x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/[email protected]
                 └─neuron.conf
         Active: active (exited) since Fri 2025-10-10 18:30:49 UTC; 4min 33s ago
     Invocation: dd5a2c8f3c434733be23f8067fc24207
           Docs: man:modprobe(8)
       Main PID: 1306 (code=exited, status=0/SUCCESS)
       Mem peak: 2.8M
            CPU: 39ms
    
    Oct 10 18:30:49 localhost systemd[1]: Finished Load Kernel Module neuron.
    Notice: journal has been rotated since unit was started, output may be incomplete.
    
    % kubectl get nodes "-o=custom-columns=NAME:.metadata.name,NeuronCore:.status.allocatable.aws\.amazon\.com/neuroncore"
    NAME                                          NeuronCore
    ip-192-168-70-48.us-west-2.compute.internal   2
    
  • Tested Neuron Workloads:

    === RUN   TestNeuronNodes/multi-node
        neuron_test.go:157: Skipping feature "multi-node": name not matched
    --- PASS: TestNeuronNodes (360.09s)
        --- PASS: TestNeuronNodes/single-node (360.09s)
            --- PASS: TestNeuronNodes/single-node/Single_node_test_Job_succeeds (360.02s)
       --- SKIP: TestNeuronNodes/multi-node (0.00s)
    PASS
    ok      github.com/aws/aws-k8s-tester/test/cases/neuron 431.977s
    
  • Nvidia instances working as expected:

    bash-5.1# nvidia-smi
    Fri Oct 10 21:23:46 2025
    +-----------------------------------------------------------------------------------------+
    | NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
    +-----------------------------------------+------------------------+----------------------+
    | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
    |                                         |                        |               MIG M. |
    |=========================================+========================+======================|
    |   0  Tesla T4                       On  |   00000000:00:1E.0 Off |                    0 |
    | N/A   31C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
    |                                         |                        |                  N/A |
    +-----------------------------------------+------------------------+----------------------+
    
    +-----------------------------------------------------------------------------------------+
    | Processes:                                                                              |
    |  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
    |        ID   ID                                                               Usage      |
    |=========================================================================================|
    |  No running processes found                                                             |
    +-----------------------------------------------------------------------------------------+
    
  • Basic Nvidia workload tests worked as expected.

  • Ran basic scale testing by launching ~100 g4dn.xlarge, m5.large, c6i.large, r7a.large,

  • Soak testing with an AMI (Pending)

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@vigh-m vigh-m force-pushed the systemd-cryptsetup-build branch from 6587c86 to a197f44 Compare October 13, 2025 17:58
@vigh-m
Copy link
Contributor Author

vigh-m commented Oct 13, 2025

⬆️ split the changes into multiple commits

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run this again now that cryptsetup is enabled and fix any spurious differences?

git diff --no-index packages/systemd-25{2,7}/systemd*.spec

Comment on lines 22 to 23
+# else
+# include <openssl/hmac.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have its own guard:

#  ifndef OPENSSL_HMAC_H
#    include <openssl/hmac.h>
#  endif

if (r < 0)
return log_debug_errno(SYNTHETIC_ERRNO(EIO),
- "Signature verification failed: 0x%lx", ERR_get_error());
+ "Signature verification failed: 0x%u", ERR_get_error());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is really a uint32_t, you should use the matching printf specifier:

#include <inttypes.h>
...
"Signature verification failed: 0x"PRIx32, ERR_get_error()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, aws-lc and open-ssl have different implementations for this.

DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(ASN1_TYPE*, ASN1_TYPE_free, NULL);
DEFINE_TRIVIAL_CLEANUP_FUNC_FULL(ASN1_STRING*, ASN1_STRING_free, NULL);

+# ifndef OPENSSL_NO_UI_CONSOLE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spacing is off here

Comment on lines 6 to 7
# Disable OpenSSL UI since aws-lc does not support it.
%global _cross_cflags %{_cross_cflags} -DOPENSSL_NO_UI_CONSOLE=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd slightly prefer either:

  1. patching the pkgconfig file in libcrypto to include this by default
  2. patching meson.build to include it

Comment on lines 73 to 75
# Disable sb-sign since that has a dependency on PKCS7 which is not provided
# by aws-lc
Patch9015: 9015-disable-sb-sign.patch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Disable sb-sign since that has a dependency on PKCS7 which is not provided
# by aws-lc
Patch9015: 9015-disable-sb-sign.patch
# Disable sb-sign since that has a dependency on PKCS7 which is not provided
# by aws-lc
Patch9015: 9015-disable-sb-sign.patch

Comment on lines 821 to 833
%{_cross_libdir}/pcrlock.d/350-action-efi-application.pcrlock
%{_cross_libdir}/pcrlock.d/400-secureboot-separator.pcrlock.d/300-0x00000000.pcrlock
%{_cross_libdir}/pcrlock.d/400-secureboot-separator.pcrlock.d/600-0xffffffff.pcrlock
%{_cross_libdir}/pcrlock.d/500-separator.pcrlock.d/300-0x00000000.pcrlock
%{_cross_libdir}/pcrlock.d/500-separator.pcrlock.d/600-0xffffffff.pcrlock
%{_cross_libdir}/pcrlock.d/700-action-efi-exit-boot-services.pcrlock.d/300-present.pcrlock
%{_cross_libdir}/pcrlock.d/700-action-efi-exit-boot-services.pcrlock.d/600-absent.pcrlock
%{_cross_libdir}/pcrlock.d/750-enter-initrd.pcrlock
%{_cross_libdir}/pcrlock.d/800-leave-initrd.pcrlock
%{_cross_libdir}/pcrlock.d/850-sysinit.pcrlock
%{_cross_libdir}/pcrlock.d/900-ready.pcrlock
%{_cross_libdir}/pcrlock.d/950-shutdown.pcrlock
%{_cross_libdir}/pcrlock.d/990-final.pcrlock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that these specific pcrlock policy files will be useful, though the mechanism is elegant.

I'd like to get systemd-pcrextend included; that's what would perform some of the measurements that these pcrlock files anticipate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I've %exclude(d) them in their own section for now. systemd-pcrextend requires BOOTLOADER to be enabled which I can explore outside of this

Comment on lines 834 to 836
%{_cross_libdir}/systemd/system-generators/systemd-cryptsetup-generator
%{_cross_libdir}/systemd/system-generators/systemd-integritysetup-generator
%{_cross_libdir}/systemd/system-generators/systemd-veritysetup-generator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
%{_cross_libdir}/systemd/system-generators/systemd-cryptsetup-generator
%{_cross_libdir}/systemd/system-generators/systemd-integritysetup-generator
%{_cross_libdir}/systemd/system-generators/systemd-veritysetup-generator
%{_cross_systemdgeneratordir}/systemd-cryptsetup-generator
%{_cross_systemdgeneratordir}/systemd-integritysetup-generator
%{_cross_systemdgeneratordir}/systemd-veritysetup-generator

Comment on lines 837 to 838
%{_cross_libdir}/systemd/system/cryptsetup-pre.target
%{_cross_libdir}/systemd/system/cryptsetup.target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use %{_cross_unitdir} instead of %{_cross_libdir}/systemd/system.

Comment on lines 839 to 840
%{_cross_libdir}/systemd/system/initrd-root-device.target.wants/remote-cryptsetup.target
%{_cross_libdir}/systemd/system/initrd-root-device.target.wants/remote-veritysetup.target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omit: we don't have an initrd

Suggested change
%{_cross_libdir}/systemd/system/initrd-root-device.target.wants/remote-cryptsetup.target
%{_cross_libdir}/systemd/system/initrd-root-device.target.wants/remote-veritysetup.target

Comment on lines 843 to 844
%{_cross_libdir}/systemd/system/remote-cryptsetup.target
%{_cross_libdir}/systemd/system/remote-veritysetup.target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add these to the "Exclude remote filesystem targets" section above.

This extends the upstream patch to allow building systemd with openssl
drop-ins that don't have UI support
openssl and aws-lc (and boringssl) have diverged wrt the return type of
ERR_get_error() and `long unsigned int` has been patched to be
`uint32_t` instead
sb-sign has a dependency on PKCS7 which is not provided by aws-lc.
Adding a new meson option to prevent it from being built
Stub out install_secure_boot_auto_enroll since it depends on PKCS7. Instead
default to the EOPNOTSUPP condition with a debug log
Add a patch to extend meson options to set OPENSSL_NO_UI_CONSOLE=1
during build. The option can be controlled with the
CONFIGURE_OPTS in the specfile
@vigh-m vigh-m force-pushed the systemd-cryptsetup-build branch from a197f44 to 48b5ad8 Compare October 17, 2025 17:04
@vigh-m
Copy link
Contributor Author

vigh-m commented Oct 17, 2025

⬆️ Addressed comments

Comment on lines 846 to 854
%{_cross_libdir}/systemd/system/integritysetup-pre.target
%{_cross_libdir}/systemd/system/integritysetup.target
%{_cross_libdir}/systemd/system/sysinit.target.wants/cryptsetup.target
%{_cross_libdir}/systemd/system/sysinit.target.wants/integritysetup.target
%{_cross_libdir}/systemd/system/sysinit.target.wants/veritysetup.target
%{_cross_libdir}/systemd/system/system-systemd\x2dcryptsetup.slice
%{_cross_libdir}/systemd/system/system-systemd\x2dveritysetup.slice
%{_cross_libdir}/systemd/system/veritysetup-pre.target
%{_cross_libdir}/systemd/system/veritysetup.target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace %{_cross_libdir}/systemd/system with %{_cross_unitdir}.

@vigh-m vigh-m force-pushed the systemd-cryptsetup-build branch from 48b5ad8 to aa360e9 Compare October 20, 2025 17:56
@vigh-m
Copy link
Contributor Author

vigh-m commented Oct 20, 2025

⬆️ replaced %{_cross_libdir}/systemd/system with %{_cross_unitdir}. Also moved things around a little to reduce the diff between systemd252 and systemd257 specs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade systemd to the latest stable major version available

2 participants