Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

efi: support updating multiple EFIs in mirrored setups (RAID1) #855

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

HuijingHei
Copy link
Member

@HuijingHei HuijingHei commented Feb 18, 2025

The prep PR is #824

Fixes #132

Include 4 PRs:

  • blockdev: remove #[allow(dead_code)] for functions
  • bios: remove checking is_efi_booted() as it would fail on BIOS only
  • efi: support updating multiple EFIs in mirrored setups (RAID1)
  • tests: add kola test for updating multiple EFIs

Comment on lines +79 to +83
if let Some(esp_device) = self.get_esp_device() {
esp_devices.push(esp_device.to_string_lossy().into_owned());
} else {
esp_devices = blockdev::find_colocated_esps("/").expect("get esp devices");
};
Copy link
Contributor

@champtar champtar Mar 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered / tried the case with 2 disks / 2 separate OS ? (we do that in our lab, 2 or even 3 rpm-ostree based OS, each one on one disk, all installed via anaconda)
I'm asking because quickly looking at the code, I would use find_colocated_esps("/") first and get_esp_device() as a fallback.

The safest might even be:

  1. mounted ESP(s)
  2. colocated ESP(s)
  3. get_esp_device

If you have a bootc installed OS, and on a second disk an anaconda installed OS (not even ostree based), right now you are going to always pick the anaconda ESP

Copy link
Member Author

@HuijingHei HuijingHei Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use find_colocated_esps("/") first and get_esp_device() as a fallback.

  • get_esp_device() will find CoreOS ESP label /dev/disk/by-partlabel/EFI-SYSTEM, or Anaconda ESP label /dev/disk/by-partlabel/EFI\\x20System\\x20Partition

  • find_colocated_esps("/") will list ESP partitions on the devices with mountpoint /boot, but it does not mean it is always mounted, for example, coreos does not mount esp after booted, see doc

Consider that on single disk, can easily get ESP label /dev/disk/by-partlabel/EFI-SYSTEM via get_esp_device(), but on multiple disks, it would be /dev/disk/by-label/esp-1 & /dev/disk/by-label/esp-2, does this make sense?

Copy link
Member Author

@HuijingHei HuijingHei Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safest might even be:

mounted ESP(s)
colocated ESP(s)
get_esp_device

What I did: when running update, get_esp_devices() would list the all esp devices, then in the loop device will check mounted ESP first, if not, will mount the esp then upgrade. I think you meant that need to check mounted firstly in the beginning, will try that, thanks!

Edit: For multiple disks, we still need the loop for each esp device. Will keep it as it is, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_esp_devices() would list the all esp devices

Maybe I don't understand rust code, but for me it first call get_esp_device() which returns 1 device based on a weak heuristic, if that fails then it tries to find multiple devices

Here 3 test cases, second OS (sdb) booted, running get_esp_devices():
sda: normal anaconda install
sdb: ESP not mounted, not using anaconda/coreos label
-> returns sda ?

sda: normal anaconda install
sdb & sdc: raid 1 mirrored, ESP not mounted, not using anaconda/coreos label
-> returns sda ?

sda: normal coreos install
sdb: also a coreos install, ESP not mounted
-> not sure which PARTLABEL wins, likely random at each boot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have such env, but IMU, if second OS (sdb) booted, sda will not be mounted from the booted OS, in this case, we do not scan sda.

sda: normal anaconda install
sdb: ESP not mounted, not using anaconda/coreos label

skip sda, and get_esp_devices() will call find_colocated_esps("/") to find the mount point /boot device, then get ESP part on the same device

sda: normal anaconda install
sdb & sdc: raid 1 mirrored, ESP not mounted, not using anaconda/coreos label

skip sda, this is the case that we need to resolve actually, find_colocated_esps("/") to find the mount point /boot device which is /dev/md126, then need to get the backing devices list sdb & sdc and find ESP part on each device

sda: normal coreos install
sdb: also a coreos install, ESP not mounted

When I tried to boot VM with 2 coreos, failed with:

[    9.373774] rdcore[957]: Error: System has 2 devices with a filesystem labeled 'boot': ["/dev/vdb3", "/dev/vda3"]
[FAILED] Failed to start CoreOS Ignition Ensure Unique Boot Filesystem.

So we can skip this env.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @travier to confirm about my reply, not sure if my understanding is correct, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a server with 3 disks / 3 installs side by side

# udevadm info /dev/sda1 | grep '^S: disk/by-partlabel'
S: disk/by-partlabel/EFI\x20System\x20Partition
# udevadm info /dev/sdb1 | grep '^S: disk/by-partlabel'
S: disk/by-partlabel/EFI\x20System\x20Partition
# udevadm info /dev/sdc1 | grep '^S: disk/by-partlabel'
S: disk/by-partlabel/EFI\x20System\x20Partition

All 3 wants to have the by-partlabel symlink, only the first one gets it, I'm not aware of any ordering/priorities

In case 1&2 we do not mount sda, but udev definitely scans it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 3 wants to have the by-partlabel symlink

If booted sda, then it will find ESP part in sda, I do not think it will retrieve all the devices.

@HuijingHei
Copy link
Member Author

@travier could you help to review when you have time? Thanks!

@travier
Copy link
Member

travier commented Apr 9, 2025

Dusty added some logic in openshift/os#1795 about making sure that the ESP that we are finding are part of the same device that the one the /boot partition is in and we should do the same for both the RAID and non RAID case.

@HuijingHei
Copy link
Member Author

Dusty added some logic in openshift/os#1795 about making sure that the ESP that we are finding are part of the same device that the one the /boot partition is in and we should do the same for both the RAID and non RAID case.

Agree, and what I did in this patch followed the same way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support updating multiple EFIs in mirrored setups
3 participants