Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtio/block: allow using host OS block devices in addition to regular files #278

Closed
wants to merge 1 commit into from

Conversation

nohajc
Copy link
Contributor

@nohajc nohajc commented Mar 13, 2025

This PR fixes file size detection of virtio block devices in case the backing file is also a block device on the host OS.
See #275.

#[cfg(windows)]
let is_block_device = false;
#[cfg(unix)]
let is_block_device = m.file_type().is_block_device();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so actually this is not a complete fix. It works on macOS with the buffered /dev/disk* but not /dev/rdisk*.
For one thing, I'd have to change the condition here to is_block_device() || is_char_device() but when I do that, I get ERROR devices::virtio::block::worker] error processing request: WritingToDescriptor(Os { code: 22, kind: InvalidInput, message: "Invalid argument" }).

I also noticed my disk sector size is 4096 but there's a lot of hardcoded 512 in the code. Raw disks don't allow unaligned access so that could be it...

Anyway, I'm not sure if I should attempt to fix this or rather ignore the character device case for now. Maybe return an error here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, imago should be updated to support multiple block sizes, or at least 4096 in addition to 512. On macOS, using the raw disk is the preferred option to avoid polluting both the guest and the host page cache.

Copy link

@XanClic XanClic Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think imago prescribes any block size. It queries the underlying block size and ensures requests are aligned to it (with read-modify-write cycles if necessary).

Imago’s users are responsible for choosing an appropriate block size, and imago reports the minimal underlying alignment via FormatAccess::req_align() and FormatAccess::mem_align().

(EDIT: When I say “responsible”, I mean, if users send unaligned requests, imago will re-align them using bounce buffers and potentially RMW cycles. I.e., it should be completely fine to send unaligned requests, it’s just bad for performance. As for why the EINVAL occurs, we’d have to find out what alignment is queried/probed. The code to query the alignment from the OS for macOS is here: https://gitlab.com/hreitz/imago/-/blob/main/src/file.rs?ref_type=heads#L709 – and the code to probe it is here: https://gitlab.com/hreitz/imago/-/blob/main/src/file.rs?ref_type=heads#L496 . That is to say, the alignment is first queried, and then also probed to verify it works.)

I assume the 512 comes from libkrun’s virtio-blk code, because 512 is the virtio-blk sector size. It’s possible to report other block sizes to the guest and thus force it to use those (via virtio_blk_config.blk_size and virtio_blk_config.topology.physical_block_exp).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's actually the code I was referring to. The virtio block worker in libkrun.

And I was wondering how it reports the virtualized block size into the guest. So thanks for pointers.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, made a mistake: The code I linked is obviously the one in the main branch, it’s quite different in 0.1 (precisely because I didn’t want to go wrong, and that code would need testing first). The 0.1 code is actually just this: https://gitlab.com/hreitz/imago/-/blob/v0.1-stable/src/file.rs?ref_type=heads#L88 – i.e. it always assumes a 4k alignment to be safe. But it also does so for the memory alignment – maybe 4k is too little? We’d have to know the actual syscall that failed, specifically its parameters.

Copy link
Contributor Author

@nohajc nohajc Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XanClic I opened a merge request to main. Should we port this to 0.1 too? I'm going to try to cherry-pick the commit you mentioned and see if it resolves my issue with raw disk access.

EDIT:
I tried it with both main and 0.1.3 with 6aa125b0 + my fix. It works. Now I can use /dev/rdisk*.
Thanks for your help!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slp Another improvement might be to set direct_io to true in libkrun if the file path begins with /dev/. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nohajc I think in libkrun we should use direct_io unconditionally to avoid double-caching. @jakecorrenti @XanClic WDYT? Any reason to favor not using direct_io?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nohajc Thanks a lot of opening the MR and for testing 6aa125b0. I can backport those to 0.1, no problem.

@slp I think direct_io is a good idea. As far as I understand, for QEMU, it is generally suggested to disable the host page cache on VM images to avoid double-caching, as you say. It may cause issues with images in locations that don’t support O_DIRECT, but the only case I know of was tmpfs, which now supports it, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and think that direct_io is a good idea here.

@jakecorrenti
Copy link
Member

jakecorrenti commented Mar 13, 2025

Hi @nohajc, thank you for the contribution! The imago crate is vendored from https://gitlab.com/hreitz/imago, so any changes should be made to that repository.

We should actually remove the vendor-ed code from libkrun and just add the imago dependency since it's packaged in Fedora now.

cc @XanClic

@nohajc
Copy link
Contributor Author

nohajc commented Mar 13, 2025

Oh, I haven't noticed. Thanks for pointing it out.

@slp
Copy link
Contributor

slp commented Mar 13, 2025

@nohajc Thanks for working on this. I'm going to keep this one open until the MR is created in the imago repo.

@nohajc
Copy link
Contributor Author

nohajc commented Mar 13, 2025

I'm happy to contribute. :) I'll see if I can open the correct MR today.

@nohajc
Copy link
Contributor Author

nohajc commented Mar 13, 2025

@nohajc Thanks for working on this. I'm going to keep this one open until the MR is created in the imago repo.

MR created: https://gitlab.com/hreitz/imago/-/merge_requests/18

@XanClic
Copy link

XanClic commented Mar 17, 2025

I’ve released 0.1.4 to fix this. I’ll do the Fedora release tomorrow.

@XanClic
Copy link

XanClic commented Mar 21, 2025

(By the way, the Fedora release is done (from my side), too: https://src.fedoraproject.org/rpms/rust-imago)

@nohajc
Copy link
Contributor Author

nohajc commented Mar 22, 2025

I suppose this can be closed now. I'll look forward to the next release. Thank you everyone for your help!

@slp
Copy link
Contributor

slp commented Mar 24, 2025

I suppose this can be closed now. I'll look forward to the next release. Thank you everyone for your help!

Thank you @nohajc !

@slp slp closed this Mar 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants