-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
virtio/block: allow using host OS block devices in addition to regular files #278
Conversation
…r files Signed-off-by: Jan Noha <[email protected]>
#[cfg(windows)] | ||
let is_block_device = false; | ||
#[cfg(unix)] | ||
let is_block_device = m.file_type().is_block_device(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so actually this is not a complete fix. It works on macOS with the buffered /dev/disk*
but not /dev/rdisk*
.
For one thing, I'd have to change the condition here to is_block_device() || is_char_device()
but when I do that, I get ERROR devices::virtio::block::worker] error processing request: WritingToDescriptor(Os { code: 22, kind: InvalidInput, message: "Invalid argument" })
.
I also noticed my disk sector size is 4096 but there's a lot of hardcoded 512 in the code. Raw disks don't allow unaligned access so that could be it...
Anyway, I'm not sure if I should attempt to fix this or rather ignore the character device case for now. Maybe return an error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, imago
should be updated to support multiple block sizes, or at least 4096 in addition to 512. On macOS, using the raw disk is the preferred option to avoid polluting both the guest and the host page cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think imago prescribes any block size. It queries the underlying block size and ensures requests are aligned to it (with read-modify-write cycles if necessary).
Imago’s users are responsible for choosing an appropriate block size, and imago reports the minimal underlying alignment via FormatAccess::req_align() and FormatAccess::mem_align().
(EDIT: When I say “responsible”, I mean, if users send unaligned requests, imago will re-align them using bounce buffers and potentially RMW cycles. I.e., it should be completely fine to send unaligned requests, it’s just bad for performance. As for why the EINVAL
occurs, we’d have to find out what alignment is queried/probed. The code to query the alignment from the OS for macOS is here: https://gitlab.com/hreitz/imago/-/blob/main/src/file.rs?ref_type=heads#L709 – and the code to probe it is here: https://gitlab.com/hreitz/imago/-/blob/main/src/file.rs?ref_type=heads#L496 . That is to say, the alignment is first queried, and then also probed to verify it works.)
I assume the 512 comes from libkrun’s virtio-blk code, because 512 is the virtio-blk sector size. It’s possible to report other block sizes to the guest and thus force it to use those (via virtio_blk_config.blk_size
and virtio_blk_config.topology.physical_block_exp
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's actually the code I was referring to. The virtio block worker in libkrun.
And I was wondering how it reports the virtualized block size into the guest. So thanks for pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, made a mistake: The code I linked is obviously the one in the main branch, it’s quite different in 0.1 (precisely because I didn’t want to go wrong, and that code would need testing first). The 0.1 code is actually just this: https://gitlab.com/hreitz/imago/-/blob/v0.1-stable/src/file.rs?ref_type=heads#L88 – i.e. it always assumes a 4k alignment to be safe. But it also does so for the memory alignment – maybe 4k is too little? We’d have to know the actual syscall that failed, specifically its parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XanClic I opened a merge request to main. Should we port this to 0.1 too? I'm going to try to cherry-pick the commit you mentioned and see if it resolves my issue with raw disk access.
EDIT:
I tried it with both main and 0.1.3 with 6aa125b0 + my fix. It works. Now I can use /dev/rdisk*
.
Thanks for your help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slp Another improvement might be to set direct_io
to true
in libkrun if the file path begins with /dev/
. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nohajc I think in libkrun we should use direct_io
unconditionally to avoid double-caching. @jakecorrenti @XanClic WDYT? Any reason to favor not using direct_io
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nohajc Thanks a lot of opening the MR and for testing 6aa125b0. I can backport those to 0.1, no problem.
@slp I think direct_io
is a good idea. As far as I understand, for QEMU, it is generally suggested to disable the host page cache on VM images to avoid double-caching, as you say. It may cause issues with images in locations that don’t support O_DIRECT
, but the only case I know of was tmpfs, which now supports it, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and think that direct_io
is a good idea here.
Hi @nohajc, thank you for the contribution! The We should actually remove the vendor-ed code from libkrun and just add the cc @XanClic |
Oh, I haven't noticed. Thanks for pointing it out. |
@nohajc Thanks for working on this. I'm going to keep this one open until the MR is created in the imago repo. |
I'm happy to contribute. :) I'll see if I can open the correct MR today. |
MR created: https://gitlab.com/hreitz/imago/-/merge_requests/18 |
I’ve released 0.1.4 to fix this. I’ll do the Fedora release tomorrow. |
(By the way, the Fedora release is done (from my side), too: https://src.fedoraproject.org/rpms/rust-imago) |
I suppose this can be closed now. I'll look forward to the next release. Thank you everyone for your help! |
Thank you @nohajc ! |
This PR fixes file size detection of virtio block devices in case the backing file is also a block device on the host OS.
See #275.