-
Notifications
You must be signed in to change notification settings - Fork 296
qemu-wrapper: increase max supported NVMe request size #6781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lindig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would like @rosslagerwall to review this. (I think the handling of n in this code is difficult to reason about. Why are we looking at quemu_args[n+1]?)
The current default value for the NVMe MDTS parameter exposed in QEMU emulated NMVe devices is 7 (max 512KiB requests). However there seems to be an internal Windows Server 2025 issue that possibly triggers when splitting bigger requests into smaller on in the NVMe Windows driver. Increase the exposed MDTS value on the emulated QEMU NVMe device to 9 (max 2MiB request size), as that seems to drop the reproduction rate of the issue. Discussion is ongoing with Microsoft to get the issue identified and possibly sorted on their end. For the time being apply this mitigation in qemu-wrapper as a workaround. Signed-off-by: Roger Pau Monné <[email protected]>
|
|
||
| if p == "-device": | ||
| if qemu_args[n + 1].startswith('nvme,'): | ||
| qemu_args[n + 1] += ',mdts=9' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also change xc/device.ml and change nvme,serial=nvme0,... to nvme,serial=nvme0,mdts=9,... around line 2739 in let extra_args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that would end up appending mdts=9 twice to the command line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on whether to do it here or in Ocaml code, but if it's Ocaml code I would defer to someone more familiar with XAPI than me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diff --git a/ocaml/xenopsd/xc/device.ml b/ocaml/xenopsd/xc/device.ml
index 73f136feca..9beeecf436 100644
--- a/ocaml/xenopsd/xc/device.ml
+++ b/ocaml/xenopsd/xc/device.ml
@@ -2774,7 +2774,7 @@ module Backend = struct
]
(* 4 and 5 are NICs, and we can only have two, 6 is platform *)
- let extra_args = ["-device"; "nvme,serial=nvme0,id=nvme0,addr=7"]
+ let extra_args = ["-device"; "nvme,serial=nvme0,mdts=9,id=nvme0,addr=7"]
endJust like this? In that case we should do it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a PR with that change: #6783
|
We are going to use #6783 |
The current default value for the NVMe MDTS parameter exposed in QEMU emulated NMVe devices is 7 (max 512KiB requests). However there seems to be an internal Windows Server 2025 issue that possibly triggers when splitting bigger requests into smaller on in the NVMe Windows driver.
Increase the exposed MDTS value on the emulated QEMU NVMe device to 9 (max 2MiB request size), as that seems to drop the reproduction rate of the issue.
Discussion is ongoing with Microsoft to get the issue identified and possibly sorted on their end. For the time being apply this mitigation in qemu-wrapper as a workaround.