-
Notifications
You must be signed in to change notification settings - Fork 132
storage: storvsc user-mode implementation #1256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
struct StorvscInner { | ||
next_transaction_id: AtomicU64, | ||
new_request_receiver: Receiver<StorvscRequest>, | ||
transactions: Mutex<HashMap<u64, PendingOperation>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a slab::Slab<PendingOperation>
instead of a HashMap
. The slab will give you a usize
key which you can use as the transaction ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see from the Slab docs that it can reuse keys for removed items. Could there be any issues with reused, non-sequential keys on the storvsp side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a protocol violation--storvsp can only send a completion for a transaction once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So does slab still work for this if it can reuse keys, given that it can result in duplicate transaction IDs?
Looking good! |
Pinging this issue because it still needs reviews |
async fn test_request_response(driver: DefaultDriver) { | ||
let (guest, host) = connected_async_channels(16 * 1024); | ||
let host_queue = Queue::new(host).unwrap(); | ||
let test_guest_mem = GuestMemory::allocate(16384); | ||
|
||
let storvsp = TestStorvspWorker::start( | ||
driver.clone(), | ||
test_guest_mem.clone(), | ||
host_queue, | ||
Vec::new(), | ||
); | ||
let mut storvsc = TestStorvscWorker::new(); | ||
storvsc.start(driver.clone(), guest); | ||
|
||
let mut timer = PolledTimer::new(&driver); | ||
timer.sleep(time::Duration::from_secs(1)).await; | ||
|
||
storvsc.stop().await; | ||
assert!(storvsc.get_mut().has_negotiated); | ||
storvsc.resume().await; | ||
|
||
// Send SCSI write request | ||
let write_buf = [7u8; 4096]; | ||
test_guest_mem.write_at(4096, &write_buf).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the same code that is in storage_tests/tests/storvsc.rs
. What's the difference between them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. These tests don't have a "host" side controller or disk, whereas the ones in storage_tests
do. I'm not sure what we get by having the tests with a backend disk. Presumably it sets us up for some more interesting tests in the future? Was that your intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests here are unit tests, which the ones in storage_test are integration tests. I think it is important to test the component in isolation, as well as with a validated storvsp to ensure proper protocol operations.
let (payload_bytes, padding_bytes) = if len > storvsp_protocol::SCSI_REQUEST_LEN_MAX { | ||
( | ||
&payload[..storvsp_protocol::SCSI_REQUEST_LEN_MAX - size_of_val(&header)], | ||
&[][..], | ||
) | ||
} else { | ||
( | ||
payload, | ||
&padding[..storvsp_protocol::SCSI_REQUEST_LEN_MAX - len], | ||
) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected us to just panic if the vmbus packet is bigger than what we expect for our protocol, since we're truncating it anyways in a way that is potentially semantically unsafe. But I imagine there are some quirks in the vstor protocol ( :) ), so what quirks are we hitting here that require the code to hande this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The storvsp implementation also does this unusual truncation:
openvmm/vm/devices/storage/storvsp/src/lib.rs
Lines 457 to 471 in 3c2dda8
// Zero pad or truncate the payload to the queue's packet size. This is | |
// necessary because Windows guests check that each packet's size is | |
// exactly the largest possible packet size for the negotiated protocol | |
// version. | |
let len = size_of_val(&header) + size_of_val(payload); | |
let padding = [0; storvsp_protocol::SCSI_REQUEST_LEN_MAX]; | |
let (payload_bytes, padding_bytes) = if len > packet_size { | |
(&payload[..packet_size - size_of_val(&header)], &[][..]) | |
} else { | |
(payload, &padding[..packet_size - len]) | |
}; | |
assert_eq!( | |
size_of_val(&header) + payload_bytes.len() + padding_bytes.len(), | |
packet_size | |
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but I don't think the storvsp
behavior is sufficient justification. They're different ends of the pipe, and storvsp
has to make sure that it is compatible with various guests.
I don't think we should truncate without understanding it is safe.
Furthermore, the padding is only necessary if storvsp
(in either the openvmm or Hyper-V implementations) require it for packets incoming to storvsp.
Silently truncating packets seems like a recipe for difficult debugging later. I'd rather panic
here and then know we need to re-add this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked over this code multiple times, including the comments in storvsp. The comments linked from storvsp talk about padding or truncating the payload specifically, but it seems to also consider the length of the header when padding/truncating. The size of a SCSI request (the payload) is 52 bytes, which is the length limited to in storvsp and here. Therefore, I believe the intention is to pad/truncate the payload only. I have modified the code here in storvsc (not in storvsp) to remove truncation (to avoid any hidden issues) and to pad only the payload. It is worth revisiting the code in storvsp to see if this is an error there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think that's reasonable. Please ensure that this is covered in a test case, and then I'm happy to consider this resolved.
Overall this is looking quite good. I've left some suggestions and asked some questions to help me understand both your thinking and the design here. Thanks for working on this! I have ideas, but it is unclear to me how and where we plug in this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions and comments that I'd like you to address, but that can happen in a follow-up PR. Please review my comments before completing. Thanks for the hard work to get this in, Eric!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been mulling this over based on other comments that you made. I now understand the value here. I do think there are changes that you should make so that this is more useful to our testing:
- there are several cases where this code logs an
error
orwarning
due to unexpected state. Those should all beassert
orpanic
calls, so that we catch if storvsc is misbehaving instantly (rather than tracing a warning and returning an error packet back to storvsp). - the fake storvsp here does not provide configurability to test interesting cases (at least I don't think it does). For example: testing various subsets of interesting versions.
} | ||
|
||
#[derive(Debug)] | ||
#[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is dead_code
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these fields are necessary because they are in the packet structure, but we don't actually end up using most of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I don't see any value in keeping dead code around. This packet is not a byte representation of anything, so we don't need the padding.
However, I think it could be interesting to log the details when we see something unexpected, like in this case:
// Wait for completion
let completion = match self.next_packet(&mut reader).await? {
Packet::Completion(packet) => Ok(packet),
Packet::Data(_) => Err(StorvscError(StorvscErrorInner::PacketError(
PacketError::InvalidPacketType,
))),
}?;
No need to make a change to this PR, we can do this as a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fields will also likely be used if we add support for data packets that actually do something (and are not no-ops like EnumerateBus)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements the storvsc user‐mode driver (the client counterpart to storvsp) with updated feature flag usage and enhanced testing support. Key changes include renaming the "fuzz_helpers" feature to "test" in multiple Cargo.toml files and source code, adding new test helper functions for integration tests, and updating integration tests for storvsc and storvsp functionality.
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
vm/devices/storage/storvsp_protocol/Cargo.toml | Removed the public "fuzz_helpers" array in favor of the new "test" feature |
vm/devices/storage/storvsp/src/test_helpers.rs | Updated feature flag from "fuzz_helpers" to "test" and added a new teardown_or_panic method |
vm/devices/storage/storvsp/src/lib.rs | Replaced conditional compilation from "fuzz_helpers" to "test" for the test_helpers module |
vm/devices/storage/storvsp/fuzz/Cargo.toml | Changed dependency feature list from "fuzz_helpers" to "test" for storvsp |
vm/devices/storage/storvsc_driver/* | Added a new test_helpers module and updated integration test helpers with corresponding feature flag changes |
vm/devices/storage/storage_tests/tests/{storvsc.rs, tests.rs} | Added and updated integration tests exercising storvsp and storvsc workflows |
Comments suppressed due to low confidence (2)
vm/devices/storage/storvsp_protocol/Cargo.toml:10
- Removing the fuzz_helpers field in favor of the new 'test' feature is fine, but please ensure that downstream consumers are updated to refer to 'test' consistently in their documentation and configuration.
# Enable generating arbitrary values of types useful for fuzzing.
vm/devices/storage/storvsp/src/test_helpers.rs:52
- [nitpick] The conversion from 'fuzz_helpers' to 'test' improves clarity but the name 'test' can be ambiguous because of the built-in #[cfg(test)] attribute; consider a more descriptive feature name such as 'integration_tests' if applicable.
#[cfg(feature = "test")]
self.inner | ||
.send_packet_and_expect_completion( | ||
&mut self.queue, | ||
storvsp_protocol::Operation::BEGIN_INITIALIZATION, | ||
1, | ||
&(), | ||
) | ||
.await?; | ||
|
||
// Step 2: QUERY_PROTOCOL_VERSION - request latest version | ||
self.inner | ||
.send_packet_and_expect_completion( | ||
&mut self.queue, | ||
storvsp_protocol::Operation::QUERY_PROTOCOL_VERSION, | ||
2, | ||
&self.version, | ||
) | ||
.await | ||
.map_err(|err| match err { | ||
StorvscError(StorvscErrorInner::PacketError(PacketError::UnexpectedStatus( | ||
storvsp_protocol::NtStatus::INVALID_DEVICE_STATE, | ||
))) => StorvscError(StorvscErrorInner::UnsupportedProtocolVersion), | ||
_ => err, | ||
})?; | ||
|
||
// Step 3: QUERY_PROPERTIES | ||
let properties_packet = self | ||
.inner | ||
.send_packet_and_expect_completion( | ||
&mut self.queue, | ||
storvsp_protocol::Operation::QUERY_PROPERTIES, | ||
3, | ||
&(), | ||
) | ||
.await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The current pattern of sending the request and immediately propagating Ok(()) using '?' could be refactored to directly propagate the result of the send operation to enhance clarity and maintainability.
self.inner | |
.send_packet_and_expect_completion( | |
&mut self.queue, | |
storvsp_protocol::Operation::BEGIN_INITIALIZATION, | |
1, | |
&(), | |
) | |
.await?; | |
// Step 2: QUERY_PROTOCOL_VERSION - request latest version | |
self.inner | |
.send_packet_and_expect_completion( | |
&mut self.queue, | |
storvsp_protocol::Operation::QUERY_PROTOCOL_VERSION, | |
2, | |
&self.version, | |
) | |
.await | |
.map_err(|err| match err { | |
StorvscError(StorvscErrorInner::PacketError(PacketError::UnexpectedStatus( | |
storvsp_protocol::NtStatus::INVALID_DEVICE_STATE, | |
))) => StorvscError(StorvscErrorInner::UnsupportedProtocolVersion), | |
_ => err, | |
})?; | |
// Step 3: QUERY_PROPERTIES | |
let properties_packet = self | |
.inner | |
.send_packet_and_expect_completion( | |
&mut self.queue, | |
storvsp_protocol::Operation::QUERY_PROPERTIES, | |
3, | |
&(), | |
) | |
.await?; | |
self.inner.send_packet_and_expect_completion( | |
&mut self.queue, | |
storvsp_protocol::Operation::BEGIN_INITIALIZATION, | |
1, | |
&(), | |
).await?; | |
// Step 2: QUERY_PROTOCOL_VERSION - request latest version | |
self.inner.send_packet_and_expect_completion( | |
&mut self.queue, | |
storvsp_protocol::Operation::QUERY_PROTOCOL_VERSION, | |
2, | |
&self.version, | |
).await.map_err(|err| match err { | |
StorvscError(StorvscErrorInner::PacketError(PacketError::UnexpectedStatus( | |
storvsp_protocol::NtStatus::INVALID_DEVICE_STATE, | |
))) => StorvscError(StorvscErrorInner::UnsupportedProtocolVersion), | |
_ => err, | |
})?; | |
// Step 3: QUERY_PROPERTIES | |
let properties_packet = self.inner.send_packet_and_expect_completion( | |
&mut self.queue, | |
storvsp_protocol::Operation::QUERY_PROPERTIES, | |
3, | |
&(), | |
).await?; |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i reviewed only the dependency changes
This is an implementation of storvsc (client complement to storvsp) for SCSI storage over VMBus.
#273