`ot_spi_device`: discard all packets for a failed transaction #263

jwnrt · 2025-10-28T15:38:21Z

This changes the SPI device's discarding behaviour in two ways:

The byte count is correctly adjusted for discarded bytes so we can receive a new packet when it's finished.
We discard subsequent packets in a transaction if one of them fails. This ensures we treat the transaction as one and don't stop discarding at an arbitrary packet boundary.

This is motivated by seeing that QEMU never stops discarding after it receives an unrecognised flash command as part of a multi-packet transaction (e.g. a "read" transaction which sends a "write read command" packet with EOT=0 followed by a "read bytes" packet with EOT=1).

rivos-eblot · 2025-10-28T15:53:03Z

I do understand the rationale.
However would there be a way to implement this feature using a BUS_STATE state rather than another boolean?

jwnrt · 2025-10-28T15:57:04Z

I tried that implementation, but the state ended up identical to IDLE expect with the transition to DISCARD in handle_header.

I don't mind switching to a state if you'd prefer

rivos-eblot · 2025-10-28T16:08:07Z

I don't mind switching to a state if you'd prefer

If it is not too much work and it does not clutter the code I would say otherwise it is preferable as it is always easier to debug with a single state that having to track a combination of multiple variables. If this ends up being harder to read, please leave it as is.

jwnrt · 2025-10-28T16:21:35Z

I've made the change and it passes my test, though I may not have got the ideal implementation

edit: oops, let me fix formatting

rivos-eblot

I think you also want to add

 * Copyright (c) 2025 lowRISC contributors.

to the file header.

hw/opentitan/ot_spi_device.c

AlexJones0

Looks good to me after @rivos-eblot's comments are addressed.

It might also be nice to quickly document this behaviour in the "SPI device CharDev protocol" section of docs/opentitan/spi_device.md, given that the protocol is described in detail (including handling error states) there?

jwnrt · 2025-10-29T10:15:22Z

Rebased on TPM changes

rivos-eblot

I think I totally lost track of what we want to achieve and how we handle it.
DISCARD_PACKET may not be not helping, as we actively store the incoming packet.

I really need to spend some time to understand this new state machine to be sure it does not introduce unexpected behavior.

I think this part is the most confusing:

    switch (bus->state) {
    case SPI_BUS_IDLE:
    case SPI_BUS_FLASH:
    case SPI_BUS_DISCARD_PACKET:
        BUS_CHANGE_STATE(s, IDLE);
        break;
    case SPI_BUS_DISCARD:
    case SPI_BUS_ERROR:
        BUS_CHANGE_STATE(s, DISCARD_PACKET);
        break;
    default:
        g_assert_not_reached();
        break;
    }

I do not think we should resume to IDLE till the full transaction is over. Maybe it is here, not sure (at least, we need some comments)
Resuming from ERROR to DISCARD_PACKET is for sure, wrong. For example ERROR is set when the incoming header is invalid, which means there is no way we can recover from it, as we no longer understand the content of the incoming data.

May be flipping to a new state was not a good idea, I'm not sure, but something that is missing is that once we enter the state where we want to discard anything till the last byte of the last packet of a transaction is received (and the same amount of 0xFF bytes is sent back), we should not return to some regular state. Maybe we need one or more extra states to manage:

ignoring bytes till the end of the current packet (DISCARD_PACKET)
entering a special state to handle the header of the next packet if the transaction to discard is not over (IDLE_IGNORE?), only entering IDLE once the last byte of the packet of the ignored transaction has been received
entering the above state when a new packet is received (DISCARD_PACKET)

DISCARD might also need to be renamed or removed: entering this state follows a flash error, so I guess that we want to enter DISCARD_PACKET in this case. It does not seem DISCARD is needed anymore.

jwnrt · 2025-10-29T12:46:18Z

Let me check my understanding:

When errors occur on malformed packets, we want to remain latched in the error state until reset.
When "soft" errors occur (e.g. unrecognised commands) we want to discard all packets until the end of transaction.

So perhaps we need two state machines, one for the transaction and one for the current packet.

The transaction might have states:

IDLE: waiting for a new transaction.
DISCARD: discard all bytes until end of transaction.
ERROR: latch until reset, stop parsing packets.
FLASH: processing a flash transaction.
TPM: processing a TPM transaction.

The packet might have states:

IDLE: waiting for a new packet.
DISCARD: discard remaining bytes in packet.
ACCEPT/PROCESS: process flash or TPM data.

rivos-eblot · 2025-10-29T13:22:34Z

Let me check my understanding:

When errors occur on malformed packets, we want to remain latched in the error state until reset.

When "soft" errors occur (e.g. unrecognised commands) we want to discard all packets until the end of transaction.

Yes, I think so.

So perhaps we need two state machines, one for the transaction and one for the current packet.

If you think it is easier, why not.

Questions:

Why would there be a differenciation at transaction level between flash and TPM modes?
Why would be the use case for DISCARD at packet level? It seems that if some unexpected data is received, it should propagated to the end of the transaction (?)

jwnrt · 2025-10-29T13:37:29Z

Why would there be a differenciation at transaction level between flash and TPM modes?

@engdoreis can help me out here, but I think the TPM and flash have separate buffers and configuration registers that influence how incoming data it processed. Apparently both TPM and flash can be enabled simultaneously, but the chip select determines which mode to use for a transaction. We don't want to accept transactions which interleave flash and TPM packets, so we need to remember which mode the transaction is in.

Why would be the use case for DISCARD at packet level? It seems that if some unexpected data is received, it should propagated to the end of the transaction (?)

My intention was to handle a case like this:

First packet arrives with an unknown command. EOT=0.
This is a soft error, so we enter DISCARD at both the transaction and packet level.
At the end of the packet, we return to IDLE at the packet level but retain DISCARD at the transaction level.
The next packet arrives, but we transition directly to DISCARD at the packet level instead of ACCEPT.

I think I'm not understanding what you mean by propagating the data to the end of the transaction. My plan was for DISCARD at the transaction level to remember that we need to discard all subsequent packets, and DISCARD at the packet level to remember that we need to discard all remaining bytes in the current packet.

engdoreis · 2025-10-29T13:53:34Z

Apparently both TPM and flash can be enabled simultaneously, but the chip select determines which mode to use for a transaction

Yes, that's correct, it's how the HW is implemented, it's possible to enable flash mode and TPM mode at the same time, and the different CS will define how the incoming package will be processed.

rivos-eblot · 2025-10-29T14:12:21Z

@engdoreis can help me out here, but I think the TPM and flash have separate buffers and configuration registers that influence how incoming data it processed. Apparently both TPM and flash can be enabled simultaneously, but the chip select determines which mode to use for a transaction. We don't want to accept transactions which interleave flash and TPM packets, so we need to remember which mode the transaction is in.

If I got it right, as the SPI device CharDev protocol does not support this feature yet, and it has been chosen to not support it for now, I think this use case can be dismissed for now.

First packet arrives with an unknown command. EOT=0.

This is a soft error, so we enter DISCARD at both the transaction and packet level.

At the end of the packet, we return to IDLE at the packet level but retain DISCARD at the transaction level.

The next packet arrives, but we transition directly to DISCARD at the packet level instead of ACCEPT.

I'm really not sure this can be handled with 2 SMs as it seems both are inter-dependent, and/or I think the naming is confusing.

The transport level, should decode the header, enter fatal error if some byte is deemed invalid, forward the payload, and inform the next layer whether it is the last packet of a transaction. So it needs an ERROR state whenever it is no longer able to parse the incoming stream, and leave this ERROR state only on SPI Device reset. I do not think it needs a "DISCARD" state.

The next layer, and here starts my confusion about the naming, should enter the FLASH or TPM mode, enter a DISCARD mode if the current command no longer accept so extra bytes for the current -packet-. I'm not sure whether it needs both ERROR and DISCARD mode in this case.

Please ignore my last response regarding the data, it is meaningless.

ziuziakowska · 2025-10-29T19:12:32Z

We would never want to discard only some bytes in the middle of a transaction that has been marked as erroneous, and discarding should persist until after a packet with EOT=1 (which represents the de-assertion of chip select), so I understand the rationale of this change, but I think an additional DISCARD_PACKET state makes things a lot more confusing and risks making the DISCARD state redundant.

I think the onus should fall on the IDLE state here in ot_spi_device_chr_handle_header - it is solely responsible for parsing the header and dispatching to the different states such as Flash or TPM. If the previous transaction packet has been sent with EOT=0 and the bus ended in the DISCARD state, that should set a flag that makes this function dispatch into DISCARD until after it has received a packet with EOT=1.

jwnrt · 2025-10-30T11:02:28Z

I have reverted back to the boolean flag separate from the state machine, but maybe the SPI device does need overhauling separately

hw/opentitan/ot_spi_device.c

We must adjust the byte count by the number discarded to that we can return to `IDLE` when the packet ends. Signed-off-by: James Wainwright <[email protected]>

If one packet in a transaction triggers an error and needs discarding, we must continue discarding the remaining packets until CS is released. Signed-off-by: James Wainwright <[email protected]>

rivos-eblot

LGTM (not tested)

jwnrt requested review from AlexJones0 and rivos-eblot October 28, 2025 15:38

jwnrt force-pushed the jw/spi-device-discard branch from 9109cba to 3354aca Compare October 28, 2025 16:21

jwnrt force-pushed the jw/spi-device-discard branch from 3354aca to c1970f6 Compare October 28, 2025 16:23

rivos-eblot requested changes Oct 28, 2025

View reviewed changes

hw/opentitan/ot_spi_device.c Outdated Show resolved Hide resolved

AlexJones0 approved these changes Oct 28, 2025

View reviewed changes

jwnrt force-pushed the jw/spi-device-discard branch from c1970f6 to ea8242e Compare October 29, 2025 09:29

jwnrt requested a review from rivos-eblot October 29, 2025 09:29

jwnrt force-pushed the jw/spi-device-discard branch from ea8242e to d5144c4 Compare October 29, 2025 10:15

jwnrt requested a review from AlexJones0 October 29, 2025 10:17

rivos-eblot reviewed Oct 29, 2025

View reviewed changes

jwnrt force-pushed the jw/spi-device-discard branch 2 times, most recently from 9109cba to b46c1d6 Compare October 30, 2025 10:37

rivos-eblot reviewed Oct 30, 2025

View reviewed changes

hw/opentitan/ot_spi_device.c Show resolved Hide resolved

jwnrt added 2 commits October 30, 2025 15:59

[ot] hw/opentitan: spi_device: correctly adjust for discarded bytes

e60b225

We must adjust the byte count by the number discarded to that we can return to `IDLE` when the packet ends. Signed-off-by: James Wainwright <[email protected]>

[ot] hw/opentitan: spi_device: discard all packets in failed transaction

8a9f6c3

If one packet in a transaction triggers an error and needs discarding, we must continue discarding the remaining packets until CS is released. Signed-off-by: James Wainwright <[email protected]>

jwnrt force-pushed the jw/spi-device-discard branch from b46c1d6 to 8a9f6c3 Compare October 30, 2025 15:59

jwnrt requested a review from rivos-eblot October 30, 2025 16:00

rivos-eblot approved these changes Oct 30, 2025

View reviewed changes

AlexJones0 approved these changes Oct 30, 2025

View reviewed changes

jwnrt merged commit bd6d66f into lowRISC:ot-9.2.0 Oct 30, 2025
12 of 13 checks passed

jwnrt deleted the jw/spi-device-discard branch October 30, 2025 23:59

AlexJones0 mentioned this pull request Nov 3, 2025

[qemu] Implement opentitantool qemu transport lowRISC/opentitan#27909

Open

8 tasks

ot_spi_device: discard all packets for a failed transaction #263

ot_spi_device: discard all packets for a failed transaction #263

Uh oh!

Conversation

jwnrt commented Oct 28, 2025

Uh oh!

rivos-eblot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jwnrt commented Oct 28, 2025

Uh oh!

rivos-eblot commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jwnrt commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rivos-eblot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexJones0 left a comment

Choose a reason for hiding this comment

Uh oh!

jwnrt commented Oct 29, 2025

Uh oh!

rivos-eblot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jwnrt commented Oct 29, 2025

Uh oh!

rivos-eblot commented Oct 29, 2025

Uh oh!

jwnrt commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

engdoreis commented Oct 29, 2025

Uh oh!

rivos-eblot commented Oct 29, 2025

Uh oh!

ziuziakowska commented Oct 29, 2025

Uh oh!

jwnrt commented Oct 30, 2025

Uh oh!

Uh oh!

rivos-eblot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

`ot_spi_device`: discard all packets for a failed transaction #263

`ot_spi_device`: discard all packets for a failed transaction #263

rivos-eblot commented Oct 28, 2025 •

edited

Loading

rivos-eblot commented Oct 28, 2025 •

edited

Loading

jwnrt commented Oct 28, 2025 •

edited

Loading

rivos-eblot left a comment •

edited

Loading

jwnrt commented Oct 29, 2025 •

edited

Loading