Skip to content

Conversation

@sunceping
Copy link
Owner

@sunceping sunceping commented Dec 13, 2024

Refer to intel-tdx-module-api spec section 5.5.21, GPR mask
(TDVMCALL_EXPOSE_REGS_MASK) is a bitmap that controls which
part of the guest TD GPR and XMM state is passed as-is to
the VMM and back.

  • A bit value of 0 indicates that the corresponding register
    is saved by the Intel TDX module and not passed as-is to
    Host VMM.
  • A bit value of 1 indicates that the corresponding register
    is passed as-is to the host VMM.

Currently, RBX is used as the mailbox address in ApRunLoop.nasm,
the corresponding bit value of RBX in MASK(Bit 3) is set as 1 which
means the value is passed to Host VMM as-is and it can be changed by
Host VMM.

So the bitmask shall be set as 0 to avoid this situation.

Reference:
[TDX-API]: intel-tdx-module-abi-spec
https://cdrdv2.intel.com/v1/dl/getContent/733579

Cc: Erdem Aktas [email protected]
Cc: Jiewen Yao [email protected]
Cc: Min Xu [email protected]
Cc: Gerd Hoffmann [email protected]
Cc: Elena Reshetova [email protected]

Description

<Include a description of the change and why this change was made.>

<For each item, place an "x" in between [ and ] if true. Example: [x] (you can also check items in GitHub UI)>

<Create the PR as a Draft PR if it is only created to run CI checks.>

<Delete lines in <> tags before creating the PR.>

  • Breaking change?
    • Breaking change - Does this PR cause a break in build or boot behavior?
    • Examples: Does it add a new library class or move a module to a different repo.
  • Impacts security?
    • Security - Does this PR have a direct security impact?
    • Examples: Crypto algorithm change or buffer overflow fix.
  • Includes tests?
    • Tests - Does this PR include any explicit test code?
    • Examples: Unit tests or integration tests.

How This Was Tested

Tested on intel platform without problem.

Integration Instructions

<Describe how these changes should be integrated. Use N/A if nothing is required.>

@sunceping sunceping force-pushed the updateMaskinApRunLoop.v1 branch from b296ba0 to b5a6345 Compare December 13, 2024 02:46
Refer to intel-tdx-module-api spec section 5.5.21, GPR mask
(TDVMCALL_EXPOSE_REGS_MASK) is a bitmap that controls which
part of the guest TD GPR and XMM state is passed as-is to
the VMM and back.
- A bit value of 0 indicates that the corresponding register
   is saved by the Intel TDX module and not passed as-is to
   Host VMM.
- A bit value of 1 indicates that the corresponding register
   is passed as-is to the host VMM.

Currently, RBX is used as the mailbox address in ApRunLoop.nasm,
the corresponding bit value of RBX in MASK(Bit 3) is set as 1 which
means the value is passed to Host VMM as-is and it can be changed by
Host VMM.

So the bitmask shall be set as 0 to avoid this situation.

Reference:
[TDX-API]: intel-tdx-module-abi-spec
https://cdrdv2.intel.com/v1/dl/getContent/733579

Cc: Erdem Aktas <[email protected]>
Cc: Jiewen Yao <[email protected]>
Cc: Min Xu <[email protected]>
Cc: Gerd Hoffmann <[email protected]>
Cc: Hunter Adrian <[email protected]>
Signed-off-by: Ceping Sun <[email protected]>
@sunceping sunceping force-pushed the updateMaskinApRunLoop.v1 branch from b5a6345 to 10b9439 Compare December 13, 2024 02:59
@sunceping sunceping closed this Dec 13, 2024
@sunceping sunceping reopened this Jun 17, 2025
@sunceping sunceping closed this Jun 17, 2025
sunceping pushed a commit that referenced this pull request Jul 1, 2025
…eEntry

This patch introduces a synchronization point between the BSP and APs to
ensure all APs have entered their SMM wait-loop (while (TRUE) in APHandler
()) before the BSP calls into the SMI handler logic via gSmmCpuPrivate
->SmmCoreEntry().

Previously, the BSP would invoke ReleaseAllAPs() and immediately proceed
to SmmCoreEntry() without confirming whether APs had reached the stable
waiting state. If SmmStartupThisAp() was called inside the SMI handler
shortly after ReleaseAllAPs(), it might lead to a race condition:
APs are issued two consecutive wait signals (SmmCpuSyncWaitForBsp()).
BSP sends two consecutive releases (ReleaseAllAPs() + SmmStartupThisAp())
If an AP has not yet responded to the first release, the second release may
overwrite the semaphore state, and the AP might miss the notification,
causing it to hang or behave unpredictably.

To address this:
A SmmCpuSyncWaitForAPs() is added in BSP after
mmCpuPlatformHookBeforeMmiHandler() and before entering SmmCoreEntry().
A matching SmmCpuSyncReleaseBsp() is added in AP immediately after its own
SmmCpuPlatformHookBeforeMmiHandler()

This ensures that BSP does not enter SMI handler logic or dispatch any
AP-related requests before all APs are confirmed to be idle and ready.
Debug sync point markers (e.g., /// #6, #7) are updated accordingly.
This change eliminates a subtle but critical race condition in
multi-processor/multi-socket systems during SMM entry and improves overall
synchronization safety.

Signed-off-by: Wei6 Xu <[email protected]>
mergify bot pushed a commit that referenced this pull request Jul 2, 2025
…eEntry

This patch introduces a synchronization point between the BSP and APs to
ensure all APs have entered their SMM wait-loop (while (TRUE) in APHandler
()) before the BSP calls into the SMI handler logic via gSmmCpuPrivate
->SmmCoreEntry().

Previously, the BSP would invoke ReleaseAllAPs() and immediately proceed
to SmmCoreEntry() without confirming whether APs had reached the stable
waiting state. If SmmStartupThisAp() was called inside the SMI handler
shortly after ReleaseAllAPs(), it might lead to a race condition:
APs are issued two consecutive wait signals (SmmCpuSyncWaitForBsp()).
BSP sends two consecutive releases (ReleaseAllAPs() + SmmStartupThisAp())
If an AP has not yet responded to the first release, the second release may
overwrite the semaphore state, and the AP might miss the notification,
causing it to hang or behave unpredictably.

To address this:
A SmmCpuSyncWaitForAPs() is added in BSP after
mmCpuPlatformHookBeforeMmiHandler() and before entering SmmCoreEntry().
A matching SmmCpuSyncReleaseBsp() is added in AP immediately after its own
SmmCpuPlatformHookBeforeMmiHandler()

This ensures that BSP does not enter SMI handler logic or dispatch any
AP-related requests before all APs are confirmed to be idle and ready.
Debug sync point markers (e.g., /// #6, #7) are updated accordingly.
This change eliminates a subtle but critical race condition in
multi-processor/multi-socket systems during SMM entry and improves overall
synchronization safety.

Signed-off-by: Wei6 Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants