Skip to content

Commit 6a3ea3e

Browse files
Sean ChristophersonKAGA-KOKO
Sean Christopherson
authored andcommitted
x86/entry/64: Do not use RDPID in paranoid entry to accomodate KVM
KVM has an optmization to avoid expensive MRS read/writes on VMENTER/EXIT. It caches the MSR values and restores them either when leaving the run loop, on preemption or when going out to user space. The affected MSRs are not required for kernel context operations. This changed with the recently introduced mechanism to handle FSGSBASE in the paranoid entry code which has to retrieve the kernel GSBASE value by accessing per CPU memory. The mechanism needs to retrieve the CPU number and uses either LSL or RDPID if the processor supports it. Unfortunately RDPID uses MSR_TSC_AUX which is in the list of cached and lazily restored MSRs, which means between the point where the guest value is written and the point of restore, MSR_TSC_AUX contains a random number. If an NMI or any other exception which uses the paranoid entry path happens in such a context, then RDPID returns the random guest MSR_TSC_AUX value. As a consequence this reads from the wrong memory location to retrieve the kernel GSBASE value. Kernel GS is used to for all regular this_cpu_*() operations. If the GSBASE in the exception handler points to the per CPU memory of a different CPU then this has the obvious consequences of data corruption and crashes. As the paranoid entry path is the only place which accesses MSR_TSX_AUX (via RDPID) and the fallback via LSL is not significantly slower, remove the RDPID alternative from the entry path and always use LSL. The alternative would be to write MSR_TSC_AUX on every VMENTER and VMEXIT which would be inflicting massive overhead on that code path. [ tglx: Rewrote changelog ] Fixes: eaad981 ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro") Reported-by: Tom Lendacky <[email protected]> Debugged-by: Tom Lendacky <[email protected]> Suggested-by: Andy Lutomirski <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent 50f6c7d commit 6a3ea3e

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

arch/x86/entry/calling.h

+6-4
Original file line numberDiff line numberDiff line change
@@ -374,12 +374,14 @@ For 32-bit we have the following conventions - kernel is built with
374374
* Fetch the per-CPU GSBASE value for this processor and put it in @reg.
375375
* We normally use %gs for accessing per-CPU data, but we are setting up
376376
* %gs here and obviously can not use %gs itself to access per-CPU data.
377+
*
378+
* Do not use RDPID, because KVM loads guest's TSC_AUX on vm-entry and
379+
* may not restore the host's value until the CPU returns to userspace.
380+
* Thus the kernel would consume a guest's TSC_AUX if an NMI arrives
381+
* while running KVM's run loop.
377382
*/
378383
.macro GET_PERCPU_BASE reg:req
379-
ALTERNATIVE \
380-
"LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
381-
"RDPID \reg", \
382-
X86_FEATURE_RDPID
384+
LOAD_CPU_AND_NODE_SEG_LIMIT \reg
383385
andq $VDSO_CPUNODE_MASK, \reg
384386
movq __per_cpu_offset(, \reg, 8), \reg
385387
.endm

0 commit comments

Comments
 (0)