Message ID | 20220424070951.106990-2-darcy.sh@antgroup.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [kvm-unit-tests,1/2] x86: replace `push` `pop` with callee-clobbered list | expand |
On Sun, Apr 24, 2022, SU Hang wrote: Why? As gross as it is, I actually think INTn is a better option because it doesn't require writing multiple MSRs, and can work for both 64-bit and 32-bit KUT. The latter is currently a moot point since this code is 64-bit only, but the UMIP test _does_ support 32-bit, and it's do_ring3() should really be rolled into this framework. Furthermore, we really should have a test to verify that KVM correctly emulates SYSCALL at CPL3 with EFER.SCE=0, and forcing EFER.SCE=1 just to get to CPL3 would make it impossible to utilize this framework for such a test. > Signed-off-by: SU Hang <darcy.sh@antgroup.com> > --- > lib/x86/usermode.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c > index 477cb9f..e4cb899 100644 > --- a/lib/x86/usermode.c > +++ b/lib/x86/usermode.c > @@ -12,7 +12,6 @@ > #include <stdint.h> > > #define USERMODE_STACK_SIZE 0x2000 > -#define RET_TO_KERNEL_IRQ 0x20 > > static jmp_buf jmpbuf; > > @@ -40,9 +39,11 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, > static unsigned char user_stack[USERMODE_STACK_SIZE]; > > *raised_vector = 0; > - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); > handle_exception(fault_vector, > restore_exec_to_jmpbuf_exception_handler); > + wrmsr(MSR_EFER, rdmsr(MSR_EFER) | EFER_SCE); Tangentially related, KUT should really explicitly initialize EFER.SCE during boot. cstart64.S leaves it alone, which means KUT has no defined value for EFER.SCE. One thought would be to set EFER.SCE=1 during boot (in a separate patch) and remove the code from the syscall test that forces EFER.SCE. AFAICT, no existing test verifies that KVM injects #UD on SYSCALL without EFER.SCE set, though it would be nice to add one. > + wrmsr(MSR_STAR, ((u64)(USER_CS32 << 16) | KERNEL_CS) << 32); It doesn't matter at this time because this framework doesn't ses SYSRET, but this should be USER_CS or USER_CS64. > + wrmsr(MSR_LSTAR, (u64)&ret_to_kernel); > > if (setjmp(jmpbuf) != 0) { > *raised_vector = 1; > @@ -73,7 +74,7 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, > "mov %[arg4], %%rcx\n\t" > "call *%[func]\n\t" > /* Return to kernel via system call */ > - "int %[kernel_entry_vector]\n\t" > + "syscall\n\t" > /* Kernel Mode */ > "ret_to_kernel:\n\t" > "mov %[rsp0], %%rsp\n\t" > @@ -89,8 +90,7 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, > [user_ds]"i"(USER_DS), > [user_cs]"i"(USER_CS), > [user_stack_top]"r"(user_stack + > - sizeof(user_stack)), > - [kernel_entry_vector]"i"(RET_TO_KERNEL_IRQ) > + sizeof(user_stack)) > : > "rsi", "rdi", "rbx", "rcx", "rdx", "r8", "r9", "r10", "r11"); > > -- > 2.32.0.3.g01195cf9f >
On Mon, Apr 25, 2022, Sean Christopherson wrote: > On Sun, Apr 24, 2022, SU Hang wrote: > > Why? As gross as it is, I actually think INTn is a better option because it > doesn't require writing multiple MSRs, and can work for both 64-bit and 32-bit KUT. > The latter is currently a moot point since this code is 64-bit only, but the UMIP > test _does_ support 32-bit, and it's do_ring3() should really be rolled into this > framework. > > Furthermore, we really should have a test to verify that KVM correctly emulates > SYSCALL at CPL3 with EFER.SCE=0, and forcing EFER.SCE=1 just to get to CPL3 would > make it impossible to utilize this framework for such a test. And a concrete reason not to apply this patch: it causes the nVMX #AC test to fail: FAIL: #AC handled by L2 FAIL: #AC handled by L1
> Why? We are implementing a para-virtualization hypervisor, which doesn't allow guest to trigger soft interrupt > 0x20(but `int 0x80` works fine), so I want to replace `int 0x20` with a more common `syscall`. > it's do_ring3() should really be rolled into this framework. Yes, it is worth working on it, I'll do it in my part-time. > no existing test verifies that KVM injects #UD on SYSCALL without EFER.SCE > set, though it would be nice to add one. I am also interested in it, maybe do it later. > > + wrmsr(MSR_STAR, ((u64)(USER_CS32 << 16) | KERNEL_CS) << 32); > It doesn't matter at this time because this framework doesn't ses SYSRET, but > this should be USER_CS or USER_CS64. Oops, intel SDM vol.3 <chap 5.8.8> says: """ When SYSRET transfers control to 64-bit mode user code using REX.W, the processor gets the privilege level 3 target code segment, instruction pointer, stack segment, and flags as follows: • Target code segment — Reads a non-NULL selector from IA32_STAR[63:48] + 16. • Stack segment — IA32_STAR[63:48] + 8. """ Since the value of USER_CS is 0x4b in 64 bit mode, SS register points to 0x53 = 0x4b + 8, (offset is 0x50) But `gdt + offset(0x50)` hasn't been setup(so does DS register). > refs: https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/blob/master/lib/x86/desc.c#L34 > refs: https://gitlab.com/kvm-unit-tests/kvm-unit-tests/-/blob/master/x86/syscall.c#L68 Linux also does so, the reason is to reuse user segment descriptor in both 32/64 bit. > refs1: https://github.com/torvalds/linux/blob/46cf2c613f4b10eb12f749207b0fd2c1bfae3088/arch/x86/kernel/cpu/common.c#L1942 > refs2: https://github.com/torvalds/linux/blob/46cf2c613f4b10eb12f749207b0fd2c1bfae3088/arch/x86/include/asm/segment.h#L211 > refs3: https://github.com/torvalds/linux/blob/46cf2c613f4b10eb12f749207b0fd2c1bfae3088/arch/x86/kernel/cpu/common.c#L216 > And a concrete reason not to apply this patch: it causes the nVMX #AC test to fail: It's awkward, some KUT test cases results diffs on my different machines, which makes me don't know which result I could trust, so I only pay attention to the test cases that I care about. I'll keep an eye on the rest cases in the future.
On Wed, Apr 27, 2022, SU Hang wrote: > > It doesn't matter at this time because this framework doesn't ses SYSRET, but > > this should be USER_CS or USER_CS64. > Oops, intel SDM vol.3 <chap 5.8.8> says: > """ > When SYSRET transfers control to 64-bit mode user code using REX.W, the > processor gets the privilege level 3 target code segment, instruction pointer, > stack segment, and flags as follows: > • Target code segment — Reads a non-NULL selector from IA32_STAR[63:48] + 16. Ugh, missed that detail. > • Stack segment — IA32_STAR[63:48] + 8. > """
diff --git a/lib/x86/usermode.c b/lib/x86/usermode.c index 477cb9f..e4cb899 100644 --- a/lib/x86/usermode.c +++ b/lib/x86/usermode.c @@ -12,7 +12,6 @@ #include <stdint.h> #define USERMODE_STACK_SIZE 0x2000 -#define RET_TO_KERNEL_IRQ 0x20 static jmp_buf jmpbuf; @@ -40,9 +39,11 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, static unsigned char user_stack[USERMODE_STACK_SIZE]; *raised_vector = 0; - set_idt_entry(RET_TO_KERNEL_IRQ, &ret_to_kernel, 3); handle_exception(fault_vector, restore_exec_to_jmpbuf_exception_handler); + wrmsr(MSR_EFER, rdmsr(MSR_EFER) | EFER_SCE); + wrmsr(MSR_STAR, ((u64)(USER_CS32 << 16) | KERNEL_CS) << 32); + wrmsr(MSR_LSTAR, (u64)&ret_to_kernel); if (setjmp(jmpbuf) != 0) { *raised_vector = 1; @@ -73,7 +74,7 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, "mov %[arg4], %%rcx\n\t" "call *%[func]\n\t" /* Return to kernel via system call */ - "int %[kernel_entry_vector]\n\t" + "syscall\n\t" /* Kernel Mode */ "ret_to_kernel:\n\t" "mov %[rsp0], %%rsp\n\t" @@ -89,8 +90,7 @@ uint64_t run_in_user(usermode_func func, unsigned int fault_vector, [user_ds]"i"(USER_DS), [user_cs]"i"(USER_CS), [user_stack_top]"r"(user_stack + - sizeof(user_stack)), - [kernel_entry_vector]"i"(RET_TO_KERNEL_IRQ) + sizeof(user_stack)) : "rsi", "rdi", "rbx", "rcx", "rdx", "r8", "r9", "r10", "r11");
Signed-off-by: SU Hang <darcy.sh@antgroup.com> --- lib/x86/usermode.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)