Patchwork [1/2] KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume

login
register
mail settings
Submitter Wanpeng Li
Date Sept. 13, 2017, 11:03 a.m.
Message ID <1505300602-7236-1-git-send-email-wanpeng.li@hotmail.com>
Download mbox | patch
Permalink /patch/9951069/
State New
Headers show

Comments

Wanpeng Li - Sept. 13, 2017, 11:03 a.m.
From: Wanpeng Li <wanpeng.li@hotmail.com>

------------[ cut here ]------------
 WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
 RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 Call Trace:
  ? emulator_read_emulated+0x15/0x20 [kvm]
  ? segmented_read+0xae/0xf0 [kvm]
  vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  x86_emulate_instruction+0x733/0x810 [kvm]
  vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
  ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
  kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x23/0xc2

A nested #PF is triggered during L0 emulating instruction for L2. However, it 
doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes 
it by queuing the #PF exception instead ,requesting an immediate VM exit from 
L2 and keeping the exception for L1 pending for a subsequent nested VM exit.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/vmx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
Paolo Bonzini - Sept. 13, 2017, 9:45 p.m.
On 13/09/2017 13:03, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> ------------[ cut here ]------------
>  WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
>  CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
>  RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
>  Call Trace:
>   ? emulator_read_emulated+0x15/0x20 [kvm]
>   ? segmented_read+0xae/0xf0 [kvm]
>   vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
>   ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
>   x86_emulate_instruction+0x733/0x810 [kvm]
>   vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
>   ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
>   kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
>   ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
>   kvm_vcpu_ioctl+0x340/0x700 [kvm]
>   ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
>   ? __fget+0xfc/0x210
>   do_vfs_ioctl+0xa4/0x6a0
>   ? __fget+0x11d/0x210
>   SyS_ioctl+0x79/0x90
>   entry_SYSCALL_64_fastpath+0x23/0xc2
> 
> A nested #PF is triggered during L0 emulating instruction for L2. However, it 
> doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes 
> it by queuing the #PF exception instead ,requesting an immediate VM exit from 
> L2 and keeping the exception for L1 pending for a subsequent nested VM exit.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kvm/vmx.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 4253ade..fda9dd6 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9829,7 +9829,8 @@ static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu,
>  
>  	WARN_ON(!is_guest_mode(vcpu));
>  
> -	if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code)) {
> +	if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code) &&
> +		!to_vmx(vcpu)->nested.nested_run_pending) {
>  		vmcs12->vm_exit_intr_error_code = fault->error_code;
>  		nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
>  				  PF_VECTOR | INTR_TYPE_HARD_EXCEPTION |
> 

Is vmx_inject_page_fault_nested even needed at all these days?

kvm_inject_page_fault's call to kvm_queue_exception_e should transform
into an L2->L1 vmexit when vmx_check_nested_events is called.

Paolo
Wanpeng Li - Sept. 23, 2017, 12:51 a.m.
2017-09-15 19:26 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 15/09/2017 05:48, Wanpeng Li wrote:
>> 2017-09-14 5:45 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>> On 13/09/2017 13:03, Wanpeng Li wrote:
>>>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>>>
>>>> ------------[ cut here ]------------
>>>>  WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
>>>>  CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
>>>>  RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
>>>>  Call Trace:
>>>>   ? emulator_read_emulated+0x15/0x20 [kvm]
>>>>   ? segmented_read+0xae/0xf0 [kvm]
>>>>   vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
>>>>   ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
>>>>   x86_emulate_instruction+0x733/0x810 [kvm]
>>>>   vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
>>>>   ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
>>>>   kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
>>>>   ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
>>>>   kvm_vcpu_ioctl+0x340/0x700 [kvm]
>>>>   ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
>>>>   ? __fget+0xfc/0x210
>>>>   do_vfs_ioctl+0xa4/0x6a0
>>>>   ? __fget+0x11d/0x210
>>>>   SyS_ioctl+0x79/0x90
>>>>   entry_SYSCALL_64_fastpath+0x23/0xc2
>>>>
>>>> A nested #PF is triggered during L0 emulating instruction for L2. However, it
>>>> doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes
>>>> it by queuing the #PF exception instead ,requesting an immediate VM exit from
>>>> L2 and keeping the exception for L1 pending for a subsequent nested VM exit.
>>>>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>>>> ---
>>>>  arch/x86/kvm/vmx.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index 4253ade..fda9dd6 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -9829,7 +9829,8 @@ static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu,
>>>>
>>>>       WARN_ON(!is_guest_mode(vcpu));
>>>>
>>>> -     if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code)) {
>>>> +     if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code) &&
>>>> +             !to_vmx(vcpu)->nested.nested_run_pending) {
>>>>               vmcs12->vm_exit_intr_error_code = fault->error_code;
>>>>               nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
>>>>                                 PF_VECTOR | INTR_TYPE_HARD_EXCEPTION |
>>>>
>>>
>>> Is vmx_inject_page_fault_nested even needed at all these days?
>>>
>>> kvm_inject_page_fault's call to kvm_queue_exception_e should transform
>>> into an L2->L1 vmexit when vmx_check_nested_events is called.
>>
>> After more investigation, this will break the original goal of what
>> vmx_inject_page_fault_nested() tries to fix.
>> http://www.spinics.net/lists/kvm/msg96579.html
>
> Right!  I think I have a generic patch for the same issue that Gleb
> solved there.  We can fill in the IDT vectoring info early in the
> vmexit, so that the L1 vmexit can overwrite the L2 exception easily.

Maybe my commit can be merged for the moment I think.

Regards,
Wanpeng Li

Patch

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4253ade..fda9dd6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9829,7 +9829,8 @@  static void vmx_inject_page_fault_nested(struct kvm_vcpu *vcpu,
 
 	WARN_ON(!is_guest_mode(vcpu));
 
-	if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code)) {
+	if (nested_vmx_is_page_fault_vmexit(vmcs12, fault->error_code) &&
+		!to_vmx(vcpu)->nested.nested_run_pending) {
 		vmcs12->vm_exit_intr_error_code = fault->error_code;
 		nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI,
 				  PF_VECTOR | INTR_TYPE_HARD_EXCEPTION |