diff mbox

KVM: X86: Fix preempt the preemption timer cancel

Message ID 1495337552-78885-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li May 21, 2017, 3:32 a.m. UTC
From: Wanpeng Li <wanpeng.li@hotmail.com>

 WARNING: CPU: 3 PID: 1952 at arch/x86/kvm/lapic.c:1529 kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
 CPU: 3 PID: 1952 Comm: qemu-system-x86 Not tainted 4.12.0-rc1+ #24 RIP: 0010:kvm_lapic_expired_hv_timer+0xb5/0xd0 [kvm]
  Call Trace:
  handle_preemption_timer+0xe/0x20 [kvm_intel]
  vmx_handle_exit+0xc9/0x15f0 [kvm_intel]
  ? lock_acquire+0xdb/0x250
  ? lock_acquire+0xdb/0x250
  ? kvm_arch_vcpu_ioctl_run+0xdf3/0x1ce0 [kvm]
  kvm_arch_vcpu_ioctl_run+0xe55/0x1ce0 [kvm]
  kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm]
  ? __fget+0xf3/0x210
  do_vfs_ioctl+0xa4/0x700
  ? __fget+0x114/0x210
  SyS_ioctl+0x79/0x90
  do_syscall_64+0x8f/0x750
  ? trace_hardirqs_on_thunk+0x1a/0x1c
  entry_SYSCALL64_slow_path+0x25/0x25
 
This can be reproduced sporadically during boot L2 on a preemptible L1, and 
splat on L1.

          CPU0                              CPU1 

vmx_cancel_hv_timer
  vCPU0's vmx->hv_deadline_tsc = -1

  preempt occur

                                     clear preemption timer field in CPU1's active vmcs
                                     vCPU0's apic_timer.hv_timer_in_use = false
vmx_vcpu_run(vCPU0)
  vmx_arm_hv_timer
    if (vmx->hv_deadline_tsc == -1)
	  nothing change
	 
handle_preemption_timer(vCPU0)
  kvm_lapic_expired_hv_timer
    WARN_ON(!apic->lapic_timer.hv_timer_in_use); 
  
Preemption can occur during cancel preemption timer, and there will be inconsistent 
status in lapic, vmx and vmcs field. This patch fixes it by disable preemption for 
cancelling preemption timer.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Paolo Bonzini May 26, 2017, 3:45 p.m. UTC | #1
On 21/05/2017 05:32, Wanpeng Li wrote:
> vmx_cancel_hv_timer
>   vCPU0's vmx->hv_deadline_tsc = -1
> 
>   preempt occur
> 
>                                      clear preemption timer field in CPU1's active vmcs
>                                      vCPU0's apic_timer.hv_timer_in_use = false
> vmx_vcpu_run(vCPU0)
>   vmx_arm_hv_timer
>     if (vmx->hv_deadline_tsc == -1)
> 	  nothing change
> 	 
> handle_preemption_timer(vCPU0)
>   kvm_lapic_expired_hv_timer
>     WARN_ON(!apic->lapic_timer.hv_timer_in_use); 
>   
> Preemption can occur during cancel preemption timer, and there will be inconsistent 
> status in lapic, vmx and vmcs field. This patch fixes it by disable preemption for 
> cancelling preemption timer.

I see, so the purpose is to serialize against kvm_arch_vcpu_load.  Nice
catch, I've queued the patch for kvm/master.

Paolo

> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kvm/lapic.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index c329d28..6e6f345 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1495,8 +1495,10 @@ EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);
>  
>  static void cancel_hv_timer(struct kvm_lapic *apic)
>  {
> +	preempt_disable();
>  	kvm_x86_ops->cancel_hv_timer(apic->vcpu);
>  	apic->lapic_timer.hv_timer_in_use = false;
> +	preempt_enable();
>  }
>  
>  static bool start_hv_timer(struct kvm_lapic *apic)
> --
Paolo Bonzini May 26, 2017, 3:57 p.m. UTC | #2
On 21/05/2017 05:32, Wanpeng Li wrote:
>           CPU0                              CPU1 
> 
> vmx_cancel_hv_timer
>   vCPU0's vmx->hv_deadline_tsc = -1
> 
>   preempt occur
> 
>                                      clear preemption timer field in CPU1's active vmcs
>                                      vCPU0's apic_timer.hv_timer_in_use = false
> vmx_vcpu_run(vCPU0)
>   vmx_arm_hv_timer
>     if (vmx->hv_deadline_tsc == -1)
> 	  nothing change
> 	 
> handle_preemption_timer(vCPU0)
>   kvm_lapic_expired_hv_timer
>     WARN_ON(!apic->lapic_timer.hv_timer_in_use); 


I think it's more like this, what do you think?

          CPU0                    CPU1

  preemption timer vmexit
  handle_preemption_timer(vCPU0)
    kvm_lapic_expired_hv_timer
      vmx_cancel_hv_timer
        vmx->hv_deadline_tsc = -1
        vmcs_clear_bits
        /* hv_timer_in_use still true */
  sched_out
                           sched_in
                           kvm_arch_vcpu_load
                             vmx_set_hv_timer
                               write vmx->hv_deadline_tsc
                               vmcs_set_bits
                           /* back in kvm_lapic_expired_hv_timer */
                           hv_timer_in_use = false
                           ...
                           vmx_vcpu_run
                             vmx_arm_hv_run
                               write preemption timer deadline
                             spurious preemption timer vmexit
                               handle_preemption_timer(vCPU0)
                                 kvm_lapic_expired_hv_timer
                                   WARN_ON(!apic->lapic_timer.hv_timer_in_use);


Thanks,

Paolo
Wanpeng Li May 26, 2017, 10:07 p.m. UTC | #3
2017-05-26 23:57 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 21/05/2017 05:32, Wanpeng Li wrote:
>>           CPU0                              CPU1
>>
>> vmx_cancel_hv_timer
>>   vCPU0's vmx->hv_deadline_tsc = -1
>>
>>   preempt occur
>>
>>                                      clear preemption timer field in CPU1's active vmcs
>>                                      vCPU0's apic_timer.hv_timer_in_use = false
>> vmx_vcpu_run(vCPU0)
>>   vmx_arm_hv_timer
>>     if (vmx->hv_deadline_tsc == -1)
>>         nothing change
>>
>> handle_preemption_timer(vCPU0)
>>   kvm_lapic_expired_hv_timer
>>     WARN_ON(!apic->lapic_timer.hv_timer_in_use);
>
>
> I think it's more like this, what do you think?
>
>           CPU0                    CPU1
>
>   preemption timer vmexit
>   handle_preemption_timer(vCPU0)
>     kvm_lapic_expired_hv_timer
>       vmx_cancel_hv_timer
>         vmx->hv_deadline_tsc = -1
>         vmcs_clear_bits
>         /* hv_timer_in_use still true */
>   sched_out
>                            sched_in
>                            kvm_arch_vcpu_load
>                              vmx_set_hv_timer
>                                write vmx->hv_deadline_tsc
>                                vmcs_set_bits
>                            /* back in kvm_lapic_expired_hv_timer */
>                            hv_timer_in_use = false
>                            ...
>                            vmx_vcpu_run
>                              vmx_arm_hv_run
>                                write preemption timer deadline
>                              spurious preemption timer vmexit
>                                handle_preemption_timer(vCPU0)
>                                  kvm_lapic_expired_hv_timer
>                                    WARN_ON(!apic->lapic_timer.hv_timer_in_use);

Looks good to me, thanks for your help, Paolo. :)

Regards,
Wanpeng Li
diff mbox

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c329d28..6e6f345 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1495,8 +1495,10 @@  EXPORT_SYMBOL_GPL(kvm_lapic_hv_timer_in_use);
 
 static void cancel_hv_timer(struct kvm_lapic *apic)
 {
+	preempt_disable();
 	kvm_x86_ops->cancel_hv_timer(apic->vcpu);
 	apic->lapic_timer.hv_timer_in_use = false;
+	preempt_enable();
 }
 
 static bool start_hv_timer(struct kvm_lapic *apic)