diff mbox series

[2/2] KVM: Fix rcu splat if vm creation fails

Message ID 1572848879-21011-2-git-send-email-wanpengli@tencent.com (mailing list archive)
State New, archived
Headers show
Series [1/2] KVM: Fix NULL-ptr defer after kvm_create_vm fails | expand

Commit Message

Wanpeng Li Nov. 4, 2019, 6:27 a.m. UTC
From: Wanpeng Li <wanpengli@tencent.com>

Reported by syzkaller:

   =============================
   WARNING: suspicious RCU usage
   -----------------------------
   ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
   
   other info that might help us debug this:

   rcu_scheduler_active = 2, debug_locks = 1
   no locks held by repro_11/12688.
    
   stack backtrace:
   Call Trace:
    dump_stack+0x7d/0xc5
    lockdep_rcu_suspicious+0x123/0x170
    kvm_dev_ioctl+0x9a9/0x1260 [kvm]
    do_vfs_ioctl+0x1a1/0xfb0
    ksys_ioctl+0x6d/0x80
    __x64_sys_ioctl+0x73/0xb0
    do_syscall_64+0x108/0xaa0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

Commit a97b0e773e4 (kvm: call kvm_arch_destroy_vm if vm creation fails)
sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm()
fails, we need to dec this count. Or, we can move the sets refcount after 
kvm_arch_init_vm().

syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=15209b84e00000

Reported-by: syzbot+75475908cd0910f141ee@syzkaller.appspotmail.com
Fixes: a97b0e773e49 ("kvm: call kvm_arch_destroy_vm if vm creation fails")
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paolo Bonzini Nov. 4, 2019, 11:18 a.m. UTC | #1
On 04/11/19 07:27, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> Reported by syzkaller:
> 
>    =============================
>    WARNING: suspicious RCU usage
>    -----------------------------
>    ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
>    
>    other info that might help us debug this:
> 
>    rcu_scheduler_active = 2, debug_locks = 1
>    no locks held by repro_11/12688.
>     
>    stack backtrace:
>    Call Trace:
>     dump_stack+0x7d/0xc5
>     lockdep_rcu_suspicious+0x123/0x170
>     kvm_dev_ioctl+0x9a9/0x1260 [kvm]
>     do_vfs_ioctl+0x1a1/0xfb0
>     ksys_ioctl+0x6d/0x80
>     __x64_sys_ioctl+0x73/0xb0
>     do_syscall_64+0x108/0xaa0
>     entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Commit a97b0e773e4 (kvm: call kvm_arch_destroy_vm if vm creation fails)
> sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm()
> fails, we need to dec this count. Or, we can move the sets refcount after 
> kvm_arch_init_vm().

I don't understand this one, hasn't

        WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));

decreased the conut already?  With your patch the refcount would then
underflow.

Paolo
Wanpeng Li Nov. 4, 2019, 12:16 p.m. UTC | #2
On Mon, 4 Nov 2019 at 19:18, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/11/19 07:27, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > Reported by syzkaller:
> >
> >    =============================
> >    WARNING: suspicious RCU usage
> >    -----------------------------
> >    ./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
> >
> >    other info that might help us debug this:
> >
> >    rcu_scheduler_active = 2, debug_locks = 1
> >    no locks held by repro_11/12688.
> >
> >    stack backtrace:
> >    Call Trace:
> >     dump_stack+0x7d/0xc5
> >     lockdep_rcu_suspicious+0x123/0x170
> >     kvm_dev_ioctl+0x9a9/0x1260 [kvm]
> >     do_vfs_ioctl+0x1a1/0xfb0
> >     ksys_ioctl+0x6d/0x80
> >     __x64_sys_ioctl+0x73/0xb0
> >     do_syscall_64+0x108/0xaa0
> >     entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > Commit a97b0e773e4 (kvm: call kvm_arch_destroy_vm if vm creation fails)
> > sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm()
> > fails, we need to dec this count. Or, we can move the sets refcount after
> > kvm_arch_init_vm().
>
> I don't understand this one, hasn't
>
>         WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>
> decreased the conut already?  With your patch the refcount would then
> underflow.

r = kvm_arch_init_vm(kvm, type);
if (r)
    goto out_err_no_arch_destroy_vm;

out_err_no_disable:
    kvm_arch_destroy_vm(kvm);
    WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
out_err_no_arch_destroy_vm:

So, if kvm_arch_init_vm() fails, we will not execute
WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));

    Wanpeng
Paolo Bonzini Nov. 4, 2019, 12:21 p.m. UTC | #3
On 04/11/19 13:16, Wanpeng Li wrote:
>> I don't understand this one, hasn't
>>
>>         WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>>
>> decreased the conut already?  With your patch the refcount would then
>> underflow.
> 
> r = kvm_arch_init_vm(kvm, type);
> if (r)
>     goto out_err_no_arch_destroy_vm;
> 
> out_err_no_disable:
>     kvm_arch_destroy_vm(kvm);
>     WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> out_err_no_arch_destroy_vm:
> 
> So, if kvm_arch_init_vm() fails, we will not execute
> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));

Uuh of course.  But I'd rather do the opposite: move the refcount_set
earlier so that refcount_dec_and_test can be moved after
no_arch_destroy_vm.  Moving the refcount_set is not strictly necessary,
but avoids the introduction of yet another label.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e22ff63e5b1a..e7a07132cd7f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -650,6 +650,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
 	if (init_srcu_struct(&kvm->irq_srcu))
 		goto out_err_no_irq_srcu;

+	refcount_set(&kvm->users_count, 1);
 	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
 		struct kvm_memslots *slots = kvm_alloc_memslots();

@@ -667,7 +668,6 @@ static struct kvm *kvm_create_vm(unsigned long type)
 			goto out_err_no_arch_destroy_vm;
 	}

-	refcount_set(&kvm->users_count, 1);
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -696,8 +696,8 @@ static struct kvm *kvm_create_vm(unsigned long type)
 	hardware_disable_all();
 out_err_no_disable:
 	kvm_arch_destroy_vm(kvm);
-	WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
 out_err_no_arch_destroy_vm:
+	WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
 	for (i = 0; i < KVM_NR_BUSES; i++)
 		kfree(kvm_get_bus(kvm, i));
 	for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
Wanpeng Li Nov. 4, 2019, 12:25 p.m. UTC | #4
On Mon, 4 Nov 2019 at 20:21, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 04/11/19 13:16, Wanpeng Li wrote:
> >> I don't understand this one, hasn't
> >>
> >>         WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> >>
> >> decreased the conut already?  With your patch the refcount would then
> >> underflow.
> >
> > r = kvm_arch_init_vm(kvm, type);
> > if (r)
> >     goto out_err_no_arch_destroy_vm;
> >
> > out_err_no_disable:
> >     kvm_arch_destroy_vm(kvm);
> >     WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> > out_err_no_arch_destroy_vm:
> >
> > So, if kvm_arch_init_vm() fails, we will not execute
> > WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>
> Uuh of course.  But I'd rather do the opposite: move the refcount_set
> earlier so that refcount_dec_and_test can be moved after
> no_arch_destroy_vm.  Moving the refcount_set is not strictly necessary,
> but avoids the introduction of yet another label.
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index e22ff63e5b1a..e7a07132cd7f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -650,6 +650,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
>         if (init_srcu_struct(&kvm->irq_srcu))
>                 goto out_err_no_irq_srcu;
>
> +       refcount_set(&kvm->users_count, 1);
>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>                 struct kvm_memslots *slots = kvm_alloc_memslots();
>
> @@ -667,7 +668,6 @@ static struct kvm *kvm_create_vm(unsigned long type)
>                         goto out_err_no_arch_destroy_vm;
>         }
>
> -       refcount_set(&kvm->users_count, 1);
>         r = kvm_arch_init_vm(kvm, type);
>         if (r)
>                 goto out_err_no_arch_destroy_vm;
> @@ -696,8 +696,8 @@ static struct kvm *kvm_create_vm(unsigned long type)
>         hardware_disable_all();
>  out_err_no_disable:
>         kvm_arch_destroy_vm(kvm);
> -       WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>  out_err_no_arch_destroy_vm:
> +       WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>         for (i = 0; i < KVM_NR_BUSES; i++)
>                 kfree(kvm_get_bus(kvm, i));
>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)

Good point, I will handle these two patches later.

    Wanpeng
Paolo Bonzini Nov. 4, 2019, 12:26 p.m. UTC | #5
On 04/11/19 13:25, Wanpeng Li wrote:
> On Mon, 4 Nov 2019 at 20:21, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 04/11/19 13:16, Wanpeng Li wrote:
>>>> I don't understand this one, hasn't
>>>>
>>>>         WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>>>>
>>>> decreased the conut already?  With your patch the refcount would then
>>>> underflow.
>>>
>>> r = kvm_arch_init_vm(kvm, type);
>>> if (r)
>>>     goto out_err_no_arch_destroy_vm;
>>>
>>> out_err_no_disable:
>>>     kvm_arch_destroy_vm(kvm);
>>>     WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>>> out_err_no_arch_destroy_vm:
>>>
>>> So, if kvm_arch_init_vm() fails, we will not execute
>>> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>>
>> Uuh of course.  But I'd rather do the opposite: move the refcount_set
>> earlier so that refcount_dec_and_test can be moved after
>> no_arch_destroy_vm.  Moving the refcount_set is not strictly necessary,
>> but avoids the introduction of yet another label.
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index e22ff63e5b1a..e7a07132cd7f 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -650,6 +650,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
>>         if (init_srcu_struct(&kvm->irq_srcu))
>>                 goto out_err_no_irq_srcu;
>>
>> +       refcount_set(&kvm->users_count, 1);
>>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
>>                 struct kvm_memslots *slots = kvm_alloc_memslots();
>>
>> @@ -667,7 +668,6 @@ static struct kvm *kvm_create_vm(unsigned long type)
>>                         goto out_err_no_arch_destroy_vm;
>>         }
>>
>> -       refcount_set(&kvm->users_count, 1);
>>         r = kvm_arch_init_vm(kvm, type);
>>         if (r)
>>                 goto out_err_no_arch_destroy_vm;
>> @@ -696,8 +696,8 @@ static struct kvm *kvm_create_vm(unsigned long type)
>>         hardware_disable_all();
>>  out_err_no_disable:
>>         kvm_arch_destroy_vm(kvm);
>> -       WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>>  out_err_no_arch_destroy_vm:
>> +       WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>>         for (i = 0; i < KVM_NR_BUSES; i++)
>>                 kfree(kvm_get_bus(kvm, i));
>>         for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
> 
> Good point, I will handle these two patches later.

No problem, I can send v2 after testing.

Paolo
diff mbox series

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d6f0696..62ae0c9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -662,11 +662,11 @@  static struct kvm *kvm_create_vm(unsigned long type)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	refcount_set(&kvm->users_count, 1);
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
 
+	refcount_set(&kvm->users_count, 1);
 	r = hardware_enable_all();
 	if (r)
 		goto out_err_no_disable;