diff mbox

KVM: ioapic: fix NULL deref ioapic->lock

Message ID 1483242289-12323-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wanpeng Li Jan. 1, 2017, 3:44 a.m. UTC
From: Wanpeng Li <wanpeng.li@hotmail.com>

This was reported by syzkaller:

BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
IP: _raw_spin_lock+0xc/0x30
PGD 3e28eb067
PUD 3f0ac6067
PMD 0
Oops: 0002 [#1] SMP
CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
Call Trace:
 ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
 kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
 ? pick_next_task_fair+0xe1/0x4e0
 ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
 kvm_vcpu_ioctl+0x33a/0x600 [kvm]
 ? hrtimer_try_to_cancel+0x29/0x130
 ? do_nanosleep+0x97/0xf0
 do_vfs_ioctl+0xa1/0x5d0
 ? __hrtimer_init+0x90/0x90
 ? do_nanosleep+0x5b/0xf0
 SyS_ioctl+0x79/0x90
 do_syscall_64+0x6e/0x180
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0

KVM will skip over create pic/ioapic if there is a created vCPU. However, 
there is no guarantee whether ioapic is present when rescan ioapic which 
results in NULL dereference ioapic->lock. This patch fix it by adding the 
ioapic present check to ioapic scan.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/x86.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini Jan. 2, 2017, 10:09 a.m. UTC | #1
On 01/01/2017 04:44, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> This was reported by syzkaller:
> 
> BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
> IP: _raw_spin_lock+0xc/0x30
> PGD 3e28eb067
> PUD 3f0ac6067
> PMD 0
> Oops: 0002 [#1] SMP
> CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
> Call Trace:
>  ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
>  kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
>  ? pick_next_task_fair+0xe1/0x4e0
>  ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
>  kvm_vcpu_ioctl+0x33a/0x600 [kvm]
>  ? hrtimer_try_to_cancel+0x29/0x130
>  ? do_nanosleep+0x97/0xf0
>  do_vfs_ioctl+0xa1/0x5d0
>  ? __hrtimer_init+0x90/0x90
>  ? do_nanosleep+0x5b/0xf0
>  SyS_ioctl+0x79/0x90
>  do_syscall_64+0x6e/0x180
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0
> 
> KVM will skip over create pic/ioapic if there is a created vCPU. However, 
> there is no guarantee whether ioapic is present when rescan ioapic which 
> results in NULL dereference ioapic->lock. This patch fix it by adding the 
> ioapic present check to ioapic scan.
> 
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kvm/x86.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 51ccfe0..9ca175c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6556,7 +6556,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>  	else {
>  		if (vcpu->arch.apicv_active)
>  			kvm_x86_ops->sync_pir_to_irr(vcpu);
> -		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
> +		if (ioapic_irqchip(vcpu->kvm))
> +			kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>  	}
>  	bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
>  		  vcpu_to_synic(vcpu)->vec_bitmap, 256);

Commit message for fuzzing bugs have usually included a beautified
reproducer.  However, you are not even saying if it is a race, or it is
deterministic.

The fix seems wrong to me at first impression, because "LAPIC enabled"
and "irqchip not split" should imply the existence of an in-kernel
IOAPIC.  However, I cannot suggest the right course of action without
seeing a testcase.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Vyukov Jan. 2, 2017, 10:17 a.m. UTC | #2
On Mon, Jan 2, 2017 at 11:09 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
>
> On 01/01/2017 04:44, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpeng.li@hotmail.com>
> >
> > This was reported by syzkaller:
> >
> > BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
> > IP: _raw_spin_lock+0xc/0x30
> > PGD 3e28eb067
> > PUD 3f0ac6067
> > PMD 0
> > Oops: 0002 [#1] SMP
> > CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
> > Call Trace:
> >  ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
> >  kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
> >  ? pick_next_task_fair+0xe1/0x4e0
> >  ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
> >  kvm_vcpu_ioctl+0x33a/0x600 [kvm]
> >  ? hrtimer_try_to_cancel+0x29/0x130
> >  ? do_nanosleep+0x97/0xf0
> >  do_vfs_ioctl+0xa1/0x5d0
> >  ? __hrtimer_init+0x90/0x90
> >  ? do_nanosleep+0x5b/0xf0
> >  SyS_ioctl+0x79/0x90
> >  do_syscall_64+0x6e/0x180
> >  entry_SYSCALL64_slow_path+0x25/0x25
> > RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0
> >
> > KVM will skip over create pic/ioapic if there is a created vCPU. However,
> > there is no guarantee whether ioapic is present when rescan ioapic which
> > results in NULL dereference ioapic->lock. This patch fix it by adding the
> > ioapic present check to ioapic scan.
> >
> > Reported-by: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> > ---
> >  arch/x86/kvm/x86.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 51ccfe0..9ca175c 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -6556,7 +6556,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
> >       else {
> >               if (vcpu->arch.apicv_active)
> >                       kvm_x86_ops->sync_pir_to_irr(vcpu);
> > -             kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
> > +             if (ioapic_irqchip(vcpu->kvm))
> > +                     kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
> >       }
> >       bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
> >                 vcpu_to_synic(vcpu)->vec_bitmap, 256);
>
> Commit message for fuzzing bugs have usually included a beautified
> reproducer.  However, you are not even saying if it is a race, or it is
> deterministic.
>
> The fix seems wrong to me at first impression, because "LAPIC enabled"
> and "irqchip not split" should imply the existence of an in-kernel
> IOAPIC.  However, I cannot suggest the right course of action without
> seeing a testcase.



I've created a reasonably beautified reproducer here:
https://groups.google.com/forum/#!msg/syzkaller/6V-KXaMDYi8/rOvBl-69DAAJ


FWIW the patch has fixed the crash, but there is another similar one
that happens slightly earlier:

general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17462 Comm: syz-executor4 Not tainted 4.10.0-rc1+ #119
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88003d6a2280 task.stack: ffff8800357e8000
RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline]
RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline]
RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline]
RIP: 0010:vcpu_run arch/x86/kvm/x86.c:6944 [inline]
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
RSP: 0018:ffff8800357ef7d8 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff88003a459170 RCX: ffffc90000a76000
RDX: 0000000000000014 RSI: ffffffff810ec1ce RDI: 00000000000000a0
RBP: ffff8800357efa50 R08: ffff88003d7f2560 R09: 0000000000000000
R10: 38260856151e7596 R11: dffffc0000000000 R12: dffffc0000000000
R13: ffff8800357ef9e0 R14: ffff88003a459140 R15: 0000000000000000
FS:  00007faa53a40700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000003d5c6000 CR4: 00000000000026e0
Call Trace:
 kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
 vfs_ioctl fs/ioctl.c:43 [inline]
 do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
 SYSC_ioctl fs/ioctl.c:698 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x443a19
RSP: 002b:00007faa53a3fb58 EFLAGS: 00000286 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000018 RCX: 0000000000443a19
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
RBP: 00000000006ddb30 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000286 R12: 00000000007000a8
R13: 00007faa53c23328 R14: 00007faa53c25358 R15: 0000000000000003
Code: e0 03 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 82 1a 00
00 49 8b 86 e0 03 00 00 48 8d b8 a0 00 00 00 48 89 fa 48 c1 ea 03 <42>
80 3c 22 00 0f 85 3f 15 00 00 48 8b 80 a0 00 00 00 48 8d b8
RIP: kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline] RSP: ffff8800357ef7d8
RIP: vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline] RSP: ffff8800357ef7d8
RIP: vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline] RSP: ffff8800357ef7d8
RIP: vcpu_run arch/x86/kvm/x86.c:6944 [inline] RSP: ffff8800357ef7d8
RIP: kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
RSP: ffff8800357ef7d8
---[ end trace 72f3e29e9ea09f21 ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled


static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
{
u64 eoi_exit_bitmap[4];

if (!kvm_apic_hw_enabled(vcpu->arch.apic)) // <-----HERE
return;


static inline int kvm_apic_hw_enabled(struct kvm_lapic *apic)
{
if (static_key_false(&apic_hw_disabled.key))
return apic->vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE; // <--- HERE
return MSR_IA32_APICBASE_ENABLE;
}
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Jan. 2, 2017, 6:01 p.m. UTC | #3
On 02/01/2017 11:17, Dmitry Vyukov wrote:
> On Mon, Jan 2, 2017 at 11:09 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>
>>
>> On 01/01/2017 04:44, Wanpeng Li wrote:
>>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>>
>>> This was reported by syzkaller:
>>>
>>> BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
>>> IP: _raw_spin_lock+0xc/0x30
>>> PGD 3e28eb067
>>> PUD 3f0ac6067
>>> PMD 0
>>> Oops: 0002 [#1] SMP
>>> CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
>>> Call Trace:
>>>  ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
>>>  kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
>>>  ? pick_next_task_fair+0xe1/0x4e0
>>>  ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
>>>  kvm_vcpu_ioctl+0x33a/0x600 [kvm]
>>>  ? hrtimer_try_to_cancel+0x29/0x130
>>>  ? do_nanosleep+0x97/0xf0
>>>  do_vfs_ioctl+0xa1/0x5d0
>>>  ? __hrtimer_init+0x90/0x90
>>>  ? do_nanosleep+0x5b/0xf0
>>>  SyS_ioctl+0x79/0x90
>>>  do_syscall_64+0x6e/0x180
>>>  entry_SYSCALL64_slow_path+0x25/0x25
>>> RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0
>>>
>>> KVM will skip over create pic/ioapic if there is a created vCPU. However,
>>> there is no guarantee whether ioapic is present when rescan ioapic which
>>> results in NULL dereference ioapic->lock. This patch fix it by adding the
>>> ioapic present check to ioapic scan.
>>>
>>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>>> ---
>>>  arch/x86/kvm/x86.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 51ccfe0..9ca175c 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -6556,7 +6556,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>>>       else {
>>>               if (vcpu->arch.apicv_active)
>>>                       kvm_x86_ops->sync_pir_to_irr(vcpu);
>>> -             kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>>> +             if (ioapic_irqchip(vcpu->kvm))
>>> +                     kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>>>       }
>>>       bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
>>>                 vcpu_to_synic(vcpu)->vec_bitmap, 256);
>>
>> Commit message for fuzzing bugs have usually included a beautified
>> reproducer.  However, you are not even saying if it is a race, or it is
>> deterministic.
>>
>> The fix seems wrong to me at first impression, because "LAPIC enabled"
>> and "irqchip not split" should imply the existence of an in-kernel
>> IOAPIC.  However, I cannot suggest the right course of action without
>> seeing a testcase.
> 
> 
> 
> I've created a reasonably beautified reproducer here:
> https://groups.google.com/forum/#!msg/syzkaller/6V-KXaMDYi8/rOvBl-69DAAJ

Thanks, this is beautiful enough. :)

Hmm, the combination of 6c7caebc26c5 ("KVM: introduce
kvm->created_vcpus", 2016-06-16) and 4c5ea0a9cd02 ("locking/static_key:
Fix concurrent static_key_slow_inc()", 2016-06-24) should have fixed it
for good.

Is the ENABLE_CAP necessary to reproduce?  Then, the bug is simply that
the ENABLE_CAP should have failed without an irqchip (the
KVM_CREATE_IRQCHIP in turn must have failed with EINVAL).

Paolo

> 
> FWIW the patch has fixed the crash, but there is another similar one
> that happens slightly earlier:
> 
> general protection fault: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 17462 Comm: syz-executor4 Not tainted 4.10.0-rc1+ #119
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> task: ffff88003d6a2280 task.stack: ffff8800357e8000
> RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline]
> RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline]
> RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline]
> RIP: 0010:vcpu_run arch/x86/kvm/x86.c:6944 [inline]
> RIP: 0010:kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
> RSP: 0018:ffff8800357ef7d8 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff88003a459170 RCX: ffffc90000a76000
> RDX: 0000000000000014 RSI: ffffffff810ec1ce RDI: 00000000000000a0
> RBP: ffff8800357efa50 R08: ffff88003d7f2560 R09: 0000000000000000
> R10: 38260856151e7596 R11: dffffc0000000000 R12: dffffc0000000000
> R13: ffff8800357ef9e0 R14: ffff88003a459140 R15: 0000000000000000
> FS:  00007faa53a40700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000003d5c6000 CR4: 00000000000026e0
> Call Trace:
>  kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
>  vfs_ioctl fs/ioctl.c:43 [inline]
>  do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
>  SYSC_ioctl fs/ioctl.c:698 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> RIP: 0033:0x443a19
> RSP: 002b:00007faa53a3fb58 EFLAGS: 00000286 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000018 RCX: 0000000000443a19
> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
> RBP: 00000000006ddb30 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000286 R12: 00000000007000a8
> R13: 00007faa53c23328 R14: 00007faa53c25358 R15: 0000000000000003
> Code: e0 03 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 82 1a 00
> 00 49 8b 86 e0 03 00 00 48 8d b8 a0 00 00 00 48 89 fa 48 c1 ea 03 <42>
> 80 3c 22 00 0f 85 3f 15 00 00 48 8b 80 a0 00 00 00 48 8d b8
> RIP: kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline] RSP: ffff8800357ef7d8
> RIP: vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline] RSP: ffff8800357ef7d8
> RIP: vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline] RSP: ffff8800357ef7d8
> RIP: vcpu_run arch/x86/kvm/x86.c:6944 [inline] RSP: ffff8800357ef7d8
> RIP: kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
> RSP: ffff8800357ef7d8
> ---[ end trace 72f3e29e9ea09f21 ]---
> Kernel panic - not syncing: Fatal exception
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Kernel Offset: disabled
> 
> 
> static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
> {
> u64 eoi_exit_bitmap[4];
> 
> if (!kvm_apic_hw_enabled(vcpu->arch.apic)) // <-----HERE
> return;
> 
> 
> static inline int kvm_apic_hw_enabled(struct kvm_lapic *apic)
> {
> if (static_key_false(&apic_hw_disabled.key))
> return apic->vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE; // <--- HERE
> return MSR_IA32_APICBASE_ENABLE;
> }
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Wanpeng Li Jan. 2, 2017, 10:37 p.m. UTC | #4
2017-01-03 2:01 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 02/01/2017 11:17, Dmitry Vyukov wrote:
>> On Mon, Jan 2, 2017 at 11:09 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>>
>>>
>>> On 01/01/2017 04:44, Wanpeng Li wrote:
>>>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>>>
>>>> This was reported by syzkaller:
>>>>
>>>> BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
>>>> IP: _raw_spin_lock+0xc/0x30
>>>> PGD 3e28eb067
>>>> PUD 3f0ac6067
>>>> PMD 0
>>>> Oops: 0002 [#1] SMP
>>>> CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
>>>> Call Trace:
>>>>  ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
>>>>  kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
>>>>  ? pick_next_task_fair+0xe1/0x4e0
>>>>  ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
>>>>  kvm_vcpu_ioctl+0x33a/0x600 [kvm]
>>>>  ? hrtimer_try_to_cancel+0x29/0x130
>>>>  ? do_nanosleep+0x97/0xf0
>>>>  do_vfs_ioctl+0xa1/0x5d0
>>>>  ? __hrtimer_init+0x90/0x90
>>>>  ? do_nanosleep+0x5b/0xf0
>>>>  SyS_ioctl+0x79/0x90
>>>>  do_syscall_64+0x6e/0x180
>>>>  entry_SYSCALL64_slow_path+0x25/0x25
>>>> RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0
>>>>
>>>> KVM will skip over create pic/ioapic if there is a created vCPU. However,
>>>> there is no guarantee whether ioapic is present when rescan ioapic which
>>>> results in NULL dereference ioapic->lock. This patch fix it by adding the
>>>> ioapic present check to ioapic scan.
>>>>
>>>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>>>> ---
>>>>  arch/x86/kvm/x86.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 51ccfe0..9ca175c 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -6556,7 +6556,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>>>>       else {
>>>>               if (vcpu->arch.apicv_active)
>>>>                       kvm_x86_ops->sync_pir_to_irr(vcpu);
>>>> -             kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>>>> +             if (ioapic_irqchip(vcpu->kvm))
>>>> +                     kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>>>>       }
>>>>       bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
>>>>                 vcpu_to_synic(vcpu)->vec_bitmap, 256);
>>>
>>> Commit message for fuzzing bugs have usually included a beautified
>>> reproducer.  However, you are not even saying if it is a race, or it is
>>> deterministic.
>>>
>>> The fix seems wrong to me at first impression, because "LAPIC enabled"
>>> and "irqchip not split" should imply the existence of an in-kernel
>>> IOAPIC.  However, I cannot suggest the right course of action without
>>> seeing a testcase.
>>
>>
>>
>> I've created a reasonably beautified reproducer here:
>> https://groups.google.com/forum/#!msg/syzkaller/6V-KXaMDYi8/rOvBl-69DAAJ
>
> Thanks, this is beautiful enough. :)
>
> Hmm, the combination of 6c7caebc26c5 ("KVM: introduce
> kvm->created_vcpus", 2016-06-16) and 4c5ea0a9cd02 ("locking/static_key:
> Fix concurrent static_key_slow_inc()", 2016-06-24) should have fixed it
> for good.
>
> Is the ENABLE_CAP necessary to reproduce?  Then, the bug is simply that
> the ENABLE_CAP should have failed without an irqchip (the

The beautified reproducer didn't check KVM_CAP_IRQCHIP.

> KVM_CREATE_IRQCHIP in turn must have failed with EINVAL).
>
> Paolo
>
>>
>> FWIW the patch has fixed the crash, but there is another similar one
>> that happens slightly earlier:

Maybe a irqchip_in_kernel() check at the head of function vcpu_scan_ioapic().

Regards,
Wanpeng Li

>>
>> general protection fault: 0000 [#1] SMP KASAN
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Modules linked in:
>> CPU: 1 PID: 17462 Comm: syz-executor4 Not tainted 4.10.0-rc1+ #119
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> task: ffff88003d6a2280 task.stack: ffff8800357e8000
>> RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline]
>> RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline]
>> RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline]
>> RIP: 0010:vcpu_run arch/x86/kvm/x86.c:6944 [inline]
>> RIP: 0010:kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
>> RSP: 0018:ffff8800357ef7d8 EFLAGS: 00010206
>> RAX: 0000000000000000 RBX: ffff88003a459170 RCX: ffffc90000a76000
>> RDX: 0000000000000014 RSI: ffffffff810ec1ce RDI: 00000000000000a0
>> RBP: ffff8800357efa50 R08: ffff88003d7f2560 R09: 0000000000000000
>> R10: 38260856151e7596 R11: dffffc0000000000 R12: dffffc0000000000
>> R13: ffff8800357ef9e0 R14: ffff88003a459140 R15: 0000000000000000
>> FS:  00007faa53a40700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 000000003d5c6000 CR4: 00000000000026e0
>> Call Trace:
>>  kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
>>  vfs_ioctl fs/ioctl.c:43 [inline]
>>  do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
>>  SYSC_ioctl fs/ioctl.c:698 [inline]
>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> RIP: 0033:0x443a19
>> RSP: 002b:00007faa53a3fb58 EFLAGS: 00000286 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 0000000000000018 RCX: 0000000000443a19
>> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
>> RBP: 00000000006ddb30 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000286 R12: 00000000007000a8
>> R13: 00007faa53c23328 R14: 00007faa53c25358 R15: 0000000000000003
>> Code: e0 03 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 82 1a 00
>> 00 49 8b 86 e0 03 00 00 48 8d b8 a0 00 00 00 48 89 fa 48 c1 ea 03 <42>
>> 80 3c 22 00 0f 85 3f 15 00 00 48 8b 80 a0 00 00 00 48 8d b8
>> RIP: kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline] RSP: ffff8800357ef7d8
>> RIP: vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline] RSP: ffff8800357ef7d8
>> RIP: vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline] RSP: ffff8800357ef7d8
>> RIP: vcpu_run arch/x86/kvm/x86.c:6944 [inline] RSP: ffff8800357ef7d8
>> RIP: kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
>> RSP: ffff8800357ef7d8
>> ---[ end trace 72f3e29e9ea09f21 ]---
>> Kernel panic - not syncing: Fatal exception
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Kernel Offset: disabled
>>
>>
>> static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>> {
>> u64 eoi_exit_bitmap[4];
>>
>> if (!kvm_apic_hw_enabled(vcpu->arch.apic)) // <-----HERE
>> return;
>>
>>
>> static inline int kvm_apic_hw_enabled(struct kvm_lapic *apic)
>> {
>> if (static_key_false(&apic_hw_disabled.key))
>> return apic->vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE; // <--- HERE
>> return MSR_IA32_APICBASE_ENABLE;
>> }
>>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Vyukov Jan. 3, 2017, 9:27 a.m. UTC | #5
On Mon, Jan 2, 2017 at 7:01 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 02/01/2017 11:17, Dmitry Vyukov wrote:
>> On Mon, Jan 2, 2017 at 11:09 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>
>>>
>>>
>>> On 01/01/2017 04:44, Wanpeng Li wrote:
>>>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>>>
>>>> This was reported by syzkaller:
>>>>
>>>> BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
>>>> IP: _raw_spin_lock+0xc/0x30
>>>> PGD 3e28eb067
>>>> PUD 3f0ac6067
>>>> PMD 0
>>>> Oops: 0002 [#1] SMP
>>>> CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
>>>> Call Trace:
>>>>  ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
>>>>  kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
>>>>  ? pick_next_task_fair+0xe1/0x4e0
>>>>  ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
>>>>  kvm_vcpu_ioctl+0x33a/0x600 [kvm]
>>>>  ? hrtimer_try_to_cancel+0x29/0x130
>>>>  ? do_nanosleep+0x97/0xf0
>>>>  do_vfs_ioctl+0xa1/0x5d0
>>>>  ? __hrtimer_init+0x90/0x90
>>>>  ? do_nanosleep+0x5b/0xf0
>>>>  SyS_ioctl+0x79/0x90
>>>>  do_syscall_64+0x6e/0x180
>>>>  entry_SYSCALL64_slow_path+0x25/0x25
>>>> RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0
>>>>
>>>> KVM will skip over create pic/ioapic if there is a created vCPU. However,
>>>> there is no guarantee whether ioapic is present when rescan ioapic which
>>>> results in NULL dereference ioapic->lock. This patch fix it by adding the
>>>> ioapic present check to ioapic scan.
>>>>
>>>> Reported-by: Dmitry Vyukov <dvyukov@google.com>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>>>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>>>> ---
>>>>  arch/x86/kvm/x86.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 51ccfe0..9ca175c 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -6556,7 +6556,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>>>>       else {
>>>>               if (vcpu->arch.apicv_active)
>>>>                       kvm_x86_ops->sync_pir_to_irr(vcpu);
>>>> -             kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>>>> +             if (ioapic_irqchip(vcpu->kvm))
>>>> +                     kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
>>>>       }
>>>>       bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
>>>>                 vcpu_to_synic(vcpu)->vec_bitmap, 256);
>>>
>>> Commit message for fuzzing bugs have usually included a beautified
>>> reproducer.  However, you are not even saying if it is a race, or it is
>>> deterministic.
>>>
>>> The fix seems wrong to me at first impression, because "LAPIC enabled"
>>> and "irqchip not split" should imply the existence of an in-kernel
>>> IOAPIC.  However, I cannot suggest the right course of action without
>>> seeing a testcase.
>>
>>
>>
>> I've created a reasonably beautified reproducer here:
>> https://groups.google.com/forum/#!msg/syzkaller/6V-KXaMDYi8/rOvBl-69DAAJ
>
> Thanks, this is beautiful enough. :)
>
> Hmm, the combination of 6c7caebc26c5 ("KVM: introduce
> kvm->created_vcpus", 2016-06-16) and 4c5ea0a9cd02 ("locking/static_key:
> Fix concurrent static_key_slow_inc()", 2016-06-24) should have fixed it
> for good.
>
> Is the ENABLE_CAP necessary to reproduce?  Then, the bug is simply that
> the ENABLE_CAP should have failed without an irqchip (the
> KVM_CREATE_IRQCHIP in turn must have failed with EINVAL).

ENABLE_CAP is necessary to reproduce.
You are right KVM_CREATE_IRQCHIP fails with EINVAL:
ioctl(4, KVM_CREATE_IRQCHIP, 0)         = -1 EINVAL (Invalid argument)
so I guess it is not necessary to reproduce.
It looks like ENABLE_CAP(KVM_CAP_HYPERV_SYNIC) badly races with exit in KVM_RUN.



>> FWIW the patch has fixed the crash, but there is another similar one
>> that happens slightly earlier:
>>
>> general protection fault: 0000 [#1] SMP KASAN
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Modules linked in:
>> CPU: 1 PID: 17462 Comm: syz-executor4 Not tainted 4.10.0-rc1+ #119
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> task: ffff88003d6a2280 task.stack: ffff8800357e8000
>> RIP: 0010:kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline]
>> RIP: 0010:vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline]
>> RIP: 0010:vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline]
>> RIP: 0010:vcpu_run arch/x86/kvm/x86.c:6944 [inline]
>> RIP: 0010:kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
>> RSP: 0018:ffff8800357ef7d8 EFLAGS: 00010206
>> RAX: 0000000000000000 RBX: ffff88003a459170 RCX: ffffc90000a76000
>> RDX: 0000000000000014 RSI: ffffffff810ec1ce RDI: 00000000000000a0
>> RBP: ffff8800357efa50 R08: ffff88003d7f2560 R09: 0000000000000000
>> R10: 38260856151e7596 R11: dffffc0000000000 R12: dffffc0000000000
>> R13: ffff8800357ef9e0 R14: ffff88003a459140 R15: 0000000000000000
>> FS:  00007faa53a40700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 000000003d5c6000 CR4: 00000000000026e0
>> Call Trace:
>>  kvm_vcpu_ioctl+0x673/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
>>  vfs_ioctl fs/ioctl.c:43 [inline]
>>  do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:683
>>  SYSC_ioctl fs/ioctl.c:698 [inline]
>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> RIP: 0033:0x443a19
>> RSP: 002b:00007faa53a3fb58 EFLAGS: 00000286 ORIG_RAX: 0000000000000010
>> RAX: ffffffffffffffda RBX: 0000000000000018 RCX: 0000000000443a19
>> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000018
>> RBP: 00000000006ddb30 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000286 R12: 00000000007000a8
>> R13: 00007faa53c23328 R14: 00007faa53c25358 R15: 0000000000000003
>> Code: e0 03 00 00 48 89 f8 48 c1 e8 03 42 80 3c 20 00 0f 85 82 1a 00
>> 00 49 8b 86 e0 03 00 00 48 8d b8 a0 00 00 00 48 89 fa 48 c1 ea 03 <42>
>> 80 3c 22 00 0f 85 3f 15 00 00 48 8b 80 a0 00 00 00 48 8d b8
>> RIP: kvm_apic_hw_enabled arch/x86/kvm/lapic.h:156 [inline] RSP: ffff8800357ef7d8
>> RIP: vcpu_scan_ioapic arch/x86/kvm/x86.c:6559 [inline] RSP: ffff8800357ef7d8
>> RIP: vcpu_enter_guest arch/x86/kvm/x86.c:6691 [inline] RSP: ffff8800357ef7d8
>> RIP: vcpu_run arch/x86/kvm/x86.c:6944 [inline] RSP: ffff8800357ef7d8
>> RIP: kvm_arch_vcpu_ioctl_run+0x272c/0x4660 arch/x86/kvm/x86.c:7102
>> RSP: ffff8800357ef7d8
>> ---[ end trace 72f3e29e9ea09f21 ]---
>> Kernel panic - not syncing: Fatal exception
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Kernel Offset: disabled
>>
>>
>> static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
>> {
>> u64 eoi_exit_bitmap[4];
>>
>> if (!kvm_apic_hw_enabled(vcpu->arch.apic)) // <-----HERE
>> return;
>>
>>
>> static inline int kvm_apic_hw_enabled(struct kvm_lapic *apic)
>> {
>> if (static_key_false(&apic_hw_disabled.key))
>> return apic->vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE; // <--- HERE
>> return MSR_IA32_APICBASE_ENABLE;
>> }
>>
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 51ccfe0..9ca175c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6556,7 +6556,8 @@  static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	else {
 		if (vcpu->arch.apicv_active)
 			kvm_x86_ops->sync_pir_to_irr(vcpu);
-		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
+		if (ioapic_irqchip(vcpu->kvm))
+			kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
 	bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
 		  vcpu_to_synic(vcpu)->vec_bitmap, 256);