Message ID | 20200422161844.3848063-6-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm: vgic fixes for 5.7 | expand |
Hi Marc, Zenghui, On 22/04/2020 17:18, Marc Zyngier wrote: > From: Zenghui Yu <yuzenghui@huawei.com> > > It's likely that the vcpu fails to handle all virtual interrupts if > userspace decides to destroy it, leaving the pending ones stay in the > ap_list. If the un-handled one is a LPI, its vgic_irq structure will > be eventually leaked because of an extra refcount increment in > vgic_queue_irq_unlock(). > diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > index a963b9d766b73..53ec9b9d9bc43 100644 > --- a/virt/kvm/arm/vgic/vgic-init.c > +++ b/virt/kvm/arm/vgic/vgic-init.c > @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) > { > struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; > > + /* > + * Retire all pending LPIs on this vcpu anyway as we're > + * going to destroy it. > + */ Looking at the other caller, do we need something like: | if (vgic_cpu->lpis_enabled) ? > + vgic_flush_pending_lpis(vcpu); > + Otherwise, I get this on a gic-v2 machine!: [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0 [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task qemu-system-aar/542 [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted 5.7.0-rc2-00006-g4fb0f7bb0e27 #2 [ 1742.211780] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Jul 30 2018 [ 1742.222596] Call trace: [ 1742.225059] dump_backtrace+0x0/0x328 [ 1742.228738] show_stack+0x18/0x28 [ 1742.232071] dump_stack+0x134/0x1b0 [ 1742.235578] print_address_description.isra.0+0x6c/0x350 [ 1742.240910] __kasan_report+0x10c/0x180 [ 1742.244763] kasan_report+0x4c/0x68 [ 1742.248268] __asan_report_load8_noabort+0x30/0x48 [ 1742.253081] vgic_flush_pending_lpis+0x250/0x2c0 [ 1742.257718] __kvm_vgic_destroy+0x1cc/0x478 [ 1742.261919] kvm_vgic_destroy+0x30/0x48 [ 1742.265773] kvm_arch_destroy_vm+0x20/0x128 [ 1742.269976] kvm_put_kvm+0x3e0/0x8d0 [ 1742.273567] kvm_vm_release+0x3c/0x60 [ 1742.277248] __fput+0x218/0x630 [ 1742.280406] ____fput+0x10/0x20 [ 1742.283565] task_work_run+0xd8/0x1f0 [ 1742.287245] do_exit+0x87c/0x2640 [ 1742.290575] do_group_exit+0xd0/0x258 [ 1742.294254] __arm64_sys_exit_group+0x3c/0x48 [ 1742.298631] el0_svc_common.constprop.0+0x10c/0x348 [ 1742.303529] do_el0_svc+0x48/0xd0 [ 1742.306861] el0_sync_handler+0x11c/0x1b8 [ 1742.310888] el0_sync+0x158/0x180 [ 1742.315716] The buggy address belongs to the page: [ 1742.320529] page:fffffe002366fc40 refcount:0 mapcount:0 mapping:000000007e21d29f index:0x0 [ 1742.328821] flags: 0x2ffff00000000000() [ 1742.332678] raw: 2ffff00000000000 0000000000000000 ffffffff23660401 0000000000000000 [ 1742.340449] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 1742.348215] page dumped because: kasan: bad access detected [ 1742.355304] Memory state around the buggy address: [ 1742.360115] ffff0008e1bf1e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 1742.367360] ffff0008e1bf1e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 1742.374606] >ffff0008e1bf1f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 1742.381851] ^ [ 1742.386399] ffff0008e1bf1f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 1742.393645] ffff0008e1bf2000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 1742.400889] ================================================================== [ 1742.408132] Disabling lock debugging due to kernel taint With that: Reviewed-by: James Morse <james.morse@arm.com> Thanks, James
Hi James, Thanks for having a look and testing it! On 2020/4/23 19:35, James Morse wrote: > Hi Marc, Zenghui, > > On 22/04/2020 17:18, Marc Zyngier wrote: >> From: Zenghui Yu <yuzenghui@huawei.com> >> >> It's likely that the vcpu fails to handle all virtual interrupts if >> userspace decides to destroy it, leaving the pending ones stay in the >> ap_list. If the un-handled one is a LPI, its vgic_irq structure will >> be eventually leaked because of an extra refcount increment in >> vgic_queue_irq_unlock(). > >> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c >> index a963b9d766b73..53ec9b9d9bc43 100644 >> --- a/virt/kvm/arm/vgic/vgic-init.c >> +++ b/virt/kvm/arm/vgic/vgic-init.c >> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) >> { >> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >> >> + /* >> + * Retire all pending LPIs on this vcpu anyway as we're >> + * going to destroy it. >> + */ > > Looking at the other caller, do we need something like: > | if (vgic_cpu->lpis_enabled) > > ? If LPIs are disabled at redistributor level, yes there should be no pending LPIs in the ap_list. But I'm not sure how can the following use-after-free BUG be triggered. > >> + vgic_flush_pending_lpis(vcpu); >> + > > Otherwise, I get this on a gic-v2 machine!: I don't have a gic-v2 one and thus can't reproduce it :-( > [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0 > [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task qemu-system-aar/542 > [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted > 5.7.0-rc2-00006-g4fb0f7bb0e27 #2 > [ 1742.211780] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development > Platform, BIOS EDK II Jul 30 2018 > [ 1742.222596] Call trace: > [ 1742.225059] dump_backtrace+0x0/0x328 > [ 1742.228738] show_stack+0x18/0x28 > [ 1742.232071] dump_stack+0x134/0x1b0 > [ 1742.235578] print_address_description.isra.0+0x6c/0x350 > [ 1742.240910] __kasan_report+0x10c/0x180 > [ 1742.244763] kasan_report+0x4c/0x68 > [ 1742.248268] __asan_report_load8_noabort+0x30/0x48 > [ 1742.253081] vgic_flush_pending_lpis+0x250/0x2c0 Could you please show the result of ./scripts/faddr2line vmlinux vgic_flush_pending_lpis+0x250/0x2c0 on your setup? > [ 1742.257718] __kvm_vgic_destroy+0x1cc/0x478 > [ 1742.261919] kvm_vgic_destroy+0x30/0x48 > [ 1742.265773] kvm_arch_destroy_vm+0x20/0x128 > [ 1742.269976] kvm_put_kvm+0x3e0/0x8d0 > [ 1742.273567] kvm_vm_release+0x3c/0x60 > [ 1742.277248] __fput+0x218/0x630 > [ 1742.280406] ____fput+0x10/0x20 > [ 1742.283565] task_work_run+0xd8/0x1f0 > [ 1742.287245] do_exit+0x87c/0x2640 > [ 1742.290575] do_group_exit+0xd0/0x258 > [ 1742.294254] __arm64_sys_exit_group+0x3c/0x48 > [ 1742.298631] el0_svc_common.constprop.0+0x10c/0x348 > [ 1742.303529] do_el0_svc+0x48/0xd0 > [ 1742.306861] el0_sync_handler+0x11c/0x1b8 > [ 1742.310888] el0_sync+0x158/0x180 > [ 1742.315716] The buggy address belongs to the page: > [ 1742.320529] page:fffffe002366fc40 refcount:0 mapcount:0 mapping:000000007e21d29f index:0x0 > [ 1742.328821] flags: 0x2ffff00000000000() > [ 1742.332678] raw: 2ffff00000000000 0000000000000000 ffffffff23660401 0000000000000000 > [ 1742.340449] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > [ 1742.348215] page dumped because: kasan: bad access detected > [ 1742.355304] Memory state around the buggy address: > [ 1742.360115] ffff0008e1bf1e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > [ 1742.367360] ffff0008e1bf1e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > [ 1742.374606] >ffff0008e1bf1f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > [ 1742.381851] ^ > [ 1742.386399] ffff0008e1bf1f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > [ 1742.393645] ffff0008e1bf2000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff > [ 1742.400889] ================================================================== > [ 1742.408132] Disabling lock debugging due to kernel taint > > > With that: > Reviewed-by: James Morse <james.morse@arm.com> Thanks a lot! Zenghui
Hi James, Thanks for the heads up. On 2020-04-23 12:35, James Morse wrote: > Hi Marc, Zenghui, > > On 22/04/2020 17:18, Marc Zyngier wrote: >> From: Zenghui Yu <yuzenghui@huawei.com> >> >> It's likely that the vcpu fails to handle all virtual interrupts if >> userspace decides to destroy it, leaving the pending ones stay in the >> ap_list. If the un-handled one is a LPI, its vgic_irq structure will >> be eventually leaked because of an extra refcount increment in >> vgic_queue_irq_unlock(). > >> diff --git a/virt/kvm/arm/vgic/vgic-init.c >> b/virt/kvm/arm/vgic/vgic-init.c >> index a963b9d766b73..53ec9b9d9bc43 100644 >> --- a/virt/kvm/arm/vgic/vgic-init.c >> +++ b/virt/kvm/arm/vgic/vgic-init.c >> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) >> { >> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >> >> + /* >> + * Retire all pending LPIs on this vcpu anyway as we're >> + * going to destroy it. >> + */ > > Looking at the other caller, do we need something like: > | if (vgic_cpu->lpis_enabled) > > ? Huh... On its own, this call is absolutely harmless even if you don't have LPIs. But see below. > >> + vgic_flush_pending_lpis(vcpu); >> + > > Otherwise, I get this on a gic-v2 machine!: > [ 1742.187139] BUG: KASAN: use-after-free in > vgic_flush_pending_lpis+0x250/0x2c0 > [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task > qemu-system-aar/542 > [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted > 5.7.0-rc2-00006-g4fb0f7bb0e27 #2 > [ 1742.211780] Hardware name: ARM LTD ARM Juno Development > Platform/ARM Juno Development > Platform, BIOS EDK II Jul 30 2018 > [ 1742.222596] Call trace: > [ 1742.225059] dump_backtrace+0x0/0x328 > [ 1742.228738] show_stack+0x18/0x28 > [ 1742.232071] dump_stack+0x134/0x1b0 > [ 1742.235578] print_address_description.isra.0+0x6c/0x350 > [ 1742.240910] __kasan_report+0x10c/0x180 > [ 1742.244763] kasan_report+0x4c/0x68 > [ 1742.248268] __asan_report_load8_noabort+0x30/0x48 > [ 1742.253081] vgic_flush_pending_lpis+0x250/0x2c0 > [ 1742.257718] __kvm_vgic_destroy+0x1cc/0x478 > [ 1742.261919] kvm_vgic_destroy+0x30/0x48 > [ 1742.265773] kvm_arch_destroy_vm+0x20/0x128 > [ 1742.269976] kvm_put_kvm+0x3e0/0x8d0 > [ 1742.273567] kvm_vm_release+0x3c/0x60 > [ 1742.277248] __fput+0x218/0x630 > [ 1742.280406] ____fput+0x10/0x20 > [ 1742.283565] task_work_run+0xd8/0x1f0 > [ 1742.287245] do_exit+0x87c/0x2640 > [ 1742.290575] do_group_exit+0xd0/0x258 > [ 1742.294254] __arm64_sys_exit_group+0x3c/0x48 > [ 1742.298631] el0_svc_common.constprop.0+0x10c/0x348 > [ 1742.303529] do_el0_svc+0x48/0xd0 > [ 1742.306861] el0_sync_handler+0x11c/0x1b8 > [ 1742.310888] el0_sync+0x158/0x180 > [ 1742.315716] The buggy address belongs to the page: > [ 1742.320529] page:fffffe002366fc40 refcount:0 mapcount:0 > mapping:000000007e21d29f index:0x0 > [ 1742.328821] flags: 0x2ffff00000000000() > [ 1742.332678] raw: 2ffff00000000000 0000000000000000 ffffffff23660401 > 0000000000000000 > [ 1742.340449] raw: 0000000000000000 0000000000000000 00000000ffffffff > 0000000000000000 > [ 1742.348215] page dumped because: kasan: bad access detected > [ 1742.355304] Memory state around the buggy address: > [ 1742.360115] ffff0008e1bf1e00: ff ff ff ff ff ff ff ff ff ff ff ff > ff ff ff ff > [ 1742.367360] ffff0008e1bf1e80: ff ff ff ff ff ff ff ff ff ff ff ff > ff ff ff ff > [ 1742.374606] >ffff0008e1bf1f00: ff ff ff ff ff ff ff ff ff ff ff ff > ff ff ff ff > [ 1742.381851] ^ > [ 1742.386399] ffff0008e1bf1f80: ff ff ff ff ff ff ff ff ff ff ff ff > ff ff ff ff > [ 1742.393645] ffff0008e1bf2000: ff ff ff ff ff ff ff ff ff ff ff ff > ff ff ff ff > [ 1742.400889] > ================================================================== > [ 1742.408132] Disabling lock debugging due to kernel taint > > > With that: > Reviewed-by: James Morse <james.morse@arm.com> I think this is slightly more concerning. The issue is that we have started freeing parts of the interrupt state already (we free the SPIs early in kvm_vgic_dist_destroy()). If a SPI was pending or active at this stage (i.e. present in the ap_list), we are going to iterate over memory that has been freed already. This is bad, and this can happen on GICv3 as well. I think this should solve it, but I need to test it on a GICv2 system: diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c index 53ec9b9d9bc43..30dbec9fe0b4a 100644 --- a/virt/kvm/arm/vgic/vgic-init.c +++ b/virt/kvm/arm/vgic/vgic-init.c @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm) vgic_debug_destroy(kvm); - kvm_vgic_dist_destroy(kvm); - kvm_for_each_vcpu(i, vcpu, kvm) kvm_vgic_vcpu_destroy(vcpu); + + kvm_vgic_dist_destroy(kvm); } void kvm_vgic_destroy(struct kvm *kvm) Thanks, M.
On 2020/4/23 20:03, Marc Zyngier wrote: > > I think this is slightly more concerning. The issue is that we have > started freeing parts of the interrupt state already (we free the > SPIs early in kvm_vgic_dist_destroy()). > > If a SPI was pending or active at this stage (i.e. present in the > ap_list), we are going to iterate over memory that has been freed > already. This is bad, and this can happen on GICv3 as well. Ah, I think this should be the case. > > I think this should solve it, but I need to test it on a GICv2 system: Agreed. > > diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > index 53ec9b9d9bc43..30dbec9fe0b4a 100644 > --- a/virt/kvm/arm/vgic/vgic-init.c > +++ b/virt/kvm/arm/vgic/vgic-init.c > @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm) > > vgic_debug_destroy(kvm); > > - kvm_vgic_dist_destroy(kvm); > - > kvm_for_each_vcpu(i, vcpu, kvm) > kvm_vgic_vcpu_destroy(vcpu); > + > + kvm_vgic_dist_destroy(kvm); > } > > void kvm_vgic_destroy(struct kvm *kvm) Thanks for the fix, Zenghui
Hi Zenghui, On 23/04/2020 12:57, Zenghui Yu wrote: > On 2020/4/23 19:35, James Morse wrote: >> On 22/04/2020 17:18, Marc Zyngier wrote: >>> From: Zenghui Yu <yuzenghui@huawei.com> >>> >>> It's likely that the vcpu fails to handle all virtual interrupts if >>> userspace decides to destroy it, leaving the pending ones stay in the >>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will >>> be eventually leaked because of an extra refcount increment in >>> vgic_queue_irq_unlock(). >> >>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c >>> index a963b9d766b73..53ec9b9d9bc43 100644 >>> --- a/virt/kvm/arm/vgic/vgic-init.c >>> +++ b/virt/kvm/arm/vgic/vgic-init.c >>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) >>> { >>> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >>> + /* >>> + * Retire all pending LPIs on this vcpu anyway as we're >>> + * going to destroy it. >>> + */ >> >> Looking at the other caller, do we need something like: >> | if (vgic_cpu->lpis_enabled) >> >> ? > > If LPIs are disabled at redistributor level, yes there should be no > pending LPIs in the ap_list. But I'm not sure how can the following > use-after-free BUG be triggered. > >> >>> + vgic_flush_pending_lpis(vcpu); >>> + >> >> Otherwise, I get this on a gic-v2 machine!: > > I don't have a gic-v2 one and thus can't reproduce it :-( > >> [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0 >> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task qemu-system-aar/542 >> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted >> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2 >> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development >> Platform, BIOS EDK II Jul 30 2018 >> [ 1742.222596] Call trace: >> [ 1742.225059] dump_backtrace+0x0/0x328 >> [ 1742.228738] show_stack+0x18/0x28 >> [ 1742.232071] dump_stack+0x134/0x1b0 >> [ 1742.235578] print_address_description.isra.0+0x6c/0x350 >> [ 1742.240910] __kasan_report+0x10c/0x180 >> [ 1742.244763] kasan_report+0x4c/0x68 >> [ 1742.248268] __asan_report_load8_noabort+0x30/0x48 >> [ 1742.253081] vgic_flush_pending_lpis+0x250/0x2c0 > > Could you please show the result of > > ./scripts/faddr2line vmlinux vgic_flush_pending_lpis+0x250/0x2c0 > > on your setup? vgic_flush_pending_lpis+0x250/0x2c0: vgic_flush_pending_lpis at arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic.c:159 Which is: | list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) { I think this confirms Marc's view of the KASAN splat. >> With that: >> Reviewed-by: James Morse <james.morse@arm.com> > > Thanks a lot! Heh, it looks like I had the wrong end of the stick with this... I wrongly assumed calling this on gicv2 would go using structures that held something else. Thanks, James
Hi guys, On 23/04/2020 13:03, Marc Zyngier wrote: > On 2020-04-23 12:35, James Morse wrote: >> On 22/04/2020 17:18, Marc Zyngier wrote: >>> From: Zenghui Yu <yuzenghui@huawei.com> >>> >>> It's likely that the vcpu fails to handle all virtual interrupts if >>> userspace decides to destroy it, leaving the pending ones stay in the >>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will >>> be eventually leaked because of an extra refcount increment in >>> vgic_queue_irq_unlock(). >> >>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c >>> index a963b9d766b73..53ec9b9d9bc43 100644 >>> --- a/virt/kvm/arm/vgic/vgic-init.c >>> +++ b/virt/kvm/arm/vgic/vgic-init.c >>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) >>> { >>> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >>> >>> + /* >>> + * Retire all pending LPIs on this vcpu anyway as we're >>> + * going to destroy it. >>> + */ >> >> Looking at the other caller, do we need something like: >> | if (vgic_cpu->lpis_enabled) >> >> ? > > Huh... On its own, this call is absolutely harmless even if you > don't have LPIs. But see below. > >> >>> + vgic_flush_pending_lpis(vcpu); >>> + >> >> Otherwise, I get this on a gic-v2 machine!: >> [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0 >> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task >> qemu-system-aar/542 >> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted >> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2 >> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development >> Platform/ARM Juno Development >> Platform, BIOS EDK II Jul 30 2018 >> [ 1742.222596] Call trace: >> [ 1742.225059] dump_backtrace+0x0/0x328 >> [ 1742.228738] show_stack+0x18/0x28 >> [ 1742.232071] dump_stack+0x134/0x1b0 >> [ 1742.235578] print_address_description.isra.0+0x6c/0x350 >> [ 1742.240910] __kasan_report+0x10c/0x180 >> [ 1742.244763] kasan_report+0x4c/0x68 >> [ 1742.248268] __asan_report_load8_noabort+0x30/0x48 >> [ 1742.253081] vgic_flush_pending_lpis+0x250/0x2c0 >> [ 1742.257718] __kvm_vgic_destroy+0x1cc/0x478 >> [ 1742.261919] kvm_vgic_destroy+0x30/0x48 >> [ 1742.265773] kvm_arch_destroy_vm+0x20/0x128 >> [ 1742.269976] kvm_put_kvm+0x3e0/0x8d0 >> [ 1742.273567] kvm_vm_release+0x3c/0x60 >> [ 1742.277248] __fput+0x218/0x630 >> [ 1742.280406] ____fput+0x10/0x20 >> [ 1742.283565] task_work_run+0xd8/0x1f0 >> [ 1742.287245] do_exit+0x87c/0x2640 >> [ 1742.290575] do_group_exit+0xd0/0x258 >> [ 1742.294254] __arm64_sys_exit_group+0x3c/0x48 >> [ 1742.298631] el0_svc_common.constprop.0+0x10c/0x348 >> [ 1742.303529] do_el0_svc+0x48/0xd0 >> [ 1742.306861] el0_sync_handler+0x11c/0x1b8 >> [ 1742.310888] el0_sync+0x158/0x180 >> [ 1742.348215] page dumped because: kasan: bad access detected > I think this is slightly more concerning. The issue is that we have > started freeing parts of the interrupt state already (we free the > SPIs early in kvm_vgic_dist_destroy()). (I took this to be some wild pointer access. Previously for use-after-free I've seen it print where it was allocated and where it was freed). > If a SPI was pending or active at this stage (i.e. present in the > ap_list), we are going to iterate over memory that has been freed > already. This is bad, and this can happen on GICv3 as well. > I think this should solve it, but I need to test it on a GICv2 system: > > diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > index 53ec9b9d9bc43..30dbec9fe0b4a 100644 > --- a/virt/kvm/arm/vgic/vgic-init.c > +++ b/virt/kvm/arm/vgic/vgic-init.c > @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm) > > vgic_debug_destroy(kvm); > > - kvm_vgic_dist_destroy(kvm); > - > kvm_for_each_vcpu(i, vcpu, kvm) > kvm_vgic_vcpu_destroy(vcpu); > + > + kvm_vgic_dist_destroy(kvm); > } > > void kvm_vgic_destroy(struct kvm *kvm) This works for me on Juno. Thanks, James
Hi James, On 2020-04-23 15:34, James Morse wrote: > Hi guys, > > On 23/04/2020 13:03, Marc Zyngier wrote: >> On 2020-04-23 12:35, James Morse wrote: [...] >>> [ 1742.348215] page dumped because: kasan: bad access detected > >> I think this is slightly more concerning. The issue is that we have >> started freeing parts of the interrupt state already (we free the >> SPIs early in kvm_vgic_dist_destroy()). > > (I took this to be some wild pointer access. Previously for > use-after-free I've seen it > print where it was allocated and where it was freed). This is indeed what I managed to trigger by forcing a pending SPI (the kvmtool UART interrupt) in the guest and forcefully terminating it: [ 3807.084237] ================================================================== [ 3807.086516] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x54/0x198 [ 3807.088027] Read of size 8 at addr ffff00085514a328 by task ioeventfd-worke/231 [ 3807.089771] [ 3807.090911] CPU: 4 PID: 231 Comm: ioeventfd-worke Not tainted 5.7.0-rc2-00086-g2100c066e9a78 #200 [ 3807.092864] Hardware name: FVP Base RevC (DT) [ 3807.094003] Call trace: [ 3807.095180] dump_backtrace+0x0/0x268 [ 3807.096445] show_stack+0x1c/0x28 [ 3807.097961] dump_stack+0xe8/0x144 [ 3807.099374] print_address_description.isra.11+0x6c/0x354 [ 3807.101002] __kasan_report+0x110/0x1c8 [ 3807.102332] kasan_report+0x48/0x60 [ 3807.103769] __asan_load8+0x9c/0xc0 [ 3807.105113] vgic_flush_pending_lpis+0x54/0x198 [ 3807.107187] __kvm_vgic_destroy+0x120/0x278 [ 3807.108814] kvm_vgic_destroy+0x30/0x48 [ 3807.110443] kvm_arch_destroy_vm+0x20/0xa8 [ 3807.111868] kvm_put_kvm+0x234/0x460 [ 3807.113697] kvm_vm_release+0x34/0x48 [ 3807.115162] __fput+0x104/0x2f8 [ 3807.116464] ____fput+0x14/0x20 [ 3807.117929] task_work_run+0xbc/0x188 [ 3807.119419] do_exit+0x514/0xff8 [ 3807.120859] do_group_exit+0x78/0x108 [ 3807.122323] get_signal+0x164/0xcc0 [ 3807.123951] do_notify_resume+0x244/0x5e0 [ 3807.125416] work_pending+0x8/0x10 [ 3807.126392] [ 3807.126969] Allocated by task 229: [ 3807.128834] save_stack+0x24/0x50 [ 3807.130462] __kasan_kmalloc.isra.10+0xc4/0xe0 [ 3807.132134] kasan_kmalloc+0xc/0x18 [ 3807.133554] __kmalloc+0x174/0x270 [ 3807.135182] vgic_init.part.2+0xe0/0x4f0 [ 3807.136809] vgic_init+0x48/0x58 [ 3807.138095] vgic_set_common_attr.isra.4+0x2fc/0x388 [ 3807.140081] vgic_v3_set_attr+0x8c/0x350 [ 3807.141692] kvm_device_ioctl_attr+0x124/0x190 [ 3807.143260] kvm_device_ioctl+0xe8/0x170 [ 3807.144947] ksys_ioctl+0xb8/0xf8 [ 3807.146575] __arm64_sys_ioctl+0x48/0x60 [ 3807.148365] el0_svc_common.constprop.1+0xc8/0x1c8 [ 3807.150015] do_el0_svc+0x94/0xa0 [ 3807.151605] el0_sync_handler+0x120/0x190 [ 3807.152922] el0_sync+0x140/0x180 [ 3807.153899] [ 3807.154784] Freed by task 231: [ 3807.156178] save_stack+0x24/0x50 [ 3807.157805] __kasan_slab_free+0x10c/0x188 [ 3807.159433] kasan_slab_free+0x10/0x18 [ 3807.160897] kfree+0x88/0x350 [ 3807.162570] __kvm_vgic_destroy+0x5c/0x278 [ 3807.164153] kvm_vgic_destroy+0x30/0x48 [ 3807.165780] kvm_arch_destroy_vm+0x20/0xa8 [ 3807.167408] kvm_put_kvm+0x234/0x460 [ 3807.168691] kvm_vm_release+0x34/0x48 [ 3807.170281] __fput+0x104/0x2f8 [ 3807.171870] ____fput+0x14/0x20 [ 3807.173268] task_work_run+0xbc/0x188 [ 3807.174733] do_exit+0x514/0xff8 [ 3807.176242] do_group_exit+0x78/0x108 [ 3807.177434] get_signal+0x164/0xcc0 [ 3807.179289] do_notify_resume+0x244/0x5e0 [ 3807.180755] work_pending+0x8/0x10 [ 3807.181731] [ 3807.182707] The buggy address belongs to the object at ffff00085514a000 [ 3807.182707] which belongs to the cache kmalloc-4k of size 4096 [ 3807.185381] The buggy address is located 808 bytes inside of [ 3807.185381] 4096-byte region [ffff00085514a000, ffff00085514b000) [ 3807.187591] The buggy address belongs to the page: [ 3807.189381] page:fffffe0021345200 refcount:1 mapcount:0 mapping:00000000090b1068 index:0x0 head:fffffe0021345200 order:3 compound_mapcount:0 compound_pincount:0 [ 3807.192148] flags: 0x2ffff00000010200(slab|head) [ 3807.194123] raw: 2ffff00000010200 dead000000000100 dead000000000122 ffff00085a00f200 [ 3807.196379] raw: 0000000000000000 0000000080040004 00000001ffffffff 0000000000000000 [ 3807.198097] page dumped because: kasan: bad access detected [ 3807.199289] [ 3807.200123] Memory state around the buggy address: [ 3807.201750] ffff00085514a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3807.203704] ffff00085514a280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3807.205657] >ffff00085514a300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3807.207285] ^ [ 3807.208826] ffff00085514a380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3807.210812] ffff00085514a400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 3807.212402] ================================================================== >> If a SPI was pending or active at this stage (i.e. present in the >> ap_list), we are going to iterate over memory that has been freed >> already. This is bad, and this can happen on GICv3 as well. > > >> I think this should solve it, but I need to test it on a GICv2 system: >> >> diff --git a/virt/kvm/arm/vgic/vgic-init.c >> b/virt/kvm/arm/vgic/vgic-init.c >> index 53ec9b9d9bc43..30dbec9fe0b4a 100644 >> --- a/virt/kvm/arm/vgic/vgic-init.c >> +++ b/virt/kvm/arm/vgic/vgic-init.c >> @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm) >> >> vgic_debug_destroy(kvm); >> >> - kvm_vgic_dist_destroy(kvm); >> - >> kvm_for_each_vcpu(i, vcpu, kvm) >> kvm_vgic_vcpu_destroy(vcpu); >> + >> + kvm_vgic_dist_destroy(kvm); >> } >> > void kvm_vgic_destroy(struct kvm *kvm) > > This works for me on Juno. I've verified that the above splat disappears on the FVP too. I'll squash the fix in, add your RB (which I assume stands) and send the whole thing as a lockdown present to Paolo! Thanks, M.
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c index a963b9d766b73..53ec9b9d9bc43 100644 --- a/virt/kvm/arm/vgic/vgic-init.c +++ b/virt/kvm/arm/vgic/vgic-init.c @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) { struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + /* + * Retire all pending LPIs on this vcpu anyway as we're + * going to destroy it. + */ + vgic_flush_pending_lpis(vcpu); + INIT_LIST_HEAD(&vgic_cpu->ap_list_head); }