Message ID | 20200414030349.625-2-yuzenghui@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: vgic_irq: Fix memory leaks | expand |
On Tue, 14 Apr 2020 11:03:47 +0800 Zenghui Yu <yuzenghui@huawei.com> wrote: Hi Zenghui, > It's likely that the vcpu fails to handle all virtual interrupts if > userspace decides to destroy it, leaving the pending ones stay in the > ap_list. If the un-handled one is a LPI, its vgic_irq structure will > be eventually leaked because of an extra refcount increment in > vgic_queue_irq_unlock(). > > This was detected by kmemleak on almost every guest destroy, the > backtrace is as follows: > > unreferenced object 0xffff80725aed5500 (size 128): > comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s) > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm... > c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l... > backtrace: > [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418 > [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418 > [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588 > [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198 > [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80 > [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108 > [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188 > [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240 > [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150 > [<0000000078197602>] io_mem_abort+0x484/0x7b8 > [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58 > [<00000000e0d0cd65>] handle_exit+0x24c/0x770 > [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988 > [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0 > [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8 > [<00000000e7f39607>] ksys_ioctl+0x98/0xd8 > > Fix it by retiring all pending LPIs in the ap_list on the destroy path. > > p.s. I can also reproduce it on a normal guest shutdown. It is because > userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while > the guest is being shutdown and unable to handle it. A little strange > though and haven't dig further... What userspace are you using? You'd hope that the VMM would stop processing I/Os when destroying the guest. But we still need to handle it anyway, and I thing this fix makes sense. > > Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> > --- > virt/kvm/arm/vgic/vgic-init.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > index a963b9d766b7..53ec9b9d9bc4 100644 > --- a/virt/kvm/arm/vgic/vgic-init.c > +++ b/virt/kvm/arm/vgic/vgic-init.c > @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) > { > struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; > > + /* > + * Retire all pending LPIs on this vcpu anyway as we're > + * going to destroy it. > + */ > + vgic_flush_pending_lpis(vcpu); > + > INIT_LIST_HEAD(&vgic_cpu->ap_list_head); > } > I guess that at this stage, the INIT_LIST_HEAD() is superfluous, right? Otherwise, looks good. If you agree with the above, I can fix that locally, no need to resend this patch. Thanks, M.
Hi Marc, On 2020/4/14 18:54, Marc Zyngier wrote: > On Tue, 14 Apr 2020 11:03:47 +0800 > Zenghui Yu <yuzenghui@huawei.com> wrote: > > Hi Zenghui, > >> It's likely that the vcpu fails to handle all virtual interrupts if >> userspace decides to destroy it, leaving the pending ones stay in the >> ap_list. If the un-handled one is a LPI, its vgic_irq structure will >> be eventually leaked because of an extra refcount increment in >> vgic_queue_irq_unlock(). >> >> This was detected by kmemleak on almost every guest destroy, the >> backtrace is as follows: >> >> unreferenced object 0xffff80725aed5500 (size 128): >> comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s) >> hex dump (first 32 bytes): >> 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm... >> c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l... >> backtrace: >> [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418 >> [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418 >> [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588 >> [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198 >> [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80 >> [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108 >> [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188 >> [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240 >> [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150 >> [<0000000078197602>] io_mem_abort+0x484/0x7b8 >> [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58 >> [<00000000e0d0cd65>] handle_exit+0x24c/0x770 >> [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988 >> [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0 >> [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8 >> [<00000000e7f39607>] ksys_ioctl+0x98/0xd8 >> >> Fix it by retiring all pending LPIs in the ap_list on the destroy path. >> >> p.s. I can also reproduce it on a normal guest shutdown. It is because >> userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while >> the guest is being shutdown and unable to handle it. A little strange >> though and haven't dig further... > > What userspace are you using? You'd hope that the VMM would stop > processing I/Os when destroying the guest. But we still need to handle > it anyway, and I thing this fix makes sense. I'm using Qemu (master) for debugging. Looks like an interrupt corresponding to a virtio device configuration change, triggered after all other devices had freed their irqs. Not sure if it's expected. >> >> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> >> --- >> virt/kvm/arm/vgic/vgic-init.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c >> index a963b9d766b7..53ec9b9d9bc4 100644 >> --- a/virt/kvm/arm/vgic/vgic-init.c >> +++ b/virt/kvm/arm/vgic/vgic-init.c >> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) >> { >> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; >> >> + /* >> + * Retire all pending LPIs on this vcpu anyway as we're >> + * going to destroy it. >> + */ >> + vgic_flush_pending_lpis(vcpu); >> + >> INIT_LIST_HEAD(&vgic_cpu->ap_list_head); >> } >> > > I guess that at this stage, the INIT_LIST_HEAD() is superfluous, right? I was just thinking that the ap_list_head may not be empty (besides LPI, with other active or pending interrupts), so leave it unchanged. > Otherwise, looks good. If you agree with the above, I can fix that > locally, no need to resend this patch. Thanks, Zenghui
On Tue, 14 Apr 2020 19:17:49 +0800 Zenghui Yu <yuzenghui@huawei.com> wrote: > Hi Marc, > > On 2020/4/14 18:54, Marc Zyngier wrote: > > On Tue, 14 Apr 2020 11:03:47 +0800 > > Zenghui Yu <yuzenghui@huawei.com> wrote: > > > > Hi Zenghui, > > > >> It's likely that the vcpu fails to handle all virtual interrupts if > >> userspace decides to destroy it, leaving the pending ones stay in the > >> ap_list. If the un-handled one is a LPI, its vgic_irq structure will > >> be eventually leaked because of an extra refcount increment in > >> vgic_queue_irq_unlock(). > >> > >> This was detected by kmemleak on almost every guest destroy, the > >> backtrace is as follows: > >> > >> unreferenced object 0xffff80725aed5500 (size 128): > >> comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s) > >> hex dump (first 32 bytes): > >> 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm... > >> c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l... > >> backtrace: > >> [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418 > >> [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418 > >> [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588 > >> [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198 > >> [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80 > >> [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108 > >> [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188 > >> [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240 > >> [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150 > >> [<0000000078197602>] io_mem_abort+0x484/0x7b8 > >> [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58 > >> [<00000000e0d0cd65>] handle_exit+0x24c/0x770 > >> [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988 > >> [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0 > >> [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8 > >> [<00000000e7f39607>] ksys_ioctl+0x98/0xd8 > >> > >> Fix it by retiring all pending LPIs in the ap_list on the destroy path. > >> > >> p.s. I can also reproduce it on a normal guest shutdown. It is because > >> userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while > >> the guest is being shutdown and unable to handle it. A little strange > >> though and haven't dig further... > > > > What userspace are you using? You'd hope that the VMM would stop > > processing I/Os when destroying the guest. But we still need to handle > > it anyway, and I thing this fix makes sense. > > I'm using Qemu (master) for debugging. Looks like an interrupt > corresponding to a virtio device configuration change, triggered after > all other devices had freed their irqs. Not sure if it's expected. > > >> > >> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> > >> --- > >> virt/kvm/arm/vgic/vgic-init.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > >> index a963b9d766b7..53ec9b9d9bc4 100644 > >> --- a/virt/kvm/arm/vgic/vgic-init.c > >> +++ b/virt/kvm/arm/vgic/vgic-init.c > >> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) > >> { > >> struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; > >> >> + /* > >> + * Retire all pending LPIs on this vcpu anyway as we're > >> + * going to destroy it. > >> + */ > >> + vgic_flush_pending_lpis(vcpu); > >> + > >> INIT_LIST_HEAD(&vgic_cpu->ap_list_head); > >> } > >> > > I guess that at this stage, the INIT_LIST_HEAD() is superfluous, right? > > I was just thinking that the ap_list_head may not be empty (besides LPI, > with other active or pending interrupts), so leave it unchanged. It isn't clear what purpose this serves (the vcpus are about to be freed, and so are the ap_lists), but I guess it doesn't hurt either. I'll queue both patches. Thanks, M.
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c index a963b9d766b7..53ec9b9d9bc4 100644 --- a/virt/kvm/arm/vgic/vgic-init.c +++ b/virt/kvm/arm/vgic/vgic-init.c @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) { struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + /* + * Retire all pending LPIs on this vcpu anyway as we're + * going to destroy it. + */ + vgic_flush_pending_lpis(vcpu); + INIT_LIST_HEAD(&vgic_cpu->ap_list_head); }
It's likely that the vcpu fails to handle all virtual interrupts if userspace decides to destroy it, leaving the pending ones stay in the ap_list. If the un-handled one is a LPI, its vgic_irq structure will be eventually leaked because of an extra refcount increment in vgic_queue_irq_unlock(). This was detected by kmemleak on almost every guest destroy, the backtrace is as follows: unreferenced object 0xffff80725aed5500 (size 128): comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm... c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l... backtrace: [<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418 [<0000000069c7dabb>] vgic_add_lpi+0x88/0x418 [<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588 [<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198 [<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80 [<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108 [<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188 [<000000008f79d288>] __kvm_io_bus_write+0x134/0x240 [<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150 [<0000000078197602>] io_mem_abort+0x484/0x7b8 [<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58 [<00000000e0d0cd65>] handle_exit+0x24c/0x770 [<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988 [<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0 [<000000003271e317>] do_vfs_ioctl+0x160/0xcd8 [<00000000e7f39607>] ksys_ioctl+0x98/0xd8 Fix it by retiring all pending LPIs in the ap_list on the destroy path. p.s. I can also reproduce it on a normal guest shutdown. It is because userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while the guest is being shutdown and unable to handle it. A little strange though and haven't dig further... Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> --- virt/kvm/arm/vgic/vgic-init.c | 6 ++++++ 1 file changed, 6 insertions(+)