Message ID | 20240819125045.3474845-1-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors | expand |
On 2024/8/19 20:50, Marc Zyngier wrote: > We recently moved the teardown of the vgic part of a vcpu inside > a critical section guarded by the config_lock. This teardown phase > involves calling into kvm_io_bus_unregister_dev(), which takes the > kvm->srcu lock. > > However, this violates the established order where kvm->srcu is > taken on a memory fault (such as an MMIO access), possibly > followed by taking the config_lock if the GIC emulation requires > mutual exclusion from the other vcpus. > > It therefore results in a bad lockdep splat, as reported by Zenghui. > > Fix this by moving the call to kvm_io_bus_unregister_dev() outside > of the config_lock critical section. At this stage, there shouln't > be any need to hold the config_lock. > > As an additional bonus, document the ordering between kvm->slots_lock, > kvm->srcu and kvm->arch.config_lock so that I cannot pretend I didn't > know about those anymore. > > Fixes: 9eb18136af9f ("KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface") > Reported-by: Zenghui Yu <yuzenghui@huawei.com> > Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Thanks, Zenghui
On Mon, 19 Aug 2024 13:50:45 +0100, Marc Zyngier wrote: > We recently moved the teardown of the vgic part of a vcpu inside > a critical section guarded by the config_lock. This teardown phase > involves calling into kvm_io_bus_unregister_dev(), which takes the > kvm->srcu lock. > > However, this violates the established order where kvm->srcu is > taken on a memory fault (such as an MMIO access), possibly > followed by taking the config_lock if the GIC emulation requires > mutual exclusion from the other vcpus. > > [...] Tested this w/ kvm-unit-tests, selftests, and a few VMs on a lockdep kernel. Applied to kvmarm/fixes, thanks! [1/1] KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors https://git.kernel.org/kvmarm/kvmarm/c/f616506754d3 -- Best, Oliver
diff --git a/arch/arm64/kvm/vgic/vgic-init.c b/arch/arm64/kvm/vgic/vgic-init.c index 41feb858ff9a..e7c53e8af3d1 100644 --- a/arch/arm64/kvm/vgic/vgic-init.c +++ b/arch/arm64/kvm/vgic/vgic-init.c @@ -417,10 +417,8 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) kfree(vgic_cpu->private_irqs); vgic_cpu->private_irqs = NULL; - if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) { - vgic_unregister_redist_iodev(vcpu); + if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF; - } } void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) @@ -448,6 +446,11 @@ void kvm_vgic_destroy(struct kvm *kvm) kvm_vgic_dist_destroy(kvm); mutex_unlock(&kvm->arch.config_lock); + + if (kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) + kvm_for_each_vcpu(i, vcpu, kvm) + vgic_unregister_redist_iodev(vcpu); + mutex_unlock(&kvm->slots_lock); } diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c index 2caa64415ff3..f50274fd5581 100644 --- a/arch/arm64/kvm/vgic/vgic.c +++ b/arch/arm64/kvm/vgic/vgic.c @@ -36,6 +36,11 @@ struct vgic_global kvm_vgic_global_state __ro_after_init = { * we have to disable IRQs before taking this lock and everything lower * than it. * + * The config_lock has additional ordering requirements: + * kvm->slots_lock + * kvm->srcu + * kvm->arch.config_lock + * * If you need to take multiple locks, always take the upper lock first, * then the lower ones, e.g. first take the its_lock, then the irq_lock. * If you are already holding a lock and need to take a higher one, you
We recently moved the teardown of the vgic part of a vcpu inside a critical section guarded by the config_lock. This teardown phase involves calling into kvm_io_bus_unregister_dev(), which takes the kvm->srcu lock. However, this violates the established order where kvm->srcu is taken on a memory fault (such as an MMIO access), possibly followed by taking the config_lock if the GIC emulation requires mutual exclusion from the other vcpus. It therefore results in a bad lockdep splat, as reported by Zenghui. Fix this by moving the call to kvm_io_bus_unregister_dev() outside of the config_lock critical section. At this stage, there shouln't be any need to hold the config_lock. As an additional bonus, document the ordering between kvm->slots_lock, kvm->srcu and kvm->arch.config_lock so that I cannot pretend I didn't know about those anymore. Fixes: 9eb18136af9f ("KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface") Reported-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> --- arch/arm64/kvm/vgic/vgic-init.c | 9 ++++++--- arch/arm64/kvm/vgic/vgic.c | 5 +++++ 2 files changed, 11 insertions(+), 3 deletions(-)