From patchwork Tue Apr 9 14:13:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 10891355 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CBD981515 for ; Tue, 9 Apr 2019 14:14:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF4202856D for ; Tue, 9 Apr 2019 14:14:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A0EF52880C; Tue, 9 Apr 2019 14:14:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 033122856D for ; Tue, 9 Apr 2019 14:14:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726438AbfDIOOK (ORCPT ); Tue, 9 Apr 2019 10:14:10 -0400 Received: from 9.mo69.mail-out.ovh.net ([46.105.56.78]:57146 "EHLO 9.mo69.mail-out.ovh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726035AbfDIOOK (ORCPT ); Tue, 9 Apr 2019 10:14:10 -0400 Received: from player735.ha.ovh.net (unknown [10.109.143.223]) by mo69.mail-out.ovh.net (Postfix) with ESMTP id 33F494C2B6 for ; Tue, 9 Apr 2019 16:14:07 +0200 (CEST) Received: from kaod.org (lfbn-tou-1-40-22.w86-201.abo.wanadoo.fr [86.201.133.22]) (Authenticated sender: clg@kaod.org) by player735.ha.ovh.net (Postfix) with ESMTPSA id 7470E4989432; Tue, 9 Apr 2019 14:13:58 +0000 (UTC) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: kvm-ppc@vger.kernel.org Cc: Paul Mackerras , David Gibson , kvm@vger.kernel.org, =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [RFC PATCH v4.1 16/17] KVM: PPC: Book3S HV: XIVE: introduce a xive_devices array under the VM Date: Tue, 9 Apr 2019 16:13:46 +0200 Message-Id: <20190409141347.3029-1-clg@kaod.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190320083751.27001-1-clg@kaod.org> References: <20190320083751.27001-1-clg@kaod.org> MIME-Version: 1.0 X-Ovh-Tracer-Id: 12154933921781418967 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduuddrudehgdejvdcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On P9 sPAPR guests, the interrupt mode (XICS legacy or XIVE native) is determine at CAS time and the chosen mode is activated after a machine reset. To be able to switch from one mode to another, subsequent patches will introduce the capability to destroy the KVM device without destroying the VM. This is not considered as a safe operation as the vCPUs are still running and could be referencing the KVM device through their presenters. To protect the system from any breakage, the kvmppc_xive objects representing both KVM devices are now stored in an array under the VM. Allocation is performed on first usage and memory is freed only when the VM exits. Signed-off-by: Cédric Le Goater Reviewed-by: David Gibson --- arch/powerpc/include/asm/kvm_host.h | 1 + arch/powerpc/kvm/book3s_xive.h | 1 + arch/powerpc/kvm/book3s_xive.c | 23 +++++++++++++++++++++-- arch/powerpc/kvm/book3s_xive_native.c | 9 +++++++-- arch/powerpc/kvm/powerpc.c | 6 ++++++ 5 files changed, 36 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 9cc6abdce1b9..ed059c95e56a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -314,6 +314,7 @@ struct kvm_arch { #ifdef CONFIG_KVM_XICS struct kvmppc_xics *xics; struct kvmppc_xive *xive; + struct kvmppc_xive *xive_devices[2]; struct kvmppc_passthru_irqmap *pimap; #endif struct kvmppc_ops *kvm_ops; diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h index e011622dc038..426146332984 100644 --- a/arch/powerpc/kvm/book3s_xive.h +++ b/arch/powerpc/kvm/book3s_xive.h @@ -283,6 +283,7 @@ void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb); int kvmppc_xive_select_target(struct kvm *kvm, u32 *server, u8 prio); int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio, bool single_escalation); +struct kvmppc_xive *kvmppc_xive_get_device(struct kvm *kvm, u32 type); #endif /* CONFIG_KVM_XICS */ #endif /* _KVM_PPC_BOOK3S_XICS_H */ diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index 480a3fc6b9fd..4d4e1730de84 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -1846,11 +1846,30 @@ static void kvmppc_xive_free(struct kvm_device *dev) if (xive->vp_base != XIVE_INVALID_VP) xive_native_free_vp_block(xive->vp_base); + /* + * A reference of the kvmppc_xive pointer is now kept under + * the xive_devices[] array of the machine for reuse. It is + * freed when the VM is destroyed. + */ - kfree(xive); kfree(dev); } +struct kvmppc_xive *kvmppc_xive_get_device(struct kvm *kvm, u32 type) +{ + struct kvmppc_xive *xive; + bool xive_native_index = type == KVM_DEV_TYPE_XIVE; + + xive = kvm->arch.xive_devices[xive_native_index]; + + if (!xive) { + xive = kzalloc(sizeof(*xive), GFP_KERNEL); + kvm->arch.xive_devices[xive_native_index] = xive; + } + + return xive; +} + static int kvmppc_xive_create(struct kvm_device *dev, u32 type) { struct kvmppc_xive *xive; @@ -1859,7 +1878,7 @@ static int kvmppc_xive_create(struct kvm_device *dev, u32 type) pr_devel("Creating xive for partition\n"); - xive = kzalloc(sizeof(*xive), GFP_KERNEL); + xive = kvmppc_xive_get_device(kvm, type); if (!xive) return -ENOMEM; diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c index 62648f833adf..092db0efe628 100644 --- a/arch/powerpc/kvm/book3s_xive_native.c +++ b/arch/powerpc/kvm/book3s_xive_native.c @@ -987,7 +987,12 @@ static void kvmppc_xive_native_free(struct kvm_device *dev) if (xive->vp_base != XIVE_INVALID_VP) xive_native_free_vp_block(xive->vp_base); - kfree(xive); + /* + * A reference of the kvmppc_xive pointer is now kept under + * the xive_devices[] array of the machine for reuse. It is + * freed when the VM is destroyed. + */ + kfree(dev); } @@ -1002,7 +1007,7 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type) if (kvm->arch.xive) return -EEXIST; - xive = kzalloc(sizeof(*xive), GFP_KERNEL); + xive = kvmppc_xive_get_device(kvm, type); if (!xive) return -ENOMEM; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index f54926c78320..d0914316ddc7 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -501,6 +501,12 @@ void kvm_arch_destroy_vm(struct kvm *kvm) mutex_unlock(&kvm->lock); + for (i = 0; i < ARRAY_SIZE(kvm->arch.xive_devices); i++) { + struct kvmppc_xive *xive = kvm->arch.xive_devices[i]; + if (xive) + kfree(xive); + } + /* drop the module reference */ module_put(kvm->arch.kvm_ops->owner); } From patchwork Tue Apr 9 14:13:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?C=C3=A9dric_Le_Goater?= X-Patchwork-Id: 10891365 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 881BE1515 for ; Tue, 9 Apr 2019 14:32:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 73EDA28871 for ; Tue, 9 Apr 2019 14:32:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6840728924; Tue, 9 Apr 2019 14:32:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CDC8928871 for ; Tue, 9 Apr 2019 14:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726493AbfDIOch (ORCPT ); Tue, 9 Apr 2019 10:32:37 -0400 Received: from 3.mo4.mail-out.ovh.net ([46.105.57.129]:36292 "EHLO 3.mo4.mail-out.ovh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726401AbfDIOch (ORCPT ); Tue, 9 Apr 2019 10:32:37 -0400 Received: from player735.ha.ovh.net (unknown [10.109.146.5]) by mo4.mail-out.ovh.net (Postfix) with ESMTP id B9E5D1E2F42 for ; Tue, 9 Apr 2019 16:14:14 +0200 (CEST) Received: from kaod.org (lfbn-tou-1-40-22.w86-201.abo.wanadoo.fr [86.201.133.22]) (Authenticated sender: clg@kaod.org) by player735.ha.ovh.net (Postfix) with ESMTPSA id 3BC3B49894E6; Tue, 9 Apr 2019 14:14:07 +0000 (UTC) From: =?utf-8?q?C=C3=A9dric_Le_Goater?= To: kvm-ppc@vger.kernel.org Cc: Paul Mackerras , David Gibson , kvm@vger.kernel.org, =?utf-8?q?C=C3=A9dric_Le_Goater?= Subject: [RFC PATCH v4 17/17] KVM: PPC: Book3S HV: XIVE: introduce a 'release' device operation Date: Tue, 9 Apr 2019 16:13:47 +0200 Message-Id: <20190409141347.3029-2-clg@kaod.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190409141347.3029-1-clg@kaod.org> References: <20190320083751.27001-1-clg@kaod.org> <20190409141347.3029-1-clg@kaod.org> MIME-Version: 1.0 X-Ovh-Tracer-Id: 12156904245733788631 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduuddrudehgdejfecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When the VM boots, the CAS negotiation process determines which interrupt mode to use and invokes a machine reset. At that time, any links to the previous KVM interrupt device should be 'destroyed' before the new chosen one is created. To perform the necessary cleanups in KVM, we extend the KVM device interface with a new 'release' operation which is called when the file descriptor of the device is closed. Such operations are defined for the XICS-on-XIVE and the XIVE native KVM devices. They clear the vCPU interrupt presenters that could be attached and then destroy the device. Signed-off-by: Cédric Le Goater --- include/linux/kvm_host.h | 1 + arch/powerpc/kvm/book3s_xive.c | 50 +++++++++++++++++++++++++-- arch/powerpc/kvm/book3s_xive_native.c | 23 ++++++++++++ virt/kvm/kvm_main.c | 13 +++++++ 4 files changed, 85 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 831d963451d8..3b444620d8fc 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1246,6 +1246,7 @@ struct kvm_device_ops { long (*ioctl)(struct kvm_device *dev, unsigned int ioctl, unsigned long arg); int (*mmap)(struct kvm_device *dev, struct vm_area_struct *vma); + void (*release)(struct kvm_device *dev); }; void kvm_device_get(struct kvm_device *dev); diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index 4d4e1730de84..ba777db849d7 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -1100,11 +1100,19 @@ void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu) void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) { struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu; - struct kvmppc_xive *xive = xc->xive; + struct kvmppc_xive *xive; int i; + if (!kvmppc_xics_enabled(vcpu)) + return; + + if (!xc) + return; + pr_devel("cleanup_vcpu(cpu=%d)\n", xc->server_num); + xive = xc->xive; + /* Ensure no interrupt is still routed to that VP */ xc->valid = false; kvmppc_xive_disable_vcpu_interrupts(vcpu); @@ -1141,6 +1149,10 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) } /* Free the VP */ kfree(xc); + + /* Cleanup the vcpu */ + vcpu->arch.irq_type = KVMPPC_IRQ_DEFAULT; + vcpu->arch.xive_vcpu = NULL; } int kvmppc_xive_connect_vcpu(struct kvm_device *dev, @@ -1158,7 +1170,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev, } if (xive->kvm != vcpu->kvm) return -EPERM; - if (vcpu->arch.irq_type) + if (vcpu->arch.irq_type != KVMPPC_IRQ_DEFAULT) return -EBUSY; if (kvmppc_xive_find_server(vcpu->kvm, cpu)) { pr_devel("Duplicate !\n"); @@ -1855,6 +1867,39 @@ static void kvmppc_xive_free(struct kvm_device *dev) kfree(dev); } +static void kvmppc_xive_release(struct kvm_device *dev) +{ + struct kvmppc_xive *xive = dev->private; + struct kvm *kvm = xive->kvm; + struct kvm_vcpu *vcpu; + int i; + + pr_devel("Releasing xive device\n"); + + /* + * When releasing the KVM device fd, the vCPUs can still be + * running and we should clean up the vCPU interrupt + * presenters first. + */ + if (atomic_read(&kvm->online_vcpus) != 0) { + /* + * call kick_all_cpus_sync() to ensure that all CPUs + * have executed any pending interrupts + */ + if (is_kvmppc_hv_enabled(kvm)) + kick_all_cpus_sync(); + + /* + * TODO: There is still a race window with the early + * checks in kvmppc_native_connect_vcpu() + */ + kvm_for_each_vcpu(i, vcpu, kvm) + kvmppc_xive_cleanup_vcpu(vcpu); + } + + kvmppc_xive_free(dev); +} + struct kvmppc_xive *kvmppc_xive_get_device(struct kvm *kvm, u32 type) { struct kvmppc_xive *xive; @@ -2043,6 +2088,7 @@ struct kvm_device_ops kvm_xive_ops = { .name = "kvm-xive", .create = kvmppc_xive_create, .init = kvmppc_xive_init, + .release = kvmppc_xive_release, .destroy = kvmppc_xive_free, .set_attr = xive_set_attr, .get_attr = xive_get_attr, diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c index 092db0efe628..629da7bf2a89 100644 --- a/arch/powerpc/kvm/book3s_xive_native.c +++ b/arch/powerpc/kvm/book3s_xive_native.c @@ -996,6 +996,28 @@ static void kvmppc_xive_native_free(struct kvm_device *dev) kfree(dev); } +static void kvmppc_xive_native_release(struct kvm_device *dev) +{ + struct kvmppc_xive *xive = dev->private; + struct kvm *kvm = xive->kvm; + struct kvm_vcpu *vcpu; + int i; + + pr_devel("Releasing xive native device\n"); + + /* + * When releasing the KVM device fd, the vCPUs can still be + * running and we should clean up the vCPU interrupt + * presenters first. + */ + if (atomic_read(&kvm->online_vcpus) != 0) { + kvm_for_each_vcpu(i, vcpu, kvm) + kvmppc_xive_native_cleanup_vcpu(vcpu); + } + + kvmppc_xive_native_free(dev); +} + static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type) { struct kvmppc_xive *xive; @@ -1187,6 +1209,7 @@ struct kvm_device_ops kvm_xive_native_ops = { .name = "kvm-xive-native", .create = kvmppc_xive_native_create, .init = kvmppc_xive_native_init, + .release = kvmppc_xive_native_release, .destroy = kvmppc_xive_native_free, .set_attr = kvmppc_xive_native_set_attr, .get_attr = kvmppc_xive_native_get_attr, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ea2018ae1cd7..ea2619d5ca98 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2938,6 +2938,19 @@ static int kvm_device_release(struct inode *inode, struct file *filp) struct kvm_device *dev = filp->private_data; struct kvm *kvm = dev->kvm; + if (!dev) + return -ENODEV; + + if (dev->kvm != kvm) + return -EPERM; + + if (dev->ops->release) { + mutex_lock(&kvm->lock); + list_del(&dev->vm_node); + dev->ops->release(dev); + mutex_unlock(&kvm->lock); + } + kvm_put_kvm(kvm); return 0; }