From patchwork Mon Apr 4 09:33:43 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Kardashevskiy X-Patchwork-Id: 8739261 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 8AAA7C0553 for ; Mon, 4 Apr 2016 09:43:00 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9E9002021F for ; Mon, 4 Apr 2016 09:42:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 89EA42017E for ; Mon, 4 Apr 2016 09:42:58 +0000 (UTC) Received: from localhost ([::1]:57584 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an129-0005iI-V5 for patchwork-qemu-devel@patchwork.kernel.org; Mon, 04 Apr 2016 05:42:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44806) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an0v2-0000WC-Ps for qemu-devel@nongnu.org; Mon, 04 Apr 2016 05:35:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1an0ux-0000al-GI for qemu-devel@nongnu.org; Mon, 04 Apr 2016 05:35:36 -0400 Received: from e23smtp09.au.ibm.com ([202.81.31.142]:58873) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1an0uw-0000Zy-Mr for qemu-devel@nongnu.org; Mon, 04 Apr 2016 05:35:31 -0400 Received: from localhost by e23smtp09.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 4 Apr 2016 19:35:29 +1000 Received: from d23dlp02.au.ibm.com (202.81.31.213) by e23smtp09.au.ibm.com (202.81.31.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 4 Apr 2016 19:35:27 +1000 X-IBM-Helo: d23dlp02.au.ibm.com X-IBM-MailFrom: aik@ozlabs.ru X-IBM-RcptTo: qemu-devel@nongnu.org;qemu-ppc@nongnu.org Received: from d23relay09.au.ibm.com (d23relay09.au.ibm.com [9.185.63.181]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id 900EB2BB0060; Mon, 4 Apr 2016 19:35:21 +1000 (EST) Received: from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139]) by d23relay09.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u349Z7QV48300276; Mon, 4 Apr 2016 19:35:21 +1000 Received: from d23av04.au.ibm.com (localhost [127.0.0.1]) by d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u349YgOU013665; Mon, 4 Apr 2016 19:34:42 +1000 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.192.253.14]) by d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u349Yf5F012580; Mon, 4 Apr 2016 19:34:42 +1000 Received: from bran.ozlabs.ibm.com (haven.au.ibm.com [9.192.254.114]) by ozlabs.au.ibm.com (Postfix) with ESMTP id 2A32FA03B5; Mon, 4 Apr 2016 19:33:53 +1000 (AEST) Received: from vpl2.ozlabs.ibm.com (vpl2.ozlabs.ibm.com [10.61.141.27]) by bran.ozlabs.ibm.com (Postfix) with ESMTP id 21D0AE3A0E; Mon, 4 Apr 2016 19:33:53 +1000 (AEST) From: Alexey Kardashevskiy To: qemu-devel@nongnu.org Date: Mon, 4 Apr 2016 19:33:43 +1000 Message-Id: <1459762426-18440-15-git-send-email-aik@ozlabs.ru> X-Mailer: git-send-email 2.5.0.rc3 In-Reply-To: <1459762426-18440-1-git-send-email-aik@ozlabs.ru> References: <1459762426-18440-1-git-send-email-aik@ozlabs.ru> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16040409-0033-0000-0000-000005743AAD X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 202.81.31.142 Cc: Alexey Kardashevskiy , Alex Williamson , qemu-ppc@nongnu.org, Alexander Graf , David Gibson Subject: [Qemu-devel] [PATCH qemu v15 14/17] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping being used by VFIO X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU - a guest view of the table and a hardware TCE table. If there is no VFIO presense in the address space, then just the guest view is used, if this is the case, it is allocated in the KVM. However since there is no support yet for VFIO in KVM TCE hypercalls, when we start using VFIO, we need to move the guest view from KVM to the userspace; and we need to do this for every IOMMU on a bus with VFIO devices. This adds vfio_start/vfio_stop callbacks in MemoryRegionIOMMUOps to notifiy IOMMU about changing environment so it can reallocate the table to/from KVM or (when available) hook the IOMMU groups with the logical bus (LIOBN) in the KVM. This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug path as the new callbacks do this better - they notify IOMMU at the exact moment when the configuration is changed, and this also includes the case of PCI hot unplug. As there can be multiple containers attached to the same PHB/LIOBN, this replaces the @need_vfio flag in sPAPRTCETable with the counter of VFIO users. Signed-off-by: Alexey Kardashevskiy --- Changes: v15: * s/need_vfio/vfio-Users/g --- hw/ppc/spapr_iommu.c | 30 ++++++++++++++++++++---------- hw/ppc/spapr_pci.c | 6 ------ hw/vfio/common.c | 9 +++++++++ include/exec/memory.h | 4 ++++ include/hw/ppc/spapr.h | 2 +- 5 files changed, 34 insertions(+), 17 deletions(-) diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c index c945dba..ea09414 100644 --- a/hw/ppc/spapr_iommu.c +++ b/hw/ppc/spapr_iommu.c @@ -155,6 +155,16 @@ static uint64_t spapr_tce_get_page_sizes(MemoryRegion *iommu) return 1ULL << tcet->page_shift; } +static void spapr_tce_vfio_start(MemoryRegion *iommu) +{ + spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), true); +} + +static void spapr_tce_vfio_stop(MemoryRegion *iommu) +{ + spapr_tce_set_need_vfio(container_of(iommu, sPAPRTCETable, iommu), false); +} + static void spapr_tce_table_do_enable(sPAPRTCETable *tcet); static void spapr_tce_table_do_disable(sPAPRTCETable *tcet); @@ -239,6 +249,8 @@ static const VMStateDescription vmstate_spapr_tce_table = { static MemoryRegionIOMMUOps spapr_iommu_ops = { .translate = spapr_tce_translate_iommu, .get_page_sizes = spapr_tce_get_page_sizes, + .vfio_start = spapr_tce_vfio_start, + .vfio_stop = spapr_tce_vfio_stop, }; static int spapr_tce_table_realize(DeviceState *dev) @@ -248,7 +260,7 @@ static int spapr_tce_table_realize(DeviceState *dev) char tmp[32]; tcet->fd = -1; - tcet->need_vfio = false; + tcet->vfio_users = 0; snprintf(tmp, sizeof(tmp), "tce-root-%x", tcet->liobn); memory_region_init(&tcet->root, tcetobj, tmp, UINT64_MAX); @@ -268,20 +280,18 @@ void spapr_tce_set_need_vfio(sPAPRTCETable *tcet, bool need_vfio) size_t table_size = tcet->nb_table * sizeof(uint64_t); void *newtable; - if (need_vfio == tcet->need_vfio) { - /* Nothing to do */ - return; - } + tcet->vfio_users += need_vfio ? 1 : -1; + g_assert(tcet->vfio_users >= 0); + g_assert(tcet->table); - if (!need_vfio) { + if (!tcet->vfio_users) { /* FIXME: We don't support transition back to KVM accelerated * TCEs yet */ return; } - tcet->need_vfio = true; - - if (tcet->fd < 0) { + if (tcet->vfio_users > 1) { + g_assert(tcet->fd < 0); /* Table is already in userspace, nothing to be do */ return; } @@ -327,7 +337,7 @@ static void spapr_tce_table_do_enable(sPAPRTCETable *tcet) tcet->page_shift, tcet->nb_table, &tcet->fd, - tcet->need_vfio); + tcet->vfio_users != 0); memory_region_set_size(&tcet->iommu, (uint64_t)tcet->nb_table << tcet->page_shift); diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index 5497a18..f864fde 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1083,12 +1083,6 @@ static void spapr_phb_add_pci_device(sPAPRDRConnector *drc, void *fdt = NULL; int fdt_start_offset = 0, fdt_size; - if (object_dynamic_cast(OBJECT(pdev), "vfio-pci")) { - sPAPRTCETable *tcet = spapr_tce_find_by_liobn(phb->dma_liobn); - - spapr_tce_set_need_vfio(tcet, true); - } - if (dev->hotplugged) { fdt = create_device_tree(&fdt_size); fdt_start_offset = spapr_create_pci_child_dt(phb, pdev, fdt, 0); diff --git a/hw/vfio/common.c b/hw/vfio/common.c index ea79311..5e5b77c 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -421,6 +421,9 @@ static void vfio_listener_region_add(MemoryListener *listener, QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); + if (section->mr->iommu_ops && section->mr->iommu_ops->vfio_start) { + section->mr->iommu_ops->vfio_start(section->mr); + } memory_region_iommu_replay(giommu->iommu, &giommu->n, false); @@ -466,6 +469,7 @@ static void vfio_listener_region_del(MemoryListener *listener, VFIOContainer *container = container_of(listener, VFIOContainer, listener); hwaddr iova, end; int ret; + MemoryRegion *iommu = NULL; if (vfio_listener_skipped_section(section)) { trace_vfio_listener_region_del_skip( @@ -487,6 +491,7 @@ static void vfio_listener_region_del(MemoryListener *listener, QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { if (giommu->iommu == section->mr) { memory_region_unregister_iommu_notifier(&giommu->n); + iommu = giommu->iommu; QLIST_REMOVE(giommu, giommu_next); g_free(giommu); break; @@ -519,6 +524,10 @@ static void vfio_listener_region_del(MemoryListener *listener, "0x%"HWADDR_PRIx") = %d (%m)", container, iova, end - iova, ret); } + + if (iommu && iommu->iommu_ops && iommu->iommu_ops->vfio_stop) { + iommu->iommu_ops->vfio_stop(section->mr); + } } static const MemoryListener vfio_memory_listener = { diff --git a/include/exec/memory.h b/include/exec/memory.h index eb5ce67..f1de133f 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -152,6 +152,10 @@ struct MemoryRegionIOMMUOps { IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr, bool is_write); /* Returns supported page sizes */ uint64_t (*get_page_sizes)(MemoryRegion *iommu); + /* Called when VFIO starts using this */ + void (*vfio_start)(MemoryRegion *iommu); + /* Called when VFIO stops using this */ + void (*vfio_stop)(MemoryRegion *iommu); }; typedef struct CoalescedMemoryRange CoalescedMemoryRange; diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index 471eb4a..5c00e38 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -548,7 +548,7 @@ struct sPAPRTCETable { uint32_t mig_nb_table; uint64_t *mig_table; bool bypass; - bool need_vfio; + int vfio_users; int fd; MemoryRegion root, iommu; struct VIOsPAPRDevice *vdev; /* for @bypass migration compatibility only */