From patchwork Mon Mar 21 07:47:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Kardashevskiy X-Patchwork-Id: 8630081 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id D9E72C0553 for ; Mon, 21 Mar 2016 07:52:40 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 268DB20220 for ; Mon, 21 Mar 2016 07:52:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 265B120219 for ; Mon, 21 Mar 2016 07:52:37 +0000 (UTC) Received: from localhost ([::1]:56092 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ahudg-0003A2-Kk for patchwork-qemu-devel@patchwork.kernel.org; Mon, 21 Mar 2016 03:52:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35580) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ahuar-0006Qd-3g for qemu-devel@nongnu.org; Mon, 21 Mar 2016 03:49:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ahuan-0002An-PY for qemu-devel@nongnu.org; Mon, 21 Mar 2016 03:49:41 -0400 Received: from e23smtp03.au.ibm.com ([202.81.31.145]:34116) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ahuam-0002AL-VP for qemu-devel@nongnu.org; Mon, 21 Mar 2016 03:49:37 -0400 Received: from localhost by e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 21 Mar 2016 17:49:35 +1000 Received: from d23dlp02.au.ibm.com (202.81.31.213) by e23smtp03.au.ibm.com (202.81.31.209) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 21 Mar 2016 17:49:34 +1000 X-IBM-Helo: d23dlp02.au.ibm.com X-IBM-MailFrom: aik@ozlabs.ru X-IBM-RcptTo: qemu-devel@nongnu.org;qemu-ppc@nongnu.org Received: from d23relay09.au.ibm.com (d23relay09.au.ibm.com [9.185.63.181]) by d23dlp02.au.ibm.com (Postfix) with ESMTP id DFF0A2BB0054; Mon, 21 Mar 2016 18:49:29 +1100 (EST) Received: from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97]) by d23relay09.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u2L7nL5h4194790; Mon, 21 Mar 2016 18:49:29 +1100 Received: from d23av03.au.ibm.com (localhost [127.0.0.1]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id u2L7mve6009091; Mon, 21 Mar 2016 18:48:57 +1100 Received: from ozlabs.au.ibm.com (ozlabs.au.ibm.com [9.192.253.14]) by d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id u2L7mvgC008411; Mon, 21 Mar 2016 18:48:57 +1100 Received: from bran.ozlabs.ibm.com (haven.au.ibm.com [9.192.254.114]) by ozlabs.au.ibm.com (Postfix) with ESMTP id 4F82EA03B5; Mon, 21 Mar 2016 18:47:09 +1100 (AEDT) Received: from vpl2.ozlabs.ibm.com (vpl2.ozlabs.ibm.com [10.61.141.27]) by bran.ozlabs.ibm.com (Postfix) with ESMTP id 56675E39B7; Mon, 21 Mar 2016 18:47:09 +1100 (AEDT) From: Alexey Kardashevskiy To: qemu-devel@nongnu.org Date: Mon, 21 Mar 2016 18:47:06 +1100 Message-Id: <1458546426-26222-19-git-send-email-aik@ozlabs.ru> X-Mailer: git-send-email 2.5.0.rc3 In-Reply-To: <1458546426-26222-1-git-send-email-aik@ozlabs.ru> References: <1458546426-26222-1-git-send-email-aik@ozlabs.ru> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16032107-0009-0000-0000-0000068A4638 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 202.81.31.145 Cc: Alexey Kardashevskiy , Alex Williamson , qemu-ppc@nongnu.org, David Gibson Subject: [Qemu-devel] [PATCH qemu v14 18/18] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) This implements DDW for emulated and VFIO devices. This reserves RTAS token numbers for DDW calls. This changes the TCE table migration descriptor to support dynamic tables as from now on, PHB will create as many stub TCE table objects as PHB can possibly support but not all of them might be initialized at the time of migration because DDW might or might not be requested by the guest. The "ddw" property is enabled by default on a PHB but for compatibility the pseries-2.5 machine and older disable it. This implements DDW for VFIO. The host kernel support is required. This adds a "levels" property to PHB to control the number of levels in the actual TCE table allocated by the host kernel, 0 is the default value to tell QEMU to calculate the correct value. Current hardware supports up to 5 levels. The existing linux guests try creating one additional huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If succeeded, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds a "dma64_win_addr" property which is a bus address for the 64bit window and by default set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware uses and this allows having emulated and VFIO devices on the same bus. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PCI. Signed-off-by: Alexey Kardashevskiy --- hw/ppc/Makefile.objs | 1 + hw/ppc/spapr.c | 7 +- hw/ppc/spapr_pci.c | 73 ++++++++--- hw/ppc/spapr_rtas_ddw.c | 300 ++++++++++++++++++++++++++++++++++++++++++++ hw/vfio/common.c | 5 - include/hw/pci-host/spapr.h | 13 ++ include/hw/ppc/spapr.h | 16 ++- trace-events | 4 + 8 files changed, 395 insertions(+), 24 deletions(-) create mode 100644 hw/ppc/spapr_rtas_ddw.c diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs index c1ffc77..986b36f 100644 --- a/hw/ppc/Makefile.objs +++ b/hw/ppc/Makefile.objs @@ -7,6 +7,7 @@ obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o spapr_rng.o ifeq ($(CONFIG_PCI)$(CONFIG_PSERIES)$(CONFIG_LINUX), yyy) obj-y += spapr_pci_vfio.o endif +obj-$(CONFIG_PSERIES) += spapr_rtas_ddw.o # PowerPC 4xx boards obj-y += ppc405_boards.o ppc4xx_devs.o ppc405_uc.o ppc440_bamboo.o obj-y += ppc4xx_pci.o diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index d0bb423..ef4c637 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2362,7 +2362,12 @@ DEFINE_SPAPR_MACHINE(2_6, "2.6", true); * pseries-2.5 */ #define SPAPR_COMPAT_2_5 \ - HW_COMPAT_2_5 + HW_COMPAT_2_5 \ + {\ + .driver = TYPE_SPAPR_PCI_HOST_BRIDGE,\ + .property = "ddw",\ + .value = stringify(off),\ + }, static void spapr_machine_2_5_instance_options(MachineState *machine) { diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index af99a36..3bb294a 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -803,12 +803,12 @@ static char *spapr_phb_get_loc_code(sPAPRPHBState *sphb, PCIDevice *pdev) return buf; } -static void spapr_phb_dma_window_enable(sPAPRPHBState *sphb, - uint32_t liobn, - uint32_t page_shift, - uint64_t window_addr, - uint64_t window_size, - Error **errp) +void spapr_phb_dma_window_enable(sPAPRPHBState *sphb, + uint32_t liobn, + uint32_t page_shift, + uint64_t window_addr, + uint64_t window_size, + Error **errp) { sPAPRTCETable *tcet; uint32_t nb_table = window_size >> page_shift; @@ -825,10 +825,16 @@ static void spapr_phb_dma_window_enable(sPAPRPHBState *sphb, return; } + if (SPAPR_PCI_DMA_WINDOW_NUM(liobn) && !sphb->ddw_enabled) { + error_setg(errp, + "Attempt to use second window when DDW is disabled on PHB"); + return; + } + spapr_tce_table_enable(tcet, page_shift, window_addr, nb_table); } -static int spapr_phb_dma_window_disable(sPAPRPHBState *sphb, uint32_t liobn) +int spapr_phb_dma_window_disable(sPAPRPHBState *sphb, uint32_t liobn) { sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); @@ -1492,14 +1498,18 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) } /* DMA setup */ - tcet = spapr_tce_new_table(DEVICE(sphb), sphb->dma_liobn); - if (!tcet) { - error_report("No default TCE table for %s", sphb->dtbusname); - return; - } - memory_region_add_subregion_overlap(&sphb->iommu_root, 0, - spapr_tce_get_iommu(tcet), 0); + for (i = 0; i < SPAPR_PCI_DMA_MAX_WINDOWS; ++i) { + tcet = spapr_tce_new_table(DEVICE(sphb), + SPAPR_PCI_LIOBN(sphb->index, i)); + if (!tcet) { + error_setg(errp, "Creating window#%d failed for %s", + i, sphb->dtbusname); + return; + } + memory_region_add_subregion_overlap(&sphb->iommu_root, 0, + spapr_tce_get_iommu(tcet), 0); + } sphb->msi = g_hash_table_new_full(g_int_hash, g_int_equal, g_free, g_free); } @@ -1517,11 +1527,16 @@ static int spapr_phb_children_reset(Object *child, void *opaque) void spapr_phb_dma_reset(sPAPRPHBState *sphb) { - sPAPRTCETable *tcet = spapr_tce_find_by_liobn(sphb->dma_liobn); Error *local_err = NULL; + int i; - if (tcet && tcet->enabled) { - spapr_phb_dma_window_disable(sphb, sphb->dma_liobn); + for (i = 0; i < SPAPR_PCI_DMA_MAX_WINDOWS; ++i) { + uint32_t liobn = SPAPR_PCI_LIOBN(sphb->index, i); + sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn); + + if (tcet && tcet->enabled) { + spapr_phb_dma_window_disable(sphb, liobn); + } } /* Register default 32bit DMA window */ @@ -1562,6 +1577,13 @@ static Property spapr_phb_properties[] = { /* Default DMA window is 0..1GB */ DEFINE_PROP_UINT64("dma_win_addr", sPAPRPHBState, dma_win_addr, 0), DEFINE_PROP_UINT64("dma_win_size", sPAPRPHBState, dma_win_size, 0x40000000), + DEFINE_PROP_UINT64("dma64_win_addr", sPAPRPHBState, dma64_window_addr, + 0x800000000000000ULL), + DEFINE_PROP_BOOL("ddw", sPAPRPHBState, ddw_enabled, true), + DEFINE_PROP_UINT32("windows", sPAPRPHBState, windows_supported, + SPAPR_PCI_DMA_MAX_WINDOWS), + DEFINE_PROP_UINT64("pgsz", sPAPRPHBState, page_size_mask, + (1ULL << 12) | (1ULL << 16) | (1ULL << 24)), DEFINE_PROP_END_OF_LIST(), }; @@ -1815,6 +1837,15 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, uint32_t interrupt_map_mask[] = { cpu_to_be32(b_ddddd(-1)|b_fff(0)), 0x0, 0x0, cpu_to_be32(-1)}; uint32_t interrupt_map[PCI_SLOT_MAX * PCI_NUM_PINS][7]; + uint32_t ddw_applicable[] = { + cpu_to_be32(RTAS_IBM_QUERY_PE_DMA_WINDOW), + cpu_to_be32(RTAS_IBM_CREATE_PE_DMA_WINDOW), + cpu_to_be32(RTAS_IBM_REMOVE_PE_DMA_WINDOW) + }; + uint32_t ddw_extensions[] = { + cpu_to_be32(1), + cpu_to_be32(RTAS_IBM_RESET_PE_DMA_WINDOW) + }; sPAPRTCETable *tcet; PCIBus *bus = PCI_HOST_BRIDGE(phb)->bus; sPAPRFDT s_fdt; @@ -1839,6 +1870,14 @@ int spapr_populate_pci_dt(sPAPRPHBState *phb, _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pci-config-space-type", 0x1)); _FDT(fdt_setprop_cell(fdt, bus_off, "ibm,pe-total-#msi", XICS_IRQS)); + /* Dynamic DMA window */ + if (phb->ddw_enabled) { + _FDT(fdt_setprop(fdt, bus_off, "ibm,ddw-applicable", &ddw_applicable, + sizeof(ddw_applicable))); + _FDT(fdt_setprop(fdt, bus_off, "ibm,ddw-extensions", + &ddw_extensions, sizeof(ddw_extensions))); + } + /* Build the interrupt-map, this must matches what is done * in pci_spapr_map_irq */ diff --git a/hw/ppc/spapr_rtas_ddw.c b/hw/ppc/spapr_rtas_ddw.c new file mode 100644 index 0000000..37f805f --- /dev/null +++ b/hw/ppc/spapr_rtas_ddw.c @@ -0,0 +1,300 @@ +/* + * QEMU sPAPR Dynamic DMA windows support + * + * Copyright (c) 2015 Alexey Kardashevskiy, IBM Corporation. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, + * or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "hw/ppc/spapr.h" +#include "hw/pci-host/spapr.h" +#include "trace.h" + +static int spapr_phb_get_active_win_num_cb(Object *child, void *opaque) +{ + sPAPRTCETable *tcet; + + tcet = (sPAPRTCETable *) object_dynamic_cast(child, TYPE_SPAPR_TCE_TABLE); + if (tcet && tcet->enabled) { + ++*(unsigned *)opaque; + } + return 0; +} + +static unsigned spapr_phb_get_active_win_num(sPAPRPHBState *sphb) +{ + unsigned ret = 0; + + object_child_foreach(OBJECT(sphb), spapr_phb_get_active_win_num_cb, &ret); + + return ret; +} + +static int spapr_phb_get_free_liobn_cb(Object *child, void *opaque) +{ + sPAPRTCETable *tcet; + + tcet = (sPAPRTCETable *) object_dynamic_cast(child, TYPE_SPAPR_TCE_TABLE); + if (tcet && !tcet->enabled) { + *(uint32_t *)opaque = tcet->liobn; + return 1; + } + return 0; +} + +static unsigned spapr_phb_get_free_liobn(sPAPRPHBState *sphb) +{ + uint32_t liobn = 0; + + object_child_foreach(OBJECT(sphb), spapr_phb_get_free_liobn_cb, &liobn); + + return liobn; +} + +static uint32_t spapr_query_mask(struct ppc_one_seg_page_size *sps, + uint64_t page_mask) +{ + int i, j; + uint32_t mask = 0; + const struct { int shift; uint32_t mask; } masks[] = { + { 12, RTAS_DDW_PGSIZE_4K }, + { 16, RTAS_DDW_PGSIZE_64K }, + { 24, RTAS_DDW_PGSIZE_16M }, + { 25, RTAS_DDW_PGSIZE_32M }, + { 26, RTAS_DDW_PGSIZE_64M }, + { 27, RTAS_DDW_PGSIZE_128M }, + { 28, RTAS_DDW_PGSIZE_256M }, + { 34, RTAS_DDW_PGSIZE_16G }, + }; + + for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) { + for (j = 0; j < ARRAY_SIZE(masks); ++j) { + if ((sps[i].page_shift == masks[j].shift) && + (page_mask & (1ULL << masks[j].shift))) { + mask |= masks[j].mask; + } + } + } + + return mask; +} + +static void rtas_ibm_query_pe_dma_window(PowerPCCPU *cpu, + sPAPRMachineState *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + CPUPPCState *env = &cpu->env; + sPAPRPHBState *sphb; + uint64_t buid, max_window_size; + uint32_t avail, addr, pgmask = 0; + + if ((nargs != 3) || (nret != 5)) { + goto param_error_exit; + } + + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); + addr = rtas_ld(args, 0); + sphb = spapr_pci_find_phb(spapr, buid); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + /* Work out supported page masks */ + pgmask = spapr_query_mask(env->sps.sps, sphb->page_size_mask); + + /* + * This is "Largest contiguous block of TCEs allocated specifically + * for (that is, are reserved for) this PE". + * Return the maximum number as maximum supported RAM size was in 4K pages. + */ + max_window_size = MACHINE(spapr)->maxram_size >> SPAPR_TCE_PAGE_SHIFT; + + avail = sphb->windows_supported - spapr_phb_get_active_win_num(sphb); + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + rtas_st(rets, 1, avail); + rtas_st(rets, 2, max_window_size); + rtas_st(rets, 3, pgmask); + rtas_st(rets, 4, 0); /* DMA migration mask, not supported */ + + trace_spapr_iommu_ddw_query(buid, addr, avail, max_window_size, pgmask); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_create_pe_dma_window(PowerPCCPU *cpu, + sPAPRMachineState *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + sPAPRPHBState *sphb; + sPAPRTCETable *tcet = NULL; + uint32_t addr, page_shift, window_shift, liobn; + uint64_t buid; + Error *local_err = NULL; + + if ((nargs != 5) || (nret != 4)) { + goto param_error_exit; + } + + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); + addr = rtas_ld(args, 0); + sphb = spapr_pci_find_phb(spapr, buid); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + page_shift = rtas_ld(args, 3); + window_shift = rtas_ld(args, 4); + liobn = spapr_phb_get_free_liobn(sphb); + + if (!liobn || !(sphb->page_size_mask & (1ULL << page_shift)) || + spapr_phb_get_active_win_num(sphb) == sphb->windows_supported) { + goto hw_error_exit; + } + + if (window_shift < page_shift) { + goto param_error_exit; + } + + spapr_phb_dma_window_enable(sphb, liobn, page_shift, + sphb->dma64_window_addr, + 1ULL << window_shift, &local_err); + if (local_err) { + error_report_err(local_err); + goto hw_error_exit; + } + + tcet = spapr_tce_find_by_liobn(liobn); + trace_spapr_iommu_ddw_create(buid, addr, 1ULL << page_shift, + 1ULL << window_shift, + tcet ? tcet->bus_offset : 0xbaadf00d, liobn); + if (local_err || !tcet) { + goto hw_error_exit; + } + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + rtas_st(rets, 1, liobn); + rtas_st(rets, 2, tcet->bus_offset >> 32); + rtas_st(rets, 3, tcet->bus_offset & ((uint32_t) -1)); + + return; + +hw_error_exit: + rtas_st(rets, 0, RTAS_OUT_HW_ERROR); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_remove_pe_dma_window(PowerPCCPU *cpu, + sPAPRMachineState *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + sPAPRPHBState *sphb; + sPAPRTCETable *tcet; + uint32_t liobn; + long ret; + + if ((nargs != 1) || (nret != 1)) { + goto param_error_exit; + } + + liobn = rtas_ld(args, 0); + tcet = spapr_tce_find_by_liobn(liobn); + if (!tcet) { + goto param_error_exit; + } + + sphb = SPAPR_PCI_HOST_BRIDGE(OBJECT(tcet)->parent); + if (!sphb || !sphb->ddw_enabled || !spapr_phb_get_active_win_num(sphb)) { + goto param_error_exit; + } + + ret = spapr_phb_dma_window_disable(sphb, liobn); + trace_spapr_iommu_ddw_remove(liobn, ret); + if (ret) { + goto hw_error_exit; + } + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + return; + +hw_error_exit: + rtas_st(rets, 0, RTAS_OUT_HW_ERROR); + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void rtas_ibm_reset_pe_dma_window(PowerPCCPU *cpu, + sPAPRMachineState *spapr, + uint32_t token, uint32_t nargs, + target_ulong args, + uint32_t nret, target_ulong rets) +{ + sPAPRPHBState *sphb; + uint64_t buid; + uint32_t addr; + + if ((nargs != 3) || (nret != 1)) { + goto param_error_exit; + } + + buid = ((uint64_t)rtas_ld(args, 1) << 32) | rtas_ld(args, 2); + addr = rtas_ld(args, 0); + sphb = spapr_pci_find_phb(spapr, buid); + if (!sphb || !sphb->ddw_enabled) { + goto param_error_exit; + } + + spapr_phb_dma_reset(sphb); + trace_spapr_iommu_ddw_reset(buid, addr); + + rtas_st(rets, 0, RTAS_OUT_SUCCESS); + + return; + +param_error_exit: + rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR); +} + +static void spapr_rtas_ddw_init(void) +{ + spapr_rtas_register(RTAS_IBM_QUERY_PE_DMA_WINDOW, + "ibm,query-pe-dma-window", + rtas_ibm_query_pe_dma_window); + spapr_rtas_register(RTAS_IBM_CREATE_PE_DMA_WINDOW, + "ibm,create-pe-dma-window", + rtas_ibm_create_pe_dma_window); + spapr_rtas_register(RTAS_IBM_REMOVE_PE_DMA_WINDOW, + "ibm,remove-pe-dma-window", + rtas_ibm_remove_pe_dma_window); + spapr_rtas_register(RTAS_IBM_RESET_PE_DMA_WINDOW, + "ibm,reset-pe-dma-window", + rtas_ibm_reset_pe_dma_window); +} + +type_init(spapr_rtas_ddw_init) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 421d6eb..b0ea146 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -994,11 +994,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) } } - /* - * This only considers the host IOMMU's 32-bit window. At - * some point we need to add support for the optional 64-bit - * window and dynamic windows - */ info.argsz = sizeof(info); ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info); if (ret) { diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h index 7848366..e81b751 100644 --- a/include/hw/pci-host/spapr.h +++ b/include/hw/pci-host/spapr.h @@ -71,6 +71,11 @@ struct sPAPRPHBState { spapr_pci_msi_mig *msi_devs; QLIST_ENTRY(sPAPRPHBState) list; + + bool ddw_enabled; + uint32_t windows_supported; + uint64_t page_size_mask; + uint64_t dma64_window_addr; }; #define SPAPR_PCI_MAX_INDEX 255 @@ -89,6 +94,8 @@ struct sPAPRPHBState { #define SPAPR_PCI_MSI_WINDOW 0x40000000000ULL +#define SPAPR_PCI_DMA_MAX_WINDOWS 2 + static inline qemu_irq spapr_phb_lsi_qirq(struct sPAPRPHBState *phb, int pin) { sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine()); @@ -148,5 +155,11 @@ static inline void spapr_phb_vfio_reset(DeviceState *qdev) #endif void spapr_phb_dma_reset(sPAPRPHBState *sphb); +void spapr_phb_dma_window_enable(sPAPRPHBState *sphb, + uint32_t liobn, uint32_t page_shift, + uint64_t window_addr, + uint64_t window_size, + Error **errp); +int spapr_phb_dma_window_disable(sPAPRPHBState *sphb, uint32_t liobn); #endif /* __HW_SPAPR_PCI_H__ */ diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index 471eb4a..41b32c6 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -417,6 +417,16 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi); #define RTAS_OUT_NOT_AUTHORIZED -9002 #define RTAS_OUT_SYSPARM_PARAM_ERROR -9999 +/* DDW pagesize mask values from ibm,query-pe-dma-window */ +#define RTAS_DDW_PGSIZE_4K 0x01 +#define RTAS_DDW_PGSIZE_64K 0x02 +#define RTAS_DDW_PGSIZE_16M 0x04 +#define RTAS_DDW_PGSIZE_32M 0x08 +#define RTAS_DDW_PGSIZE_64M 0x10 +#define RTAS_DDW_PGSIZE_128M 0x20 +#define RTAS_DDW_PGSIZE_256M 0x40 +#define RTAS_DDW_PGSIZE_16G 0x80 + /* RTAS tokens */ #define RTAS_TOKEN_BASE 0x2000 @@ -458,8 +468,12 @@ int spapr_allocate_irq_block(int num, bool lsi, bool msi); #define RTAS_IBM_SET_SLOT_RESET (RTAS_TOKEN_BASE + 0x23) #define RTAS_IBM_CONFIGURE_PE (RTAS_TOKEN_BASE + 0x24) #define RTAS_IBM_SLOT_ERROR_DETAIL (RTAS_TOKEN_BASE + 0x25) +#define RTAS_IBM_QUERY_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x26) +#define RTAS_IBM_CREATE_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x27) +#define RTAS_IBM_REMOVE_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x28) +#define RTAS_IBM_RESET_PE_DMA_WINDOW (RTAS_TOKEN_BASE + 0x29) -#define RTAS_TOKEN_MAX (RTAS_TOKEN_BASE + 0x26) +#define RTAS_TOKEN_MAX (RTAS_TOKEN_BASE + 0x2A) /* RTAS ibm,get-system-parameter token values */ #define RTAS_SYSPARM_SPLPAR_CHARACTERISTICS 20 diff --git a/trace-events b/trace-events index f2b75a3..e68d0e4 100644 --- a/trace-events +++ b/trace-events @@ -1431,6 +1431,10 @@ spapr_iommu_pci_indirect(uint64_t liobn, uint64_t ioba, uint64_t tce, uint64_t i spapr_iommu_pci_stuff(uint64_t liobn, uint64_t ioba, uint64_t tce_value, uint64_t npages, uint64_t ret) "liobn=%"PRIx64" ioba=0x%"PRIx64" tcevalue=0x%"PRIx64" npages=%"PRId64" ret=%"PRId64 spapr_iommu_xlate(uint64_t liobn, uint64_t ioba, uint64_t tce, unsigned perm, unsigned pgsize) "liobn=%"PRIx64" 0x%"PRIx64" -> 0x%"PRIx64" perm=%u mask=%x" spapr_iommu_new_table(uint64_t liobn, void *table, int fd) "liobn=%"PRIx64" table=%p fd=%d" +spapr_iommu_ddw_query(uint64_t buid, uint32_t cfgaddr, unsigned wa, uint64_t win_size, uint32_t pgmask) "buid=%"PRIx64" addr=%"PRIx32", %u windows available, max window size=%"PRIx64", mask=%"PRIx32 +spapr_iommu_ddw_create(uint64_t buid, uint32_t cfgaddr, uint64_t pg_size, uint64_t req_size, uint64_t start, uint32_t liobn) "buid=%"PRIx64" addr=%"PRIx32", page size=0x%"PRIx64", requested=0x%"PRIx64", start addr=%"PRIx64", liobn=%"PRIx32 +spapr_iommu_ddw_remove(uint32_t liobn, long ret) "liobn=%"PRIx32", ret = %ld" +spapr_iommu_ddw_reset(uint64_t buid, uint32_t cfgaddr) "buid=%"PRIx64" addr=%"PRIx32 # hw/ppc/ppc.c ppc_tb_adjust(uint64_t offs1, uint64_t offs2, int64_t diff, int64_t seconds) "adjusted from 0x%"PRIx64" to 0x%"PRIx64", diff %"PRId64" (%"PRId64"s)"