From patchwork Fri Dec 13 12:53:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11291115 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5D1E013B6 for ; Fri, 13 Dec 2019 20:59:58 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2BEE4246BD for ; Fri, 13 Dec 2019 20:59:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2BEE4246BD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ifkRU-0002Pk-TH; Fri, 13 Dec 2019 12:53:12 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ifkRT-0002Pf-JR for xen-devel@lists.xenproject.org; Fri, 13 Dec 2019 12:53:11 +0000 X-Inumbo-ID: 83181724-1da7-11ea-8ef7-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 83181724-1da7-11ea-8ef7-12813bfff9fa; Fri, 13 Dec 2019 12:53:08 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E7682AF43; Fri, 13 Dec 2019 12:53:07 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" Message-ID: Date: Fri, 13 Dec 2019 13:53:29 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 Content-Language: en-US Subject: [Xen-devel] [PATCH v2] IOMMU: make DMA containment of quarantined devices optional X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Kevin Tian , Stefano Stabellini , Julien Grall , Wei Liu , Konrad Wilk , George Dunlap , Andrew Cooper , Paul Durrant , Ian Jackson , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Containing still in flight DMA was introduced to work around certain devices / systems hanging hard upon hitting an IOMMU fault. Passing through (such) devices (on such systems) is inherently insecure (as guests could easily arrange for IOMMU faults to occur). Defaulting to a mode where admins may not even become aware of issues with devices can be considered undesirable. Therefore convert this mode of operation to an optional one, not one enabled by default. This involves resurrecting code commit ea38867831da ("x86 / iommu: set up a scratch page in the quarantine domain") did remove, in a slightly extended and abstracted fashion. Here, instead of reintroducing a pretty pointless use of "goto" in domain_context_unmap(), and instead of making the function (at least temporarily) inconsistent, take the opportunity and replace the other similarly pointless "goto" as well. In order to key the re-instated bypasses off of there (not) being a root page table this further requires moving the allocate_domain_resources() invocation from reassign_device() to amd_iommu_setup_domain_device() (or else reassign_device() would allocate a root page table anyway); this is benign to the second caller of the latter function. Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné --- As far as 4.13 is concerned, I guess if we can't come to an agreement here, the only other option is to revert ea38867831da from the branch, for having been committed prematurely (I'm not so much worried about the master branch, where we have ample time until 4.14). What I surely want to see us avoid is a back and forth in behavior of released versions. (Note that 4.12.2 is similarly blocked on a decision either way here.) I'm happy to take better suggestions to replace "full". --- v2: Don't use true/false. Introduce QUARANTINE_SKIP() (albeit I'm not really convinced this is an improvement). Add comment. --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -1232,7 +1232,7 @@ detection of systems known to misbehave > Default: `new` unless directed-EOI is supported ### iommu - = List of [ , verbose, debug, force, required, quarantine, + = List of [ , verbose, debug, force, required, quarantine[=full], sharept, intremap, intpost, crash-disable, snoop, qinval, igfx, amd-iommu-perdev-intremap, dom0-{passthrough,strict} ] @@ -1270,11 +1270,13 @@ boolean (e.g. `iommu=no`) can override t will prevent Xen from booting if IOMMUs aren't discovered and enabled successfully. -* The `quarantine` boolean can be used to control Xen's behavior when - de-assigning devices from guests. If enabled (the default), Xen always - quarantines such devices; they must be explicitly assigned back to Dom0 - before they can be used there again. If disabled, Xen will only - quarantine devices the toolstack hass arranged for getting quarantined. +* The `quarantine` option can be used to control Xen's behavior when + de-assigning devices from guests. If set to true (the default), Xen + always quarantines such devices; they must be explicitly assigned back + to Dom0 before they can be used there again. If set to "full", still + active DMA will additionally be directed to a "sink" page. If set to + false, Xen will only quarantine devices the toolstack has arranged for + getting quarantined. * The `sharept` boolean controls whether the IOMMU pagetables are shared with the CPU-side HAP pagetables, or allocated separately. Sharing --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -28,6 +28,9 @@ #include #include "../ats.h" +/* dom_io is used as a sentinel for quarantined devices */ +#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)->arch.root_table) + static bool_t __read_mostly init_done; static const struct iommu_init_ops _iommu_init_ops; @@ -85,18 +88,35 @@ int get_dma_requestor_id(uint16_t seg, u return req_id; } -static void amd_iommu_setup_domain_device( +static int __must_check allocate_domain_resources(struct domain_iommu *hd) +{ + int rc; + + spin_lock(&hd->arch.mapping_lock); + rc = amd_iommu_alloc_root(hd); + spin_unlock(&hd->arch.mapping_lock); + + return rc; +} + +static int __must_check amd_iommu_setup_domain_device( struct domain *domain, struct amd_iommu *iommu, uint8_t devfn, struct pci_dev *pdev) { struct amd_iommu_dte *table, *dte; unsigned long flags; - int req_id, valid = 1; + int req_id, valid = 1, rc; u8 bus = pdev->bus; - const struct domain_iommu *hd = dom_iommu(domain); + struct domain_iommu *hd = dom_iommu(domain); + + if ( QUARANTINE_SKIP(domain) ) + return 0; + + BUG_ON(!hd->arch.paging_mode || !iommu->dev_table.buffer); - BUG_ON( !hd->arch.root_table || !hd->arch.paging_mode || - !iommu->dev_table.buffer ); + rc = allocate_domain_resources(hd); + if ( rc ) + return rc; if ( iommu_hwdom_passthrough && is_hardware_domain(domain) ) valid = 0; @@ -151,6 +171,8 @@ static void amd_iommu_setup_domain_devic amd_iommu_flush_iotlb(devfn, pdev, INV_IOMMU_ALL_PAGES_ADDRESS, 0); } + + return 0; } int __init acpi_ivrs_init(void) @@ -220,17 +242,6 @@ int amd_iommu_alloc_root(struct domain_i return 0; } -static int __must_check allocate_domain_resources(struct domain_iommu *hd) -{ - int rc; - - spin_lock(&hd->arch.mapping_lock); - rc = amd_iommu_alloc_root(hd); - spin_unlock(&hd->arch.mapping_lock); - - return rc; -} - int amd_iommu_get_paging_mode(unsigned long entries) { int level = 1; @@ -294,6 +305,9 @@ static void amd_iommu_disable_domain_dev int req_id; u8 bus = pdev->bus; + if ( QUARANTINE_SKIP(domain) ) + return; + BUG_ON ( iommu->dev_table.buffer == NULL ); req_id = get_dma_requestor_id(iommu->seg, PCI_BDF2(bus, devfn)); table = iommu->dev_table.buffer; @@ -340,7 +354,6 @@ static int reassign_device(struct domain { struct amd_iommu *iommu; int bdf, rc; - struct domain_iommu *t = dom_iommu(target); bdf = PCI_BDF2(pdev->bus, pdev->devfn); iommu = find_iommu_for_device(pdev->seg, bdf); @@ -361,11 +374,10 @@ static int reassign_device(struct domain pdev->domain = target; } - rc = allocate_domain_resources(t); + rc = amd_iommu_setup_domain_device(target, iommu, devfn, pdev); if ( rc ) return rc; - amd_iommu_setup_domain_device(target, iommu, devfn, pdev); AMD_IOMMU_DEBUG("Re-assign %04x:%02x:%02x.%u from dom%d to dom%d\n", pdev->seg, pdev->bus, PCI_SLOT(devfn), PCI_FUNC(devfn), source->domain_id, target->domain_id); @@ -522,8 +534,7 @@ static int amd_iommu_add_device(u8 devfn spin_unlock_irqrestore(&iommu->lock, flags); } - amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev); - return 0; + return amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev); } static int amd_iommu_remove_device(u8 devfn, struct pci_dev *pdev) --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -30,13 +30,17 @@ bool_t __initdata iommu_enable = 1; bool_t __read_mostly iommu_enabled; bool_t __read_mostly force_iommu; bool_t __read_mostly iommu_verbose; -bool __read_mostly iommu_quarantine = true; bool_t __read_mostly iommu_igfx = 1; bool_t __read_mostly iommu_snoop = 1; bool_t __read_mostly iommu_qinval = 1; bool_t __read_mostly iommu_intremap = 1; bool_t __read_mostly iommu_crash_disable; +#define IOMMU_quarantine_none 0 +#define IOMMU_quarantine_basic 1 +#define IOMMU_quarantine_full 2 +uint8_t __read_mostly iommu_quarantine = IOMMU_quarantine_basic; + static bool __hwdom_initdata iommu_hwdom_none; bool __hwdom_initdata iommu_hwdom_strict; bool __read_mostly iommu_hwdom_passthrough; @@ -81,6 +85,8 @@ static int __init parse_iommu_param(cons force_iommu = val; else if ( (val = parse_boolean("quarantine", s, ss)) >= 0 ) iommu_quarantine = val; + else if ( ss == s + 15 && !strncmp(s, "quarantine=full", 15) ) + iommu_quarantine = IOMMU_quarantine_full; else if ( (val = parse_boolean("igfx", s, ss)) >= 0 ) iommu_igfx = val; else if ( (val = parse_boolean("verbose", s, ss)) >= 0 ) @@ -451,7 +457,7 @@ static int __init iommu_quarantine_init( dom_io->options |= XEN_DOMCTL_CDF_iommu; rc = iommu_domain_init(dom_io, 0); - if ( rc ) + if ( rc || iommu_quarantine < IOMMU_quarantine_full ) return rc; if ( !hd->platform_ops->quarantine_init ) --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -41,6 +41,9 @@ #include "vtd.h" #include "../ats.h" +/* dom_io is used as a sentinel for quarantined devices */ +#define QUARANTINE_SKIP(d) ((d) == dom_io && !dom_iommu(d)->arch.pgd_maddr) + struct mapped_rmrr { struct list_head list; u64 base, end; @@ -1291,6 +1294,9 @@ int domain_context_mapping_one( int agaw, rc, ret; bool_t flush_dev_iotlb; + if ( QUARANTINE_SKIP(domain) ) + return 0; + ASSERT(pcidevs_locked()); spin_lock(&iommu->lock); maddr = bus_to_context_maddr(iommu, bus); @@ -1537,6 +1543,9 @@ int domain_context_unmap_one( int iommu_domid, rc, ret; bool_t flush_dev_iotlb; + if ( QUARANTINE_SKIP(domain) ) + return 0; + ASSERT(pcidevs_locked()); spin_lock(&iommu->lock); @@ -1598,7 +1607,7 @@ static int domain_context_unmap(struct d { struct acpi_drhd_unit *drhd; struct vtd_iommu *iommu; - int ret = 0; + int ret; u8 seg = pdev->seg, bus = pdev->bus, tmp_bus, tmp_devfn, secbus; int found = 0; @@ -1614,14 +1623,12 @@ static int domain_context_unmap(struct d printk(VTDPREFIX "d%d:Hostbridge: skip %04x:%02x:%02x.%u unmap\n", domain->domain_id, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); - if ( !is_hardware_domain(domain) ) - return -EPERM; - goto out; + return is_hardware_domain(domain) ? 0 : -EPERM; case DEV_TYPE_PCIe_BRIDGE: case DEV_TYPE_PCIe2PCI_BRIDGE: case DEV_TYPE_LEGACY_PCI_BRIDGE: - goto out; + return 0; case DEV_TYPE_PCIe_ENDPOINT: if ( iommu_debug ) @@ -1665,10 +1672,12 @@ static int domain_context_unmap(struct d dprintk(XENLOG_ERR VTDPREFIX, "d%d:unknown(%u): %04x:%02x:%02x.%u\n", domain->domain_id, pdev->type, seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); - ret = -EINVAL; - goto out; + return -EINVAL; } + if ( QUARANTINE_SKIP(domain) ) + return ret; + /* * if no other devices under the same iommu owned by this domain, * clear iommu in iommu_bitmap and clear domain_id in domid_bitmp @@ -1694,16 +1703,12 @@ static int domain_context_unmap(struct d iommu_domid = domain_iommu_domid(domain, iommu); if ( iommu_domid == -1 ) - { - ret = -EINVAL; - goto out; - } + return -EINVAL; clear_bit(iommu_domid, iommu->domid_bitmap); iommu->domid_map[iommu_domid] = 0; } -out: return ret; } --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -53,8 +53,10 @@ static inline bool_t dfn_eq(dfn_t x, dfn } extern bool_t iommu_enable, iommu_enabled; -extern bool force_iommu, iommu_quarantine, iommu_verbose, iommu_igfx; +extern bool force_iommu, iommu_verbose, iommu_igfx; extern bool_t iommu_snoop, iommu_qinval, iommu_intremap, iommu_intpost; +/* Boolean except for the specific purposes of drivers/passthrough/iommu.c. */ +extern uint8_t iommu_quarantine; #if defined(CONFIG_IOMMU_FORCE_PT_SHARE) #define iommu_hap_pt_share true