From patchwork Thu Nov 14 16:43:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11244015 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC68A1390 for ; Thu, 14 Nov 2019 16:44:54 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A7118206DB for ; Thu, 14 Nov 2019 16:44:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A7118206DB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iVIDa-0005D3-4a; Thu, 14 Nov 2019 16:43:38 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iVIDZ-0005Cu-O5 for xen-devel@lists.xenproject.org; Thu, 14 Nov 2019 16:43:37 +0000 X-Inumbo-ID: e6e93c8a-06fd-11ea-b678-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id e6e93c8a-06fd-11ea-b678-bc764e2007e4; Thu, 14 Nov 2019 16:43:36 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 80BADACC0; Thu, 14 Nov 2019 16:43:35 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <8e8866de-33a8-68c0-3352-d6dfeec4a9b6@suse.com> Message-ID: Date: Thu, 14 Nov 2019 17:43:51 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <8e8866de-33a8-68c0-3352-d6dfeec4a9b6@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v2 1/2] introduce GFN notification for translated domains X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Julien Grall , Wei Liu , Konrad Wilk , George Dunlap , Andrew Cooper , Sander Eikelenboom , Ian Jackson , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" In order for individual IOMMU drivers (and from an abstract pov also architectures) to be able to adjust, ahead of actual mapping requests, their data structures when they might cover only a sub-range of all possible GFNs, introduce a notification call used by various code paths potentially installing a fresh mapping of a never used GFN (for a particular domain). Note that before this patch, in gnttab_transfer(), once past assign_pages(), further errors modifying the physmap are ignored (presumably because it would be too complicated to try to roll back at that point). This patch follows suit by ignoring failed notify_gfn()s or races due to the need to intermediately drop locks, simply printing out a warning that the gfn may not be accessible due to the failure. Signed-off-by: Jan Beulich --- v2: Introduce arch_notify_gfn(), to invoke gfn_valid() on x86 (this unfortunately means it and notify_gfn() now need to be macros, or else include file dependencies get in the way, as gfn_valid() lives in paging.h, which we shouldn't include from xen/sched.h). Improve description. TBD: Does Arm actually have anything to check against in its arch_notify_gfn()? --- a/xen/arch/x86/hvm/dom0_build.c +++ b/xen/arch/x86/hvm/dom0_build.c @@ -173,7 +173,8 @@ static int __init pvh_populate_memory_ra continue; } - rc = guest_physmap_add_page(d, _gfn(start), page_to_mfn(page), + rc = notify_gfn(d, _gfn(start + (1UL << order) - 1)) ?: + guest_physmap_add_page(d, _gfn(start), page_to_mfn(page), order); if ( rc != 0 ) { --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4304,9 +4304,17 @@ static int hvmop_set_param( if ( a.value > SHUTDOWN_MAX ) rc = -EINVAL; break; + case HVM_PARAM_IOREQ_SERVER_PFN: - d->arch.hvm.ioreq_gfn.base = a.value; + if ( d->arch.hvm.params[HVM_PARAM_NR_IOREQ_SERVER_PAGES] ) + rc = notify_gfn( + d, + _gfn(a.value + d->arch.hvm.params + [HVM_PARAM_NR_IOREQ_SERVER_PAGES] - 1)); + if ( !rc ) + d->arch.hvm.ioreq_gfn.base = a.value; break; + case HVM_PARAM_NR_IOREQ_SERVER_PAGES: { unsigned int i; @@ -4317,6 +4325,9 @@ static int hvmop_set_param( rc = -EINVAL; break; } + rc = notify_gfn(d, _gfn(d->arch.hvm.ioreq_gfn.base + a.value - 1)); + if ( rc ) + break; for ( i = 0; i < a.value; i++ ) set_bit(i, &d->arch.hvm.ioreq_gfn.mask); @@ -4330,7 +4341,11 @@ static int hvmop_set_param( BUILD_BUG_ON(HVM_PARAM_BUFIOREQ_PFN > sizeof(d->arch.hvm.ioreq_gfn.legacy_mask) * 8); if ( a.value ) - set_bit(a.index, &d->arch.hvm.ioreq_gfn.legacy_mask); + { + rc = notify_gfn(d, _gfn(a.value)); + if ( !rc ) + set_bit(a.index, &d->arch.hvm.ioreq_gfn.legacy_mask); + } break; case HVM_PARAM_X87_FIP_WIDTH: --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -946,6 +946,16 @@ map_grant_ref( return; } + if ( paging_mode_translate(ld) /* && (op->flags & GNTMAP_host_map) */ && + (rc = notify_gfn(ld, gaddr_to_gfn(op->host_addr))) ) + { + gdprintk(XENLOG_INFO, "notify(%"PRI_gfn") -> %d\n", + gfn_x(gaddr_to_gfn(op->host_addr)), rc); + op->status = GNTST_general_error; + return; + BUILD_BUG_ON(GNTST_okay); + } + if ( unlikely((rd = rcu_lock_domain_by_id(op->dom)) == NULL) ) { gdprintk(XENLOG_INFO, "Could not find domain %d\n", op->dom); @@ -2123,6 +2133,7 @@ gnttab_transfer( { bool_t okay; int rc; + gfn_t gfn; if ( i && hypercall_preempt_check() ) return i; @@ -2300,21 +2311,52 @@ gnttab_transfer( act = active_entry_acquire(e->grant_table, gop.ref); if ( evaluate_nospec(e->grant_table->gt_version == 1) ) + gfn = _gfn(shared_entry_v1(e->grant_table, gop.ref).frame); + else + gfn = _gfn(shared_entry_v2(e->grant_table, gop.ref).full_page.frame); + + if ( paging_mode_translate(e) ) { - grant_entry_v1_t *sha = &shared_entry_v1(e->grant_table, gop.ref); + gfn_t gfn2; + + active_entry_release(act); + grant_read_unlock(e->grant_table); + + rc = notify_gfn(e, gfn); + if ( rc ) + printk(XENLOG_G_WARNING + "%pd: gref %u: xfer GFN %"PRI_gfn" may be inaccessible (%d)\n", + e, gop.ref, gfn_x(gfn), rc); + + grant_read_lock(e->grant_table); + act = active_entry_acquire(e->grant_table, gop.ref); - guest_physmap_add_page(e, _gfn(sha->frame), mfn, 0); - if ( !paging_mode_translate(e) ) - sha->frame = mfn_x(mfn); + if ( evaluate_nospec(e->grant_table->gt_version == 1) ) + gfn2 = _gfn(shared_entry_v1(e->grant_table, gop.ref).frame); + else + gfn2 = _gfn(shared_entry_v2(e->grant_table, gop.ref). + full_page.frame); + + if ( !gfn_eq(gfn, gfn2) ) + { + printk(XENLOG_G_WARNING + "%pd: gref %u: xfer GFN went %"PRI_gfn" -> %"PRI_gfn"\n", + e, gop.ref, gfn_x(gfn), gfn_x(gfn2)); + gfn = gfn2; + } } - else - { - grant_entry_v2_t *sha = &shared_entry_v2(e->grant_table, gop.ref); - guest_physmap_add_page(e, _gfn(sha->full_page.frame), mfn, 0); - if ( !paging_mode_translate(e) ) - sha->full_page.frame = mfn_x(mfn); + guest_physmap_add_page(e, gfn, mfn, 0); + + if ( !paging_mode_translate(e) ) + { + if ( evaluate_nospec(e->grant_table->gt_version == 1) ) + shared_entry_v1(e->grant_table, gop.ref).frame = mfn_x(mfn); + else + shared_entry_v2(e->grant_table, gop.ref).full_page.frame = + mfn_x(mfn); } + smp_wmb(); shared_entry_header(e->grant_table, gop.ref)->flags |= GTF_transfer_completed; --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -203,6 +203,10 @@ static void populate_physmap(struct memo if ( unlikely(__copy_from_guest_offset(&gpfn, a->extent_list, i, 1)) ) goto out; + if ( paging_mode_translate(d) && + notify_gfn(d, _gfn(gpfn + (1U << a->extent_order) - 1)) ) + goto out; + if ( a->memflags & MEMF_populate_on_demand ) { /* Disallow populating PoD pages on oneself. */ @@ -745,6 +749,10 @@ static long memory_exchange(XEN_GUEST_HA continue; } + if ( paging_mode_translate(d) ) + rc = notify_gfn(d, + _gfn(gpfn + (1U << exch.out.extent_order) - 1)); + mfn = page_to_mfn(page); guest_physmap_add_page(d, _gfn(gpfn), mfn, exch.out.extent_order); @@ -813,12 +821,20 @@ int xenmem_add_to_physmap(struct domain extra.foreign_domid = DOMID_INVALID; if ( xatp->space != XENMAPSPACE_gmfn_range ) - return xenmem_add_to_physmap_one(d, xatp->space, extra, + return notify_gfn(d, _gfn(xatp->gpfn)) ?: + xenmem_add_to_physmap_one(d, xatp->space, extra, xatp->idx, _gfn(xatp->gpfn)); if ( xatp->size < start ) return -EILSEQ; + if ( !start && xatp->size ) + { + rc = notify_gfn(d, _gfn(xatp->gpfn + xatp->size - 1)); + if ( rc ) + return rc; + } + xatp->idx += start; xatp->gpfn += start; xatp->size -= start; @@ -891,7 +907,8 @@ static int xenmem_add_to_physmap_batch(s extent, 1)) ) return -EFAULT; - rc = xenmem_add_to_physmap_one(d, xatpb->space, + rc = notify_gfn(d, _gfn(gpfn)) ?: + xenmem_add_to_physmap_one(d, xatpb->space, xatpb->u, idx, _gfn(gpfn)); --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -530,6 +530,14 @@ void iommu_share_p2m_table(struct domain iommu_get_ops()->share_p2m(d); } +int iommu_notify_gfn(struct domain *d, gfn_t gfn) +{ + const struct iommu_ops *ops = dom_iommu(d)->platform_ops; + + return need_iommu_pt_sync(d) && ops->notify_dfn + ? iommu_call(ops, notify_dfn, d, _dfn(gfn_x(gfn))) : 0; +} + void iommu_crash_shutdown(void) { if ( !iommu_crash_disable ) --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -272,6 +272,8 @@ static inline void free_vcpu_guest_conte static inline void arch_vcpu_block(struct vcpu *v) {} +#define arch_notify_gfn(d, gfn) ((void)(d), (void)(gfn), 0) + #endif /* __ASM_DOMAIN_H__ */ /* --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -647,6 +647,8 @@ static inline void free_vcpu_guest_conte void arch_vcpu_regs_init(struct vcpu *v); +#define arch_notify_gfn(d, gfn) (gfn_valid(d, gfn) ? 0 : -EADDRNOTAVAIL) + struct vcpu_hvm_context; int arch_set_info_hvm_guest(struct vcpu *v, const struct vcpu_hvm_context *ctx); --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -237,6 +237,8 @@ struct iommu_ops { int __must_check (*lookup_page)(struct domain *d, dfn_t dfn, mfn_t *mfn, unsigned int *flags); + int __must_check (*notify_dfn)(struct domain *d, dfn_t dfn); + void (*free_page_table)(struct page_info *); #ifdef CONFIG_X86 @@ -331,6 +333,7 @@ void iommu_crash_shutdown(void); int iommu_get_reserved_device_memory(iommu_grdm_t *, void *); void iommu_share_p2m_table(struct domain *d); +int __must_check iommu_notify_gfn(struct domain *d, gfn_t gfn); #ifdef CONFIG_HAS_PCI int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d, --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -1039,6 +1039,8 @@ static always_inline bool is_iommu_enabl return evaluate_nospec(d->options & XEN_DOMCTL_CDF_iommu); } +#define notify_gfn(d, gfn) (arch_notify_gfn(d, gfn) ?: iommu_notify_gfn(d, gfn)) + extern bool sched_smt_power_savings; extern bool sched_disable_smt_switching; From patchwork Thu Nov 14 16:44:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11244017 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B3201393 for ; Thu, 14 Nov 2019 16:45:21 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 31189206DB for ; Thu, 14 Nov 2019 16:45:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 31189206DB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iVIE0-0005Fx-E4; Thu, 14 Nov 2019 16:44:04 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iVIDz-0005Fh-1x for xen-devel@lists.xenproject.org; Thu, 14 Nov 2019 16:44:03 +0000 X-Inumbo-ID: f5d16984-06fd-11ea-9631-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id f5d16984-06fd-11ea-9631-bc764e2007e4; Thu, 14 Nov 2019 16:44:01 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9F454AC93; Thu, 14 Nov 2019 16:44:00 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: <8e8866de-33a8-68c0-3352-d6dfeec4a9b6@suse.com> Message-ID: <54412e2e-d28e-a117-9515-689e11db0df6@suse.com> Date: Thu, 14 Nov 2019 17:44:16 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <8e8866de-33a8-68c0-3352-d6dfeec4a9b6@suse.com> Content-Language: en-US Subject: [Xen-devel] [PATCH v2 2/2] AMD/IOMMU: use notify_dfn() hook to update paging mode X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Andrew Cooper , Sander Eikelenboom Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" update_paging_mode() expects to be invoked with the PCI devices lock held. The check occurring only when the mode actually needs updating, the violation of this rule by the majority of callers did go unnoticed until per-domain IOMMU setup was changed to do away with on-demand creation of IOMMU page tables. Acquiring the necessary lock in amd_iommu_map_page() or intermediate layers in generic IOMMU code is not possible - we'd risk all sorts of lock order violations. Hence the call to update_paging_mode() gets pulled out of the function, to be invoked instead from the new notify_dfn() hook, where no potentially conflicting locks are being held by the callers. Similarly the call to amd_iommu_alloc_root() gets pulled out - now that we receive notification of all DFN range increases, there's no need anymore to do this check when actually mapping a page. Note that this ought to result in a small performance improvement as well: The hook often gets invoked just once for larger blocks of pages, so rather than going through amd_iommu_alloc_root() and update_paging_mode() once per page, we may now invoke it just once per batch. Reported-by: Sander Eikelenboom Signed-off-by: Jan Beulich --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -383,35 +383,16 @@ int amd_iommu_map_page(struct domain *d, unsigned int flags, unsigned int *flush_flags) { struct domain_iommu *hd = dom_iommu(d); - int rc; unsigned long pt_mfn[7]; memset(pt_mfn, 0, sizeof(pt_mfn)); spin_lock(&hd->arch.mapping_lock); - rc = amd_iommu_alloc_root(hd); - if ( rc ) + if ( !hd->arch.root_table ) { spin_unlock(&hd->arch.mapping_lock); - AMD_IOMMU_DEBUG("Root table alloc failed, dfn = %"PRI_dfn"\n", - dfn_x(dfn)); - domain_crash(d); - return rc; - } - - /* Since HVM domain is initialized with 2 level IO page table, - * we might need a deeper page table for wider dfn now */ - if ( is_hvm_domain(d) ) - { - if ( update_paging_mode(d, dfn_x(dfn)) ) - { - spin_unlock(&hd->arch.mapping_lock); - AMD_IOMMU_DEBUG("Update page mode failed dfn = %"PRI_dfn"\n", - dfn_x(dfn)); - domain_crash(d); - return -EFAULT; - } + return -ENODATA; } if ( iommu_pde_from_dfn(d, dfn_x(dfn), pt_mfn, true) || (pt_mfn[1] == 0) ) @@ -468,6 +449,48 @@ int amd_iommu_unmap_page(struct domain * return 0; } + +int amd_iommu_notify_dfn(struct domain *d, dfn_t dfn) +{ + struct domain_iommu *hd = dom_iommu(d); + int rc; + + ASSERT(is_hvm_domain(d)); + + /* + * Since HVM domain is initialized with 2 level IO page table, + * we might need a deeper page table for wider dfn now. + */ + pcidevs_lock(); + spin_lock(&hd->arch.mapping_lock); + + rc = amd_iommu_alloc_root(hd); + if ( rc ) + { + spin_unlock(&hd->arch.mapping_lock); + pcidevs_unlock(); + AMD_IOMMU_DEBUG("Root table alloc failed, dfn = %"PRI_dfn" (rc %d)\n", + dfn_x(dfn), rc); + domain_crash(d); + return rc; + } + + rc = update_paging_mode(d, dfn_x(dfn)); + if ( rc ) + { + spin_unlock(&hd->arch.mapping_lock); + pcidevs_unlock(); + AMD_IOMMU_DEBUG("Update paging mode failed dfn %"PRI_dfn" (rc %d)\n", + dfn_x(dfn), rc); + domain_crash(d); + return rc; + } + + spin_unlock(&hd->arch.mapping_lock); + pcidevs_unlock(); + + return 0; +} static unsigned long flush_count(unsigned long dfn, unsigned int page_count, unsigned int order) --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -628,6 +628,7 @@ static const struct iommu_ops __initcons .teardown = amd_iommu_domain_destroy, .map_page = amd_iommu_map_page, .unmap_page = amd_iommu_unmap_page, + .notify_dfn = amd_iommu_notify_dfn, .iotlb_flush = amd_iommu_flush_iotlb_pages, .iotlb_flush_all = amd_iommu_flush_iotlb_all, .free_page_table = deallocate_page_table, --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -61,6 +61,7 @@ int __must_check amd_iommu_map_page(stru int __must_check amd_iommu_unmap_page(struct domain *d, dfn_t dfn, unsigned int *flush_flags); int __must_check amd_iommu_alloc_root(struct domain_iommu *hd); +int __must_check amd_iommu_notify_dfn(struct domain *d, dfn_t dfn); int amd_iommu_reserve_domain_unity_map(struct domain *domain, paddr_t phys_addr, unsigned long size, int iw, int ir);