From patchwork Fri Nov 17 09:47:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13458670 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1CF25C197A0 for ; Fri, 17 Nov 2023 09:48:21 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.634961.990615 (Exim 4.92) (envelope-from ) id 1r3vSQ-0007rF-3M; Fri, 17 Nov 2023 09:48:14 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 634961.990615; Fri, 17 Nov 2023 09:48:14 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r3vSP-0007r8-WA; Fri, 17 Nov 2023 09:48:13 +0000 Received: by outflank-mailman (input) for mailman id 634961; Fri, 17 Nov 2023 09:48:13 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r3vSP-0007p0-7m for xen-devel@lists.xenproject.org; Fri, 17 Nov 2023 09:48:13 +0000 Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [2a00:1450:4864:20::130]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 6c50d818-852e-11ee-98dc-6d05b1d4d9a1; Fri, 17 Nov 2023 10:48:12 +0100 (CET) Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-507b9408c61so2482600e87.0 for ; Fri, 17 Nov 2023 01:48:12 -0800 (PST) Received: from localhost ([213.195.113.99]) by smtp.gmail.com with ESMTPSA id p8-20020adff208000000b0032da75af3easm1754119wro.80.2023.11.17.01.48.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 01:48:11 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6c50d818-852e-11ee-98dc-6d05b1d4d9a1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1700214492; x=1700819292; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=14W9TQ4ErT83JuBFcjXmKlZRmBO8QnfuQD3+/1nam+w=; b=ICZXsxFvRV0OAbdY81cBVOmcrGyE/pGUzGkwlbwd2FstdBdUgFHWubNeba1IAyFfFZ D59YijrLDGs5SaLLJkco5H3CEK0ueBXW5VVXLTmYPXepe7/eKSZ32uPdI0QPXVyqAeRt aq+bUIZxZ6GuLM+G4A8qR6DQ1wQe+BxaXGFXQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700214492; x=1700819292; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=14W9TQ4ErT83JuBFcjXmKlZRmBO8QnfuQD3+/1nam+w=; b=Zrx1XtU6HmUyGW/UTg5lQi0T0FJaRunv+LdEUdUjB2L1AKZOHviA0dBNZ5qetNibld cIzXXC2DfM/7bXf3tF7IMv6qhuBbkXAIoxtzfPDyTntBvuynh8O9T3UxViEFkIXFZP9U y9foCyX9jiRRUx/WUSGZqlyHk48hXb7L+oA3YapIvFNCF9VV+2zOYafIaNyDfO4CdE83 QsbJ83MS9QlrlMwnE7PG7oWne98UIPkl1TTV4F2FZe7Xx9fwUOI5yCjnjfCYdyaKfdhE HFTDLAwPjAxepvoXKbmPf4JIKvHk1z6qIViPplvevZGZlcWSP6+NRG3P3omHEeC+hwwh as2g== X-Gm-Message-State: AOJu0YwYqxUHglr6jC99X6mrCRhdOb057+kGB93kD9JNXG/vCDyEr5aH x1638ZMc5uyEkBNQASgO8OugvsziJLITVM2i6Jk= X-Google-Smtp-Source: AGHT+IG7BD7sg4RKTTHhLpcIXugWZZXYVgnmSbO2lXN/5DElldJzmOO33eb7CQCIUzUpHWXURaHP9g== X-Received: by 2002:ac2:5e8b:0:b0:509:4587:bdd0 with SMTP id b11-20020ac25e8b000000b005094587bdd0mr13530628lfq.7.1700214492128; Fri, 17 Nov 2023 01:48:12 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH 1/3] amd-vi: use the same IOMMU page table levels for PV and HVM Date: Fri, 17 Nov 2023 10:47:47 +0100 Message-ID: <20231117094749.81091-2-roger.pau@citrix.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231117094749.81091-1-roger.pau@citrix.com> References: <20231117094749.81091-1-roger.pau@citrix.com> MIME-Version: 1.0 Using different page table levels for HVM or PV guest is not helpful, and is not inline with the IOMMU implementation used by the other architecture vendor (VT-d). Switch to uniformly use DEFAULT_DOMAIN_ADDRESS_WIDTH in order to set the AMD-Vi page table levels. Note using the max RAM address for PV was bogus anyway, as there's no guarantee there can't be device MMIO or reserved regions past the maximum RAM region. Signed-off-by: Roger Pau Monné Acked-by: Andrew Cooper , although... --- xen/drivers/passthrough/amd/pci_amd_iommu.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index 6bc73dc21052..f9e749d74da2 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -359,21 +359,17 @@ int __read_mostly amd_iommu_min_paging_mode = 1; static int cf_check amd_iommu_domain_init(struct domain *d) { struct domain_iommu *hd = dom_iommu(d); + int pgmode = amd_iommu_get_paging_mode( + 1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT)); + + if ( pgmode < 0 ) + return pgmode; /* - * Choose the number of levels for the IOMMU page tables. - * - PV needs 3 or 4, depending on whether there is RAM (including hotplug - * RAM) above the 512G boundary. - * - HVM could in principle use 3 or 4 depending on how much guest - * physical address space we give it, but this isn't known yet so use 4 - * unilaterally. - * - Unity maps may require an even higher number. + * Choose the number of levels for the IOMMU page tables, taking into + * account unity maps. */ - hd->arch.amd.paging_mode = max(amd_iommu_get_paging_mode( - is_hvm_domain(d) - ? 1UL << (DEFAULT_DOMAIN_ADDRESS_WIDTH - PAGE_SHIFT) - : get_upper_mfn_bound() + 1), - amd_iommu_min_paging_mode); + hd->arch.amd.paging_mode = max(pgmode, amd_iommu_min_paging_mode); return 0; } From patchwork Fri Nov 17 09:47:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13458671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ADA82C197A0 for ; Fri, 17 Nov 2023 09:48:26 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.634964.990625 (Exim 4.92) (envelope-from ) id 1r3vSS-00089J-EI; Fri, 17 Nov 2023 09:48:16 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 634964.990625; Fri, 17 Nov 2023 09:48:16 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r3vSS-00089C-An; Fri, 17 Nov 2023 09:48:16 +0000 Received: by outflank-mailman (input) for mailman id 634964; Fri, 17 Nov 2023 09:48:15 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r3vSR-0007p0-69 for xen-devel@lists.xenproject.org; Fri, 17 Nov 2023 09:48:15 +0000 Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [2a00:1450:4864:20::136]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 6d6d9253-852e-11ee-98dc-6d05b1d4d9a1; Fri, 17 Nov 2023 10:48:14 +0100 (CET) Received: by mail-lf1-x136.google.com with SMTP id 2adb3069b0e04-5094cb3a036so2672441e87.2 for ; Fri, 17 Nov 2023 01:48:14 -0800 (PST) Received: from localhost ([213.195.113.99]) by smtp.gmail.com with ESMTPSA id l17-20020a5d5611000000b0032f7d7ec4adsm1771881wrv.92.2023.11.17.01.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 01:48:13 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6d6d9253-852e-11ee-98dc-6d05b1d4d9a1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1700214493; x=1700819293; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pbfnLiSboEjj7ZovZYmQgy+lf6RZEwYxMOTQ6veSeok=; b=pPK9SBm2mp8lb2PfodqufAhENb6wvlC+6vaY8FZ1qz4hEMhft1b1Ya90ZgXa8V5tU4 nOp6ToUqrDEIzMbh48/Iu2gE1QSaq/b+4PbqMHHzPVnykA+xWKPhkZ3JgmGxl8p1b+k2 hMmzuhhRAzZ0C5jYyDChJHZfcHQAZOWpcwEik= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700214493; x=1700819293; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pbfnLiSboEjj7ZovZYmQgy+lf6RZEwYxMOTQ6veSeok=; b=v46mYQsiXJYuzlQzEUNVDB8dpjnuGvfk2FllahL4C5b/HZbsUy4AdZCEoXQIbHu9x+ gv87R8sPZBBAXI1UbiFqBN6AB3jcJmDQWvJzRl1PHbmNPYGg8mXfvkjCcZVFbrRld6rG ycajK4Z4aMKwV91wjhw0ETVA61eP+62aM9Q3IVlCBB/PDE3X1APRnVvuaOdaHqMgLNkY 4Dfv4U9GSF/h69f4Is8ooH4GSFIDUaUZObj8LgeKWUTK+zH6Wz7YPXnBXXJau75jmITh CQK3vIuSwtus7zep2yagMLPaiBIMeZ2yPfh5wlxk0j2pWYh1BrQ6hfXHAIOSye7yBbkg JSyQ== X-Gm-Message-State: AOJu0Ywn0YK9d5OZ5++pdh5d6alDTdWHIQLxZw4ANLYHwXz60rnW5ft1 jukr4RhFk+KIzzI3Tc9q9SFG9quZiYt2DSkCq5g= X-Google-Smtp-Source: AGHT+IErDsdGtVszEcMZBEn6fOFlC16pRsXfTnVi9InEOTwzXkIN9XjUoH+lhV4FzSftE2Ubi6zOqg== X-Received: by 2002:a19:434b:0:b0:507:9ae6:6913 with SMTP id m11-20020a19434b000000b005079ae66913mr11254723lfj.28.1700214493702; Fri, 17 Nov 2023 01:48:13 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper , Wei Liu , Paul Durrant Subject: [PATCH 2/3] x86/iommu: move xen_in_range() so it can be made static Date: Fri, 17 Nov 2023 10:47:48 +0100 Message-ID: <20231117094749.81091-3-roger.pau@citrix.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231117094749.81091-1-roger.pau@citrix.com> References: <20231117094749.81091-1-roger.pau@citrix.com> MIME-Version: 1.0 No functional change intended. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/setup.h | 2 -- xen/arch/x86/setup.c | 49 ---------------------------- xen/drivers/passthrough/x86/iommu.c | 50 +++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+), 51 deletions(-) diff --git a/xen/arch/x86/include/asm/setup.h b/xen/arch/x86/include/asm/setup.h index 9a460e4db8f4..4a1600decf6a 100644 --- a/xen/arch/x86/include/asm/setup.h +++ b/xen/arch/x86/include/asm/setup.h @@ -36,8 +36,6 @@ unsigned long initial_images_nrpages(nodeid_t node); void discard_initial_images(void); void *bootstrap_map(const module_t *mod); -int xen_in_range(unsigned long mfn); - extern uint8_t kbd_shift_flags; #ifdef NDEBUG diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index a3d3f797bb1e..54daff3d4942 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -2080,55 +2080,6 @@ void arch_get_xen_caps(xen_capabilities_info_t *info) } } -int __hwdom_init xen_in_range(unsigned long mfn) -{ - paddr_t start, end; - int i; - - enum { region_s3, region_ro, region_rw, region_bss, nr_regions }; - static struct { - paddr_t s, e; - } xen_regions[nr_regions] __hwdom_initdata; - - /* initialize first time */ - if ( !xen_regions[0].s ) - { - /* S3 resume code (and other real mode trampoline code) */ - xen_regions[region_s3].s = bootsym_phys(trampoline_start); - xen_regions[region_s3].e = bootsym_phys(trampoline_end); - - /* - * This needs to remain in sync with the uses of the same symbols in - * - __start_xen() (above) - * - is_xen_fixed_mfn() - * - tboot_shutdown() - */ - - /* hypervisor .text + .rodata */ - xen_regions[region_ro].s = __pa(&_stext); - xen_regions[region_ro].e = __pa(&__2M_rodata_end); - /* hypervisor .data + .bss */ - xen_regions[region_rw].s = __pa(&__2M_rwdata_start); - xen_regions[region_rw].e = __pa(&__2M_rwdata_end); - if ( efi_boot_mem_unused(&start, &end) ) - { - ASSERT(__pa(start) >= xen_regions[region_rw].s); - ASSERT(__pa(end) <= xen_regions[region_rw].e); - xen_regions[region_rw].e = __pa(start); - xen_regions[region_bss].s = __pa(end); - xen_regions[region_bss].e = __pa(&__2M_rwdata_end); - } - } - - start = (paddr_t)mfn << PAGE_SHIFT; - end = start + PAGE_SIZE; - for ( i = 0; i < nr_regions; i++ ) - if ( (start < xen_regions[i].e) && (end > xen_regions[i].s) ) - return 1; - - return 0; -} - static int __hwdom_init cf_check io_bitmap_cb( unsigned long s, unsigned long e, void *ctx) { diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index 857dccb6a465..d70cee9fea77 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -13,6 +13,7 @@ */ #include +#include #include #include #include @@ -300,6 +301,55 @@ void iommu_identity_map_teardown(struct domain *d) } } +static int __hwdom_init xen_in_range(unsigned long mfn) +{ + paddr_t start, end; + int i; + + enum { region_s3, region_ro, region_rw, region_bss, nr_regions }; + static struct { + paddr_t s, e; + } xen_regions[nr_regions] __hwdom_initdata; + + /* initialize first time */ + if ( !xen_regions[0].s ) + { + /* S3 resume code (and other real mode trampoline code) */ + xen_regions[region_s3].s = bootsym_phys(trampoline_start); + xen_regions[region_s3].e = bootsym_phys(trampoline_end); + + /* + * This needs to remain in sync with the uses of the same symbols in + * - __start_xen() + * - is_xen_fixed_mfn() + * - tboot_shutdown() + */ + + /* hypervisor .text + .rodata */ + xen_regions[region_ro].s = __pa(&_stext); + xen_regions[region_ro].e = __pa(&__2M_rodata_end); + /* hypervisor .data + .bss */ + xen_regions[region_rw].s = __pa(&__2M_rwdata_start); + xen_regions[region_rw].e = __pa(&__2M_rwdata_end); + if ( efi_boot_mem_unused(&start, &end) ) + { + ASSERT(__pa(start) >= xen_regions[region_rw].s); + ASSERT(__pa(end) <= xen_regions[region_rw].e); + xen_regions[region_rw].e = __pa(start); + xen_regions[region_bss].s = __pa(end); + xen_regions[region_bss].e = __pa(&__2M_rwdata_end); + } + } + + start = (paddr_t)mfn << PAGE_SHIFT; + end = start + PAGE_SIZE; + for ( i = 0; i < nr_regions; i++ ) + if ( (start < xen_regions[i].e) && (end > xen_regions[i].s) ) + return 1; + + return 0; +} + static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned long pfn, unsigned long max_pfn) From patchwork Fri Nov 17 09:47:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13458672 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E3712C072A2 for ; Fri, 17 Nov 2023 09:48:28 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.634965.990635 (Exim 4.92) (envelope-from ) id 1r3vSV-0008Sk-MR; Fri, 17 Nov 2023 09:48:19 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 634965.990635; Fri, 17 Nov 2023 09:48:19 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r3vSV-0008Sb-J0; Fri, 17 Nov 2023 09:48:19 +0000 Received: by outflank-mailman (input) for mailman id 634965; Fri, 17 Nov 2023 09:48:17 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1r3vST-0007p0-KV for xen-devel@lists.xenproject.org; Fri, 17 Nov 2023 09:48:17 +0000 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [2a00:1450:4864:20::329]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 6ea78262-852e-11ee-98dc-6d05b1d4d9a1; Fri, 17 Nov 2023 10:48:16 +0100 (CET) Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-407c3adef8eso15443045e9.2 for ; Fri, 17 Nov 2023 01:48:16 -0800 (PST) Received: from localhost ([213.195.113.99]) by smtp.gmail.com with ESMTPSA id n27-20020a05600c3b9b00b00405959469afsm2140251wms.3.2023.11.17.01.48.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 01:48:14 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6ea78262-852e-11ee-98dc-6d05b1d4d9a1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1700214495; x=1700819295; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=G0uzIW7esXwONO7eMjXy6VgiWyzem2oujkU/WJcseRA=; b=aZAV/MgHtF7xWDL2lR2btRjyhKv2KsF4JBFeGu0QE9VohRRkA9N8lnR9048SiJ3R6V Wl+0DqvjOLZAZrSWhI9d4/gDKLs+BEqR/lL85m75TfFjiNA24HRIVtFIbrYKLRWNX5EL FIkmyloQ4719j8qd8UMcO9+WXz08p0MsrpNtA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700214495; x=1700819295; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=G0uzIW7esXwONO7eMjXy6VgiWyzem2oujkU/WJcseRA=; b=pF6UO0RlIhkhUkCgMrofZ0Y/iLbSvTscjXtDpcczs79CkYizI4IcETC+Iv1cB5A1wB A694padRA6t1LEtLAwfP+B/rZ86Elrf5px/lHyWOUnL0sv1HG3uXDMEYYqj+Aqc9TIXU TWViYY1XGuxnwB4w9gz+HYlQlk8fJ96JWpG8iWe0FLBc4TpFOdHvdIcumYdKb/Xwrm3s NA43Ps65ob+lkif48CArtDiqb+G53Aphb6i4JhwpKBKN759x5OGhnDzv+uDJZADcnpaq a8lhWHAG+Wqs1emHUWbMRMvQ8FV9s2KuCScB5It/taVp9Yqwu5YPGTfrqgWTghRJx/1W Q4fQ== X-Gm-Message-State: AOJu0Yyo50WdMOc08AGlsdctZe8XrF2TKU1vY0XLN0vg+Pt8lBbZETFV vlbnV088H9NKfwc+SSSYrmQWAsOkH4dFkgB823s= X-Google-Smtp-Source: AGHT+IGmR6cKgc8EYPAsakfgX5DVJKAwzRszpfSTMBYURWhaclALSaZzr17mYsFCANRIgUFfWHL4yQ== X-Received: by 2002:a05:600c:4f13:b0:409:6e0e:e948 with SMTP id l19-20020a05600c4f1300b004096e0ee948mr14594620wmq.1.1700214495307; Fri, 17 Nov 2023 01:48:15 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Paul Durrant , Jan Beulich , Andrew Cooper , Wei Liu Subject: [PATCH 3/3] x86/iommu: use a rangeset for hwdom setup Date: Fri, 17 Nov 2023 10:47:49 +0100 Message-ID: <20231117094749.81091-4-roger.pau@citrix.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231117094749.81091-1-roger.pau@citrix.com> References: <20231117094749.81091-1-roger.pau@citrix.com> MIME-Version: 1.0 The current loop that iterates from 0 to the maximum RAM address in order to setup the IOMMU mappings is highly inefficient, and it will get worse as the amount of RAM increases. It's also not accounting for any reserved regions past the last RAM address. Instead of iterating over memory addresses, iterate over the memory map regions and use a rangeset in order to keep track of which ranges need to be identity mapped in the hardware domain physical address space. On an AMD EPYC 7452 with 512GiB of RAM, the time to execute arch_iommu_hwdom_init() in nanoseconds is: x old + new N Min Max Median Avg Stddev x 5 2.2364154e+10 2.338244e+10 2.2474685e+10 2.2622409e+10 4.2949869e+08 + 5 1025012 1033036 1026188 1028276.2 3623.1194 Difference at 95.0% confidence -2.26214e+10 +/- 4.42931e+08 -99.9955% +/- 9.05152e-05% (Student's t, pooled s = 3.03701e+08) Execution time of arch_iommu_hwdom_init() goes down from ~22s to ~0.001s. Note there's a change for HVM domains (ie: PVH dom0) that get switched to create the p2m mappings using map_mmio_regions() instead of p2m_add_identity_entry(), so that ranges can be mapped with a single function call if possible. Note that the interface of map_mmio_regions() doesn't allow creating read-only mappings, but so far there are no such mappings created for PVH dom0 in arch_iommu_hwdom_init(). No change intended in the resulting mappings that a hardware domain gets. Signed-off-by: Roger Pau Monné --- xen/arch/x86/hvm/io.c | 15 +- xen/arch/x86/include/asm/hvm/io.h | 4 +- xen/drivers/passthrough/x86/iommu.c | 355 +++++++++++++++++----------- 3 files changed, 231 insertions(+), 143 deletions(-) diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index d75af83ad01f..7c4b7317b13a 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -364,9 +364,20 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d, return NULL; } -bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr) +int vpci_subtract_mmcfg(const struct domain *d, struct rangeset *r) { - return vpci_mmcfg_find(d, addr); + const struct hvm_mmcfg *mmcfg; + + list_for_each_entry ( mmcfg, &d->arch.hvm.mmcfg_regions, next ) + { + int rc = rangeset_remove_range(r, PFN_DOWN(mmcfg->addr), + PFN_DOWN(mmcfg->addr + mmcfg->size - 1)); + + if ( rc ) + return rc; + } + + return 0; } static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg, diff --git a/xen/arch/x86/include/asm/hvm/io.h b/xen/arch/x86/include/asm/hvm/io.h index e5225e75ef26..c9d058fd5695 100644 --- a/xen/arch/x86/include/asm/hvm/io.h +++ b/xen/arch/x86/include/asm/hvm/io.h @@ -153,8 +153,8 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr, /* Destroy tracked MMCFG areas. */ void destroy_vpci_mmcfg(struct domain *d); -/* Check if an address is between a MMCFG region for a domain. */ -bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr); +/* Remove MMCFG regions from a given rangeset. */ +int vpci_subtract_mmcfg(const struct domain *d, struct rangeset *r); #endif /* __ASM_X86_HVM_IO_H__ */ diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index d70cee9fea77..be2c909f61d8 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -301,129 +301,133 @@ void iommu_identity_map_teardown(struct domain *d) } } -static int __hwdom_init xen_in_range(unsigned long mfn) +static int __hwdom_init remove_xen_ranges(struct rangeset *r) { paddr_t start, end; - int i; - - enum { region_s3, region_ro, region_rw, region_bss, nr_regions }; - static struct { - paddr_t s, e; - } xen_regions[nr_regions] __hwdom_initdata; + int rc; - /* initialize first time */ - if ( !xen_regions[0].s ) - { - /* S3 resume code (and other real mode trampoline code) */ - xen_regions[region_s3].s = bootsym_phys(trampoline_start); - xen_regions[region_s3].e = bootsym_phys(trampoline_end); + /* S3 resume code (and other real mode trampoline code) */ + rc = rangeset_remove_range(r, PFN_DOWN(bootsym_phys(trampoline_start)), + PFN_DOWN(bootsym_phys(trampoline_end))); + if ( rc ) + return rc; - /* - * This needs to remain in sync with the uses of the same symbols in - * - __start_xen() - * - is_xen_fixed_mfn() - * - tboot_shutdown() - */ + /* + * This needs to remain in sync with the uses of the same symbols in + * - __start_xen() + * - is_xen_fixed_mfn() + * - tboot_shutdown() + */ + /* hypervisor .text + .rodata */ + rc = rangeset_remove_range(r, PFN_DOWN(__pa(&_stext)), + PFN_DOWN(__pa(&__2M_rodata_end))); + if ( rc ) + return rc; - /* hypervisor .text + .rodata */ - xen_regions[region_ro].s = __pa(&_stext); - xen_regions[region_ro].e = __pa(&__2M_rodata_end); - /* hypervisor .data + .bss */ - xen_regions[region_rw].s = __pa(&__2M_rwdata_start); - xen_regions[region_rw].e = __pa(&__2M_rwdata_end); - if ( efi_boot_mem_unused(&start, &end) ) - { - ASSERT(__pa(start) >= xen_regions[region_rw].s); - ASSERT(__pa(end) <= xen_regions[region_rw].e); - xen_regions[region_rw].e = __pa(start); - xen_regions[region_bss].s = __pa(end); - xen_regions[region_bss].e = __pa(&__2M_rwdata_end); - } + /* hypervisor .data + .bss */ + if ( efi_boot_mem_unused(&start, &end) ) + { + ASSERT(__pa(start) >= __pa(&__2M_rwdata_start)); + rc = rangeset_remove_range(r, PFN_DOWN(__pa(&__2M_rwdata_start)), + PFN_DOWN(__pa(start))); + if ( rc ) + return rc; + ASSERT(__pa(end) <= __pa(&__2M_rwdata_end)); + rc = rangeset_remove_range(r, PFN_DOWN(__pa(end)), + PFN_DOWN(__pa(&__2M_rwdata_end))); + if ( rc ) + return rc; + } + else + { + rc = rangeset_remove_range(r, PFN_DOWN(__pa(&__2M_rwdata_start)), + PFN_DOWN(__pa(&__2M_rwdata_end))); + if ( rc ) + return rc; } - - start = (paddr_t)mfn << PAGE_SHIFT; - end = start + PAGE_SIZE; - for ( i = 0; i < nr_regions; i++ ) - if ( (start < xen_regions[i].e) && (end > xen_regions[i].s) ) - return 1; return 0; } -static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d, - unsigned long pfn, - unsigned long max_pfn) +static int __hwdom_init map_subtract(unsigned long s, unsigned long e, + void *data) { - mfn_t mfn = _mfn(pfn); - unsigned int i, type, perms = IOMMUF_readable | IOMMUF_writable; + struct rangeset *map = data; - /* - * Set up 1:1 mapping for dom0. Default to include only conventional RAM - * areas and let RMRRs include needed reserved regions. When set, the - * inclusive mapping additionally maps in every pfn up to 4GB except those - * that fall in unusable ranges for PV Dom0. - */ - if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) ) - return 0; + return rangeset_remove_range(map, s, e); +} - switch ( type = page_get_ram_type(mfn) ) - { - case RAM_TYPE_UNUSABLE: - return 0; +struct map_data { + struct domain *d; + unsigned int flush_flags; + bool ro; +}; - case RAM_TYPE_CONVENTIONAL: - if ( iommu_hwdom_strict ) - return 0; - break; +static int __hwdom_init identity_map(unsigned long s, unsigned long e, + void *data) +{ + struct map_data *info = data; + struct domain *d = info->d; + long rc; - default: - if ( type & RAM_TYPE_RESERVED ) - { - if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved ) - perms = 0; - } - else if ( is_hvm_domain(d) ) - return 0; - else if ( !iommu_hwdom_inclusive || pfn > max_pfn ) - perms = 0; - } + if ( iommu_verbose ) + printk(XENLOG_INFO " [%010lx, %010lx] R%c\n", + s, e, info->ro ? 'O' : 'W'); - /* Check that it doesn't overlap with the Interrupt Address Range. */ - if ( pfn >= 0xfee00 && pfn <= 0xfeeff ) - return 0; - /* ... or the IO-APIC */ - if ( has_vioapic(d) ) + if ( paging_mode_translate(d) ) { - for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ ) - if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) ) - return 0; + if ( info->ro ) + { + ASSERT_UNREACHABLE(); + return 0; + } + while ( (rc = map_mmio_regions(d, _gfn(s), e - s + 1, _mfn(s))) > 0 ) + { + s += rc; + process_pending_softirqs(); + } } - else if ( is_pv_domain(d) ) + else { - /* - * Be consistent with CPU mappings: Dom0 is permitted to establish r/o - * ones there (also for e.g. HPET in certain cases), so it should also - * have such established for IOMMUs. - */ - if ( iomem_access_permitted(d, pfn, pfn) && - rangeset_contains_singleton(mmio_ro_ranges, pfn) ) - perms = IOMMUF_readable; + const unsigned int perms = IOMMUF_readable | IOMMUF_preempt | + (info->ro ? 0 : IOMMUF_writable); + + if ( info->ro && !iomem_access_permitted(d, s, e) ) + { + /* + * Should be more fine grained in order to not map the forbidden + * frame instead of rejecting the region as a whole, but it's only + * for read-only MMIO regions, which are very limited. + */ + printk(XENLOG_DEBUG + "IOMMU read-only mapping of region [%lx, %lx] forbidden\n", + s, e); + return 0; + } + while ( (rc = iommu_map(d, _dfn(s), _mfn(s), e - s + 1, + perms, &info->flush_flags)) > 0 ) + { + s += rc; + process_pending_softirqs(); + } } - /* - * ... or the PCIe MCFG regions. - * TODO: runtime added MMCFG regions are not checked to make sure they - * don't overlap with already mapped regions, thus preventing trapping. - */ - if ( has_vpci(d) && vpci_is_mmcfg_address(d, pfn_to_paddr(pfn)) ) - return 0; + ASSERT(rc <= 0); + if ( rc ) + printk(XENLOG_WARNING + "IOMMU identity mapping of [%lx, %lx] failed: %ld\n", + s, e, rc); - return perms; + /* Ignore errors and attempt to map the remaining regions. */ + return 0; } void __hwdom_init arch_iommu_hwdom_init(struct domain *d) { - unsigned long i, top, max_pfn, start, count; - unsigned int flush_flags = 0, start_perms = 0; + const unsigned long max_pfn = PFN_DOWN(GB(4)) - 1; + unsigned int i; + struct rangeset *map; + struct map_data map_data = { .d = d }; + int rc; BUG_ON(!is_hardware_domain(d)); @@ -447,62 +451,135 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) if ( iommu_hwdom_passthrough ) return; - max_pfn = (GB(4) >> PAGE_SHIFT) - 1; - top = max(max_pdx, pfn_to_pdx(max_pfn) + 1); + map = rangeset_new(NULL, NULL, 0); + if ( !map ) + panic("IOMMU initialization failed unable to allocate rangeset\n"); + + if ( iommu_hwdom_inclusive ) + { + /* Add the whole range below 4GB, UNUSABLE regions will be removed. */ + rc = rangeset_add_range(map, 0, max_pfn); + if ( rc ) + panic("IOMMU inclusive mappings can't be added: %d\n", + rc); + } - for ( i = 0, start = 0, count = 0; i < top; ) + for ( i = 0; i < e820.nr_map; i++ ) { - unsigned long pfn = pdx_to_pfn(i); - unsigned int perms = hwdom_iommu_map(d, pfn, max_pfn); + struct e820entry entry = e820.map[i]; - if ( !perms ) - /* nothing */; - else if ( paging_mode_translate(d) ) + switch ( entry.type ) { - int rc; + case E820_UNUSABLE: + if ( !iommu_hwdom_inclusive || PFN_DOWN(entry.addr) > max_pfn ) + continue; - rc = p2m_add_identity_entry(d, pfn, - perms & IOMMUF_writable ? p2m_access_rw - : p2m_access_r, - 0); + rc = rangeset_remove_range(map, PFN_DOWN(entry.addr), + PFN_DOWN(entry.addr + entry.size - 1)); if ( rc ) - printk(XENLOG_WARNING - "%pd: identity mapping of %lx failed: %d\n", - d, pfn, rc); + panic("IOMMU failed to remove unusable memory: %d\n", + rc); + continue; + + case E820_RESERVED: + if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved ) + continue; + break; + + case E820_RAM: + if ( iommu_hwdom_strict ) + continue; + break; + + default: + if ( !iommu_hwdom_inclusive || PFN_DOWN(entry.addr) >= max_pfn ) + continue; + entry.size = pfn_to_paddr(max_pfn) - 1 - entry.addr; + break; } - else if ( pfn != start + count || perms != start_perms ) - { - long rc; - commit: - while ( (rc = iommu_map(d, _dfn(start), _mfn(start), count, - start_perms | IOMMUF_preempt, - &flush_flags)) > 0 ) - { - start += rc; - count -= rc; - process_pending_softirqs(); - } + if ( iommu_hwdom_inclusive && + PFN_DOWN(entry.addr + entry.size - 1) <= max_pfn ) + /* + * Any range below 4GB is already in the rangeset if using inclusive + * mode. + */ + continue; + + rc = rangeset_add_range(map, PFN_DOWN(entry.addr), + PFN_DOWN(entry.addr + entry.size - 1)); + if ( rc ) + panic("IOMMU failed to add identity range: %d\n", rc); + } + + /* Remove any areas in-use by Xen. */ + rc = remove_xen_ranges(map); + if ( rc ) + panic("IOMMU failed to remove Xen ranges: %d\n", rc); + + /* Remove any overlap with the Interrupt Address Range. */ + rc = rangeset_remove_range(map, 0xfee00, 0xfeeff); + if ( rc ) + panic("IOMMU failed to remove Interrupt Address Range: %d\n", + rc); + + /* If emulating IO-APIC(s) make sure the base address is unmapped. */ + if ( has_vioapic(d) ) + { + for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ ) + { + rc = rangeset_remove_singleton(map, + PFN_DOWN(domain_vioapic(d, i)->base_address)); if ( rc ) - printk(XENLOG_WARNING - "%pd: IOMMU identity mapping of [%lx,%lx) failed: %ld\n", - d, start, start + count, rc); - start = pfn; - count = 1; - start_perms = perms; + panic("IOMMU failed to remove IO-APIC: %d\n", + rc); } - else - ++count; + } - if ( !(++i & 0xfffff) ) - process_pending_softirqs(); + if ( is_pv_domain(d) ) + { + /* + * Be consistent with CPU mappings: Dom0 is permitted to establish r/o + * ones there (also for e.g. HPET in certain cases), so it should also + * have such established for IOMMUs. + */ + rc = rangeset_report_ranges(mmio_ro_ranges, 0, ~0UL, map_subtract, map); + if ( rc ) + panic("IOMMU failed to remove read-only regions: %d\n", + rc); + } + + if ( has_vpci(d) ) + { + /* + * TODO: runtime added MMCFG regions are not checked to make sure they + * don't overlap with already mapped regions, thus preventing trapping. + */ + rc = vpci_subtract_mmcfg(d, map); + if ( rc ) + panic("IOMMU unable to remove MMCFG areas: %d\n", rc); + } - if ( i == top && count ) - goto commit; + if ( iommu_verbose ) + printk(XENLOG_INFO "d%u: identity mappings for IOMMU:\n", + d->domain_id); + + rc = rangeset_report_ranges(map, 0, ~0UL, identity_map, &map_data); + if ( rc ) + panic("IOMMU unable to create mappings: %d\n", rc); + if ( is_pv_domain(d) ) + { + map_data.ro = true; + rc = rangeset_report_ranges(mmio_ro_ranges, 0, ~0UL, identity_map, + &map_data); + if ( rc ) + panic("IOMMU unable to create read-only mappings: %d\n", rc); } + rangeset_destroy(map); + /* Use if to avoid compiler warning */ - if ( iommu_iotlb_flush_all(d, flush_flags) ) + if ( iommu_iotlb_flush_all(d, map_data.flush_flags) ) return; }