From patchwork Wed Jan 8 14:26:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C1D99E77188 for ; Wed, 8 Jan 2025 14:30:36 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867318.1278808 (Exim 4.92) (envelope-from ) id 1tVX4l-0006JU-9n; Wed, 08 Jan 2025 14:30:27 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867318.1278808; Wed, 08 Jan 2025 14:30:27 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4l-0006JN-5V; Wed, 08 Jan 2025 14:30:27 +0000 Received: by outflank-mailman (input) for mailman id 867318; Wed, 08 Jan 2025 14:30:25 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4j-0005q4-4f for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:25 +0000 Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [2a00:1450:4864:20::62a]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 188548c9-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:23 +0100 (CET) Received: by mail-ej1-x62a.google.com with SMTP id a640c23a62f3a-aab6fa3e20eso2802169566b.2 for ; Wed, 08 Jan 2025 06:30:23 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0e82f616sm2491596166b.41.2025.01.08.06.30.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:20 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 188548c9-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346622; x=1736951422; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rIPMScFRi+9eS7pQY1QinUJZuVfM6Y6c4f6yI4NQszg=; b=cwUmBYMMH2KdOgmADw6l1twje8VUwkxrJekztKuVk74M0XBrUhq0kuaXIU3IG6mGGF g/oaS+4SC6E+de2i+sONEyJiHqI8QPLcsTIgcLMsCPa3xpg4bzX5nLi+1tr/luwTYLxA GKStlzPD/1SaB+OZlNasKIGkhEvQvQTM0o3E0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346622; x=1736951422; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rIPMScFRi+9eS7pQY1QinUJZuVfM6Y6c4f6yI4NQszg=; b=aoSjabnSWEND0SUMYQbrguGol1xlyPTjsXnQQ9ro7PpXxmRZy81PJJDO+gelnrvjPC b9FnLV3ZoR8OBVkvwgT+q1PTPZmvaquRLTBf/cTZxiaqV2+Ke57/YwzSkUYJN4IgdVL8 V5rErUYzmns1iRLds16gRTGXgJMeDGoOkcZcxQXTz98Ul55R+weqP4iykorxsFA1Ox3G ydeTqrBvBq86aO45u5K+Hei1axocg3qu6yRXyazWpadzkwAigomfALDeujAJX8BAOsIB fk00XtpqHLGKdFASmiLbG718INpIY2XWzndcPdApEyZVKnzYTUZHGKlYRUiAF8L44wuF J4fA== X-Gm-Message-State: AOJu0Yx8vDtbr8eqVvcUVLgfLivKmd9QcG4sg6Grq9Grh6ZnTe00F1Tg WPQC6Y0Zk34ebyDDe7c2rMkiYR4wgeBWHlO9HggoOkQOo8fsLz46vElWUdSybbA8VhXWspPve9r G X-Gm-Gg: ASbGncudtOzLIGnv2FGaJOLiZuAvrG5RXpxrovIWoYA5GReJPg9xpJ1LMs/KASEfTrC fORvJbusFxTqscv8Uu9u3oKYtdTGggKCyZL0R1Lu30k/hmYvVfhrJKtMIsM2pkcRJWJ0/dArMdH AI48lI2AQMIzW54EJE+OgYxJ6fflohQg+HAjTo+3ToH7qoe1p5mgVOz4slHf/UmBEsjYcKS7Ro6 1n15fR65PZfGEU9NEkXILDcIqOwJIklPJmacnglrMBPiGi7hCYigTf4YT7InG6g3w4= X-Google-Smtp-Source: AGHT+IFHptL442A0fMrP5bxV31r2PKY0Z2nBrnnSNstGJc8woCK9m2xHZd644PLKPTI8mqyRHn3vWA== X-Received: by 2002:a17:907:368c:b0:aa6:9624:78fd with SMTP id a640c23a62f3a-ab2abc78a71mr296403766b.48.1736346620715; Wed, 08 Jan 2025 06:30:20 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 01/18] x86/mm: purge unneeded destroy_perdomain_mapping() Date: Wed, 8 Jan 2025 15:26:41 +0100 Message-ID: <20250108142659.99490-2-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 The destroy_perdomain_mapping() call in the hvm_domain_initialise() fail path is useless. destroy_perdomain_mapping() called with nr == 0 is effectively a no op, as there are not entries torn down. Remove the call, as arch_domain_create() already calls free_perdomain_mappings() on failure. There's also a call to destroy_perdomain_mapping() in pv_domain_destroy() which is also not needed. arch_domain_destroy() will already unconditionally call free_perdomain_mappings(), which does the same as destroy_perdomain_mapping(), plus additionally frees the page table structures. Signed-off-by: Roger Pau Monné --- xen/arch/x86/hvm/hvm.c | 1 - xen/arch/x86/pv/domain.c | 3 --- 2 files changed, 4 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 922c9b3af64d..70fdddae583d 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -708,7 +708,6 @@ int hvm_domain_initialise(struct domain *d, XFREE(d->arch.hvm.irq); fail0: hvm_destroy_cacheattr_region_list(d); - destroy_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0); fail: hvm_domain_relinquish_resources(d); XFREE(d->arch.hvm.io_handler); diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 7aef628f55be..bc7cd0c62f0e 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -345,9 +345,6 @@ void pv_domain_destroy(struct domain *d) { pv_l1tf_domain_destroy(d); - destroy_perdomain_mapping(d, GDT_LDT_VIRT_START, - GDT_LDT_MBYTES << (20 - PAGE_SHIFT)); - XFREE(d->arch.pv.cpuidmasks); FREE_XENHEAP_PAGE(d->arch.pv.gdt_ldt_l1tab); From patchwork Wed Jan 8 14:26:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 47511E7719B for ; Wed, 8 Jan 2025 14:30:40 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867317.1278797 (Exim 4.92) (envelope-from ) id 1tVX4k-00064f-1b; Wed, 08 Jan 2025 14:30:26 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867317.1278797; Wed, 08 Jan 2025 14:30:26 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4j-00064Y-VK; Wed, 08 Jan 2025 14:30:25 +0000 Received: by outflank-mailman (input) for mailman id 867317; Wed, 08 Jan 2025 14:30:24 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4i-0005q4-Fv for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:24 +0000 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [2a00:1450:4864:20::536]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 18164900-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:22 +0100 (CET) Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-5d3f65844deso28527767a12.0 for ; Wed, 08 Jan 2025 06:30:22 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d806fedbc5sm26174702a12.60.2025.01.08.06.30.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:21 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 18164900-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346622; x=1736951422; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wnNXiTCnO2oI7NIqZptnHGJdzylt9AXPh1CWa13TEiA=; b=FC7QP65mjEy1fJdcsJMrafQN8ioREVGHQIv5ax2znGKCGVzECKEq0oJPgw5qoaowod I3+NyutkfUuAR+O/lVql8qsC1uPcx340b0MbkrESIqkij9nhfhJNgs+cHgnDokpBxrqI KZbQW1ImSZG2PM9t+xD2KousOHAUmxkwu6iZY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346622; x=1736951422; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wnNXiTCnO2oI7NIqZptnHGJdzylt9AXPh1CWa13TEiA=; b=kZugjbZTIKoxzDG4qqvF8vTCz7TIbj+Hh+C6qQwt/GmUGaJa/bSFmHW4lNXhPi5J5I /hZbs6iKMFLIRelIz6/ImE91nUDNwzx6A40v0b3b/eOFmSmJwwb2+WX2ZKgYlc2qRCac 67K6fiu1Jy5Zz5mTC2kURmXtexA6oDEQU3DoOAnnBOQDeqHN2rREhxslxibC0jmNtOWH QnnDZLnt7QtdrhjSVkZx0nIWB9T76jD6cuQxxOk8XLx31uBEgmHGtElvbWZN1APAU7Kw nXXHCrqFV319uFYvMBUM0oG9ZQ3q9Gb04MHufBUnaOH6YNcLSQqpDGKlXpBw/CDUrqL0 Brwg== X-Gm-Message-State: AOJu0YxPsuTh8gmni87I/N0Qvq/kQpJd67dyA168r2u5wjnoSZQopasE 7szyWyd+p+eA8g9i8xHgCyMSBuMy57Od4tmTNJDkZrm3ipGzQnCIVxRZLmSGGL5XXh5V1ug/wjz j X-Gm-Gg: ASbGnct8giAIyd9MwpbWtFjYy1jaRXbWBr7BRe7UjvWYA7/VI9UpkaQ8XxtpI+DZGWP jXWVhNMC5aaVFwAzhuyT2tp1Sv6bG3nmDNkTKTQEnGVdbVGPq9WDm6a3Hvrgu0asm9MC66g6etO klY2BXh+XbqA0BumzmyNm8jGxZ52ViG8p1RkV6kYsL+EJbdoiW4eRCis9tHsDuFnkiPOliGcNd/ JDOmF4FvSepBrzsO0j7E9bJVRdhQLQTjeXcI3DIaj+CEbNb59RZUnjcsN75uQ5iyZ4= X-Google-Smtp-Source: AGHT+IGa3LmhX/D9qTWm6MbvyZnXvkrTaVWZCrfqraIRjPTXr6wQWOSJ3Tvv9PwmOh1fQtJcwh+aJg== X-Received: by 2002:a05:6402:26d3:b0:5d9:a59:854a with SMTP id 4fb4d7f45d1cf-5d972e07f98mr2912632a12.13.1736346621921; Wed, 08 Jan 2025 06:30:21 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 02/18] x86/domain: limit window where curr_vcpu != current on context switch Date: Wed, 8 Jan 2025 15:26:42 +0100 Message-ID: <20250108142659.99490-3-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 On x86 Xen will perform lazy context switches to the idle vCPU, where the previously running vCPU context is not overwritten, and only current is updated to point to the idle vCPU. The state is then disjunct between current and curr_vcpu: current points to the idle vCPU, while curr_vcpu points to the vCPU whose context is loaded on the pCPU. While on that lazy context switched state, certain calls (like map_domain_page()) will trigger a full synchronization of the pCPU state by forcing a context switch. Note however how calling any of such functions inside the context switch code itself is very likely to trigger an infinite recursion loop. Attempt to limit the window where curr_vcpu != current in the context switch code, as to prevent and infinite recursion loop around sync_local_execstate(). This is required for using map_domain_page() in the vCPU context switch code, otherwise using map_domain_page() in that context ends up in a recursive sync_local_execstate() loop: map_domain_page() -> sync_local_execstate() -> map_domain_page() -> ... Signed-off-by: Roger Pau Monné --- Changes since v1: - New in this version. --- xen/arch/x86/domain.c | 58 +++++++++++++++++++++++++++++++++++-------- xen/arch/x86/traps.c | 2 -- 2 files changed, 48 insertions(+), 12 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 78a13e6812c9..1f680bf176ee 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1982,16 +1982,16 @@ static void load_default_gdt(unsigned int cpu) per_cpu(full_gdt_loaded, cpu) = false; } -static void __context_switch(void) +static void __context_switch(struct vcpu *n) { struct cpu_user_regs *stack_regs = guest_cpu_user_regs(); unsigned int cpu = smp_processor_id(); struct vcpu *p = per_cpu(curr_vcpu, cpu); - struct vcpu *n = current; struct domain *pd = p->domain, *nd = n->domain; ASSERT(p != n); ASSERT(!vcpu_cpu_dirty(n)); + ASSERT(p == current); if ( !is_idle_domain(pd) ) { @@ -2036,6 +2036,18 @@ static void __context_switch(void) write_ptbase(n); + /* + * It's relevant to set both current and curr_vcpu back-to-back, to avoid a + * window where calls to mapcache_current_vcpu() during the context switch + * could trigger a recursive loop. + * + * Do the current switch immediately after switching to the new guest + * page-tables, so that current is (almost) always in sync with the + * currently loaded page-tables. + */ + set_current(n); + per_cpu(curr_vcpu, cpu) = n; + #ifdef CONFIG_PV /* Prefetch the VMCB if we expect to use it later in the context switch */ if ( using_svm() && is_pv_64bit_domain(nd) && !is_idle_domain(nd) ) @@ -2048,8 +2060,6 @@ static void __context_switch(void) if ( pd != nd ) cpumask_clear_cpu(cpu, pd->dirty_cpumask); write_atomic(&p->dirty_cpu, VCPU_CPU_CLEAN); - - per_cpu(curr_vcpu, cpu) = n; } void context_switch(struct vcpu *prev, struct vcpu *next) @@ -2081,16 +2091,36 @@ void context_switch(struct vcpu *prev, struct vcpu *next) local_irq_disable(); - set_current(next); - if ( (per_cpu(curr_vcpu, cpu) == next) || (is_idle_domain(nextd) && cpu_online(cpu)) ) { + /* + * Lazy context switch to the idle vCPU, set current == idle. Full + * context switch happens if/when sync_local_execstate() is called. + */ + set_current(next); local_irq_enable(); } else { - __context_switch(); + /* + * curr_vcpu will always point to the currently loaded vCPU context, as + * it's not updated when doing a lazy switch to the idle vCPU. + */ + struct vcpu *prev_ctx = per_cpu(curr_vcpu, cpu); + + if ( prev_ctx != current ) + { + /* + * Doing a full context switch to a non-idle vCPU from a lazy + * context switched state. Adjust current to point to the + * currently loaded vCPU context. + */ + ASSERT(current == idle_vcpu[cpu]); + ASSERT(!is_idle_vcpu(next)); + set_current(prev_ctx); + } + __context_switch(next); /* Re-enable interrupts before restoring state which may fault. */ local_irq_enable(); @@ -2156,15 +2186,23 @@ int __sync_local_execstate(void) { unsigned long flags; int switch_required; + unsigned int cpu = smp_processor_id(); + struct vcpu *p; local_irq_save(flags); - switch_required = (this_cpu(curr_vcpu) != current); + p = per_cpu(curr_vcpu, cpu); + switch_required = (p != current); if ( switch_required ) { - ASSERT(current == idle_vcpu[smp_processor_id()]); - __context_switch(); + ASSERT(current == idle_vcpu[cpu]); + /* + * Restore current to the previously running vCPU, __context_switch() + * will update current together with curr_vcpu. + */ + set_current(p); + __context_switch(idle_vcpu[cpu]); } local_irq_restore(flags); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 87b30ce4df2a..487b8c5a78c5 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -2232,8 +2232,6 @@ void __init trap_init(void) void activate_debugregs(const struct vcpu *curr) { - ASSERT(curr == current); - write_debugreg(0, curr->arch.dr[0]); write_debugreg(1, curr->arch.dr[1]); write_debugreg(2, curr->arch.dr[2]); From patchwork Wed Jan 8 14:26:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 57C39E77199 for ; Wed, 8 Jan 2025 14:30:39 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867319.1278811 (Exim 4.92) (envelope-from ) id 1tVX4l-0006Md-HQ; Wed, 08 Jan 2025 14:30:27 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867319.1278811; Wed, 08 Jan 2025 14:30:27 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4l-0006LW-DN; Wed, 08 Jan 2025 14:30:27 +0000 Received: by outflank-mailman (input) for mailman id 867319; Wed, 08 Jan 2025 14:30:26 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4k-0005q4-6d for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:26 +0000 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [2a00:1450:4864:20::52f]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1931632d-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:24 +0100 (CET) Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-5d3e9a88793so4960344a12.1 for ; Wed, 08 Jan 2025 06:30:24 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d9706479b5sm1016086a12.80.2025.01.08.06.30.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:22 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1931632d-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346624; x=1736951424; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UQMf50ZHpIa44OSs3yzjDxbHIqEp8PQJQD9zzetsDcU=; b=YfM7FdBQgHx/M4dNmTpuymgeNZwD02HJqlGjDQgrMjg+OYOYbhShD0PLZzBvt7yg8f /YH+Duowyv8wrhgjWG9GpoqZynhCz96rJnckqxcIeEK2mrG7tMdFWC/riq8clHHtB18M cF25sqpG5oQrUerbKc/8kALehYiRxLvmiUweg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346624; x=1736951424; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UQMf50ZHpIa44OSs3yzjDxbHIqEp8PQJQD9zzetsDcU=; b=qw5fIpadAl+Af/fg5uq27iIcZV3Vs3JmmCwwUvNTTPi7Ccy4Xb38yV7H0TcnjwVY5C ZnTLA0VJkZzOo2TGMTk8NW/JlJ7SDG8M1gCx/u98Lt1XN2J8AbJ5SWwKXssJwl0QFo/N ZuoJiXW9q9t0zIef8xBJQKKrZazMo8PIYLBCHyk0NYZ7I5HsQLOGp8ZUoPVqY2/41rNe /J2kOvAyIOYDItBouoVwupxmL/lLTwqYubxuiA43k/cL5/sPHw4aFRJSYPPxSBY5B5a/ 9xgXc0POM+3y5m/3ZoOMSraHEZRmT+Nc3MR4JdpxLn1iAKji+w9I8rhYH4rVC+B3dYTv YGXQ== X-Gm-Message-State: AOJu0YyTAXeI6LsprypjUdekyjHbPVW5iIdTZq2/nxi5i4pqGo5hZRHZ 5BjmzesHk/1kLWPTcNbymIy2Bb3GrTIavcS3lOswcWLQzvGuJwmQSUIXDZbV/zGRJ6QU5EdRWEo M X-Gm-Gg: ASbGncu45PAH28vuB0r1Ib6K1CKHcgbUroWuCzT5XaNPsr9V9V0y9YEQ+ybinE1g+2K ZJ4T14nEjdV2B40nPt+jmpXb2kXWoWUjP3QsZXujNKN0fQ3fJtxUy//gfdtqvItOq5fQyEN18IL is7TFsSgnUY66s+oDfHj4dSrMy2u+1rKLb4LCz4yQAeWM4WCqyiUNtnKcIGUXe3XfrVDCRVYdj2 Me4wKxW/oexri3SrVCJRgdiMDIITMQ2KCYYHleA//7syFluHgH4TtWwB/+lpXUm+FA= X-Google-Smtp-Source: AGHT+IHdJHwme3Fk7T3dwmFLrxPAAgzz/GSB5rVNHFrsraCY0UzIaxFIR3nsYEJxKaWWNSdnngvcNA== X-Received: by 2002:a05:6402:35c4:b0:5d3:e63c:7d71 with SMTP id 4fb4d7f45d1cf-5d972e06e45mr2432016a12.11.1736346623952; Wed, 08 Jan 2025 06:30:23 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 03/18] x86/mm: introduce helper to detect per-domain L1 entries that need freeing Date: Wed, 8 Jan 2025 15:26:43 +0100 Message-ID: <20250108142659.99490-4-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 L1 present entries that require the underlying page to be freed have the _PAGE_AVAIL0 bit set, introduce a helper to unify the checking logic into a single place. No functional change intended. Signed-off-by: Roger Pau Monné Reviewed-by: Jan Beulich --- xen/arch/x86/mm.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index fa21903eb25a..3d5dd22b6c36 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6294,6 +6294,12 @@ void __iomem *__init ioremap_wc(paddr_t pa, size_t len) return (void __force __iomem *)(va + offs); } +static bool perdomain_l1e_needs_freeing(l1_pgentry_t l1e) +{ + return (l1e_get_flags(l1e) & (_PAGE_PRESENT | _PAGE_AVAIL0)) == + (_PAGE_PRESENT | _PAGE_AVAIL0); +} + int create_perdomain_mapping(struct domain *d, unsigned long va, unsigned int nr, l1_pgentry_t **pl1tab, struct page_info **ppg) @@ -6446,9 +6452,7 @@ void destroy_perdomain_mapping(struct domain *d, unsigned long va, for ( ; nr && i < L1_PAGETABLE_ENTRIES; --nr, ++i ) { - if ( (l1e_get_flags(l1tab[i]) & - (_PAGE_PRESENT | _PAGE_AVAIL0)) == - (_PAGE_PRESENT | _PAGE_AVAIL0) ) + if ( perdomain_l1e_needs_freeing(l1tab[i]) ) free_domheap_page(l1e_get_page(l1tab[i])); l1tab[i] = l1e_empty(); } @@ -6498,9 +6502,7 @@ void free_perdomain_mappings(struct domain *d) unsigned int k; for ( k = 0; k < L1_PAGETABLE_ENTRIES; ++k ) - if ( (l1e_get_flags(l1tab[k]) & - (_PAGE_PRESENT | _PAGE_AVAIL0)) == - (_PAGE_PRESENT | _PAGE_AVAIL0) ) + if ( perdomain_l1e_needs_freeing(l1tab[k]) ) free_domheap_page(l1e_get_page(l1tab[k])); unmap_domain_page(l1tab); From patchwork Wed Jan 8 14:26:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 97889E7719A for ; Wed, 8 Jan 2025 14:30:40 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867321.1278833 (Exim 4.92) (envelope-from ) id 1tVX4p-0006vB-Ac; Wed, 08 Jan 2025 14:30:31 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867321.1278833; Wed, 08 Jan 2025 14:30:31 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4p-0006uc-1u; Wed, 08 Jan 2025 14:30:31 +0000 Received: by outflank-mailman (input) for mailman id 867321; Wed, 08 Jan 2025 14:30:29 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4n-0005q4-Rs for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:29 +0000 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [2a00:1450:4864:20::531]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1aa728e9-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:27 +0100 (CET) Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-5d982de9547so287385a12.2 for ; Wed, 08 Jan 2025 06:30:27 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d80675a6ddsm25112021a12.3.2025.01.08.06.30.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:24 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1aa728e9-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346626; x=1736951426; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m03YXRaYCtzyIlnn1N4iDdGZqjJbMIkKRiM7m/8rx14=; b=SKEAwsVHqJSOh0hGbVccs3ejq4OwHvtFfki3mykrulYwh7nS1w/yPYDmu9Uyqlo7IO gqCpOgcKIYy8QHMtBEo0u4mbUSzXyAdYOUBnFjA82PtH52lHnjoR4GlWJggehBFeisKm WE7/tgyNqvdyF7MNjUXUI1smD9ZarLL1CaJpo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346626; x=1736951426; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m03YXRaYCtzyIlnn1N4iDdGZqjJbMIkKRiM7m/8rx14=; b=eATkH06awYqBoe/UtB831ONsbBFyAHYMg4DhWC1pLTofGTQsbJCRZ2kpjXFn3QTOda 3m+XTAOogo3p0fId81eJhTyoaC/aHVDqTRvlZFbDCCxfWa7yK0i9bPwhG9RmWBLJP6i2 6QuCM3rGJp2ZG4BTLQYGvEEE+bwKr8AsRhRDoSgfSZ59kT1CuEZeceexk6FOiA33pSeo b6zCu4r6j+7EEf5GLhAFIdMCy4k88n0hUGxYytgr7YhTns5DjrkbspXkfG6qLA0vPrmO b1ErRImWytoeI2DXsqNwox/Eexc4WoUvMzR7JSsNSTtDw/PEemVrqKIEZ2DLfT7aQbfM NwMQ== X-Gm-Message-State: AOJu0YwbJCUVT/YBXzduaz9CEIkEYr69OYmOAdBEI540lI2WiQ++Plu7 +q6cLvmxlPyAocUEGErNCw2+yEnc5ZElSfNKxYp/3fg7AnWqyP9enQ4wDHTYVcbn0PI8nY+2RAJ R X-Gm-Gg: ASbGncvT/slGUT9IwlZ6PC/XGH5++g08P+PbZws0Ok7hulbNESK/I0WloU9zpODzpAc JGnuWeNAUBiPWzF2q5qlSuIb4JDAAdfLL0O/sPxO85/Dz4qzB8uZILVCZxlm2pH7K5MMCMWKhOo xTR88AlawN9Jhq9TV1J9364AjeJCL0Ln5sDUFSJIq/sI4MCT51BQHN6jnn2w8ksCE6g3Tq3mRRn +jFWDhDfDt/3/6qxEBrKqFOCqEuWouZwJ72g7QLF/ouAH9TG36mhy1cHL1OkP+lRh8= X-Google-Smtp-Source: AGHT+IEohyalVzvnZbfPC5tNJbo0bX+go4X3U0Zx/bW9ziWAaGUnjmDoFWM3cG+OkBfpx4y5duIIHQ== X-Received: by 2002:a05:6402:3581:b0:5d3:ba42:e9fe with SMTP id 4fb4d7f45d1cf-5d972e04bb3mr2416313a12.12.1736346625293; Wed, 08 Jan 2025 06:30:25 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 04/18] x86/pv: introduce function to populate perdomain area and use it to map Xen GDT Date: Wed, 8 Jan 2025 15:26:44 +0100 Message-ID: <20250108142659.99490-5-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 The current code to update the Xen part of the GDT when running a PV guest relies on caching the direct map address of all the L1 tables used to map the GDT and LDT, so that entries can be modified. Introduce a new function that populates the per-domain region, either using the recursive linear mappings when the target vCPU is the current one, or by directly modifying the L1 table of the per-domain region. Using such function to populate per-domain addresses drops the need to keep a reference to per-domain L1 tables previously used to change the per-domain mappings. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain.c | 11 +++- xen/arch/x86/include/asm/desc.h | 6 +- xen/arch/x86/include/asm/mm.h | 2 + xen/arch/x86/include/asm/processor.h | 5 ++ xen/arch/x86/mm.c | 88 ++++++++++++++++++++++++++++ xen/arch/x86/smpboot.c | 6 +- xen/arch/x86/traps.c | 10 ++-- 7 files changed, 113 insertions(+), 15 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 1f680bf176ee..0bd0ef7e40f4 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1953,9 +1953,14 @@ static always_inline bool need_full_gdt(const struct domain *d) static void update_xen_slot_in_full_gdt(const struct vcpu *v, unsigned int cpu) { - l1e_write(pv_gdt_ptes(v) + FIRST_RESERVED_GDT_PAGE, - !is_pv_32bit_vcpu(v) ? per_cpu(gdt_l1e, cpu) - : per_cpu(compat_gdt_l1e, cpu)); + ASSERT(v != current); + + populate_perdomain_mapping(v, + GDT_VIRT_START(v) + + (FIRST_RESERVED_GDT_PAGE << PAGE_SHIFT), + !is_pv_32bit_vcpu(v) ? &per_cpu(gdt_mfn, cpu) + : &per_cpu(compat_gdt_mfn, + cpu), 1); } static void load_full_gdt(const struct vcpu *v, unsigned int cpu) diff --git a/xen/arch/x86/include/asm/desc.h b/xen/arch/x86/include/asm/desc.h index a1e0807d97ed..33981bfca588 100644 --- a/xen/arch/x86/include/asm/desc.h +++ b/xen/arch/x86/include/asm/desc.h @@ -44,6 +44,8 @@ #ifndef __ASSEMBLY__ +#include + #define GUEST_KERNEL_RPL(d) (is_pv_32bit_domain(d) ? 1 : 3) /* Fix up the RPL of a guest segment selector. */ @@ -212,10 +214,10 @@ struct __packed desc_ptr { extern seg_desc_t boot_gdt[]; DECLARE_PER_CPU(seg_desc_t *, gdt); -DECLARE_PER_CPU(l1_pgentry_t, gdt_l1e); +DECLARE_PER_CPU(mfn_t, gdt_mfn); extern seg_desc_t boot_compat_gdt[]; DECLARE_PER_CPU(seg_desc_t *, compat_gdt); -DECLARE_PER_CPU(l1_pgentry_t, compat_gdt_l1e); +DECLARE_PER_CPU(mfn_t, compat_gdt_mfn); DECLARE_PER_CPU(bool, full_gdt_loaded); static inline void lgdt(const struct desc_ptr *gdtr) diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 6c7e66ee21ab..b50a51327b2b 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -603,6 +603,8 @@ int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg); int create_perdomain_mapping(struct domain *d, unsigned long va, unsigned int nr, l1_pgentry_t **pl1tab, struct page_info **ppg); +void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, + mfn_t *mfn, unsigned long nr); void destroy_perdomain_mapping(struct domain *d, unsigned long va, unsigned int nr); void free_perdomain_mappings(struct domain *d); diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h index d247ef8dd226..82ee89f736c2 100644 --- a/xen/arch/x86/include/asm/processor.h +++ b/xen/arch/x86/include/asm/processor.h @@ -243,6 +243,11 @@ static inline unsigned long cr3_pa(unsigned long cr3) return cr3 & X86_CR3_ADDR_MASK; } +static inline mfn_t cr3_mfn(unsigned long cr3) +{ + return maddr_to_mfn(cr3_pa(cr3)); +} + static inline unsigned int cr3_pcid(unsigned long cr3) { return IS_ENABLED(CONFIG_PV) ? cr3 & X86_CR3_PCID_MASK : 0; diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 3d5dd22b6c36..0abea792486c 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6423,6 +6423,94 @@ int create_perdomain_mapping(struct domain *d, unsigned long va, return rc; } +void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, + mfn_t *mfn, unsigned long nr) +{ + l1_pgentry_t *l1tab = NULL, *pl1e; + const l3_pgentry_t *l3tab; + const l2_pgentry_t *l2tab; + struct domain *d = v->domain; + + ASSERT(va >= PERDOMAIN_VIRT_START && + va < PERDOMAIN_VIRT_SLOT(PERDOMAIN_SLOTS)); + ASSERT(!nr || !l3_table_offset(va ^ (va + nr * PAGE_SIZE - 1))); + + /* Use likely to force the optimization for the fast path. */ + if ( likely(v == current) ) + { + unsigned int i; + + /* Ensure page-tables are from current (if current != curr_vcpu). */ + sync_local_execstate(); + + /* Fast path: get L1 entries using the recursive linear mappings. */ + pl1e = &__linear_l1_table[l1_linear_offset(va)]; + + for ( i = 0; i < nr; i++, pl1e++ ) + { + if ( unlikely(perdomain_l1e_needs_freeing(*pl1e)) ) + { + ASSERT_UNREACHABLE(); + free_domheap_page(l1e_get_page(*pl1e)); + } + l1e_write(pl1e, l1e_from_mfn(mfn[i], __PAGE_HYPERVISOR_RW)); + } + + return; + } + + ASSERT(d->arch.perdomain_l3_pg); + l3tab = __map_domain_page(d->arch.perdomain_l3_pg); + + if ( unlikely(!(l3e_get_flags(l3tab[l3_table_offset(va)]) & + _PAGE_PRESENT)) ) + { + unmap_domain_page(l3tab); + gprintk(XENLOG_ERR, "unable to map at VA %lx: L3e not present\n", va); + ASSERT_UNREACHABLE(); + domain_crash(d); + + return; + } + + l2tab = map_l2t_from_l3e(l3tab[l3_table_offset(va)]); + + for ( ; nr--; va += PAGE_SIZE, mfn++ ) + { + if ( !l1tab || !l1_table_offset(va) ) + { + const l2_pgentry_t *pl2e = l2tab + l2_table_offset(va); + + if ( unlikely(!(l2e_get_flags(*pl2e) & _PAGE_PRESENT)) ) + { + gprintk(XENLOG_ERR, "unable to map at VA %lx: L2e not present\n", + va); + ASSERT_UNREACHABLE(); + domain_crash(d); + + break; + } + + unmap_domain_page(l1tab); + l1tab = map_l1t_from_l2e(*pl2e); + } + + pl1e = &l1tab[l1_table_offset(va)]; + + if ( unlikely(perdomain_l1e_needs_freeing(*pl1e)) ) + { + ASSERT_UNREACHABLE(); + free_domheap_page(l1e_get_page(*pl1e)); + } + + l1e_write(pl1e, l1e_from_mfn(*mfn, __PAGE_HYPERVISOR_RW)); + } + + unmap_domain_page(l1tab); + unmap_domain_page(l2tab); + unmap_domain_page(l3tab); +} + void destroy_perdomain_mapping(struct domain *d, unsigned long va, unsigned int nr) { diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index 79a79c54c304..a740a6402272 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -1059,8 +1059,7 @@ static int cpu_smpboot_alloc(unsigned int cpu) if ( gdt == NULL ) goto out; per_cpu(gdt, cpu) = gdt; - per_cpu(gdt_l1e, cpu) = - l1e_from_pfn(virt_to_mfn(gdt), __PAGE_HYPERVISOR_RW); + per_cpu(gdt_mfn, cpu) = _mfn(virt_to_mfn(gdt)); memcpy(gdt, boot_gdt, NR_RESERVED_GDT_PAGES * PAGE_SIZE); BUILD_BUG_ON(NR_CPUS > 0x10000); gdt[PER_CPU_GDT_ENTRY - FIRST_RESERVED_GDT_ENTRY].a = cpu; @@ -1069,8 +1068,7 @@ static int cpu_smpboot_alloc(unsigned int cpu) per_cpu(compat_gdt, cpu) = gdt = alloc_xenheap_pages(0, memflags); if ( gdt == NULL ) goto out; - per_cpu(compat_gdt_l1e, cpu) = - l1e_from_pfn(virt_to_mfn(gdt), __PAGE_HYPERVISOR_RW); + per_cpu(compat_gdt_mfn, cpu) = _mfn(virt_to_mfn(gdt)); memcpy(gdt, boot_compat_gdt, NR_RESERVED_GDT_PAGES * PAGE_SIZE); gdt[PER_CPU_GDT_ENTRY - FIRST_RESERVED_GDT_ENTRY].a = cpu; #endif diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 487b8c5a78c5..a7f6fb611c34 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -92,10 +92,10 @@ DEFINE_PER_CPU(uint64_t, efer); static DEFINE_PER_CPU(unsigned long, last_extable_addr); DEFINE_PER_CPU_READ_MOSTLY(seg_desc_t *, gdt); -DEFINE_PER_CPU_READ_MOSTLY(l1_pgentry_t, gdt_l1e); +DEFINE_PER_CPU_READ_MOSTLY(mfn_t, gdt_mfn); #ifdef CONFIG_PV32 DEFINE_PER_CPU_READ_MOSTLY(seg_desc_t *, compat_gdt); -DEFINE_PER_CPU_READ_MOSTLY(l1_pgentry_t, compat_gdt_l1e); +DEFINE_PER_CPU_READ_MOSTLY(mfn_t, compat_gdt_mfn); #endif /* Master table, used by CPU0. */ @@ -2219,11 +2219,9 @@ void __init trap_init(void) init_ler(); /* Cache {,compat_}gdt_l1e now that physically relocation is done. */ - this_cpu(gdt_l1e) = - l1e_from_pfn(virt_to_mfn(boot_gdt), __PAGE_HYPERVISOR_RW); + this_cpu(gdt_mfn) = _mfn(virt_to_mfn(boot_gdt)); if ( IS_ENABLED(CONFIG_PV32) ) - this_cpu(compat_gdt_l1e) = - l1e_from_pfn(virt_to_mfn(boot_compat_gdt), __PAGE_HYPERVISOR_RW); + this_cpu(compat_gdt_mfn) = _mfn(virt_to_mfn(boot_compat_gdt)); percpu_traps_init(); From patchwork Wed Jan 8 14:26:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 19532E7719E for ; Wed, 8 Jan 2025 14:30:41 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867322.1278841 (Exim 4.92) (envelope-from ) id 1tVX4q-00076A-0s; Wed, 08 Jan 2025 14:30:32 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867322.1278841; Wed, 08 Jan 2025 14:30:31 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4p-00071f-NR; Wed, 08 Jan 2025 14:30:31 +0000 Received: by outflank-mailman (input) for mailman id 867322; Wed, 08 Jan 2025 14:30:30 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4o-0005q4-S4 for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:30 +0000 Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [2a00:1450:4864:20::629]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1ad05b41-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:27 +0100 (CET) Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-aaee2c5ee6eso2013637466b.1 for ; Wed, 08 Jan 2025 06:30:27 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0f015b53sm2489423166b.163.2025.01.08.06.30.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:26 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1ad05b41-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346627; x=1736951427; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R3R5hmlmCEd3MS69pC9DKVq/z/e8dZ+FD++Vyspageo=; b=LzA1uq2ALJ7yTUr9qkhSQEj48adVHC/ZkKUJiJ5CMX2s7DKCO7wxpiBrUT2I91dVnh 2Z1FN0mmcd4W6CPW1Moth8/NDWs84SKa4LNhjhSATuNixUhzOd47vqXziDlbD3oFM1Q7 YhYDILv2Fx/WAA7SckkBHnEv3MTwfjSTsXB24= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346627; x=1736951427; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R3R5hmlmCEd3MS69pC9DKVq/z/e8dZ+FD++Vyspageo=; b=Lgu30hwA1yhpXMdnCVp+7kY7LlslUZ2ry/2gIYqoaQueyc1wL8RP0d4fsNqvBuTGT7 ZatO8hmuyTxGXxxosfUSYB99zQVp0E1p8RvOurWV/+D1EaWE8taZC5QHC3WH9BRlGBe0 oyPg5ep/1pJsXKpVFjI6GFCsXr48KFtFP/BlxAtfPonxi4Cih1rAjvhaJ9pmuJgna5u3 Bf2RxrumEoEO5nIrhVEdMWFFyJwOYxYicVqNHrLREEk10/yG/+/Z/eQeFsBsqaVWDhOK vTHbX33qZOAmbi5zpWekRlqlpo3a0cGxyq5K6Xj0ALmG6kEwsNuY1oS4+sQ8mhTsPS0k 6OoA== X-Gm-Message-State: AOJu0YxV0iG9M9uDhAM6NuG2J98uxvLlc5EAob8s6Ey8+9q+oELq2fAr DuOKfkPGcv+VOg6fvdLWXfzWvmkdjQE1ULPuI22pKjFAZL9wjPeF1YP6wEFQxs3TLL8RHQy2df/ r X-Gm-Gg: ASbGncvq0QsBoCFVG4d17xvA10Na2ggH8ivyRrfhulHQyHWjuF14xpeU/psgjYB2p1t /q68Qbo0/6AjfAz2+s+PDKu/RwLV1Y27VwEn9Or3JG5W9333xSbWmhCEdnObZvI/rK3a07+PzRX 6NJidi5HtPrzC0oTinwjbS/fprsNJsgl3C5cjuG3cGaxrmD/21LwOnHvMraYYZpNQu6+Enuvtcy K8WpGXs8YKKlTR0iRWy14WA3AezGfaUHN+SzHTItDtjjxOyHvXc16UVGykWO+9Slow= X-Google-Smtp-Source: AGHT+IFOHqB2X4YYIijHMhEMuthvZwPSfHYYCzV0yIvDkNOaCTZyCxMx+sLWOIfq27OgFwEuvS3YwQ== X-Received: by 2002:a17:907:7f08:b0:aab:9430:40e9 with SMTP id a640c23a62f3a-ab2ab70a84emr230283666b.32.1736346626620; Wed, 08 Jan 2025 06:30:26 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 05/18] x86/mm: switch destroy_perdomain_mapping() parameter from domain to vCPU Date: Wed, 8 Jan 2025 15:26:45 +0100 Message-ID: <20250108142659.99490-6-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 In preparation for the per-domain area being populated with per-vCPU mappings change the parameter of destroy_perdomain_mapping() to be a vCPU instead of a domain, and also update the function logic to allow manipulation of per-domain mappings using the linear page table mappings. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/mm.h | 2 +- xen/arch/x86/mm.c | 24 +++++++++++++++++++++++- xen/arch/x86/pv/domain.c | 3 +-- xen/arch/x86/x86_64/mm.c | 2 +- 4 files changed, 26 insertions(+), 5 deletions(-) diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index b50a51327b2b..65cd751087dc 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -605,7 +605,7 @@ int create_perdomain_mapping(struct domain *d, unsigned long va, struct page_info **ppg); void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, mfn_t *mfn, unsigned long nr); -void destroy_perdomain_mapping(struct domain *d, unsigned long va, +void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, unsigned int nr); void free_perdomain_mappings(struct domain *d); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 0abea792486c..713ae8dd6fa3 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6511,10 +6511,11 @@ void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, unmap_domain_page(l3tab); } -void destroy_perdomain_mapping(struct domain *d, unsigned long va, +void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, unsigned int nr) { const l3_pgentry_t *l3tab, *pl3e; + const struct domain *d = v->domain; ASSERT(va >= PERDOMAIN_VIRT_START && va < PERDOMAIN_VIRT_SLOT(PERDOMAIN_SLOTS)); @@ -6523,6 +6524,27 @@ void destroy_perdomain_mapping(struct domain *d, unsigned long va, if ( !d->arch.perdomain_l3_pg ) return; + /* Use likely to force the optimization for the fast path. */ + if ( likely(v == current) ) + { + l1_pgentry_t *pl1e; + + /* Ensure page-tables are from current (if current != curr_vcpu). */ + sync_local_execstate(); + + pl1e = &__linear_l1_table[l1_linear_offset(va)]; + + /* Fast path: zap L1 entries using the recursive linear mappings. */ + for ( ; nr--; pl1e++ ) + { + if ( perdomain_l1e_needs_freeing(*pl1e) ) + free_domheap_page(l1e_get_page(*pl1e)); + l1e_write(pl1e, l1e_empty()); + } + + return; + } + l3tab = __map_domain_page(d->arch.perdomain_l3_pg); pl3e = l3tab + l3_table_offset(va); diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index bc7cd0c62f0e..7e8bffaae9a0 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -285,8 +285,7 @@ static int pv_create_gdt_ldt_l1tab(struct vcpu *v) static void pv_destroy_gdt_ldt_l1tab(struct vcpu *v) { - destroy_perdomain_mapping(v->domain, GDT_VIRT_START(v), - 1U << GDT_LDT_VCPU_SHIFT); + destroy_perdomain_mapping(v, GDT_VIRT_START(v), 1U << GDT_LDT_VCPU_SHIFT); } void pv_vcpu_destroy(struct vcpu *v) diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c index 389d813ebe63..c08b28d9693b 100644 --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -737,7 +737,7 @@ int setup_compat_arg_xlat(struct vcpu *v) void free_compat_arg_xlat(struct vcpu *v) { - destroy_perdomain_mapping(v->domain, ARG_XLAT_START(v), + destroy_perdomain_mapping(v, ARG_XLAT_START(v), PFN_UP(COMPAT_ARG_XLAT_SIZE)); } From patchwork Wed Jan 8 14:26:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 13937E7719C for ; Wed, 8 Jan 2025 14:30:41 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867320.1278828 (Exim 4.92) (envelope-from ) id 1tVX4o-0006qq-Qd; Wed, 08 Jan 2025 14:30:30 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867320.1278828; Wed, 08 Jan 2025 14:30:30 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4o-0006qf-Mt; Wed, 08 Jan 2025 14:30:30 +0000 Received: by outflank-mailman (input) for mailman id 867320; Wed, 08 Jan 2025 14:30:29 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4n-0006o2-I1 for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:29 +0000 Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [2a00:1450:4864:20::630]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 1b838d56-cdcd-11ef-a0df-8be0dac302b0; Wed, 08 Jan 2025 15:30:28 +0100 (CET) Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-aa679ad4265so176150466b.0 for ; Wed, 08 Jan 2025 06:30:28 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aaf49604224sm1297411366b.134.2025.01.08.06.30.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:27 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1b838d56-cdcd-11ef-a0df-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346628; x=1736951428; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=atcw5DH/r9KNNaekhki+fCZDVGoHuGfx0DZg9mMghBA=; b=IzalejZAaPhsimSMOp2DDazRQziWep2vLQs4zI84lVhFZAnXknZEZrhRWOoTF4AXeE JULJq5o3LDM21S1PIwhnIdj7n+ry9tVR1BZIeCD+FnaEh0It0m2uh0HG1ZDJYfxl2ksT mBIjLuyP8VJ+DQoUYY07gSPDCMaxCg2sTfjps= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346628; x=1736951428; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=atcw5DH/r9KNNaekhki+fCZDVGoHuGfx0DZg9mMghBA=; b=Cm0orTCt1wdW7Gi3rLrS+BlkkFWebNiwJjWqnb8KR8Z7fM+8mrxHE/EnCJFklmF2Z9 BO/w6DLmQjYFxTkIpJQXZE8yOuBLJL2egxHDy9eBuA47P6BIkS0TZhN9gC5kd8wH/2nZ nn6vQiHzR6S0oETCiWtymA2W3gpaUUyFpBor48n8dlrvStXIjB0IpF9JRHJqGjU6haFT yKMTzNK7NukV3L8+6cdr4S4Q5SunuhAFcz8glNLdJva3vBxtxvhbmgACrkgdud/s2ZaW n1u9h3isq9BlyGunUEorEGXjIL3B3duA3C2IXGuogBpe7NbiVmwAmRzNVSdTvmDh5MZb ri6Q== X-Gm-Message-State: AOJu0YypBa6Y3p3/Vx3PyeLQ+sYTtrVggUmggUmFQoFpCkW/6yNffdsc 8aECmW+0gczQ8BGRVCnFqnO9xamCz1R7uOiipXbonBxNsauL+qFojL8uXtYj/NsveKrEZUjSAU8 D X-Gm-Gg: ASbGncunB0xmCFW5bpdT5+y94sLQkmcIqa6x49kvwtUWMPSp4FUd0USLqgSnaeVn0yo CsbVUVWTXbOCgonefl2oAHhgBmItvFmCZUm+xUVdkWMidISNPhROd4QVcObW34w89T1wYqZWDrW juk280TFN+0sJ+coyn5hGLfGJHT7v7DqGt+88OKKaXvpbBk7AQt1vN9LVPtGNRiITDwem43dvYi q/lSRR6dDUPmSVKFojpeu7EKSWUX+LG4q82gsMI0mcHZpo9WiuuE2omZo5Dk8hJlS4= X-Google-Smtp-Source: AGHT+IFglshV4EJ4T7uTVluXsmbrkqOUMzkPSprRPmGKZ9SFO4vUkYx5xkPgo7GBLYV4MBEdRo6rkA== X-Received: by 2002:a17:906:6a26:b0:aa6:743e:d621 with SMTP id a640c23a62f3a-ab29192c380mr591833866b.30.1736346627757; Wed, 08 Jan 2025 06:30:27 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 06/18] x86/pv: set/clear guest GDT mappings using {populate,destroy}_perdomain_mapping() Date: Wed, 8 Jan 2025 15:26:46 +0100 Message-ID: <20250108142659.99490-7-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 The pv_{set,destroy}_gdt() functions rely on the L1 table(s) that contain such mappings being stashed in the domain structure, and thus such mappings being modified by merely updating the L1 entries. Switch both pv_{set,destroy}_gdt() to instead use {populate,destory}_perdomain_mapping(). Note that this requires moving the pv_set_gdt() call in arch_set_info_guest() strictly after update_cr3(), so v->arch.cr3 is valid when populate_perdomain_mapping() is called. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain.c | 33 ++++++++++++++--------------- xen/arch/x86/pv/descriptor-tables.c | 28 +++++++++++------------- 2 files changed, 28 insertions(+), 33 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 0bd0ef7e40f4..0481164f3727 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1376,22 +1376,6 @@ int arch_set_info_guest( if ( rc ) return rc; - if ( !compat ) - rc = pv_set_gdt(v, c.nat->gdt_frames, c.nat->gdt_ents); -#ifdef CONFIG_COMPAT - else - { - unsigned long gdt_frames[ARRAY_SIZE(v->arch.pv.gdt_frames)]; - - for ( i = 0; i < nr_gdt_frames; ++i ) - gdt_frames[i] = c.cmp->gdt_frames[i]; - - rc = pv_set_gdt(v, gdt_frames, c.cmp->gdt_ents); - } -#endif - if ( rc != 0 ) - return rc; - set_bit(_VPF_in_reset, &v->pause_flags); #ifdef CONFIG_COMPAT @@ -1492,7 +1476,6 @@ int arch_set_info_guest( { if ( cr3_page ) put_page(cr3_page); - pv_destroy_gdt(v); return rc; } @@ -1508,6 +1491,22 @@ int arch_set_info_guest( paging_update_paging_modes(v); else update_cr3(v); + + if ( !compat ) + rc = pv_set_gdt(v, c.nat->gdt_frames, c.nat->gdt_ents); +#ifdef CONFIG_COMPAT + else + { + unsigned long gdt_frames[ARRAY_SIZE(v->arch.pv.gdt_frames)]; + + for ( i = 0; i < nr_gdt_frames; ++i ) + gdt_frames[i] = c.cmp->gdt_frames[i]; + + rc = pv_set_gdt(v, gdt_frames, c.cmp->gdt_ents); + } +#endif + if ( rc != 0 ) + return rc; #endif /* CONFIG_PV */ out: diff --git a/xen/arch/x86/pv/descriptor-tables.c b/xen/arch/x86/pv/descriptor-tables.c index 02647a2c5047..5a79f022ce13 100644 --- a/xen/arch/x86/pv/descriptor-tables.c +++ b/xen/arch/x86/pv/descriptor-tables.c @@ -49,23 +49,20 @@ bool pv_destroy_ldt(struct vcpu *v) void pv_destroy_gdt(struct vcpu *v) { - l1_pgentry_t *pl1e = pv_gdt_ptes(v); - mfn_t zero_mfn = _mfn(virt_to_mfn(zero_page)); - l1_pgentry_t zero_l1e = l1e_from_mfn(zero_mfn, __PAGE_HYPERVISOR_RO); unsigned int i; ASSERT(v == current || !vcpu_cpu_dirty(v)); - v->arch.pv.gdt_ents = 0; - for ( i = 0; i < FIRST_RESERVED_GDT_PAGE; i++ ) - { - mfn_t mfn = l1e_get_mfn(pl1e[i]); + if ( v->arch.cr3 ) + destroy_perdomain_mapping(v, GDT_VIRT_START(v), + ARRAY_SIZE(v->arch.pv.gdt_frames)); - if ( (l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) && - !mfn_eq(mfn, zero_mfn) ) - put_page_and_type(mfn_to_page(mfn)); + for ( i = 0; i < ARRAY_SIZE(v->arch.pv.gdt_frames); i++) + { + if ( !v->arch.pv.gdt_frames[i] ) + break; - l1e_write(&pl1e[i], zero_l1e); + put_page_and_type(mfn_to_page(_mfn(v->arch.pv.gdt_frames[i]))); v->arch.pv.gdt_frames[i] = 0; } } @@ -74,8 +71,8 @@ int pv_set_gdt(struct vcpu *v, const unsigned long frames[], unsigned int entries) { struct domain *d = v->domain; - l1_pgentry_t *pl1e; unsigned int i, nr_frames = DIV_ROUND_UP(entries, 512); + mfn_t mfns[ARRAY_SIZE(v->arch.pv.gdt_frames)]; ASSERT(v == current || !vcpu_cpu_dirty(v)); @@ -90,6 +87,8 @@ int pv_set_gdt(struct vcpu *v, const unsigned long frames[], if ( !mfn_valid(mfn) || !get_page_and_type(mfn_to_page(mfn), d, PGT_seg_desc_page) ) goto fail; + + mfns[i] = mfn; } /* Tear down the old GDT. */ @@ -97,12 +96,9 @@ int pv_set_gdt(struct vcpu *v, const unsigned long frames[], /* Install the new GDT. */ v->arch.pv.gdt_ents = entries; - pl1e = pv_gdt_ptes(v); for ( i = 0; i < nr_frames; i++ ) - { v->arch.pv.gdt_frames[i] = frames[i]; - l1e_write(&pl1e[i], l1e_from_pfn(frames[i], __PAGE_HYPERVISOR_RW)); - } + populate_perdomain_mapping(v, GDT_VIRT_START(v), mfns, nr_frames); return 0; From patchwork Wed Jan 8 14:26:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12836E7719F for ; Wed, 8 Jan 2025 14:30:42 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867323.1278856 (Exim 4.92) (envelope-from ) id 1tVX4r-0007Wq-CS; Wed, 08 Jan 2025 14:30:33 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867323.1278856; Wed, 08 Jan 2025 14:30:33 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4r-0007VK-3l; Wed, 08 Jan 2025 14:30:33 +0000 Received: by outflank-mailman (input) for mailman id 867323; Wed, 08 Jan 2025 14:30:32 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4p-0005q4-S4 for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:31 +0000 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [2a00:1450:4864:20::635]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1c7f0dc5-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:30 +0100 (CET) Received: by mail-ej1-x635.google.com with SMTP id a640c23a62f3a-aa6c0d1833eso3513387966b.1 for ; Wed, 08 Jan 2025 06:30:30 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0e5029f8sm2482634666b.0.2025.01.08.06.30.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:28 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1c7f0dc5-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346629; x=1736951429; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=a/WczJNf+EXdjpOAIA1z4tpLjcOOHU7wqiQHBrWlxLM=; b=b8otjL9TKGUYeRmsnAQd5u3W+9BRpyFEUqLa9m1hc/9nH51zmmEUhoGoQ37KDePh1u ssplNPUJToEgOwtnOXLA/U7CIrnkMUniYXeldA5bjUOpX57Blr7SyMr1ZbEmzqDgh9dF QzJDLuJG04WR+/Ug1sxZ3kKvwt3LNg+yPkpyo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346629; x=1736951429; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a/WczJNf+EXdjpOAIA1z4tpLjcOOHU7wqiQHBrWlxLM=; b=mo6M14iTz/4QmGxGKcha6t7d60s51QV18Un1A23vA0pV4Me+cqvnEVlVEPhZoO9Npk MlVxxvt2TE5YgA+BqLT7WtCTpyK3q9LhkrPIerety55qxA3Cd83gIllsSO6XuP4eN8za TWTgUOtv2ADAzvdCIHhR9jEFi82a+2LwaqwIvSMYZF6JuSNSvOnlaDpLFAhU0rcV1EMt 8FrfMkxytoB+FTatnCvwy4PG4ptWpVxQeuqKdHzBcPPWNTHWZPomqogoxdReGEuxiEGo Sg57AuSHZgkr3WFJys4Ye9R0Il91gfVWAhOqwGDPSHkWzCcvrt5hkrDJLQEkW9caAkUx yjTQ== X-Gm-Message-State: AOJu0Ywd2+CRs5GGxJd8KO0AM+msGvvOiWYU+4jaMYSY/lTr45WDdS8d FgMJBsutpZ0MocJ3uvfAJXyWaOUH084dE2VIfXO1TyunoHnQ81gEXhIhmROqNicuJ6t09NgOTHP d X-Gm-Gg: ASbGncuD1xyLM6VS6wXzQgXhKsYx0n8otpxn5Ne1xZeDN/t1GSHnV/ZWXmwqLXyteUI 26OwVJ9913ss7Gf4aohC1NjYtBLR7Gt/c+EZ+3kHg1NZW2IWH3r6UamkWCLqLlHnpXoqXSGT7vf OpIlLDxySJ3Kt4+ji3eP/+NsPsWViBJSxsyAGsOTahPkzyqNG1z5/+/LQqOV57Lip4tiMfRGLQp McX5hw5injCGRXMHtqPNAMnRAyu7sqkVdU0HNwI74/q5jGzvbNFNboBO17bdTd8UQc= X-Google-Smtp-Source: AGHT+IGIvn0q5at3p77p8XfXsyt0e4XfVPi2p4jLiXnmckHaXVyXuT6UUXjW15nwisukNHSfjVnvrA== X-Received: by 2002:a17:906:478f:b0:aab:dc3e:1c84 with SMTP id a640c23a62f3a-ab2ab703f93mr311889766b.17.1736346629032; Wed, 08 Jan 2025 06:30:29 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 07/18] x86/pv: update guest LDT mappings using the linear entries Date: Wed, 8 Jan 2025 15:26:47 +0100 Message-ID: <20250108142659.99490-8-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 The pv_map_ldt_shadow_page() and pv_destroy_ldt() functions rely on the L1 table(s) that contain such mappings being stashed in the domain structure, and thus such mappings being modified by merely updating the require L1 entries. Switch pv_map_ldt_shadow_page() to unconditionally use the linear recursive, as that logic is always called while the vCPU is running on the current pCPU. For pv_destroy_ldt() use the linear mappings if the vCPU is the one currently running on the pCPU, otherwise use destroy_mappings(). Note this requires keeping an array with the pages currently mapped at the LDT area, as that allows dropping the extra taken page reference when removing the mappings. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/domain.h | 2 ++ xen/arch/x86/pv/descriptor-tables.c | 19 ++++++++++--------- xen/arch/x86/pv/domain.c | 4 ++++ xen/arch/x86/pv/mm.c | 3 ++- 4 files changed, 18 insertions(+), 10 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index b79d6badd71c..b659cffc7f81 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -523,6 +523,8 @@ struct pv_vcpu struct trap_info *trap_ctxt; unsigned long gdt_frames[FIRST_RESERVED_GDT_PAGE]; + /* Max LDT entries is 8192, so 8192 * 8 = 64KiB (16 pages). */ + mfn_t ldt_frames[16]; unsigned long ldt_base; unsigned int gdt_ents, ldt_ents; diff --git a/xen/arch/x86/pv/descriptor-tables.c b/xen/arch/x86/pv/descriptor-tables.c index 5a79f022ce13..95b598a4c0cf 100644 --- a/xen/arch/x86/pv/descriptor-tables.c +++ b/xen/arch/x86/pv/descriptor-tables.c @@ -20,28 +20,29 @@ */ bool pv_destroy_ldt(struct vcpu *v) { - l1_pgentry_t *pl1e; + const unsigned int nr_frames = ARRAY_SIZE(v->arch.pv.ldt_frames); unsigned int i, mappings_dropped = 0; - struct page_info *page; ASSERT(!in_irq()); ASSERT(v == current || !vcpu_cpu_dirty(v)); - pl1e = pv_ldt_ptes(v); + destroy_perdomain_mapping(v, LDT_VIRT_START(v), nr_frames); - for ( i = 0; i < 16; i++ ) + for ( i = 0; i < nr_frames; i++ ) { - if ( !(l1e_get_flags(pl1e[i]) & _PAGE_PRESENT) ) - continue; + mfn_t mfn = v->arch.pv.ldt_frames[i]; + struct page_info *page; - page = l1e_get_page(pl1e[i]); - l1e_write(&pl1e[i], l1e_empty()); - mappings_dropped++; + if ( mfn_eq(mfn, INVALID_MFN) ) + continue; + v->arch.pv.ldt_frames[i] = INVALID_MFN; + page = mfn_to_page(mfn); ASSERT_PAGE_IS_TYPE(page, PGT_seg_desc_page); ASSERT_PAGE_IS_DOMAIN(page, v->domain); put_page_and_type(page); + mappings_dropped++; } return mappings_dropped; diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 7e8bffaae9a0..32d7488cc186 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -303,6 +303,7 @@ void pv_vcpu_destroy(struct vcpu *v) int pv_vcpu_initialise(struct vcpu *v) { struct domain *d = v->domain; + unsigned int i; int rc; ASSERT(!is_idle_domain(d)); @@ -311,6 +312,9 @@ int pv_vcpu_initialise(struct vcpu *v) if ( rc ) return rc; + for ( i = 0; i < ARRAY_SIZE(v->arch.pv.ldt_frames); i++ ) + v->arch.pv.ldt_frames[i] = INVALID_MFN; + BUILD_BUG_ON(X86_NR_VECTORS * sizeof(*v->arch.pv.trap_ctxt) > PAGE_SIZE); v->arch.pv.trap_ctxt = xzalloc_array(struct trap_info, X86_NR_VECTORS); diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c index 187f5f6a3e8c..4853e619f2a7 100644 --- a/xen/arch/x86/pv/mm.c +++ b/xen/arch/x86/pv/mm.c @@ -86,7 +86,8 @@ bool pv_map_ldt_shadow_page(unsigned int offset) return false; } - pl1e = &pv_ldt_ptes(curr)[offset >> PAGE_SHIFT]; + curr->arch.pv.ldt_frames[offset >> PAGE_SHIFT] = page_to_mfn(page); + pl1e = &__linear_l1_table[l1_linear_offset(LDT_VIRT_START(curr) + offset)]; l1e_add_flags(gl1e, _PAGE_RW); l1e_write(pl1e, gl1e); From patchwork Wed Jan 8 14:26:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15D65E77188 for ; Wed, 8 Jan 2025 14:30:43 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867324.1278866 (Exim 4.92) (envelope-from ) id 1tVX4s-0007s0-NH; Wed, 08 Jan 2025 14:30:34 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867324.1278866; Wed, 08 Jan 2025 14:30:34 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4s-0007rG-GJ; Wed, 08 Jan 2025 14:30:34 +0000 Received: by outflank-mailman (input) for mailman id 867324; Wed, 08 Jan 2025 14:30:32 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4q-0005q4-SP for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:32 +0000 Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [2a00:1450:4864:20::52e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1cfeb909-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:31 +0100 (CET) Received: by mail-ed1-x52e.google.com with SMTP id 4fb4d7f45d1cf-5d437235769so10185155a12.2 for ; Wed, 08 Jan 2025 06:30:31 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d806feddfasm26366981a12.58.2025.01.08.06.30.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:29 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1cfeb909-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346630; x=1736951430; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=C1M0fEQUOjxJ5DtJ3QRFhSAxh9S66gkWBgHELsH7AbA=; b=rltRehCiWHTj7fk1+hRO1zLX1JiY/qW7vG185hut9pgvKFMlfV2E9Ukxwf074hdyXk 572LhtnTuKSNlEtuW+rYiPVY8Qs7B16e0k8acj6aRISdWl8SdC8hF1pnX5c7CvoDr9gQ /aix5kQsus3uKhJr2FJ8ICVdayQ4ZqOu9kbn4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346630; x=1736951430; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=C1M0fEQUOjxJ5DtJ3QRFhSAxh9S66gkWBgHELsH7AbA=; b=ASeYnoMsE+6efMSEe2QwwW6XHHRncrg5gL6ih1r/nMDV/H9iqnIVZtVMdpRRGJMXf3 8z7mzWCQwPG8HzlRr0sPYRumJ67O6p+oVC1FsF/OiiTFVIqp65opXHO59Ff8oXvesuUt XqXdWPDS+j18XrHRCHQqyxhHG3hBgGyyjctHrpcDIF129B9TkZzg8l2PduxqcrnDH+EB T4M1Mr/ui67CeswJZqvNummB0vje2NvADqIZ9U7tkdxNgLk6OJ+zj0WfWmd4rwkuHuQ0 R7FUckqKemCEs2RAZ8ebPpLH/g6y5I/ANA/3cwhjANCZjlgpIbaZ1a4cUOppVma7mgwR R+kg== X-Gm-Message-State: AOJu0YzK4O7JUkBh71NZ+htkVmfzZXRUiRKKUvzGFO9ePZU1IPgykZiK cOmhRfcxRkol9Lokx248PwnioJidORO86XS5aEYEc6WDALwYDO5NhPb2NVszagCsNdJWLpT8ZzN E X-Gm-Gg: ASbGnctLHfUPHwbekW/JezOBQtvMdFIRPS6/boD4VZdA44EdN+VLYfZY6FahRla5865 ZBUJsfabGBdyTKetW+/Kyya8HFUh3D7TpW03LKD22hN0my0QGNL2Ea5C97ZM9HWqUuf59f/UOM9 GFLEmBP6v9+OY2o4zH+Og7dHBglbaWlNPXIgj9atWcH0SWw8kDUEBsPGhXhzx2xmG0VGfjuezGk 49NwgJTzvAv4dL0SRdQJIhKpvPs2/TBIIw45pLPFzXfcRf2i6vHkUoNVhA0AkE77c8= X-Google-Smtp-Source: AGHT+IHbq/O2WwIQm3RW5r/fjZ4FHbgGEBEEu5HVGEwWYvh51/BUDNPNr9U76gO8g2/4SPxrqkwRSg== X-Received: by 2002:a05:6402:540e:b0:5d0:b51c:8478 with SMTP id 4fb4d7f45d1cf-5d972e0e274mr2506264a12.12.1736346630352; Wed, 08 Jan 2025 06:30:30 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 08/18] x86/pv: remove stashing of GDT/LDT L1 page-tables Date: Wed, 8 Jan 2025 15:26:48 +0100 Message-ID: <20250108142659.99490-9-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 There are no remaining callers of pv_gdt_ptes() or pv_ldt_ptes() that use the stashed L1 page-tables in the domain structure. As such, the helpers and the fields can now be removed. No functional change intended, as the removed logic is not used. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/domain.h | 9 --------- xen/arch/x86/pv/domain.c | 10 +--------- 2 files changed, 1 insertion(+), 18 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index b659cffc7f81..fbe59baa82ec 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -271,8 +271,6 @@ struct time_scale { struct pv_domain { - l1_pgentry_t **gdt_ldt_l1tab; - atomic_t nr_l4_pages; /* Is a 32-bit PV guest? */ @@ -506,13 +504,6 @@ struct arch_domain #define has_pirq(d) (!!((d)->arch.emulation_flags & X86_EMU_USE_PIRQ)) #define has_vpci(d) (!!((d)->arch.emulation_flags & X86_EMU_VPCI)) -#define gdt_ldt_pt_idx(v) \ - ((v)->vcpu_id >> (PAGETABLE_ORDER - GDT_LDT_VCPU_SHIFT)) -#define pv_gdt_ptes(v) \ - ((v)->domain->arch.pv.gdt_ldt_l1tab[gdt_ldt_pt_idx(v)] + \ - (((v)->vcpu_id << GDT_LDT_VCPU_SHIFT) & (L1_PAGETABLE_ENTRIES - 1))) -#define pv_ldt_ptes(v) (pv_gdt_ptes(v) + 16) - struct pv_vcpu { /* map_domain_page() mapping cache. */ diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 32d7488cc186..dfaeeb2e2cc2 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -279,7 +279,7 @@ static int pv_create_gdt_ldt_l1tab(struct vcpu *v) { return create_perdomain_mapping(v->domain, GDT_VIRT_START(v), 1U << GDT_LDT_VCPU_SHIFT, - v->domain->arch.pv.gdt_ldt_l1tab, + NIL(l1_pgentry_t *), NULL); } @@ -349,8 +349,6 @@ void pv_domain_destroy(struct domain *d) pv_l1tf_domain_destroy(d); XFREE(d->arch.pv.cpuidmasks); - - FREE_XENHEAP_PAGE(d->arch.pv.gdt_ldt_l1tab); } void noreturn cf_check continue_pv_domain(void); @@ -366,12 +364,6 @@ int pv_domain_initialise(struct domain *d) pv_l1tf_domain_init(d); - d->arch.pv.gdt_ldt_l1tab = - alloc_xenheap_pages(0, MEMF_node(domain_to_node(d))); - if ( !d->arch.pv.gdt_ldt_l1tab ) - goto fail; - clear_page(d->arch.pv.gdt_ldt_l1tab); - if ( levelling_caps & ~LCAP_faulting && (d->arch.pv.cpuidmasks = xmemdup(&cpuidmask_defaults)) == NULL ) goto fail; From patchwork Wed Jan 8 14:26:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931040 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F343EE77199 for ; Wed, 8 Jan 2025 14:30:44 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867328.1278882 (Exim 4.92) (envelope-from ) id 1tVX4v-0008LQ-48; Wed, 08 Jan 2025 14:30:37 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867328.1278882; Wed, 08 Jan 2025 14:30:37 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4u-0008Kf-O1; Wed, 08 Jan 2025 14:30:36 +0000 Received: by outflank-mailman (input) for mailman id 867328; Wed, 08 Jan 2025 14:30:35 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4s-0005q4-T5 for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:34 +0000 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [2a00:1450:4864:20::536]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 1dd1f9b7-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:32 +0100 (CET) Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-5d3f57582a2so1798129a12.1 for ; Wed, 08 Jan 2025 06:30:32 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d80675a41esm25260184a12.1.2025.01.08.06.30.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:31 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1dd1f9b7-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346632; x=1736951432; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2+w+3NcnUX3Zar0o1eFpwbIWoYl9kfXdM+GmtfECN9Y=; b=CJYTOJ7dPqv6H2INxyBAkJI5a3fJ8NY8bA4xtzPGSutHwoJ+82fDuFy2vOCnwSBQ21 RTtIzbVOn27RNP/hFaDqiHossDLzLHmtoP8C0H9809fV3mdFaIACusaZ/CRBowqraW0e PGoqVoD2QT4wMsgJOZPJz75pfnV3aFZOiBh9o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346632; x=1736951432; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2+w+3NcnUX3Zar0o1eFpwbIWoYl9kfXdM+GmtfECN9Y=; b=q6SzKZJ5PRJ8D1wFcT8OASbW5DLKj588PhpIYMNBMJzTgnJl8dHT3X9V95KGxYpIfJ Soyeqw/PtSjv1c9i4ef43uzcXjm7gcTnZ6CdAwXXw6QrrwVlk6wizc5LnMUtkd34wreC 3FO/UstkZKsfqv/SNgiNZXcslNpWMJvXiZS6bFVQ8kQe3NVzkYTs0edN1EU+egjEwTFd obqyKlho9PDGACCPQkxpY1srzChwGf09LPjPvcidYeYrq8dq8/G10b92jDprMqPlH9Ej W9J6yOsRnej/v5lKoVenfm0t0XMw5l/FHdJC3Mkb8Ai7HdaKz136/MFK4gbAurzcs0V/ G6Fg== X-Gm-Message-State: AOJu0YyMa+bece6bk8uv174kgkVQdlzm6Fk828Fsug1K/gaJo+qw5p5e pecX6OOVdkTET89XTH+tuRg4FaqtdUmdrbRHYhk6FDkQDywJQeqT35oAU+h+Koklu7di0IZ/TJ4 G X-Gm-Gg: ASbGncsJ8RguGhfmzcN4sJTuPCVt33uVQGLTfGZ+/gCOvbzI6Rw9SZmcK2Hh3/mt5ry rPV+5A3kGVfobrtn6ULSxWW/00UJdMqaJrP9Se2CGUIgQnhsGIPcD4r5+qT4zsHXTiY0qte2oMq D27VhydyXTbKQCM2mYNBxudnUttXxL1Fg2wy+BkR4wl8XlxdqEhPWjnkbqf+zZFQfykvMjZ4Sbi 2e9uOfx5tMCxtVg3NSweup+VK1cDXoBwx4iA2wu4eiRX8255iPUmePv4HUBqiITDfE= X-Google-Smtp-Source: AGHT+IEFaKPQqexDpzrOqyp0ve20f3nJ0Bl9tu8TrAhlQl+LI4LCrLF0fA2VgPpHt6fkFPVf5GnV0A== X-Received: by 2002:a17:907:9612:b0:aaf:137:b5fa with SMTP id a640c23a62f3a-ab291929579mr627471066b.26.1736346631621; Wed, 08 Jan 2025 06:30:31 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 09/18] x86/mm: simplify create_perdomain_mapping() interface Date: Wed, 8 Jan 2025 15:26:49 +0100 Message-ID: <20250108142659.99490-10-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 There are no longer any callers of create_perdomain_mapping() that request a reference to the used L1 tables, and hence the only difference between them is whether the caller wants the region to be populated, or just the paging structures to be allocated. Simplify the arguments to create_perdomain_mapping() to reflect the current usages: drop the last two arguments and instead introduce a boolean to signal whether the caller wants the region populated. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain_page.c | 10 ++++---- xen/arch/x86/hvm/hvm.c | 2 +- xen/arch/x86/include/asm/mm.h | 3 +-- xen/arch/x86/mm.c | 43 +++++++---------------------------- xen/arch/x86/pv/domain.c | 4 +--- xen/arch/x86/x86_64/mm.c | 3 +-- 6 files changed, 16 insertions(+), 49 deletions(-) diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c index eac5e3304fb8..ad6d86be6918 100644 --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -254,8 +254,7 @@ int mapcache_domain_init(struct domain *d) spin_lock_init(&dcache->lock); return create_perdomain_mapping(d, (unsigned long)dcache->inuse, - 2 * bitmap_pages + 1, - NIL(l1_pgentry_t *), NULL); + 2 * bitmap_pages + 1, false); } int mapcache_vcpu_init(struct vcpu *v) @@ -272,16 +271,15 @@ int mapcache_vcpu_init(struct vcpu *v) if ( ents > dcache->entries ) { /* Populate page tables. */ - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents, - NIL(l1_pgentry_t *), NULL); + int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents, false); /* Populate bit maps. */ if ( !rc ) rc = create_perdomain_mapping(d, (unsigned long)dcache->inuse, - nr, NULL, NIL(struct page_info *)); + nr, true); if ( !rc ) rc = create_perdomain_mapping(d, (unsigned long)dcache->garbage, - nr, NULL, NIL(struct page_info *)); + nr, true); if ( rc ) return rc; diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index 70fdddae583d..e7817144059e 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -601,7 +601,7 @@ int hvm_domain_initialise(struct domain *d, INIT_LIST_HEAD(&d->arch.hvm.mmcfg_regions); INIT_LIST_HEAD(&d->arch.hvm.msix_tables); - rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, NULL); + rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, false); if ( rc ) goto fail; diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 65cd751087dc..0c57442c9593 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -601,8 +601,7 @@ int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg); #define IS_NIL(ptr) (!((uintptr_t)(ptr) + sizeof(*(ptr)))) int create_perdomain_mapping(struct domain *d, unsigned long va, - unsigned int nr, l1_pgentry_t **pl1tab, - struct page_info **ppg); + unsigned int nr, bool populate); void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, mfn_t *mfn, unsigned long nr); void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 713ae8dd6fa3..45664c56cb8f 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6301,8 +6301,7 @@ static bool perdomain_l1e_needs_freeing(l1_pgentry_t l1e) } int create_perdomain_mapping(struct domain *d, unsigned long va, - unsigned int nr, l1_pgentry_t **pl1tab, - struct page_info **ppg) + unsigned int nr, bool populate) { struct page_info *pg; l3_pgentry_t *l3tab; @@ -6351,55 +6350,32 @@ int create_perdomain_mapping(struct domain *d, unsigned long va, unmap_domain_page(l3tab); - if ( !pl1tab && !ppg ) - { - unmap_domain_page(l2tab); - return 0; - } - for ( l1tab = NULL; !rc && nr--; ) { l2_pgentry_t *pl2e = l2tab + l2_table_offset(va); if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) { - if ( pl1tab && !IS_NIL(pl1tab) ) - { - l1tab = alloc_xenheap_pages(0, MEMF_node(domain_to_node(d))); - if ( !l1tab ) - { - rc = -ENOMEM; - break; - } - ASSERT(!pl1tab[l2_table_offset(va)]); - pl1tab[l2_table_offset(va)] = l1tab; - pg = virt_to_page(l1tab); - } - else + pg = alloc_domheap_page(d, MEMF_no_owner); + if ( !pg ) { - pg = alloc_domheap_page(d, MEMF_no_owner); - if ( !pg ) - { - rc = -ENOMEM; - break; - } - l1tab = __map_domain_page(pg); + rc = -ENOMEM; + break; } + l1tab = __map_domain_page(pg); clear_page(l1tab); *pl2e = l2e_from_page(pg, __PAGE_HYPERVISOR_RW); } else if ( !l1tab ) l1tab = map_l1t_from_l2e(*pl2e); - if ( ppg && + if ( populate && !(l1e_get_flags(l1tab[l1_table_offset(va)]) & _PAGE_PRESENT) ) { pg = alloc_domheap_page(d, MEMF_no_owner); if ( pg ) { clear_domain_page(page_to_mfn(pg)); - if ( !IS_NIL(ppg) ) - *ppg++ = pg; l1tab[l1_table_offset(va)] = l1e_from_page(pg, __PAGE_HYPERVISOR_RW | _PAGE_AVAIL0); l2e_add_flags(*pl2e, _PAGE_AVAIL0); @@ -6618,10 +6594,7 @@ void free_perdomain_mappings(struct domain *d) unmap_domain_page(l1tab); } - if ( is_xen_heap_page(l1pg) ) - free_xenheap_page(page_to_virt(l1pg)); - else - free_domheap_page(l1pg); + free_domheap_page(l1pg); } unmap_domain_page(l2tab); diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index dfaeeb2e2cc2..ca32e7b5d686 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -278,9 +278,7 @@ int switch_compat(struct domain *d) static int pv_create_gdt_ldt_l1tab(struct vcpu *v) { return create_perdomain_mapping(v->domain, GDT_VIRT_START(v), - 1U << GDT_LDT_VCPU_SHIFT, - NIL(l1_pgentry_t *), - NULL); + 1U << GDT_LDT_VCPU_SHIFT, false); } static void pv_destroy_gdt_ldt_l1tab(struct vcpu *v) diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c index c08b28d9693b..55bba7e473ae 100644 --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -731,8 +731,7 @@ void __init zap_low_mappings(void) int setup_compat_arg_xlat(struct vcpu *v) { return create_perdomain_mapping(v->domain, ARG_XLAT_START(v), - PFN_UP(COMPAT_ARG_XLAT_SIZE), - NULL, NIL(struct page_info *)); + PFN_UP(COMPAT_ARG_XLAT_SIZE), true); } void free_compat_arg_xlat(struct vcpu *v) From patchwork Wed Jan 8 14:26:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A85BCE77188 for ; Wed, 8 Jan 2025 14:30:46 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867327.1278877 (Exim 4.92) (envelope-from ) id 1tVX4u-0008Hz-Ir; Wed, 08 Jan 2025 14:30:36 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867327.1278877; Wed, 08 Jan 2025 14:30:36 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4u-0008G5-Ao; Wed, 08 Jan 2025 14:30:36 +0000 Received: by outflank-mailman (input) for mailman id 867327; Wed, 08 Jan 2025 14:30:34 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4s-0006o2-K7 for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:34 +0000 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [2a00:1450:4864:20::62e]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 1e9c9886-cdcd-11ef-a0df-8be0dac302b0; Wed, 08 Jan 2025 15:30:33 +0100 (CET) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-aa68b513abcso3092335666b.0 for ; Wed, 08 Jan 2025 06:30:33 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0efe4971sm2525510466b.107.2025.01.08.06.30.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:32 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1e9c9886-cdcd-11ef-a0df-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346633; x=1736951433; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+jBVxxfhyjQo3CPYAi6vNYnRQLQEERzPDDK3eitXPfE=; b=f7MNmPUBzZU/65+N/0U9PUwAj6mHPSvT7fY1NT7mXcJREbhr+2o8Hirf8bQy/jNVsy sX2uUwKHTlVxa4ZVFiomxEhSSPPCGJ9b3W6zDGz20/ncpvrNhJOp+YDQU28mFFm3mITW uPkOuh6TQiYecfmYJcZTgJT7Xm160pFQ9noCc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346633; x=1736951433; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+jBVxxfhyjQo3CPYAi6vNYnRQLQEERzPDDK3eitXPfE=; b=qM+fPcdop30knuBJLRfGM+M6H+bG+2pZP7TVjFS3LG4mPDPFS/x2i9soEFB6RPc2Ya Tfp8TqySoAwLWcdt57AC9ppvpcyl8HxpW1gsgEwDI3+qIur0TBAxAxn4LBm7kXU7O974 htVEOPDnJpNYRdzwWNwkDgr8u4Ep5iM+iTLF8yDLmR1HwNGFvgKv9tbms4AFYMlwfYXi DYOO1mUTLKqBxEH1PWsD9Ey6gtO1eR9b2Gg0384/unIZxA9a2dSp7xKbhNAE1T5jmb0e Bo1bTMx3ZryBzY29AMiCEDytz4/02S9bhrTAVZDdPZpriemGTjCoyO5DoG82JtH1f997 Qd7Q== X-Gm-Message-State: AOJu0YyH05pJzNjdzsHFQkzLDyazSsEbvR1fzGx64LvHE4i01cPi6rRC FmYSP5ILgm17VBSKwJKiZ0QCQOYd6BJs7PoYFdXG1gFnt4Ks0ZIvcmT7R04tPGrrE8fHV+L6CUf O X-Gm-Gg: ASbGnctnRCb1E30t2r/QFaA3xY35w1+oIaRN/GHRFTtORyBLhdb9C3AdFEVZ7HlwG8L 42LO3Y0KECxVbtpLHc6e3/yGAXckk3GMRebW1YQwyjmbGjW9Yxaq+ZY12ghE1G0DgnX910hwvFZ W9PI0nIErInk7XsF5UDi888oFR++rbT6hkANx4+t2UI/IoYqqneB01RCzfOjTdufDMH4MNO8ACp pQ2Z8eJHPcxtODHTJJrOkSbshb6s35zYK7WW14579lxWiVXTr4y1mCZY98U7YvxyRk= X-Google-Smtp-Source: AGHT+IEprDOxP9xe+A8TyhkOIHhL+eL4gFw3vAZc1g+A9Cc6FCNI9TzjKmB6a6JD0IbDmktTNWtZeg== X-Received: by 2002:a17:907:3f95:b0:aae:b259:ef6c with SMTP id a640c23a62f3a-ab2aacfbb7cmr303604566b.0.1736346632906; Wed, 08 Jan 2025 06:30:32 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 10/18] x86/mm: switch {create,destroy}_perdomain_mapping() domain parameter to vCPU Date: Wed, 8 Jan 2025 15:26:50 +0100 Message-ID: <20250108142659.99490-11-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 In preparation for the per-domain area being per-vCPU. This requires moving some of the {create,destroy}_perdomain_mapping() calls to the domain initialization and tear down paths into vCPU initialization and tear down. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain.c | 12 ++++++++---- xen/arch/x86/domain_page.c | 13 +++++-------- xen/arch/x86/hvm/hvm.c | 5 ----- xen/arch/x86/include/asm/domain.h | 2 +- xen/arch/x86/include/asm/mm.h | 4 ++-- xen/arch/x86/mm.c | 6 ++++-- xen/arch/x86/pv/domain.c | 2 +- xen/arch/x86/x86_64/mm.c | 2 +- 8 files changed, 22 insertions(+), 24 deletions(-) diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 0481164f3727..6e1f622f7385 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -559,6 +559,10 @@ int arch_vcpu_create(struct vcpu *v) v->arch.flags = TF_kernel_mode; + rc = create_perdomain_mapping(v, PERDOMAIN_VIRT_START, 0, false); + if ( rc ) + return rc; + rc = mapcache_vcpu_init(v); if ( rc ) return rc; @@ -607,6 +611,7 @@ int arch_vcpu_create(struct vcpu *v) return rc; fail: + free_perdomain_mappings(v); paging_vcpu_teardown(v); vcpu_destroy_fpu(v); xfree(v->arch.msrs); @@ -629,6 +634,8 @@ void arch_vcpu_destroy(struct vcpu *v) hvm_vcpu_destroy(v); else pv_vcpu_destroy(v); + + free_perdomain_mappings(v); } int arch_sanitise_domain_config(struct xen_domctl_createdomain *config) @@ -870,8 +877,7 @@ int arch_domain_create(struct domain *d, } else if ( is_pv_domain(d) ) { - if ( (rc = mapcache_domain_init(d)) != 0 ) - goto fail; + mapcache_domain_init(d); if ( (rc = pv_domain_initialise(d)) != 0 ) goto fail; @@ -909,7 +915,6 @@ int arch_domain_create(struct domain *d, XFREE(d->arch.cpu_policy); if ( paging_initialised ) paging_final_teardown(d); - free_perdomain_mappings(d); return rc; } @@ -935,7 +940,6 @@ void arch_domain_destroy(struct domain *d) if ( is_pv_domain(d) ) pv_domain_destroy(d); - free_perdomain_mappings(d); free_xenheap_page(d->shared_info); cleanup_domain_irq_mapping(d); diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c index ad6d86be6918..1372be20224e 100644 --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -231,7 +231,7 @@ void unmap_domain_page(const void *ptr) local_irq_restore(flags); } -int mapcache_domain_init(struct domain *d) +void mapcache_domain_init(struct domain *d) { struct mapcache_domain *dcache = &d->arch.pv.mapcache; unsigned int bitmap_pages; @@ -240,7 +240,7 @@ int mapcache_domain_init(struct domain *d) #ifdef NDEBUG if ( !mem_hotplug && max_page <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) ) - return 0; + return; #endif BUILD_BUG_ON(MAPCACHE_VIRT_END + PAGE_SIZE * (3 + @@ -252,9 +252,6 @@ int mapcache_domain_init(struct domain *d) (bitmap_pages + 1) * PAGE_SIZE / sizeof(long); spin_lock_init(&dcache->lock); - - return create_perdomain_mapping(d, (unsigned long)dcache->inuse, - 2 * bitmap_pages + 1, false); } int mapcache_vcpu_init(struct vcpu *v) @@ -271,14 +268,14 @@ int mapcache_vcpu_init(struct vcpu *v) if ( ents > dcache->entries ) { /* Populate page tables. */ - int rc = create_perdomain_mapping(d, MAPCACHE_VIRT_START, ents, false); + int rc = create_perdomain_mapping(v, MAPCACHE_VIRT_START, ents, false); /* Populate bit maps. */ if ( !rc ) - rc = create_perdomain_mapping(d, (unsigned long)dcache->inuse, + rc = create_perdomain_mapping(v, (unsigned long)dcache->inuse, nr, true); if ( !rc ) - rc = create_perdomain_mapping(d, (unsigned long)dcache->garbage, + rc = create_perdomain_mapping(v, (unsigned long)dcache->garbage, nr, true); if ( rc ) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index e7817144059e..0dc693818349 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -601,10 +601,6 @@ int hvm_domain_initialise(struct domain *d, INIT_LIST_HEAD(&d->arch.hvm.mmcfg_regions); INIT_LIST_HEAD(&d->arch.hvm.msix_tables); - rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, false); - if ( rc ) - goto fail; - hvm_init_cacheattr_region_list(d); rc = paging_enable(d, PG_refcounts|PG_translate|PG_external); @@ -708,7 +704,6 @@ int hvm_domain_initialise(struct domain *d, XFREE(d->arch.hvm.irq); fail0: hvm_destroy_cacheattr_region_list(d); - fail: hvm_domain_relinquish_resources(d); XFREE(d->arch.hvm.io_handler); XFREE(d->arch.hvm.pl_time); diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index fbe59baa82ec..7c143d2a6c46 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -73,7 +73,7 @@ struct mapcache_domain { unsigned long *garbage; }; -int mapcache_domain_init(struct domain *d); +void mapcache_domain_init(struct domain *d); int mapcache_vcpu_init(struct vcpu *v); void mapcache_override_current(struct vcpu *v); diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index 0c57442c9593..f501e5e115ff 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -600,13 +600,13 @@ int compat_arch_memory_op(unsigned long cmd, XEN_GUEST_HANDLE_PARAM(void) arg); #define NIL(type) ((type *)-sizeof(type)) #define IS_NIL(ptr) (!((uintptr_t)(ptr) + sizeof(*(ptr)))) -int create_perdomain_mapping(struct domain *d, unsigned long va, +int create_perdomain_mapping(struct vcpu *v, unsigned long va, unsigned int nr, bool populate); void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, mfn_t *mfn, unsigned long nr); void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, unsigned int nr); -void free_perdomain_mappings(struct domain *d); +void free_perdomain_mappings(struct vcpu *v); void __iomem *ioremap_wc(paddr_t pa, size_t len); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 45664c56cb8f..c321f5723b04 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6300,9 +6300,10 @@ static bool perdomain_l1e_needs_freeing(l1_pgentry_t l1e) (_PAGE_PRESENT | _PAGE_AVAIL0); } -int create_perdomain_mapping(struct domain *d, unsigned long va, +int create_perdomain_mapping(struct vcpu *v, unsigned long va, unsigned int nr, bool populate) { + struct domain *d = v->domain; struct page_info *pg; l3_pgentry_t *l3tab; l2_pgentry_t *l2tab; @@ -6560,8 +6561,9 @@ void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, unmap_domain_page(l3tab); } -void free_perdomain_mappings(struct domain *d) +void free_perdomain_mappings(struct vcpu *v) { + struct domain *d = v->domain; l3_pgentry_t *l3tab; unsigned int i; diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index ca32e7b5d686..534d2899100f 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -277,7 +277,7 @@ int switch_compat(struct domain *d) static int pv_create_gdt_ldt_l1tab(struct vcpu *v) { - return create_perdomain_mapping(v->domain, GDT_VIRT_START(v), + return create_perdomain_mapping(v, GDT_VIRT_START(v), 1U << GDT_LDT_VCPU_SHIFT, false); } diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c index 55bba7e473ae..3b421d218e0b 100644 --- a/xen/arch/x86/x86_64/mm.c +++ b/xen/arch/x86/x86_64/mm.c @@ -730,7 +730,7 @@ void __init zap_low_mappings(void) int setup_compat_arg_xlat(struct vcpu *v) { - return create_perdomain_mapping(v->domain, ARG_XLAT_START(v), + return create_perdomain_mapping(v, ARG_XLAT_START(v), PFN_UP(COMPAT_ARG_XLAT_SIZE), true); } From patchwork Wed Jan 8 14:26:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C6EF9E7719C for ; Wed, 8 Jan 2025 14:30:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867330.1278888 (Exim 4.92) (envelope-from ) id 1tVX4w-0008SO-71; Wed, 08 Jan 2025 14:30:38 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867330.1278888; Wed, 08 Jan 2025 14:30:37 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4v-0008Px-7j; Wed, 08 Jan 2025 14:30:37 +0000 Received: by outflank-mailman (input) for mailman id 867330; Wed, 08 Jan 2025 14:30:36 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4t-0006o2-VF for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:35 +0000 Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [2a00:1450:4864:20::529]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 1f8f7ced-cdcd-11ef-a0df-8be0dac302b0; Wed, 08 Jan 2025 15:30:35 +0100 (CET) Received: by mail-ed1-x529.google.com with SMTP id 4fb4d7f45d1cf-5d3cf094768so5012916a12.0 for ; Wed, 08 Jan 2025 06:30:35 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d80676f9acsm25274285a12.31.2025.01.08.06.30.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:34 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1f8f7ced-cdcd-11ef-a0df-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346635; x=1736951435; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sUfOuSNmcurg2d9XIPxdnWgBS7aMlQNtn0OCR0xuots=; b=KAp/Wh8fFn7W/oBMia3fYMEt6rAOJJslo1nF/01QBdi6xL/iAR1Oe0AqHgB7+oO9wM V1rizFUeQzCdiw/1JG/1pzbwI9EHhXf30zV9DjKVXC36lggPEB2L9yeI3Tv9u+U65x54 9mw6gXb1mLLWGcIYWuTdIlhFSEGNBY9ppyvcE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346635; x=1736951435; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sUfOuSNmcurg2d9XIPxdnWgBS7aMlQNtn0OCR0xuots=; b=rM/CAui3HkSyOVs6NtGVhVTsnhf+PzebqbLHYepOTVtVsknNvZKMyLGOXL/VypR/Lc w2Q1lFFxfDX9fmkrrciWNTfhn0NRX+TkdY0LvT06YuwCJvCyjy4C9KCcHEHhvSabck51 Y1FM83TwAuxBQxJrkJsytj7Zc8zDMLpki2S0x4k3zuV4ewrtZzkoJ5YXBYDodBA7f0kf qW7aA1SXD8AoSdbljHbuDgxibn5zfOEYt+NFQvGLTA4GZo3sxEkFRJ5cD6ar7HroS2EB 2ya8rsTqNgkP5Y7Wa0zgnb8AdN5BTNBt55dE77GK1bJjF3w97QfjDuGP2zLGrIPIB90D DCuw== X-Gm-Message-State: AOJu0YyMr0TPx23unGPg3eZrJBIWCJJ1Zt0tUe2WyyjSoN3CYbvNNth5 Ldh6ZiCpfOS0oZycmX75gCHVAnVizNBVxXd2+J+zDDpPmLdlwEE5tHNESm0LJSsblkEcC05ntbC + X-Gm-Gg: ASbGncsrxL5lNYw0oYa+/vFzD24c+1ghLB2FeHjCmiCNKlbQAJrMrkAp+OCITATySwC GPFfC8o8lWEwNCi87QCcIspR2f/XplJr1n48tJFLMAt6V1QRf+MEMErOnQ+dQiD3nAMKcLlqYs2 WfgKsfcllliBY3naO29dBCdbbc8pYtysbdB83tpx135gi4DpPl4l/KzflJVtahooerXWAvkT1nv Oq3OWg1hjZ/Ciw6gC/ak1IlEnxdxqb3hoIb7m8hcpocyKwumoj1J8xiHv03Q03ZDHI= X-Google-Smtp-Source: AGHT+IF2DvwpaRgZK7HzfqN6mxE1GSwGuQVr3ugA4OWrViS+868nKyrdujXICXFEMX53cvhmiF1lZA== X-Received: by 2002:a05:6402:13c1:b0:5d0:ea2a:726c with SMTP id 4fb4d7f45d1cf-5d972e069b5mr2756697a12.8.1736346634609; Wed, 08 Jan 2025 06:30:34 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 11/18] x86/pv: untie issuing FLUSH_ROOT_PGTBL from XPTI Date: Wed, 8 Jan 2025 15:26:51 +0100 Message-ID: <20250108142659.99490-12-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 The current logic gates issuing flush TLB requests with the FLUSH_ROOT_PGTBL flag to XPTI being enabled. In preparation for FLUSH_ROOT_PGTBL also being needed when not using XPTI, untie it from the xpti domain boolean and instead introduce a new flush_root_pt field. No functional change intended, as flush_root_pt == xpti. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/domain.h | 2 ++ xen/arch/x86/include/asm/flushtlb.h | 2 +- xen/arch/x86/mm.c | 2 +- xen/arch/x86/pv/domain.c | 2 ++ 4 files changed, 6 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index 7c143d2a6c46..5af414fa64ac 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -281,6 +281,8 @@ struct pv_domain bool pcid; /* Mitigate L1TF with shadow/crashing? */ bool check_l1tf; + /* Issue FLUSH_ROOT_PGTBL for root page-table changes. */ + bool flush_root_pt; /* map_domain_page() mapping cache. */ struct mapcache_domain mapcache; diff --git a/xen/arch/x86/include/asm/flushtlb.h b/xen/arch/x86/include/asm/flushtlb.h index bb0ad58db49b..1b98d03decdc 100644 --- a/xen/arch/x86/include/asm/flushtlb.h +++ b/xen/arch/x86/include/asm/flushtlb.h @@ -177,7 +177,7 @@ void flush_area_mask(const cpumask_t *mask, const void *va, #define flush_root_pgtbl_domain(d) \ { \ - if ( is_pv_domain(d) && (d)->arch.pv.xpti ) \ + if ( is_pv_domain(d) && (d)->arch.pv.flush_root_pt ) \ flush_mask((d)->dirty_cpumask, FLUSH_ROOT_PGTBL); \ } diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index c321f5723b04..49403196d56e 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4178,7 +4178,7 @@ long do_mmu_update( cmd == MMU_PT_UPDATE_PRESERVE_AD, v); if ( !rc ) flush_linear_pt = true; - if ( !rc && pt_owner->arch.pv.xpti ) + if ( !rc && pt_owner->arch.pv.flush_root_pt ) { bool local_in_use = false; diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 534d2899100f..5bda168eadff 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -368,6 +368,8 @@ int pv_domain_initialise(struct domain *d) d->arch.ctxt_switch = &pv_csw; + d->arch.pv.flush_root_pt = d->arch.pv.xpti; + if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) { From patchwork Wed Jan 8 14:26:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ACDA4E77199 for ; Wed, 8 Jan 2025 14:45:02 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867451.1278967 (Exim 4.92) (envelope-from ) id 1tVXIm-0000vG-4N; Wed, 08 Jan 2025 14:44:56 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867451.1278967; Wed, 08 Jan 2025 14:44:56 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVXIm-0000v9-1q; Wed, 08 Jan 2025 14:44:56 +0000 Received: by outflank-mailman (input) for mailman id 867451; Wed, 08 Jan 2025 14:44:54 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4w-0005q4-9c for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:38 +0000 Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [2a00:1450:4864:20::62f]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 2029173a-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:36 +0100 (CET) Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-aa689a37dd4so565868366b.3 for ; Wed, 08 Jan 2025 06:30:36 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0f015b0fsm2488132166b.156.2025.01.08.06.30.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:35 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2029173a-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346636; x=1736951436; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5kWNGpzhADK4ApVubuzBqlSw3YHI7Srtu2T9UABtAuo=; b=X5sesrPYr8uXKthxxtljZ/WIebAQB6PO+BmU4KRBwWgNppigYMNW8h5U+gldsrFIW0 Zd+nztxaSdjnXbaQGU2Pvob6/4voG6j2MFae8vHp72yWjtWCVFDduB50g63PcNXPCslB 1Q2SrNoTZbsPJD6KQziC29f2e0sBSS+VjUqO8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346636; x=1736951436; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5kWNGpzhADK4ApVubuzBqlSw3YHI7Srtu2T9UABtAuo=; b=H+DezmTGBT8BXs34kaZYBMY72i3EQzTjjndsiESxQC9jBE4OzBj9VeQxOOB+2nNGP0 0shAz2tBJT/azrz4Rky/0kk2NXfeOL29ROiqJou1hFNEMTjRp9ApP1miqiKwIFpz94LO yjQgceZW0tHbDGUpy4Qdj4I/buZWKIpdoTue9mdWo8D2h8hbZUJXs0wLvw6W6byYclJo SQuGjdzez1hqZGWDydNQe6xIPs1W6rzEitILxfhd+/CyyTUhgNUjqbEOQg2lJHFr9WvC wQhyb44QLupxQ42t0ZIXhgavgEB5mxcY5Bj5+FqSL4QvhexUU4zO+A98F3DbdDUNvrGa +Sqg== X-Gm-Message-State: AOJu0YxDThdT4iqVb6ol4gIrZ7tyM4kVrWS0CIhf7G3CZgr/4urwu60W rbW5onEtTstYDYJDGE3jSkOb5Ke+MD07uI2dwleTu5C83cxv3W0SZCbf5WudApVFX9htaqpKisJ P X-Gm-Gg: ASbGncvvBch5VoE/ijqm2lRgn86fFI+QAXrf/4akQdV01StbzOgkBvFJA3XsM/UCVF0 ig/TZ+1WGG1R9rzBioryVNE8UpPV0AiO7v3+X1b4AurS39PFZ6P2D3IeOLbuzBcHKvTKuxTMY8A GHt3DjKCJPN/MfpvXOVeLNXRtaGB7LBM2OC9KeqaX11wGd2zGx7wGqeCs/5LcTfAF8EhupmQzzk rtlUhCm0bO+q1hhT4wWel0eIN2uchyqM4o5AevoJgFHCJHDJvx953i6gs0JLyV7IDg= X-Google-Smtp-Source: AGHT+IHyzn1e1qV9vmlcMx0QuYKG/GCjUKcB3kN205x8eo9Yyxhtm8T1TbpI1iDBGjGQVCA1AV0T8g== X-Received: by 2002:a17:907:948d:b0:aac:29a:2817 with SMTP id a640c23a62f3a-ab2ab73bbf5mr277559666b.26.1736346635685; Wed, 08 Jan 2025 06:30:35 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 12/18] x86/mm: move FLUSH_ROOT_PGTBL handling before TLB flush Date: Wed, 8 Jan 2025 15:26:52 +0100 Message-ID: <20250108142659.99490-13-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 Move the handling of FLUSH_ROOT_PGTBL in flush_area_local() ahead of the logic that does the TLB flushing, in preparation for further changes requiring the TLB flush to be strictly done after having handled FLUSH_ROOT_PGTBL. No functional change intended. Signed-off-by: Roger Pau Monné --- xen/arch/x86/flushtlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index 65be0474a8ea..a64c28f854ea 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -191,6 +191,9 @@ unsigned int flush_area_local(const void *va, unsigned int flags) { unsigned int order = (flags - 1) & FLUSH_ORDER_MASK; + if ( flags & FLUSH_ROOT_PGTBL ) + get_cpu_info()->root_pgt_changed = true; + if ( flags & (FLUSH_TLB|FLUSH_TLB_GLOBAL) ) { if ( order == 0 ) @@ -254,9 +257,6 @@ unsigned int flush_area_local(const void *va, unsigned int flags) } } - if ( flags & FLUSH_ROOT_PGTBL ) - get_cpu_info()->root_pgt_changed = true; - return flags; } From patchwork Wed Jan 8 14:26:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931078 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 20F45E77188 for ; Wed, 8 Jan 2025 14:43:38 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867408.1278947 (Exim 4.92) (envelope-from ) id 1tVXHQ-0007Rf-EM; Wed, 08 Jan 2025 14:43:32 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867408.1278947; Wed, 08 Jan 2025 14:43:32 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVXHQ-0007RY-Bg; Wed, 08 Jan 2025 14:43:32 +0000 Received: by outflank-mailman (input) for mailman id 867408; Wed, 08 Jan 2025 14:43:31 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4z-0005q4-AA for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:41 +0000 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [2a00:1450:4864:20::62c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 216c9d2a-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:38 +0100 (CET) Received: by mail-ej1-x62c.google.com with SMTP id a640c23a62f3a-aa679ad4265so176181966b.0 for ; Wed, 08 Jan 2025 06:30:38 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0f06deb8sm2476322266b.190.2025.01.08.06.30.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:36 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 216c9d2a-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346637; x=1736951437; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=64UhML0RWOsr//hCManVNWudvxbMgrGePKLAgvzwrso=; b=YcugltHiEInC9JtMlReKqHvYYwlyquSaZr4ZHtedJCKh/HKPjITL1XbBYKBkmym18W +VEZTddvJ+EoviZRPyy/YbIyj5EE0UY5TSfNYSXAorBydSiagDxH5r3xr6BBx3f8umhG 7TCD24rrQc2SNSr98neTJb8/I4z02FeC0BwyY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346637; x=1736951437; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=64UhML0RWOsr//hCManVNWudvxbMgrGePKLAgvzwrso=; b=E2O7wmSKxSMTf7gJ/oFEPr8QFr2EaajFvgVCcz40y03y3XNDPz6if7f00RanU4knll rljb6dO2aLZNRvnJSW59irRoPYFYzz9N6onCI3IaJTyjnEgcIQpFJgULy10c/9LiCARk 7gZoyYuf+Nq8Qcac2IAB3ZiS/cH+K4OlK04eLIctsH1KgLc7k5nligEuJlaQlT0TzznR dwFQaRRl+hpa4PUqqk9glsk2XnkTXRlgXUCCVpLc3O+qH3xwRpjhHRiuUg9vb8iCLKqO u+8R/tJVqqfn9gy2/XZGcPGxD55G72+dvvzL5qdFa1AhJB6wwNfA2FibiDUF/x+z/KyM 3FeA== X-Gm-Message-State: AOJu0Yzq492fzlGF1dgzLU6HwM3inRdO7wLNy/Bg8yVXCYgIhbdg+VoF TCa8tMCPc8lGs0dETw9uXqrg1rzSvERriBMkfV3DbmdWeTMkOS1WvSDp+Z2GO7NOxTz+hXMRWVu g X-Gm-Gg: ASbGncvo+U1VKdEfCE8NevmanrJeLb4lgF9WMmYbSwv3OSNiCSQBOscvhfJCVp1ta88 ZokNNGt2/BTFvo0LsNGmSbfc2g6V3x9fQHlesuyt1D6fld2UG1Dvyyat0EY6piU6KAQhZOmmLX4 ZKvgE+5g09oH4vy2Koift+iwqAKRMnaGJSbdyEjA/cR9k7m03eGt+Aplncmr6Lroqg7hfK6/NxY YOWomT0SqURcW4pJLWqL7gooxELQmwuqcd7qdxRoNahiUd6HMxr7ma9X1QAIqaoslk= X-Google-Smtp-Source: AGHT+IEq+KY2LZkz/gnj/jxJL+yjPIdvzmJsHhNv2K4GFW4sBjXt+uj2Uerfd4lNiG4OAscLmf4jpQ== X-Received: by 2002:a17:907:3d8f:b0:aae:83c7:6ce7 with SMTP id a640c23a62f3a-ab28d07ba93mr706906266b.0.1736346637140; Wed, 08 Jan 2025 06:30:37 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , Stefano Stabellini Subject: [PATCH v2 13/18] x86/spec-ctrl: introduce Address Space Isolation command line option Date: Wed, 8 Jan 2025 15:26:53 +0100 Message-ID: <20250108142659.99490-14-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 No functional change, as the option is not used. Introduced new so newly added functionality is keyed on the option being enabled, even if the feature is non-functional. When ASI is enabled for PV domains, printing the usage of XPTI might be omitted if it must be uniformly disabled given the usage of ASI. Signed-off-by: Roger Pau Monné --- Changes since v1: - Improve comments and documentation about what ASI provides. - Do not print the XPTI information if ASI is used for pv domUs and dom0 is PVH, or if ASI is used for both domU and dom0. FWIW, I would print the state of XPTI uniformly, as otherwise I find the output might be confusing for user expecting to assert the state of XPTI. --- docs/misc/xen-command-line.pandoc | 19 +++++ xen/arch/x86/include/asm/domain.h | 3 + xen/arch/x86/include/asm/spec_ctrl.h | 2 + xen/arch/x86/spec_ctrl.c | 115 +++++++++++++++++++++++++-- 4 files changed, 133 insertions(+), 6 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index 08b0053f9ced..3c1ad7b5fe7d 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -202,6 +202,25 @@ to appropriate auditing by Xen. Argo is disabled by default. This option is disabled by default, to protect domains from a DoS by a buggy or malicious other domain spamming the ring. +### asi (x86) +> `= List of [ , {pv,hvm}=, + {vcpu-pt}=|{pv,hvm}= ]` + +Offers control over whether the hypervisor will engage in Address Space +Isolation, by not having potentially sensitive information permanently mapped +in the VMM page-tables. Using this option might avoid the need to apply +mitigations for certain speculative related attacks, at the cost of mapping +sensitive information on-demand. + +* `pv=` and `hvm=` sub-options allow enabling for specific guest types. + +**WARNING: manual de-selection of enabled options will invalidate any +protection offered by the feature. The fine grained options provided below are +meant to be used for debugging purposes only.** + +* `vcpu-pt` ensure each vCPU uses a unique top-level page-table and setup a + virtual address space region to map memory on a per-vCPU basis. + ### asid (x86) > `= ` diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index 5af414fa64ac..fb92a10bf3b7 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -456,6 +456,9 @@ struct arch_domain /* Don't unconditionally inject #GP for unhandled MSRs. */ bool msr_relaxed; + /* Use a per-vCPU root pt, and switch per-domain slot to per-vCPU. */ + bool vcpu_pt; + /* Emulated devices enabled bitmap. */ uint32_t emulation_flags; } __cacheline_aligned; diff --git a/xen/arch/x86/include/asm/spec_ctrl.h b/xen/arch/x86/include/asm/spec_ctrl.h index 077225418956..c58afbaab671 100644 --- a/xen/arch/x86/include/asm/spec_ctrl.h +++ b/xen/arch/x86/include/asm/spec_ctrl.h @@ -88,6 +88,8 @@ extern uint8_t default_scf; extern int8_t opt_xpti_hwdom, opt_xpti_domu; +extern int8_t opt_vcpu_pt_pv, opt_vcpu_pt_hwdom, opt_vcpu_pt_hvm; + extern bool cpu_has_bug_l1tf; extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; extern bool opt_bp_spec_reduce; diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c index ced84750015c..9463a8624701 100644 --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -85,6 +85,11 @@ static int8_t __initdata opt_gds_mit = -1; static int8_t __initdata opt_div_scrub = -1; bool __ro_after_init opt_bp_spec_reduce = true; +/* Use a per-vCPU root page-table and switch the per-domain slot to per-vCPU. */ +int8_t __ro_after_init opt_vcpu_pt_hvm = -1; +int8_t __ro_after_init opt_vcpu_pt_hwdom = -1; +int8_t __ro_after_init opt_vcpu_pt_pv = -1; + static int __init cf_check parse_spec_ctrl(const char *s) { const char *ss; @@ -384,6 +389,13 @@ int8_t __ro_after_init opt_xpti_domu = -1; static __init void xpti_init_default(void) { + ASSERT(opt_vcpu_pt_pv >= 0 && opt_vcpu_pt_hwdom >= 0); + if ( (opt_xpti_hwdom == 1 || opt_xpti_domu == 1) && opt_vcpu_pt_pv == 1 ) + { + printk(XENLOG_ERR + "XPTI incompatible with per-vCPU page-tables, disabling ASI\n"); + opt_vcpu_pt_pv = 0; + } if ( (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || cpu_has_rdcl_no ) { @@ -395,9 +407,9 @@ static __init void xpti_init_default(void) else { if ( opt_xpti_hwdom < 0 ) - opt_xpti_hwdom = 1; + opt_xpti_hwdom = !opt_vcpu_pt_hwdom; if ( opt_xpti_domu < 0 ) - opt_xpti_domu = 1; + opt_xpti_domu = !opt_vcpu_pt_pv; } } @@ -488,6 +500,66 @@ static int __init cf_check parse_pv_l1tf(const char *s) } custom_param("pv-l1tf", parse_pv_l1tf); +static int __init cf_check parse_asi(const char *s) +{ + const char *ss; + int val, rc = 0; + + /* Interpret 'asi' alone in its positive boolean form. */ + if ( *s == '\0' ) + opt_vcpu_pt_pv = opt_vcpu_pt_hwdom = opt_vcpu_pt_hvm = 1; + + do { + ss = strchr(s, ','); + if ( !ss ) + ss = strchr(s, '\0'); + + val = parse_bool(s, ss); + switch ( val ) + { + case 0: + case 1: + opt_vcpu_pt_pv = opt_vcpu_pt_hwdom = opt_vcpu_pt_hvm = val; + break; + + default: + if ( (val = parse_boolean("pv", s, ss)) >= 0 ) + opt_vcpu_pt_pv = val; + else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) + opt_vcpu_pt_hvm = val; + else if ( (val = parse_boolean("vcpu-pt", s, ss)) != -1 ) + { + switch ( val ) + { + case 1: + case 0: + opt_vcpu_pt_pv = opt_vcpu_pt_hvm = opt_vcpu_pt_hwdom = val; + break; + + case -2: + s += strlen("vcpu-pt="); + if ( (val = parse_boolean("pv", s, ss)) >= 0 ) + opt_vcpu_pt_pv = val; + else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) + opt_vcpu_pt_hvm = val; + else + default: + rc = -EINVAL; + break; + } + } + else if ( *s ) + rc = -EINVAL; + break; + } + + s = ss + 1; + } while ( *ss ); + + return rc; +} +custom_param("asi", parse_asi); + static void __init print_details(enum ind_thunk thunk) { unsigned int _7d0 = 0, _7d2 = 0, e8b = 0, e21a = 0, max = 0, tmp; @@ -668,15 +740,29 @@ static void __init print_details(enum ind_thunk thunk) boot_cpu_has(X86_FEATURE_IBPB_ENTRY_PV) ? " IBPB-entry" : "", opt_bhb_entry_pv ? " BHB-entry" : ""); - printk(" XPTI (64-bit PV only): Dom0 %s, DomU %s (with%s PCID)\n", - opt_xpti_hwdom ? "enabled" : "disabled", - opt_xpti_domu ? "enabled" : "disabled", - xpti_pcid_enabled() ? "" : "out"); + if ( !opt_vcpu_pt_pv || (!opt_dom0_pvh && !opt_vcpu_pt_hwdom) ) + printk(" XPTI (64-bit PV only): Dom0 %s, DomU %s (with%s PCID)\n", + opt_xpti_hwdom ? "enabled" : "disabled", + opt_xpti_domu ? "enabled" : "disabled", + xpti_pcid_enabled() ? "" : "out"); printk(" PV L1TF shadowing: Dom0 %s, DomU %s\n", opt_pv_l1tf_hwdom ? "enabled" : "disabled", opt_pv_l1tf_domu ? "enabled" : "disabled"); #endif + +#ifdef CONFIG_HVM + printk(" ASI features for HVM VMs:%s%s\n", + opt_vcpu_pt_hvm ? "" : " None", + opt_vcpu_pt_hvm ? " vCPU-PT" : ""); + +#endif +#ifdef CONFIG_PV + printk(" ASI features for PV VMs:%s%s\n", + opt_vcpu_pt_pv ? "" : " None", + opt_vcpu_pt_pv ? " vCPU-PT" : ""); + +#endif } static bool __init check_smt_enabled(void) @@ -1779,6 +1865,10 @@ void spec_ctrl_init_domain(struct domain *d) if ( pv ) d->arch.pv.xpti = is_hardware_domain(d) ? opt_xpti_hwdom : opt_xpti_domu; + + d->arch.vcpu_pt = is_hardware_domain(d) ? opt_vcpu_pt_hwdom + : pv ? opt_vcpu_pt_pv + : opt_vcpu_pt_hvm; } void __init init_speculation_mitigations(void) @@ -2075,6 +2165,19 @@ void __init init_speculation_mitigations(void) hw_smt_enabled && default_xen_spec_ctrl ) setup_force_cpu_cap(X86_FEATURE_SC_MSR_IDLE); + /* Disable all ASI options by default until feature is finished. */ + if ( opt_vcpu_pt_pv == -1 ) + opt_vcpu_pt_pv = 0; + if ( opt_vcpu_pt_hwdom == -1 ) + opt_vcpu_pt_hwdom = 0; + if ( opt_vcpu_pt_hvm == -1 ) + opt_vcpu_pt_hvm = 0; + + if ( opt_vcpu_pt_pv || opt_vcpu_pt_hvm ) + warning_add( + "Address Space Isolation is not functional, this option is\n" + "intended to be used only for development purposes.\n"); + xpti_init_default(); l1tf_calculations(); From patchwork Wed Jan 8 14:26:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 80CF3E77188 for ; Wed, 8 Jan 2025 14:30:50 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867333.1278908 (Exim 4.92) (envelope-from ) id 1tVX50-00017v-Fq; Wed, 08 Jan 2025 14:30:42 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867333.1278908; Wed, 08 Jan 2025 14:30:42 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX50-00016o-8s; Wed, 08 Jan 2025 14:30:42 +0000 Received: by outflank-mailman (input) for mailman id 867333; Wed, 08 Jan 2025 14:30:40 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX4y-0006o2-Kh for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:40 +0000 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [2a00:1450:4864:20::632]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2225e2e3-cdcd-11ef-a0df-8be0dac302b0; Wed, 08 Jan 2025 15:30:39 +0100 (CET) Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-aa689a37dd4so565882666b.3 for ; Wed, 08 Jan 2025 06:30:39 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0f015ae1sm2498095766b.147.2025.01.08.06.30.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:38 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2225e2e3-cdcd-11ef-a0df-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346639; x=1736951439; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WR7teHFML1rITUYnPw8UNQ0VYOmZRC0EHDzMIXlg0pw=; b=cVY/4bU0wVsnhDKeWp7d7jb51GsqOsuDcxoBKROiNH4+FJKji5K/AH+eIl8YlqSSSM olJNQnE5+kjezvngMc96TbtI8EMEVm0L6HaGtouS0HduPX9NklXJBnlZN1rVurFApjmC 9yrQOyQFHiu6W/+htqhbzPKrow25KIYOhWQ/I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346639; x=1736951439; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WR7teHFML1rITUYnPw8UNQ0VYOmZRC0EHDzMIXlg0pw=; b=u6qNx4S2XAgrBNFM60qWbmyPfiTFMDEAd4SGx1BGg75Yt3B+v0NHZy3aO8Vf7nPTHy VPT1J08EaGBxw5w2Ow8tmuC5TqC/Ug13X02PSXdpTUWQ4bO5nXzepW9zwZhTo6wL8YCC SxXC3ipnL4B5vDCnx/JG3B5F25dH6loCFaRCgyllTNiyXOo4Vyn2ZdYL/Y4Jagi2WUji f1LDw0wwSiuOko9bGA1P4uU+oTxqhz/Z5spQN/hzyTOxf0BQSHj8nnThCvF7cpl/o9n4 K0TVE+WKMwgiibW1PRORTf9KiEfFP7q50Ue2UCbSE/yHlyCFbF173Ef/ovVQncuUH4HU Ku0A== X-Gm-Message-State: AOJu0YzwnHne2WkgzinvnzBNPvC3Rky5PJvWpKaXZj0LllJ80fRVkPww lsF+XN7/DknopTMT24rLbJuWW6DKcsdpyidiK+qJVT4GxizPCjl2/UWaEf2UFF9a5B8DRhMPWZa O X-Gm-Gg: ASbGncs36JjTZQWmT1/+mXzTx2yLVY/vTEHBi+rWfudOPTvxNU5zJRHC0FoGAxZ75vU +xuWimaU78twk2OgauUJYyZ3sEoTyAktHHAbgPzPjeH0AE2IH7CsAqvctkNEydFTbEKJh/zxvoy IxXI7IiXmFys/FN1E7tzjO5oClzQyqRm5RULa8Io1qlrwyQWTCI1Eq581JQgx3zV24lcjcmPWWX kikSxTaPOTLEFnqUcWvMN+21IDoa5pGA5CI5FC+fT2UT8RhZkF+mX8FTwctUXa7Ppc= X-Google-Smtp-Source: AGHT+IF/Xw9nkDZk1ywDDDE/IVmeebeNSqZlflHTjUOVoey3KkVwbF4xhORQCkJpHXMzlcnEBQFmaw== X-Received: by 2002:a17:906:ef09:b0:aab:71fc:47a3 with SMTP id a640c23a62f3a-ab2abda708dmr260798666b.60.1736346638763; Wed, 08 Jan 2025 06:30:38 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper , Tim Deegan Subject: [PATCH v2 14/18] x86/mm: introduce per-vCPU L3 page-table Date: Wed, 8 Jan 2025 15:26:54 +0100 Message-ID: <20250108142659.99490-15-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 Such table is to be used in the per-domain slot when running with Address Space Isolation enabled for the domain. Signed-off-by: Roger Pau Monné --- xen/arch/x86/include/asm/domain.h | 3 +++ xen/arch/x86/include/asm/mm.h | 2 +- xen/arch/x86/mm.c | 45 ++++++++++++++++++++++--------- xen/arch/x86/mm/hap/hap.c | 2 +- xen/arch/x86/mm/shadow/hvm.c | 2 +- xen/arch/x86/mm/shadow/multi.c | 2 +- xen/arch/x86/pv/dom0_build.c | 2 +- xen/arch/x86/pv/domain.c | 2 +- 8 files changed, 41 insertions(+), 19 deletions(-) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index fb92a10bf3b7..5bf0ad3fdcf7 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -666,6 +666,9 @@ struct arch_vcpu struct vcpu_msrs *msrs; + /* ASI: per-vCPU L3 table to use in the L4 per-domain slot. */ + struct page_info *pervcpu_l3_pg; + struct { bool next_interrupt_enabled; } monitor; diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index f501e5e115ff..f79d1594fde4 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -375,7 +375,7 @@ int devalidate_page(struct page_info *page, unsigned long type, void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d); void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, - const struct domain *d, mfn_t sl4mfn, bool ro_mpt); + const struct vcpu *v, mfn_t sl4mfn, bool ro_mpt); bool fill_ro_mpt(mfn_t mfn); void zap_ro_mpt(mfn_t mfn); diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 49403196d56e..583bf4c58bf9 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -1658,8 +1658,9 @@ static int promote_l3_table(struct page_info *page) * extended directmap. */ void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, - const struct domain *d, mfn_t sl4mfn, bool ro_mpt) + const struct vcpu *v, mfn_t sl4mfn, bool ro_mpt) { + const struct domain *d = v->domain; /* * PV vcpus need a shortened directmap. HVM and Idle vcpus get the full * directmap. @@ -1687,7 +1688,9 @@ void init_xen_l4_slots(l4_pgentry_t *l4t, mfn_t l4mfn, /* Slot 260: Per-domain mappings. */ l4t[l4_table_offset(PERDOMAIN_VIRT_START)] = - l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW); + l4e_from_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg + : d->arch.perdomain_l3_pg, + __PAGE_HYPERVISOR_RW); /* Slot 4: Per-domain mappings mirror. */ BUILD_BUG_ON(IS_ENABLED(CONFIG_PV32) && @@ -1842,8 +1845,15 @@ static int promote_l4_table(struct page_info *page) if ( !rc ) { + /* + * Use vCPU#0 unconditionally. When not running with ASI enabled the + * per-domain table is shared between all vCPUs, so it doesn't matter + * which vCPU gets passed to init_xen_l4_slots(). When running with + * ASI enabled this L4 will not be used, as a shadow per-vCPU L4 is + * used instead. + */ init_xen_l4_slots(pl4e, l4mfn, - d, INVALID_MFN, VM_ASSIST(d, m2p_strict)); + d->vcpu[0], INVALID_MFN, VM_ASSIST(d, m2p_strict)); atomic_inc(&d->arch.pv.nr_l4_pages); } unmap_domain_page(pl4e); @@ -6313,14 +6323,17 @@ int create_perdomain_mapping(struct vcpu *v, unsigned long va, ASSERT(va >= PERDOMAIN_VIRT_START && va < PERDOMAIN_VIRT_SLOT(PERDOMAIN_SLOTS)); - if ( !d->arch.perdomain_l3_pg ) + if ( !v->arch.pervcpu_l3_pg && !d->arch.perdomain_l3_pg ) { pg = alloc_domheap_page(d, MEMF_no_owner); if ( !pg ) return -ENOMEM; l3tab = __map_domain_page(pg); clear_page(l3tab); - d->arch.perdomain_l3_pg = pg; + if ( d->arch.vcpu_pt ) + v->arch.pervcpu_l3_pg = pg; + else + d->arch.perdomain_l3_pg = pg; if ( !nr ) { unmap_domain_page(l3tab); @@ -6330,7 +6343,8 @@ int create_perdomain_mapping(struct vcpu *v, unsigned long va, else if ( !nr ) return 0; else - l3tab = __map_domain_page(d->arch.perdomain_l3_pg); + l3tab = __map_domain_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg + : d->arch.perdomain_l3_pg); ASSERT(!l3_table_offset(va ^ (va + nr * PAGE_SIZE - 1))); @@ -6436,8 +6450,9 @@ void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, return; } - ASSERT(d->arch.perdomain_l3_pg); - l3tab = __map_domain_page(d->arch.perdomain_l3_pg); + ASSERT(d->arch.perdomain_l3_pg || v->arch.pervcpu_l3_pg); + l3tab = __map_domain_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg + : d->arch.perdomain_l3_pg); if ( unlikely(!(l3e_get_flags(l3tab[l3_table_offset(va)]) & _PAGE_PRESENT)) ) @@ -6498,7 +6513,7 @@ void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, va < PERDOMAIN_VIRT_SLOT(PERDOMAIN_SLOTS)); ASSERT(!nr || !l3_table_offset(va ^ (va + nr * PAGE_SIZE - 1))); - if ( !d->arch.perdomain_l3_pg ) + if ( !d->arch.perdomain_l3_pg && !v->arch.pervcpu_l3_pg ) return; /* Use likely to force the optimization for the fast path. */ @@ -6522,7 +6537,8 @@ void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, return; } - l3tab = __map_domain_page(d->arch.perdomain_l3_pg); + l3tab = __map_domain_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg + : d->arch.perdomain_l3_pg); pl3e = l3tab + l3_table_offset(va); if ( l3e_get_flags(*pl3e) & _PAGE_PRESENT ) @@ -6567,10 +6583,11 @@ void free_perdomain_mappings(struct vcpu *v) l3_pgentry_t *l3tab; unsigned int i; - if ( !d->arch.perdomain_l3_pg ) + if ( !v->arch.pervcpu_l3_pg && !d->arch.perdomain_l3_pg ) return; - l3tab = __map_domain_page(d->arch.perdomain_l3_pg); + l3tab = __map_domain_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg + : d->arch.perdomain_l3_pg); for ( i = 0; i < PERDOMAIN_SLOTS; ++i) if ( l3e_get_flags(l3tab[i]) & _PAGE_PRESENT ) @@ -6604,8 +6621,10 @@ void free_perdomain_mappings(struct vcpu *v) } unmap_domain_page(l3tab); - free_domheap_page(d->arch.perdomain_l3_pg); + free_domheap_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg + : d->arch.perdomain_l3_pg); d->arch.perdomain_l3_pg = NULL; + v->arch.pervcpu_l3_pg = NULL; } static void write_sss_token(unsigned long *ptr) diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index ec5043a8aa9e..c7d9bf7c71bf 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -402,7 +402,7 @@ static mfn_t hap_make_monitor_table(struct vcpu *v) m4mfn = page_to_mfn(pg); l4e = map_domain_page(m4mfn); - init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false); + init_xen_l4_slots(l4e, m4mfn, v, INVALID_MFN, false); unmap_domain_page(l4e); return m4mfn; diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c index 114957a3e1ec..d588dbbae003 100644 --- a/xen/arch/x86/mm/shadow/hvm.c +++ b/xen/arch/x86/mm/shadow/hvm.c @@ -776,7 +776,7 @@ mfn_t sh_make_monitor_table(const struct vcpu *v, unsigned int shadow_levels) * shadow-linear mapping will either be inserted below when creating * lower level monitor tables, or later in sh_update_cr3(). */ - init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false); + init_xen_l4_slots(l4e, m4mfn, v, INVALID_MFN, false); if ( shadow_levels < 4 ) { diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index 10ddc408ff73..a1f8147e197a 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -973,7 +973,7 @@ sh_make_shadow(struct vcpu *v, mfn_t gmfn, u32 shadow_type) BUILD_BUG_ON(sizeof(l4_pgentry_t) != sizeof(shadow_l4e_t)); - init_xen_l4_slots(l4t, gmfn, d, smfn, (!is_pv_32bit_domain(d) && + init_xen_l4_slots(l4t, gmfn, v, smfn, (!is_pv_32bit_domain(d) && VM_ASSIST(d, m2p_strict))); unmap_domain_page(l4t); } diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index f54d1da5c6f4..5081c19b9a9a 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -737,7 +737,7 @@ static int __init dom0_construct(struct boot_info *bi, struct domain *d) l4start = l4tab = __va(mpt_alloc); mpt_alloc += PAGE_SIZE; clear_page(l4tab); init_xen_l4_slots(l4tab, _mfn(virt_to_mfn(l4start)), - d, INVALID_MFN, true); + d->vcpu[0], INVALID_MFN, true); v->arch.guest_table = pagetable_from_paddr(__pa(l4start)); } else diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 5bda168eadff..8d2428051607 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -125,7 +125,7 @@ static int setup_compat_l4(struct vcpu *v) mfn = page_to_mfn(pg); l4tab = map_domain_page(mfn); clear_page(l4tab); - init_xen_l4_slots(l4tab, mfn, v->domain, INVALID_MFN, false); + init_xen_l4_slots(l4tab, mfn, v, INVALID_MFN, false); unmap_domain_page(l4tab); /* This page needs to look like a pagetable so that it can be shadowed */ From patchwork Wed Jan 8 14:26:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D9B8E77199 for ; Wed, 8 Jan 2025 14:40:53 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867373.1278918 (Exim 4.92) (envelope-from ) id 1tVXEc-0005nA-GL; Wed, 08 Jan 2025 14:40:38 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867373.1278918; Wed, 08 Jan 2025 14:40:38 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVXEc-0005n3-Cy; Wed, 08 Jan 2025 14:40:38 +0000 Received: by outflank-mailman (input) for mailman id 867373; Wed, 08 Jan 2025 14:40:37 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX55-0005q4-C1 for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:47 +0000 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [2a00:1450:4864:20::52c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 24368c75-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:43 +0100 (CET) Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-5d3dce16a3dso1765083a12.1 for ; Wed, 08 Jan 2025 06:30:43 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5d8f51faaf8sm10072954a12.2.2025.01.08.06.30.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:39 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 24368c75-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346642; x=1736951442; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JNW32bRT/EB8JAJJfIqWa3rkLwjG5D4hxKX8BxasRVQ=; b=brsXOagBDLHHKXlIhNM2KPDkBHOYrBomVeB8Q6VksbMumv/LzaaVL/fqOWxN+v3ZAQ ZpX/bBO8wIk1qT5sWu7NXNWUYn6tGKyL/WfE9h46Ec4CIgkhVeBB+/NpuG/xA0x0Ivjg qpqEcepUvCk7tPvvlzIFmv7gZlYmgPfQvTZ8Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346642; x=1736951442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JNW32bRT/EB8JAJJfIqWa3rkLwjG5D4hxKX8BxasRVQ=; b=niU71F66EZ6S6hrzvbVRsI0I9Iw5de5YZ2sDyaAUPVpM1D0HXZuaPkI+RQy9RHMhO8 7UDhGWTwBvis3lMDJtYNQkOmFe7YwHRtUkwk+jr50eJMDdkMstH+4L/GqrdRvwuDIzG/ pG8ZOxXmSSDyBCYimAp4Y1n8c/GMsoeib5hLwgg4c+z2t8575hjak+0pXdjUIEQ/QjxN 4/alAX4Qodg7yCc7EbAbR5RSzZI8xrCfNn4J7tM7eXb0Q3tFVEaRapSxlYlOtlHiWEYQ eLSjht0THiQic4yfLS7qdMrXLIRYJqHt3LH2R0MLEoGvPmtgikcwF9/OQHryrvRrxBIs 5khA== X-Gm-Message-State: AOJu0YwONCq0hN6pd8Ee8m8ExbUVVkF4AEHym7LQ22jEJ/Seht4rrpGB vp38YN2UY1PxTC7hVTbCGk2inWsOef8EYvSqffr39wxgvHpU7o6CbHYTbytzVIEMfJayKRgNJpN a X-Gm-Gg: ASbGncv+Otc4t48zwtKeYWvQDsdmKnMKGS0UO5y3cxbU7mF7fw9SJL2aJJ5CujDvAHS rEAIHfc3dqlBCEtFBaf+Ucv2qmonMP2RtDdKzYl9TAFe0fFCeyMzPQn+6XzwVQ9o6s/R+hYc06C 9fNkP1D8k5m1td2u7d5OZZU3+XSS6dGa8BUOOo3Y7XmhyUm1bJonj/ybDyvkW1EzlUg+9ogfqZ0 UVdtefEs9wIqzW5Zzyr6agj8VFuyRLEfgQCzxu4Vb3ad+4Ug1n0Sm3F4X20NclVmlE= X-Google-Smtp-Source: AGHT+IEqQIaGeMfOrrMGbsRVv/X76nyH1yi83H5076IXDviKZPheR28xONW6k7grUF1jnqGwq4MnkQ== X-Received: by 2002:a05:6402:40c6:b0:5d9:6633:8e9b with SMTP id 4fb4d7f45d1cf-5d971b30399mr3065185a12.1.1736346640359; Wed, 08 Jan 2025 06:30:40 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 15/18] x86/mm: introduce a per-vCPU mapcache when using ASI Date: Wed, 8 Jan 2025 15:26:55 +0100 Message-ID: <20250108142659.99490-16-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 When using a unique per-vCPU root page table the per-domain region becomes per-vCPU, and hence the mapcache is no longer shared between all vCPUs of a domain. Introduce per-vCPU mapcache structures, and modify map_domain_page() to create per-vCPU mappings when possible. Note the lock is also not needed with using per-vCPU map caches, as the structure is no longer shared. This introduces some duplication in the domain and vcpu structures, as both contain a mapcache field to support running with and without per-vCPU page-tables. Signed-off-by: Roger Pau Monné --- xen/arch/x86/domain_page.c | 90 ++++++++++++++++++++----------- xen/arch/x86/include/asm/domain.h | 20 ++++--- 2 files changed, 71 insertions(+), 39 deletions(-) diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c index 1372be20224e..65900d6218f8 100644 --- a/xen/arch/x86/domain_page.c +++ b/xen/arch/x86/domain_page.c @@ -74,7 +74,9 @@ void *map_domain_page(mfn_t mfn) struct vcpu *v; struct mapcache_domain *dcache; struct mapcache_vcpu *vcache; + struct mapcache *cache; struct vcpu_maphash_entry *hashent; + struct domain *d; #ifdef NDEBUG if ( mfn_x(mfn) <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) ) @@ -85,9 +87,12 @@ void *map_domain_page(mfn_t mfn) if ( !v || !is_pv_vcpu(v) ) return mfn_to_virt(mfn_x(mfn)); - dcache = &v->domain->arch.pv.mapcache; + d = v->domain; + dcache = &d->arch.pv.mapcache; vcache = &v->arch.pv.mapcache; - if ( !dcache->inuse ) + cache = d->arch.vcpu_pt ? &v->arch.pv.mapcache.cache + : &d->arch.pv.mapcache.cache; + if ( !cache->inuse ) return mfn_to_virt(mfn_x(mfn)); perfc_incr(map_domain_page_count); @@ -98,17 +103,18 @@ void *map_domain_page(mfn_t mfn) if ( hashent->mfn == mfn_x(mfn) ) { idx = hashent->idx; - ASSERT(idx < dcache->entries); + ASSERT(idx < cache->entries); hashent->refcnt++; ASSERT(hashent->refcnt); ASSERT(mfn_eq(l1e_get_mfn(MAPCACHE_L1ENT(idx)), mfn)); goto out; } - spin_lock(&dcache->lock); + if ( !d->arch.vcpu_pt ) + spin_lock(&dcache->lock); /* Has some other CPU caused a wrap? We must flush if so. */ - if ( unlikely(dcache->epoch != vcache->shadow_epoch) ) + if ( unlikely(!d->arch.vcpu_pt && dcache->epoch != vcache->shadow_epoch) ) { vcache->shadow_epoch = dcache->epoch; if ( NEED_FLUSH(this_cpu(tlbflush_time), dcache->tlbflush_timestamp) ) @@ -118,21 +124,21 @@ void *map_domain_page(mfn_t mfn) } } - idx = find_next_zero_bit(dcache->inuse, dcache->entries, dcache->cursor); - if ( unlikely(idx >= dcache->entries) ) + idx = find_next_zero_bit(cache->inuse, cache->entries, cache->cursor); + if ( unlikely(idx >= cache->entries) ) { unsigned long accum = 0, prev = 0; /* /First/, clean the garbage map and update the inuse list. */ - for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) + for ( i = 0; i < BITS_TO_LONGS(cache->entries); i++ ) { accum |= prev; - dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); - prev = ~dcache->inuse[i]; + cache->inuse[i] &= ~xchg(&cache->garbage[i], 0); + prev = ~cache->inuse[i]; } - if ( accum | (prev & BITMAP_LAST_WORD_MASK(dcache->entries)) ) - idx = find_first_zero_bit(dcache->inuse, dcache->entries); + if ( accum | (prev & BITMAP_LAST_WORD_MASK(cache->entries)) ) + idx = find_first_zero_bit(cache->inuse, cache->entries); else { /* Replace a hash entry instead. */ @@ -152,19 +158,23 @@ void *map_domain_page(mfn_t mfn) i = 0; } while ( i != MAPHASH_HASHFN(mfn_x(mfn)) ); } - BUG_ON(idx >= dcache->entries); + BUG_ON(idx >= cache->entries); /* /Second/, flush TLBs. */ perfc_incr(domain_page_tlb_flush); flush_tlb_local(); - vcache->shadow_epoch = ++dcache->epoch; - dcache->tlbflush_timestamp = tlbflush_current_time(); + if ( !d->arch.vcpu_pt ) + { + vcache->shadow_epoch = ++dcache->epoch; + dcache->tlbflush_timestamp = tlbflush_current_time(); + } } - set_bit(idx, dcache->inuse); - dcache->cursor = idx + 1; + set_bit(idx, cache->inuse); + cache->cursor = idx + 1; - spin_unlock(&dcache->lock); + if ( !d->arch.vcpu_pt ) + spin_unlock(&dcache->lock); l1e_write(&MAPCACHE_L1ENT(idx), l1e_from_mfn(mfn, __PAGE_HYPERVISOR_RW)); @@ -178,6 +188,7 @@ void unmap_domain_page(const void *ptr) unsigned int idx; struct vcpu *v; struct mapcache_domain *dcache; + struct mapcache *cache; unsigned long va = (unsigned long)ptr, mfn, flags; struct vcpu_maphash_entry *hashent; @@ -190,7 +201,9 @@ void unmap_domain_page(const void *ptr) ASSERT(v && is_pv_vcpu(v)); dcache = &v->domain->arch.pv.mapcache; - ASSERT(dcache->inuse); + cache = v->domain->arch.vcpu_pt ? &v->arch.pv.mapcache.cache + : &v->domain->arch.pv.mapcache.cache; + ASSERT(cache->inuse); idx = PFN_DOWN(va - MAPCACHE_VIRT_START); mfn = l1e_get_pfn(MAPCACHE_L1ENT(idx)); @@ -213,7 +226,7 @@ void unmap_domain_page(const void *ptr) hashent->mfn); l1e_write(&MAPCACHE_L1ENT(hashent->idx), l1e_empty()); /* /Second/, mark as garbage. */ - set_bit(hashent->idx, dcache->garbage); + set_bit(hashent->idx, cache->garbage); } /* Add newly-freed mapping to the maphash. */ @@ -225,7 +238,7 @@ void unmap_domain_page(const void *ptr) /* /First/, zap the PTE. */ l1e_write(&MAPCACHE_L1ENT(idx), l1e_empty()); /* /Second/, mark as garbage. */ - set_bit(idx, dcache->garbage); + set_bit(idx, cache->garbage); } local_irq_restore(flags); @@ -234,7 +247,6 @@ void unmap_domain_page(const void *ptr) void mapcache_domain_init(struct domain *d) { struct mapcache_domain *dcache = &d->arch.pv.mapcache; - unsigned int bitmap_pages; ASSERT(is_pv_domain(d)); @@ -243,13 +255,12 @@ void mapcache_domain_init(struct domain *d) return; #endif + if ( d->arch.vcpu_pt ) + return; + BUILD_BUG_ON(MAPCACHE_VIRT_END + PAGE_SIZE * (3 + 2 * PFN_UP(BITS_TO_LONGS(MAPCACHE_ENTRIES) * sizeof(long))) > MAPCACHE_VIRT_START + (PERDOMAIN_SLOT_MBYTES << 20)); - bitmap_pages = PFN_UP(BITS_TO_LONGS(MAPCACHE_ENTRIES) * sizeof(long)); - dcache->inuse = (void *)MAPCACHE_VIRT_END + PAGE_SIZE; - dcache->garbage = dcache->inuse + - (bitmap_pages + 1) * PAGE_SIZE / sizeof(long); spin_lock_init(&dcache->lock); } @@ -258,30 +269,45 @@ int mapcache_vcpu_init(struct vcpu *v) { struct domain *d = v->domain; struct mapcache_domain *dcache = &d->arch.pv.mapcache; + struct mapcache *cache; unsigned long i; - unsigned int ents = d->max_vcpus * MAPCACHE_VCPU_ENTRIES; + unsigned int ents = (d->arch.vcpu_pt ? 1 : d->max_vcpus) * + MAPCACHE_VCPU_ENTRIES; unsigned int nr = PFN_UP(BITS_TO_LONGS(ents) * sizeof(long)); - if ( !is_pv_vcpu(v) || !dcache->inuse ) + if ( !is_pv_vcpu(v) ) return 0; - if ( ents > dcache->entries ) + cache = d->arch.vcpu_pt ? &v->arch.pv.mapcache.cache + : &dcache->cache; + + if ( !cache->inuse ) + return 0; + + if ( ents > cache->entries ) { /* Populate page tables. */ int rc = create_perdomain_mapping(v, MAPCACHE_VIRT_START, ents, false); + const unsigned int bitmap_pages = + PFN_UP(BITS_TO_LONGS(MAPCACHE_ENTRIES) * sizeof(long)); + + cache->inuse = (void *)MAPCACHE_VIRT_END + PAGE_SIZE; + cache->garbage = cache->inuse + + (bitmap_pages + 1) * PAGE_SIZE / sizeof(long); + /* Populate bit maps. */ if ( !rc ) - rc = create_perdomain_mapping(v, (unsigned long)dcache->inuse, + rc = create_perdomain_mapping(v, (unsigned long)cache->inuse, nr, true); if ( !rc ) - rc = create_perdomain_mapping(v, (unsigned long)dcache->garbage, + rc = create_perdomain_mapping(v, (unsigned long)cache->garbage, nr, true); if ( rc ) return rc; - dcache->entries = ents; + cache->entries = ents; } /* Mark all maphash entries as not in use. */ diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index 5bf0ad3fdcf7..ba5440099d90 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -41,6 +41,16 @@ struct trap_bounce { unsigned long eip; }; +struct mapcache { + /* The number of array entries, and a cursor into the array. */ + unsigned int entries; + unsigned int cursor; + + /* Which mappings are in use, and which are garbage to reap next epoch? */ + unsigned long *inuse; + unsigned long *garbage; +}; + #define MAPHASH_ENTRIES 8 #define MAPHASH_HASHFN(pfn) ((pfn) & (MAPHASH_ENTRIES-1)) #define MAPHASHENT_NOTINUSE ((u32)~0U) @@ -54,13 +64,11 @@ struct mapcache_vcpu { uint32_t idx; uint32_t refcnt; } hash[MAPHASH_ENTRIES]; + + struct mapcache cache; }; struct mapcache_domain { - /* The number of array entries, and a cursor into the array. */ - unsigned int entries; - unsigned int cursor; - /* Protects map_domain_page(). */ spinlock_t lock; @@ -68,9 +76,7 @@ struct mapcache_domain { unsigned int epoch; u32 tlbflush_timestamp; - /* Which mappings are in use, and which are garbage to reap next epoch? */ - unsigned long *inuse; - unsigned long *garbage; + struct mapcache cache; }; void mapcache_domain_init(struct domain *d); From patchwork Wed Jan 8 14:26:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931118 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ACD18E77188 for ; Wed, 8 Jan 2025 14:44:37 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867435.1278957 (Exim 4.92) (envelope-from ) id 1tVXIL-0000Dl-T8; Wed, 08 Jan 2025 14:44:29 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867435.1278957; Wed, 08 Jan 2025 14:44:29 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVXIL-0000De-QS; Wed, 08 Jan 2025 14:44:29 +0000 Received: by outflank-mailman (input) for mailman id 867435; Wed, 08 Jan 2025 14:44:28 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX53-0006o2-Lc for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:45 +0000 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [2a00:1450:4864:20::62e]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 245e167c-cdcd-11ef-a0df-8be0dac302b0; Wed, 08 Jan 2025 15:30:43 +0100 (CET) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-aa68b513abcso3092362766b.0 for ; Wed, 08 Jan 2025 06:30:43 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0f06f942sm2506044666b.200.2025.01.08.06.30.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:41 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 245e167c-cdcd-11ef-a0df-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346642; x=1736951442; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qyS16XcyT27yuHUB4k53rewk8F6szSNyvboeI7a+Yts=; b=nnNvLh+QuoCnKt3E3GKYYAfCp4J95IgIQxR6S8jUa8bsZie3sx3EpL6IODer6P6gcS ktbU1FnGfZVdpordfi5wXmMRqMdHULWzirCdYMFRLZzscd4C7yZMcz6xQoleJjPoZPEe ECtWlkVCFIwvJZal07fFEhzWIxz75/dKkUWqE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346642; x=1736951442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qyS16XcyT27yuHUB4k53rewk8F6szSNyvboeI7a+Yts=; b=UU3uYMQ64CoepXSEHYwzEKAe0gMb/o9W9gHSEy+PGdPgztZ3EH5G5iZBB2/1SbhBkb AwGhCjs/DL71hS8/en+9o24g+yLAgMynh4IQqvsX0iBjghv0Km3D4tSH6eHt/i6fjoO0 u+vICzVmIyv4K8nSXjU7o4gLNVSybJ2Ud39uL0luHa0CGjAlv+1ymFWmXPI8Diic/3iK SE1yI+RW+48zguD+twdlbxJI3PBkAydFTqOh8YnHYitUorBTLa/wNsazr/PVxxJVesHx aXc8xLgOco5apCObGangsXWJTwSPKsloN1h9mrU0+wgc1IOsE2/fQqZRk54RZ8hdUNz0 U8Xg== X-Gm-Message-State: AOJu0YyMPD+xNvNwKKjtClGYy86OUjWm5TVhWMye7jeC8s76nhG2IhgZ lO4SsQ34y5oYSq4dool/KbnYrmlQC8jmS1aqz7gufkYE8DB7yh6cs7FgNIZrnBejH6VDReiRwY7 h X-Gm-Gg: ASbGnctq2wVMgnTUvhvO+4tVFxTOvhyLM5AY/LiVTGMrk47OgJSMzIPxlTS3K4L9Ia5 gyjz4/ieBPVV8ymZstogi+EdAHG6jF7bS0bJS/nF8f6l/SHubpp/UKwbLliGSVcEkDWf2CIbQhL gLZgGt3b6U7HQ0f32VjoCIVnpJwVD+UyECxfdnGT2XeJKrBI8YPhzdbxIJljRigy/9zQFwtHwX5 rrTkDapqS+a8IqCKXnd0wU3aMvx4tI+fZlTrNE1JmyKlnXoeN/P+NV1S6KOhLPPhy4= X-Google-Smtp-Source: AGHT+IGHBykV7yeqFfoOvN865Kl8vXTpUOINIzQjWxGFlzMEcE27CB842P7ClQ3IU5vdT/VikV9Mew== X-Received: by 2002:a17:907:2cc5:b0:aaf:c326:f2d8 with SMTP id a640c23a62f3a-ab2abdc0257mr262042466b.57.1736346641877; Wed, 08 Jan 2025 06:30:41 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Jan Beulich , Andrew Cooper Subject: [PATCH v2 16/18] x86/pv: allow using a unique per-pCPU root page table (L4) Date: Wed, 8 Jan 2025 15:26:56 +0100 Message-ID: <20250108142659.99490-17-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 When running PV guests it's possible for the guest to use the same root page table (L4) for all vCPUs, which in turn will result in Xen also using the same root page table on all pCPUs that are running any domain vCPU. When using XPTI Xen switches to a per-CPU shadow L4 when running in guest context, switching to the fully populated L4 when in Xen context. Take advantage of this existing shadowing and force the usage of a per-CPU L4 that shadows the guest selected L4 when Address Space Isolation is requested for PV guests. The mapping of the guest L4 is done with a per-CPU fixmap entry, that however requires that the currently loaded L4 has the per-CPU slot setup. In order to ensure this switch to the shadow per-CPU L4 with just the Xen slots populated, and then map the guest L4 and copy the contents of the guest controlled slots. Signed-off-by: Roger Pau Monné --- xen/arch/x86/flushtlb.c | 22 +++++++++++++++++ xen/arch/x86/include/asm/config.h | 6 +++++ xen/arch/x86/include/asm/domain.h | 3 +++ xen/arch/x86/include/asm/pv/mm.h | 5 ++++ xen/arch/x86/mm.c | 12 +++++++++- xen/arch/x86/mm/paging.c | 6 +++++ xen/arch/x86/pv/dom0_build.c | 10 ++++++-- xen/arch/x86/pv/domain.c | 31 +++++++++++++++++++++++- xen/arch/x86/pv/mm.c | 40 +++++++++++++++++++++++++++++++ 9 files changed, 131 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index a64c28f854ea..72692b504dd4 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -17,6 +17,7 @@ #include #include #include +#include #include /* Debug builds: Wrap frequently to stress-test the wrap logic. */ @@ -192,7 +193,28 @@ unsigned int flush_area_local(const void *va, unsigned int flags) unsigned int order = (flags - 1) & FLUSH_ORDER_MASK; if ( flags & FLUSH_ROOT_PGTBL ) + { get_cpu_info()->root_pgt_changed = true; + /* + * Use opt_vcpu_pt_pv instead of current->arch.vcpu_pt to avoid doing a + * sync_local_execstate() when per-vCPU page-tables are not enabled for + * PV. + */ + if ( opt_vcpu_pt_pv ) + { + const struct vcpu *curr; + const struct domain *curr_d; + + sync_local_execstate(); + + curr = current; + curr_d = curr->domain; + + if ( is_pv_domain(curr_d) && curr_d->arch.vcpu_pt ) + /* Update shadow root page-table ahead of doing TLB flush. */ + pv_asi_update_shadow_l4(curr); + } + } if ( flags & (FLUSH_TLB|FLUSH_TLB_GLOBAL) ) { diff --git a/xen/arch/x86/include/asm/config.h b/xen/arch/x86/include/asm/config.h index 19746f956ec3..af3ff3cb8705 100644 --- a/xen/arch/x86/include/asm/config.h +++ b/xen/arch/x86/include/asm/config.h @@ -265,6 +265,12 @@ extern unsigned long xen_phys_start; /* The address of a particular VCPU's GDT or LDT. */ #define GDT_VIRT_START(v) \ (PERDOMAIN_VIRT_START + ((v)->vcpu_id << GDT_LDT_VCPU_VA_SHIFT)) +/* + * There are 2 GDT pages reserved for Xen, but only one is used. Use the + * remaining one to map the guest L4 when running with ASI enabled. + */ +#define L4_SHADOW(v) \ + (GDT_VIRT_START(v) + ((FIRST_RESERVED_GDT_PAGE + 1) << PAGE_SHIFT)) #define LDT_VIRT_START(v) \ (GDT_VIRT_START(v) + (64*1024)) diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index ba5440099d90..a3c75e323cde 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -591,6 +591,9 @@ struct pv_vcpu /* Deferred VA-based update state. */ bool need_update_runstate_area; struct vcpu_time_info pending_system_time; + + /* For ASI: page to use as L4 shadow of the guest selected L4. */ + root_pgentry_t *root_pgt; }; struct arch_vcpu diff --git a/xen/arch/x86/include/asm/pv/mm.h b/xen/arch/x86/include/asm/pv/mm.h index 182764542c1f..540202f9712a 100644 --- a/xen/arch/x86/include/asm/pv/mm.h +++ b/xen/arch/x86/include/asm/pv/mm.h @@ -23,6 +23,8 @@ bool pv_destroy_ldt(struct vcpu *v); int validate_segdesc_page(struct page_info *page); +void pv_asi_update_shadow_l4(const struct vcpu *v); + #else #include @@ -44,6 +46,9 @@ static inline bool pv_map_ldt_shadow_page(unsigned int off) { return false; } static inline bool pv_destroy_ldt(struct vcpu *v) { ASSERT_UNREACHABLE(); return false; } +static inline void pv_asi_update_shadow_l4(const struct vcpu *v) +{ ASSERT_UNREACHABLE(); } + #endif #endif /* __X86_PV_MM_H__ */ diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 583bf4c58bf9..3a637e508ff3 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -546,6 +546,8 @@ void write_ptbase(struct vcpu *v) } else { + if ( is_pv_domain(d) && d->arch.vcpu_pt ) + pv_asi_update_shadow_l4(v); /* Make sure to clear use_pv_cr3 and xen_cr3 before pv_cr3. */ cpu_info->use_pv_cr3 = false; cpu_info->xen_cr3 = 0; @@ -565,6 +567,7 @@ void write_ptbase(struct vcpu *v) */ pagetable_t update_cr3(struct vcpu *v) { + const struct domain *d = v->domain; mfn_t cr3_mfn; if ( paging_mode_enabled(v->domain) ) @@ -575,7 +578,14 @@ pagetable_t update_cr3(struct vcpu *v) else cr3_mfn = pagetable_get_mfn(v->arch.guest_table); - make_cr3(v, cr3_mfn); + make_cr3(v, d->arch.vcpu_pt ? virt_to_mfn(v->arch.pv.root_pgt) : cr3_mfn); + + if ( d->arch.vcpu_pt ) + { + populate_perdomain_mapping(v, L4_SHADOW(v), &cr3_mfn, 1); + if ( v == this_cpu(curr_vcpu) ) + flush_tlb_one_local(L4_SHADOW(v)); + } return pagetable_null(); } diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index c77f4c1dac52..be30f21c1a7b 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -695,6 +695,12 @@ int paging_domctl(struct domain *d, struct xen_domctl_shadow_op *sc, return -EINVAL; } + if ( is_pv_domain(d) && d->arch.vcpu_pt ) + { + gprintk(XENLOG_ERR, "Paging not supported on PV domains with ASI\n"); + return -EOPNOTSUPP; + } + if ( resuming ? (d->arch.paging.preempt.dom != current->domain || d->arch.paging.preempt.op != sc->op) diff --git a/xen/arch/x86/pv/dom0_build.c b/xen/arch/x86/pv/dom0_build.c index 5081c19b9a9a..6c1d99a9bf0d 100644 --- a/xen/arch/x86/pv/dom0_build.c +++ b/xen/arch/x86/pv/dom0_build.c @@ -838,8 +838,11 @@ static int __init dom0_construct(struct boot_info *bi, struct domain *d) d->arch.paging.mode = 0; - /* Set up CR3 value for switch_cr3_cr4(). */ - update_cr3(v); + /* + * Set up CR3 value for switch_cr3_cr4(). Use make_cr3() instead of + * update_cr3() to avoid using an ASI page-table for dom0 building. + */ + make_cr3(v, pagetable_get_mfn(v->arch.guest_table)); /* We run on dom0's page tables for the final part of the build process. */ switch_cr3_cr4(cr3_pa(v->arch.cr3), read_cr4()); @@ -1068,6 +1071,9 @@ static int __init dom0_construct(struct boot_info *bi, struct domain *d) } #endif + /* Must be called in case ASI is enabled. */ + update_cr3(v); + v->is_initialised = 1; clear_bit(_VPF_down, &v->pause_flags); diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c index 8d2428051607..583723c5d360 100644 --- a/xen/arch/x86/pv/domain.c +++ b/xen/arch/x86/pv/domain.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #ifdef CONFIG_PV32 @@ -296,6 +297,7 @@ void pv_vcpu_destroy(struct vcpu *v) pv_destroy_gdt_ldt_l1tab(v); XFREE(v->arch.pv.trap_ctxt); + FREE_XENHEAP_PAGE(v->arch.pv.root_pgt); } int pv_vcpu_initialise(struct vcpu *v) @@ -336,6 +338,24 @@ int pv_vcpu_initialise(struct vcpu *v) goto done; } + if ( d->arch.vcpu_pt ) + { + v->arch.pv.root_pgt = alloc_xenheap_page(); + if ( !v->arch.pv.root_pgt ) + { + rc = -ENOMEM; + goto done; + } + + /* + * VM assists are not yet known, RO machine-to-phys slot will be copied + * from the guest L4. + */ + init_xen_l4_slots(v->arch.pv.root_pgt, + _mfn(virt_to_mfn(v->arch.pv.root_pgt)), + v, INVALID_MFN, false); + } + done: if ( rc ) pv_vcpu_destroy(v); @@ -368,7 +388,7 @@ int pv_domain_initialise(struct domain *d) d->arch.ctxt_switch = &pv_csw; - d->arch.pv.flush_root_pt = d->arch.pv.xpti; + d->arch.pv.flush_root_pt = d->arch.pv.xpti || d->arch.vcpu_pt; if ( !is_pv_32bit_domain(d) && use_invpcid && cpu_has_pcid ) switch ( ACCESS_ONCE(opt_pcid) ) @@ -409,6 +429,7 @@ bool __init xpti_pcid_enabled(void) static void _toggle_guest_pt(struct vcpu *v) { + const struct domain *d = v->domain; bool guest_update; pagetable_t old_shadow; unsigned long cr3; @@ -417,6 +438,14 @@ static void _toggle_guest_pt(struct vcpu *v) guest_update = v->arch.flags & TF_kernel_mode; old_shadow = update_cr3(v); + if ( d->arch.vcpu_pt ) + /* + * _toggle_guest_pt() might switch between user and kernel page tables, + * but doesn't use write_ptbase(), and hence needs an explicit call to + * sync the shadow L4. + */ + pv_asi_update_shadow_l4(v); + /* * Don't flush user global mappings from the TLB. Don't tick TLB clock. * diff --git a/xen/arch/x86/pv/mm.c b/xen/arch/x86/pv/mm.c index 4853e619f2a7..46c437692bea 100644 --- a/xen/arch/x86/pv/mm.c +++ b/xen/arch/x86/pv/mm.c @@ -12,6 +12,7 @@ #include #include +#include #include "mm.h" @@ -104,6 +105,45 @@ void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d) } #endif +void pv_asi_update_shadow_l4(const struct vcpu *v) +{ + const root_pgentry_t *guest_pgt; + root_pgentry_t *root_pgt = v->arch.pv.root_pgt; + const struct domain *d = v->domain; + + ASSERT(!d->arch.pv.xpti); + ASSERT(is_pv_domain(d)); + ASSERT(!is_idle_domain(d)); + ASSERT(current == this_cpu(curr_vcpu)); + + if ( likely(v == current) ) + guest_pgt = (void *)L4_SHADOW(v); + else if ( !(v->arch.flags & TF_kernel_mode) ) + guest_pgt = + map_domain_page(pagetable_get_mfn(v->arch.guest_table_user)); + else + guest_pgt = map_domain_page(pagetable_get_mfn(v->arch.guest_table)); + + if ( is_pv_64bit_domain(d) ) + { + unsigned int i; + + for ( i = 0; i < ROOT_PAGETABLE_FIRST_XEN_SLOT; i++ ) + l4e_write(&root_pgt[i], guest_pgt[i]); + for ( i = ROOT_PAGETABLE_LAST_XEN_SLOT + 1; + i < L4_PAGETABLE_ENTRIES; i++ ) + l4e_write(&root_pgt[i], guest_pgt[i]); + + l4e_write(&root_pgt[l4_table_offset(RO_MPT_VIRT_START)], + guest_pgt[l4_table_offset(RO_MPT_VIRT_START)]); + } + else + l4e_write(&root_pgt[0], guest_pgt[0]); + + if ( v != this_cpu(curr_vcpu) ) + unmap_domain_page(guest_pgt); +} + /* * Local variables: * mode: C From patchwork Wed Jan 8 14:26:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931076 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 94311E7719A for ; Wed, 8 Jan 2025 14:40:59 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867386.1278928 (Exim 4.92) (envelope-from ) id 1tVXEp-0006Az-Mj; Wed, 08 Jan 2025 14:40:51 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867386.1278928; Wed, 08 Jan 2025 14:40:51 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVXEp-0006As-Jb; Wed, 08 Jan 2025 14:40:51 +0000 Received: by outflank-mailman (input) for mailman id 867386; Wed, 08 Jan 2025 14:40:49 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX57-0006o2-Mg for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:49 +0000 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [2a00:1450:4864:20::633]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 2626b1d6-cdcd-11ef-a0df-8be0dac302b0; Wed, 08 Jan 2025 15:30:46 +0100 (CET) Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-aaee2c5ee6eso2013716966b.1 for ; Wed, 08 Jan 2025 06:30:46 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0e830d9dsm2482307666b.5.2025.01.08.06.30.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:43 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 2626b1d6-cdcd-11ef-a0df-8be0dac302b0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346645; x=1736951445; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=n+9ipm9aojVjLIK5yzenbimOjI0E5BZJ2ikbMkXDNCg=; b=D20e4cfbqWXrDsnosH5HUCM/n++kC/l7XJNpsPUR95nVcXAFk/4JTDChfxHX7KZ6ur PtrJ4yZVqJca7FROddgArJqakJs8vMDjfqd/V7BU6SSesFltT60mk9DRtB1oehAyGdu0 zVleJfF3s8A2S02uVEVklyg8lYN7CRu8vhXg4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346645; x=1736951445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=n+9ipm9aojVjLIK5yzenbimOjI0E5BZJ2ikbMkXDNCg=; b=BbauOqUJLWI1x0ZO0dZuBx6Y4WC1ZYwuNzDzbi5ZYKWIzWZysWddLwfcLI4/9h7H39 ZOSfo/FMHrzz5aS28t5+Lm7hG+zJR7Cydtv7PrZRWA5rn2npADyMOIejCIr2KlgRCdD5 t0l/ohbEr5cd+eJhC2R3lyuxPEPGdGXH7T/3vPF7R+EoqVZXGSk2wB6gq3EdrUtgOdvg JGEz9EQgIjBMU9ZUwSPcpkVGiXPuf1HEmUWklzD1GE55zf0Eq3PT9AeUSua5z9FoppYr gXFx8uYSCUcWFKetwLM92+obyyR8S4IjJ6d6tSnhzXgZ+5LFEgn0MqFFpgy6iBLVAzrs HktA== X-Gm-Message-State: AOJu0YxqX0eSCEolXVUmP39AcsFUVHnG/Rw2PMm+2vl4fIxAHnnSzJS9 ZhoLyj8vbyJT36Demej5v35YYAShb3ad/ZiE5y1HUk4OxY+fcgdTaT1oI4jTcIhU/AH9gKLNpf+ S X-Gm-Gg: ASbGncud9qerCr+inl6bgN52kVhWfygEAQiNh0XuwKlXpO84yT7FlhSfb+ahhVHGaa7 YtiPtO6NkQ23u6KC/tkqr4tBDh76txcfScWevreV/8CoYPqPBgF2eZNYwv9GK6GDjSc/1h+iOv+ ZraAg0sR9KXFzG9n7tihTBhv739aM4E8CLeX/lU6lUSIxyeLQjtRH8pHrt/7cpJU4iDHM9YQZC2 dxWEGtV2RFwo83pMc5Xz0LDbwWLCb0IqaMVt78HSlHZW8Yu1k/QYq6M2pU9ZEc7jBk= X-Google-Smtp-Source: AGHT+IFQ6xjtyx4IWz5iNuwpiOSpVoUvBXP3NeKXcaXm3nOGxgy2uWW8xrE/oDyTva2NL/5yAaTPow== X-Received: by 2002:a17:907:97cc:b0:aa6:ac9b:6822 with SMTP id a640c23a62f3a-ab2ab670625mr215774366b.12.1736346643755; Wed, 08 Jan 2025 06:30:43 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , Stefano Stabellini Subject: [PATCH v2 17/18] x86/mm: switch to a per-CPU mapped stack when using ASI Date: Wed, 8 Jan 2025 15:26:57 +0100 Message-ID: <20250108142659.99490-18-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 When using ASI the CPU stack is mapped using a range of fixmap entries in the per-CPU region. This ensures the stack is only accessible by the current CPU. Note however there's further work required in order to allocate the stack from domheap instead of xenheap, and ensure the stack is not part of the direct map. For domains not running with ASI enabled all the CPU stacks are mapped in the per-domain L3, so that the stack is always at the same linear address, regardless of whether ASI is enabled or not for the domain. When calling UEFI runtime methods the current per-domain slot needs to be added to the EFI L4, so that the stack is available in UEFI. Finally, some users of callfunc IPIs pass parameters from the stack, so when handling a callfunc IPI the stack of the caller CPU is mapped into the address space of the CPU handling the IPI. This needs further work to use a bounce buffer in order to avoid having to map remote CPU stacks. Signed-off-by: Roger Pau Monné --- There's also further work required in order to avoid mapping remote stack when handling callfunc IPIs. --- docs/misc/xen-command-line.pandoc | 5 +- xen/arch/x86/domain.c | 30 ++++++++++++ xen/arch/x86/include/asm/config.h | 10 +++- xen/arch/x86/include/asm/current.h | 5 ++ xen/arch/x86/include/asm/domain.h | 3 ++ xen/arch/x86/include/asm/mm.h | 2 +- xen/arch/x86/include/asm/smp.h | 12 +++++ xen/arch/x86/include/asm/spec_ctrl.h | 1 + xen/arch/x86/mm.c | 69 ++++++++++++++++++++++------ xen/arch/x86/setup.c | 32 ++++++++++--- xen/arch/x86/smp.c | 39 ++++++++++++++++ xen/arch/x86/smpboot.c | 20 +++++++- xen/arch/x86/spec_ctrl.c | 67 +++++++++++++++++++++++---- xen/arch/x86/traps.c | 8 +++- xen/common/smp.c | 10 ++++ xen/common/stop_machine.c | 10 ++++ xen/include/xen/smp.h | 8 ++++ 17 files changed, 295 insertions(+), 36 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index 3c1ad7b5fe7d..e7828d092098 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -204,7 +204,7 @@ to appropriate auditing by Xen. Argo is disabled by default. ### asi (x86) > `= List of [ , {pv,hvm}=, - {vcpu-pt}=|{pv,hvm}= ]` + {vcpu-pt,cpu-stack}=|{pv,hvm}= ]` Offers control over whether the hypervisor will engage in Address Space Isolation, by not having potentially sensitive information permanently mapped @@ -221,6 +221,9 @@ meant to be used for debugging purposes only.** * `vcpu-pt` ensure each vCPU uses a unique top-level page-table and setup a virtual address space region to map memory on a per-vCPU basis. +* `cpu-stack` prevent CPUs from having permanent mappings of stacks different + than their own. Depends on the `vcpu-pt` option. + ### asid (x86) > `= ` diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 6e1f622f7385..ac6332266e95 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -563,6 +563,26 @@ int arch_vcpu_create(struct vcpu *v) if ( rc ) return rc; + if ( opt_cpu_stack_hvm || opt_cpu_stack_pv ) + { + if ( is_idle_vcpu(v) || d->arch.cpu_stack ) + create_perdomain_mapping(v, PCPU_STACK_VIRT(0), + nr_cpu_ids << STACK_ORDER, false); + else if ( !v->vcpu_id ) + { + l3_pgentry_t *idle_perdomain = + __map_domain_page(idle_vcpu[0]->domain->arch.perdomain_l3_pg); + l3_pgentry_t *guest_perdomain = + __map_domain_page(d->arch.perdomain_l3_pg); + + l3e_write(&guest_perdomain[PCPU_STACK_SLOT], + idle_perdomain[PCPU_STACK_SLOT]); + + unmap_domain_page(guest_perdomain); + unmap_domain_page(idle_perdomain); + } + } + rc = mapcache_vcpu_init(v); if ( rc ) return rc; @@ -2031,6 +2051,16 @@ static void __context_switch(struct vcpu *n) } vcpu_restore_fpu_nonlazy(n, false); nd->arch.ctxt_switch->to(n); + if ( nd->arch.cpu_stack ) + { + /* + * Tear down previous stack mappings and map current pCPU stack. + * This is safe because not yet running on 'n' page-tables. + */ + destroy_perdomain_mapping(n, PCPU_STACK_VIRT(0), + nr_cpu_ids << STACK_ORDER); + vcpu_set_stack_mappings(n, cpu, true); + } } psr_ctxt_switch_to(nd); diff --git a/xen/arch/x86/include/asm/config.h b/xen/arch/x86/include/asm/config.h index af3ff3cb8705..016d6c8b21a9 100644 --- a/xen/arch/x86/include/asm/config.h +++ b/xen/arch/x86/include/asm/config.h @@ -168,7 +168,7 @@ /* Slot 260: per-domain mappings (including map cache). */ #define PERDOMAIN_VIRT_START (PML4_ADDR(260)) #define PERDOMAIN_SLOT_MBYTES (PML4_ENTRY_BYTES >> (20 + PAGETABLE_ORDER)) -#define PERDOMAIN_SLOTS 3 +#define PERDOMAIN_SLOTS 4 #define PERDOMAIN_VIRT_SLOT(s) (PERDOMAIN_VIRT_START + (s) * \ (PERDOMAIN_SLOT_MBYTES << 20)) /* Slot 4: mirror of per-domain mappings (for compat xlat area accesses). */ @@ -288,6 +288,14 @@ extern unsigned long xen_phys_start; #define ARG_XLAT_START(v) \ (ARG_XLAT_VIRT_START + ((v)->vcpu_id << ARG_XLAT_VA_SHIFT)) +/* Per-CPU stacks area when using ASI. */ +#define PCPU_STACK_SLOT 3 +#define PCPU_STACK_VIRT_START PERDOMAIN_VIRT_SLOT(PCPU_STACK_SLOT) +#define PCPU_STACK_VIRT_END (PCPU_STACK_VIRT_START + \ + (PERDOMAIN_SLOT_MBYTES << 20)) +#define PCPU_STACK_VIRT(cpu) (PCPU_STACK_VIRT_START + \ + (cpu << STACK_ORDER) * PAGE_SIZE) + #define ELFSIZE 64 #define ARCH_CRASH_SAVE_VMCOREINFO diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h index bcec328c9875..4a9776f87a7a 100644 --- a/xen/arch/x86/include/asm/current.h +++ b/xen/arch/x86/include/asm/current.h @@ -24,6 +24,11 @@ * 0 - IST Shadow Stacks (4x 1k, read-only) */ +static inline bool is_shstk_slot(unsigned int i) +{ + return (i == 0 || i == PRIMARY_SHSTK_SLOT); +} + /* * Identify which stack page the stack pointer is on. Returns an index * as per the comment above. diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index a3c75e323cde..f83d2860c0b4 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -465,6 +465,9 @@ struct arch_domain /* Use a per-vCPU root pt, and switch per-domain slot to per-vCPU. */ bool vcpu_pt; + /* Use per-CPU mapped stacks. */ + bool cpu_stack; + /* Emulated devices enabled bitmap. */ uint32_t emulation_flags; } __cacheline_aligned; diff --git a/xen/arch/x86/include/asm/mm.h b/xen/arch/x86/include/asm/mm.h index f79d1594fde4..77f31685fd95 100644 --- a/xen/arch/x86/include/asm/mm.h +++ b/xen/arch/x86/include/asm/mm.h @@ -519,7 +519,7 @@ extern struct rangeset *mmio_ro_ranges; #define compat_pfn_to_cr3(pfn) (((unsigned)(pfn) << 12) | ((unsigned)(pfn) >> 20)) #define compat_cr3_to_pfn(cr3) (((unsigned)(cr3) >> 12) | ((unsigned)(cr3) << 20)) -void memguard_guard_stack(void *p); +void memguard_guard_stack(void *p, unsigned int cpu); void memguard_unguard_stack(void *p); /* diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index c8c79601343d..a356f0bf0a61 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -79,6 +79,18 @@ extern bool unaccounted_cpus; void *cpu_alloc_stack(unsigned int cpu); +/* + * Setup the per-CPU area stack mappings. + * + * @v: vCPU where the mappings are to appear. + * @stack_cpu: CPU whose stacks should be mapped. + * @map_shstk: create mappings for shadow stack regions. + */ +void vcpu_set_stack_mappings(const struct vcpu *v, unsigned int stack_cpu, + bool map_shstk); + +#define HAS_ARCH_SMP_CALLFUNC_PREAMBLE + #endif /* !__ASSEMBLY__ */ #endif diff --git a/xen/arch/x86/include/asm/spec_ctrl.h b/xen/arch/x86/include/asm/spec_ctrl.h index c58afbaab671..c8943e81befa 100644 --- a/xen/arch/x86/include/asm/spec_ctrl.h +++ b/xen/arch/x86/include/asm/spec_ctrl.h @@ -89,6 +89,7 @@ extern uint8_t default_scf; extern int8_t opt_xpti_hwdom, opt_xpti_domu; extern int8_t opt_vcpu_pt_pv, opt_vcpu_pt_hwdom, opt_vcpu_pt_hvm; +extern int8_t opt_cpu_stack_pv, opt_cpu_stack_hwdom, opt_cpu_stack_hvm; extern bool cpu_has_bug_l1tf; extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 3a637e508ff3..22ee3170b86d 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -87,6 +87,7 @@ * doing the final put_page(), and remove it from the iommu if so. */ +#include #include #include #include @@ -6424,8 +6425,10 @@ int create_perdomain_mapping(struct vcpu *v, unsigned long va, return rc; } -void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, - mfn_t *mfn, unsigned long nr) +static void populate_perdomain_mapping_flags(const struct vcpu *v, + unsigned long va, mfn_t *mfn, + unsigned long nr, + unsigned int flags) { l1_pgentry_t *l1tab = NULL, *pl1e; const l3_pgentry_t *l3tab; @@ -6454,7 +6457,7 @@ void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, ASSERT_UNREACHABLE(); free_domheap_page(l1e_get_page(*pl1e)); } - l1e_write(pl1e, l1e_from_mfn(mfn[i], __PAGE_HYPERVISOR_RW)); + l1e_write(pl1e, l1e_from_mfn(mfn[i], flags)); } return; @@ -6505,7 +6508,7 @@ void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, free_domheap_page(l1e_get_page(*pl1e)); } - l1e_write(pl1e, l1e_from_mfn(*mfn, __PAGE_HYPERVISOR_RW)); + l1e_write(pl1e, l1e_from_mfn(*mfn, flags)); } unmap_domain_page(l1tab); @@ -6513,6 +6516,31 @@ void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, unmap_domain_page(l3tab); } +void populate_perdomain_mapping(const struct vcpu *v, unsigned long va, + mfn_t *mfn, unsigned long nr) +{ + populate_perdomain_mapping_flags(v, va, mfn, nr, __PAGE_HYPERVISOR_RW); +} + +void vcpu_set_stack_mappings(const struct vcpu *v, unsigned int stack_cpu, + bool map_shstk) +{ + unsigned int i; + + for ( i = 0; i < (1U << STACK_ORDER); i++ ) + { + unsigned int flags = is_shstk_slot(i) ? __PAGE_HYPERVISOR_SHSTK + : __PAGE_HYPERVISOR_RW; + mfn_t mfn = virt_to_mfn(stack_base[stack_cpu] + i * PAGE_SIZE); + + if ( is_shstk_slot(i) && !map_shstk ) + continue; + + populate_perdomain_mapping_flags(v, + PCPU_STACK_VIRT(stack_cpu) + i * PAGE_SIZE, &mfn, 1, flags); + } +} + void destroy_perdomain_mapping(const struct vcpu *v, unsigned long va, unsigned int nr) { @@ -6599,7 +6627,12 @@ void free_perdomain_mappings(struct vcpu *v) l3tab = __map_domain_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg : d->arch.perdomain_l3_pg); - for ( i = 0; i < PERDOMAIN_SLOTS; ++i) + for ( i = 0; i < PERDOMAIN_SLOTS; ++i ) + { + if ( i == PCPU_STACK_SLOT && !d->arch.cpu_stack ) + /* Without ASI the stack L3e is shared with the idle page-tables. */ + continue; + if ( l3e_get_flags(l3tab[i]) & _PAGE_PRESENT ) { struct page_info *l2pg = l3e_get_page(l3tab[i]); @@ -6629,6 +6662,7 @@ void free_perdomain_mappings(struct vcpu *v) unmap_domain_page(l2tab); free_domheap_page(l2pg); } + } unmap_domain_page(l3tab); free_domheap_page(d->arch.vcpu_pt ? v->arch.pervcpu_l3_pg @@ -6637,31 +6671,40 @@ void free_perdomain_mappings(struct vcpu *v) v->arch.pervcpu_l3_pg = NULL; } -static void write_sss_token(unsigned long *ptr) +static void write_sss_token(unsigned long *ptr, unsigned long va) { /* * A supervisor shadow stack token is its own linear address, with the * busy bit (0) clear. */ - *ptr = (unsigned long)ptr; + *ptr = va; } -void memguard_guard_stack(void *p) +void memguard_guard_stack(void *p, unsigned int cpu) { + unsigned long va = + (opt_cpu_stack_hvm || opt_cpu_stack_pv) ? PCPU_STACK_VIRT(cpu) + : (unsigned long)p; + /* IST Shadow stacks. 4x 1k in stack page 0. */ if ( IS_ENABLED(CONFIG_XEN_SHSTK) ) { - write_sss_token(p + (IST_MCE * IST_SHSTK_SIZE) - 8); - write_sss_token(p + (IST_NMI * IST_SHSTK_SIZE) - 8); - write_sss_token(p + (IST_DB * IST_SHSTK_SIZE) - 8); - write_sss_token(p + (IST_DF * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_MCE * IST_SHSTK_SIZE) - 8, + va + (IST_MCE * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_NMI * IST_SHSTK_SIZE) - 8, + va + (IST_NMI * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_DB * IST_SHSTK_SIZE) - 8, + va + (IST_DB * IST_SHSTK_SIZE) - 8); + write_sss_token(p + (IST_DF * IST_SHSTK_SIZE) - 8, + va + (IST_DF * IST_SHSTK_SIZE) - 8); } map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_SHSTK); /* Primary Shadow Stack. 1x 4k in stack page 5. */ p += PRIMARY_SHSTK_SLOT * PAGE_SIZE; + va += PRIMARY_SHSTK_SLOT * PAGE_SIZE; if ( IS_ENABLED(CONFIG_XEN_SHSTK) ) - write_sss_token(p + PAGE_SIZE - 8); + write_sss_token(p + PAGE_SIZE - 8, va + PAGE_SIZE - 8); map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_SHSTK); } diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c index 8ebe5a9443f3..d0b2c986962a 100644 --- a/xen/arch/x86/setup.c +++ b/xen/arch/x86/setup.c @@ -402,6 +402,11 @@ static void __init init_idle_domain(void) scheduler_init(); set_current(idle_vcpu[0]); this_cpu(curr_vcpu) = current; + if ( opt_cpu_stack_hvm || opt_cpu_stack_pv ) + /* Set per-domain slot in the idle page-tables to access stack mappings. */ + l4e_write(&idle_pg_table[l4_table_offset(PERDOMAIN_VIRT_START)], + l4e_from_page(idle_vcpu[0]->domain->arch.perdomain_l3_pg, + __PAGE_HYPERVISOR_RW)); } void srat_detect_node(int cpu) @@ -896,8 +901,6 @@ static void __init noreturn reinit_bsp_stack(void) /* Update SYSCALL trampolines */ percpu_traps_init(); - stack_base[0] = stack; - rc = setup_cpu_root_pgt(0); if ( rc ) panic("Error %d setting up PV root page table\n", rc); @@ -1864,10 +1867,6 @@ void asmlinkage __init noreturn __start_xen(void) system_state = SYS_STATE_boot; - bsp_stack = cpu_alloc_stack(0); - if ( !bsp_stack ) - panic("No memory for BSP stack\n"); - console_init_ring(); vesa_init(); @@ -2050,6 +2049,16 @@ void asmlinkage __init noreturn __start_xen(void) alternative_branches(); + /* + * Alloc the BSP stack closer to the point where the AP ones also get + * allocated - and after the speculation mitigations have been initialized. + * In order to set up the shadow stack token correctly Xen needs to know + * whether per-CPU mapped stacks are being used. + */ + bsp_stack = cpu_alloc_stack(0); + if ( !bsp_stack ) + panic("No memory for BSP stack\n"); + /* * NB: when running as a PV shim VCPUOP_up/down is wired to the shim * physical cpu_add/remove functions, so launch the guest with only @@ -2155,8 +2164,17 @@ void asmlinkage __init noreturn __start_xen(void) info->last_spec_ctrl = default_xen_spec_ctrl; } + stack_base[0] = bsp_stack; + /* Copy the cpu info block, and move onto the BSP stack. */ - bsp_info = get_cpu_info_from_stack((unsigned long)bsp_stack); + if ( opt_cpu_stack_hvm || opt_cpu_stack_pv ) + { + vcpu_set_stack_mappings(idle_vcpu[0], 0, true); + bsp_info = get_cpu_info_from_stack(PCPU_STACK_VIRT(0)); + } + else + bsp_info = get_cpu_info_from_stack((unsigned long)bsp_stack); + *bsp_info = *info; asm volatile ("mov %[stk], %%rsp; jmp %c[fn]" :: diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c index 02a6ed7593f3..1b11017d5722 100644 --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -9,6 +9,7 @@ */ #include +#include #include #include #include @@ -27,6 +28,8 @@ #include #include +#include + /* Helper functions to prepare APIC register values. */ static unsigned int prepare_ICR(unsigned int shortcut, int vector) { @@ -435,3 +438,39 @@ long cf_check cpu_down_helper(void *data) ret = cpu_down(cpu); return ret; } + +void arch_smp_pre_callfunc(unsigned int cpu) +{ + if ( !opt_cpu_stack_hvm && !opt_cpu_stack_pv ) + /* + * Avoid the unconditional sync_local_execstate() call below if ASI is + * not enabled for any domain. + */ + return; + + /* + * Sync execution state, so that the page-tables cannot change while + * creating or destroying the stack mappings. + */ + sync_local_execstate(); + if ( cpu == smp_processor_id() || !current->domain->arch.cpu_stack || + /* EFI page-tables have all pCPU stacks mapped. */ + efi_rs_using_pgtables() ) + return; + + vcpu_set_stack_mappings(current, cpu, false); +} + +void arch_smp_post_callfunc(unsigned int cpu) +{ + if ( cpu == smp_processor_id() || !current->domain->arch.cpu_stack || + /* EFI page-tables have all pCPU stacks mapped. */ + efi_rs_using_pgtables() ) + return; + + ASSERT(current == this_cpu(curr_vcpu)); + destroy_perdomain_mapping(current, PCPU_STACK_VIRT(cpu), + (1U << STACK_ORDER)); + + flush_area_local((void *)PCPU_STACK_VIRT(cpu), FLUSH_ORDER(STACK_ORDER)); +} diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index a740a6402272..515ab3cb9c75 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -582,7 +582,21 @@ static int do_boot_cpu(int apicid, int cpu) printk("Booting processor %d/%d eip %lx\n", cpu, apicid, start_eip); - stack_start = stack_base[cpu] + STACK_SIZE - sizeof(struct cpu_info); + if ( opt_cpu_stack_hvm || opt_cpu_stack_pv ) + { + /* + * Uniformly run with the stack mappings in the per-domain area if ASI + * is enabled for any domain type. + */ + vcpu_set_stack_mappings(idle_vcpu[cpu], cpu, true); + + ASSERT(IS_ALIGNED(PCPU_STACK_VIRT(cpu), STACK_SIZE)); + + stack_start = (void *)PCPU_STACK_VIRT(cpu) + STACK_SIZE - + sizeof(struct cpu_info); + } + else + stack_start = stack_base[cpu] + STACK_SIZE - sizeof(struct cpu_info); /* This grunge runs the startup process for the targeted processor. */ @@ -1030,7 +1044,7 @@ void *cpu_alloc_stack(unsigned int cpu) stack = alloc_xenheap_pages(STACK_ORDER, memflags); if ( stack ) - memguard_guard_stack(stack); + memguard_guard_stack(stack, cpu); return stack; } @@ -1146,6 +1160,8 @@ static struct notifier_block cpu_smpboot_nfb = { void __init smp_prepare_cpus(void) { + BUILD_BUG_ON(PCPU_STACK_VIRT(CONFIG_NR_CPUS) > PCPU_STACK_VIRT_END); + register_cpu_notifier(&cpu_smpboot_nfb); mtrr_aps_sync_begin(); diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c index 9463a8624701..4f1e912f8057 100644 --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -89,6 +89,10 @@ bool __ro_after_init opt_bp_spec_reduce = true; int8_t __ro_after_init opt_vcpu_pt_hvm = -1; int8_t __ro_after_init opt_vcpu_pt_hwdom = -1; int8_t __ro_after_init opt_vcpu_pt_pv = -1; +/* Per-CPU stacks. */ +int8_t __ro_after_init opt_cpu_stack_hvm = -1; +int8_t __ro_after_init opt_cpu_stack_hwdom = -1; +int8_t __ro_after_init opt_cpu_stack_pv = -1; static int __init cf_check parse_spec_ctrl(const char *s) { @@ -395,6 +399,7 @@ static __init void xpti_init_default(void) printk(XENLOG_ERR "XPTI incompatible with per-vCPU page-tables, disabling ASI\n"); opt_vcpu_pt_pv = 0; + opt_cpu_stack_pv = 0; } if ( (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON)) || cpu_has_rdcl_no ) @@ -507,7 +512,10 @@ static int __init cf_check parse_asi(const char *s) /* Interpret 'asi' alone in its positive boolean form. */ if ( *s == '\0' ) + { opt_vcpu_pt_pv = opt_vcpu_pt_hwdom = opt_vcpu_pt_hvm = 1; + opt_cpu_stack_pv = opt_cpu_stack_hwdom = opt_cpu_stack_hvm = 1; + } do { ss = strchr(s, ','); @@ -520,13 +528,14 @@ static int __init cf_check parse_asi(const char *s) case 0: case 1: opt_vcpu_pt_pv = opt_vcpu_pt_hwdom = opt_vcpu_pt_hvm = val; + opt_cpu_stack_pv = opt_cpu_stack_hvm = opt_cpu_stack_hwdom = val; break; default: if ( (val = parse_boolean("pv", s, ss)) >= 0 ) - opt_vcpu_pt_pv = val; + opt_cpu_stack_pv = opt_vcpu_pt_pv = val; else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) - opt_vcpu_pt_hvm = val; + opt_cpu_stack_hvm = opt_vcpu_pt_hvm = val; else if ( (val = parse_boolean("vcpu-pt", s, ss)) != -1 ) { switch ( val ) @@ -548,6 +557,28 @@ static int __init cf_check parse_asi(const char *s) break; } } + else if ( (val = parse_boolean("cpu-stack", s, ss)) != -1 ) + { + switch ( val ) + { + case 1: + case 0: + opt_cpu_stack_pv = opt_cpu_stack_hvm = + opt_cpu_stack_hwdom = val; + break; + + case -2: + s += strlen("cpu-stack="); + if ( (val = parse_boolean("pv", s, ss)) >= 0 ) + opt_cpu_stack_pv = val; + else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) + opt_cpu_stack_hvm = val; + else + default: + rc = -EINVAL; + break; + } + } else if ( *s ) rc = -EINVAL; break; @@ -556,6 +587,14 @@ static int __init cf_check parse_asi(const char *s) s = ss + 1; } while ( *ss ); + /* Per-CPU stacks depends on per-vCPU mappings. */ + if ( opt_cpu_stack_pv == 1 ) + opt_vcpu_pt_pv = 1; + if ( opt_cpu_stack_hvm == 1 ) + opt_vcpu_pt_hvm = 1; + if ( opt_cpu_stack_hwdom == 1 ) + opt_vcpu_pt_hwdom = 1; + return rc; } custom_param("asi", parse_asi); @@ -752,16 +791,17 @@ static void __init print_details(enum ind_thunk thunk) #endif #ifdef CONFIG_HVM - printk(" ASI features for HVM VMs:%s%s\n", - opt_vcpu_pt_hvm ? "" : " None", - opt_vcpu_pt_hvm ? " vCPU-PT" : ""); + printk(" ASI features for HVM VMs:%s%s%s\n", + opt_vcpu_pt_hvm || opt_cpu_stack_hvm ? "" : " None", + opt_vcpu_pt_hvm ? " vCPU-PT" : "", + opt_cpu_stack_hvm ? " CPU-STACK" : ""); #endif #ifdef CONFIG_PV - printk(" ASI features for PV VMs:%s%s\n", - opt_vcpu_pt_pv ? "" : " None", - opt_vcpu_pt_pv ? " vCPU-PT" : ""); - + printk(" ASI features for PV VMs:%s%s%s\n", + opt_vcpu_pt_pv || opt_cpu_stack_pv ? "" : " None", + opt_vcpu_pt_pv ? " vCPU-PT" : "", + opt_cpu_stack_pv ? " CPU-STACK" : ""); #endif } @@ -1869,6 +1909,9 @@ void spec_ctrl_init_domain(struct domain *d) d->arch.vcpu_pt = is_hardware_domain(d) ? opt_vcpu_pt_hwdom : pv ? opt_vcpu_pt_pv : opt_vcpu_pt_hvm; + d->arch.cpu_stack = is_hardware_domain(d) ? opt_cpu_stack_hwdom + : pv ? opt_cpu_stack_pv + : opt_cpu_stack_hvm; } void __init init_speculation_mitigations(void) @@ -2172,6 +2215,12 @@ void __init init_speculation_mitigations(void) opt_vcpu_pt_hwdom = 0; if ( opt_vcpu_pt_hvm == -1 ) opt_vcpu_pt_hvm = 0; + if ( opt_cpu_stack_pv == -1 ) + opt_cpu_stack_pv = 0; + if ( opt_cpu_stack_hwdom == -1 ) + opt_cpu_stack_hwdom = 0; + if ( opt_cpu_stack_hvm == -1 ) + opt_cpu_stack_hvm = 0; if ( opt_vcpu_pt_pv || opt_vcpu_pt_hvm ) warning_add( diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index a7f6fb611c34..c80ef2268e94 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -74,6 +74,7 @@ #include #include #include +#include /* * opt_nmi: one of 'ignore', 'dom0', or 'fatal'. @@ -609,10 +610,13 @@ void show_stack_overflow(unsigned int cpu, const struct cpu_user_regs *regs) unsigned long esp = regs->rsp; unsigned long curr_stack_base = esp & ~(STACK_SIZE - 1); unsigned long esp_top, esp_bottom; + const void *stack = + (opt_cpu_stack_hvm || opt_cpu_stack_pv) ? (void *)PCPU_STACK_VIRT(cpu) + : stack_base[cpu]; - if ( _p(curr_stack_base) != stack_base[cpu] ) + if ( _p(curr_stack_base) != stack ) printk("Current stack base %p differs from expected %p\n", - _p(curr_stack_base), stack_base[cpu]); + _p(curr_stack_base), stack); esp_bottom = (esp | (STACK_SIZE - 1)) + 1; esp_top = esp_bottom - PRIMARY_STACK_SIZE; diff --git a/xen/common/smp.c b/xen/common/smp.c index a011f541f1ea..04f5aede0d3d 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -29,6 +29,7 @@ static struct call_data_struct { void (*func) (void *info); void *info; int wait; + unsigned int caller; cpumask_t selected; } call_data; @@ -63,6 +64,7 @@ void on_selected_cpus( call_data.func = func; call_data.info = info; call_data.wait = wait; + call_data.caller = smp_processor_id(); smp_send_call_function_mask(&call_data.selected); @@ -82,6 +84,12 @@ void smp_call_function_interrupt(void) if ( !cpumask_test_cpu(cpu, &call_data.selected) ) return; + /* + * TODO: use bounce buffers to pass callfunc data, so that when using ASI + * there's no need to map remote CPU stacks. + */ + arch_smp_pre_callfunc(call_data.caller); + irq_enter(); if ( unlikely(!func) ) @@ -102,6 +110,8 @@ void smp_call_function_interrupt(void) } irq_exit(); + + arch_smp_post_callfunc(call_data.caller); } /* diff --git a/xen/common/stop_machine.c b/xen/common/stop_machine.c index 398cfd507c10..142059c36374 100644 --- a/xen/common/stop_machine.c +++ b/xen/common/stop_machine.c @@ -40,6 +40,7 @@ enum stopmachine_state { struct stopmachine_data { unsigned int nr_cpus; + unsigned int caller; enum stopmachine_state state; atomic_t done; @@ -104,6 +105,7 @@ int stop_machine_run(int (*fn)(void *data), void *data, unsigned int cpu) stopmachine_data.fn_result = 0; atomic_set(&stopmachine_data.done, 0); stopmachine_data.state = STOPMACHINE_START; + stopmachine_data.caller = this; smp_wmb(); @@ -148,6 +150,12 @@ static void cf_check stopmachine_action(void *data) BUG_ON(cpu != smp_processor_id()); + /* + * TODO: use bounce buffers to pass callfunc data, so that when using ASI + * there's no need to map remote CPU stacks. + */ + arch_smp_pre_callfunc(stopmachine_data.caller); + smp_mb(); while ( state != STOPMACHINE_EXIT ) @@ -180,6 +188,8 @@ static void cf_check stopmachine_action(void *data) } local_irq_enable(); + + arch_smp_post_callfunc(stopmachine_data.caller); } static int cf_check cpu_callback( diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index 2ca9ff1bfcc1..a25d47e29dce 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -76,4 +76,12 @@ extern void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); +#ifdef HAS_ARCH_SMP_CALLFUNC_PREAMBLE +void arch_smp_pre_callfunc(unsigned int cpu); +void arch_smp_post_callfunc(unsigned int cpu); +#else +static inline void arch_smp_pre_callfunc(unsigned int cpu) {} +static inline void arch_smp_post_callfunc(unsigned int cpu) {} +#endif + #endif /* __XEN_SMP_H__ */ From patchwork Wed Jan 8 14:26:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 13931077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B46C6E77188 for ; Wed, 8 Jan 2025 14:43:32 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.867406.1278937 (Exim 4.92) (envelope-from ) id 1tVXHI-00074t-73; Wed, 08 Jan 2025 14:43:24 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 867406.1278937; Wed, 08 Jan 2025 14:43:24 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVXHI-00074m-42; Wed, 08 Jan 2025 14:43:24 +0000 Received: by outflank-mailman (input) for mailman id 867406; Wed, 08 Jan 2025 14:43:23 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tVX57-0005q4-CO for xen-devel@lists.xenproject.org; Wed, 08 Jan 2025 14:30:49 +0000 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [2a00:1450:4864:20::531]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 25e61f95-cdcd-11ef-99a4-01e77a169b0f; Wed, 08 Jan 2025 15:30:46 +0100 (CET) Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-5d3f57582a2so1798773a12.1 for ; Wed, 08 Jan 2025 06:30:46 -0800 (PST) Received: from localhost ([84.78.159.3]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-aac0e8953c1sm2481149366b.49.2025.01.08.06.30.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jan 2025 06:30:44 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 25e61f95-cdcd-11ef-99a4-01e77a169b0f DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1736346645; x=1736951445; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5Dahf1Hb4Q67tI5YmT94pkmM93oygaKDXzYWEQbTe0s=; b=g+tYZTqen8oeW5OOEBIz3+GqmNxTfiUFlct9kOvMBIbUQoZHAQg/T1TC3Ofz7ftVJI mtLPks4LcQWGxOWNyUk20q+cCJ6j+gtCNrSLKW3OnMxDHZJWj5ww+6GisZ6mashPLhKg yrky0UPi/KTRO9SfLrGisPv1HULCTC6LBdGdQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736346645; x=1736951445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5Dahf1Hb4Q67tI5YmT94pkmM93oygaKDXzYWEQbTe0s=; b=B2vMb8lpSmE2VVjAl/VpS8JdV55VSbVEdSAeB0hrWdIVvj2kbSPhZ0QTWCtnYTSGGp 9rYeRPDXzQZEk8ubF1Ziu3D8e3n/Uy1rC73IW3Jltqt83I1i2/p2kUvD84GeHKaN49IW 8dBWOddOx9ow2ihJHWh6H26dt78ETpqfwEoO72n4Jy3YuXu4hNslYYY/IU4B64hLDHrN /qYAsMWYOURUAEyDlum87wj4KB8kRKuIRujHqItvQWKxF5X1CwcxIOxbvblCy9Q9ZTb5 ux5Op3o7SxZQCSq06bdtjZgL5F/qRyclnPEbD38rtrSq5HG+64rf/H/Bvc5zgdRlBrqO SNFg== X-Gm-Message-State: AOJu0YwWB+x/1A+SMjxRfiM9ytXnQSDGXmZWrXKn6dSmYluSNbHQHzGS 5t/c9umHsJ8Kz+h84NA5347XF6owQLsCIi4krw9Omt6RHd7OLR99Juz8sqzjwRrxdXHgohhmANt r X-Gm-Gg: ASbGncspWT+C6B6YOaX3K+kj3MJ+S455X6ONNZBySkpMKDAEKOnIMTHv98/UWuiyUw9 lvYyNl6FqoqN51C941Lo+glA/caN5JAoF6IIws7ii+ETYKreF4xlprpsBM9VxY48AT6rknCHML/ yvrXXcss46JzOq3xj3sr8O6YtcKpvRWj2y80LBQXn4UOpLe8ZmN9UUCNV0WbEguG4dyJ+bmpsqP 4Dt021RArNog2ND7HY0rSqUxIksyAuXsPrH2czaf2pLi6ijbp1gcNvGeCXbYHkxDB8= X-Google-Smtp-Source: AGHT+IGP0vEQMmGyG5hs0irconu+soKU5qBQE3llgM2KKNLjJr/T1WL+zjMdi3anWl9Peh/g+hQixQ== X-Received: by 2002:a17:907:6d1e:b0:aa6:bedc:2e4c with SMTP id a640c23a62f3a-ab28fd1c12amr726461466b.3.1736346645132; Wed, 08 Jan 2025 06:30:45 -0800 (PST) From: Roger Pau Monne To: xen-devel@lists.xenproject.org Cc: Roger Pau Monne , Andrew Cooper , Anthony PERARD , Michal Orzel , Jan Beulich , Julien Grall , Stefano Stabellini Subject: [PATCH v2 18/18] x86/mm: zero stack on context switch Date: Wed, 8 Jan 2025 15:26:58 +0100 Message-ID: <20250108142659.99490-19-roger.pau@citrix.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20250108142659.99490-1-roger.pau@citrix.com> References: <20250108142659.99490-1-roger.pau@citrix.com> MIME-Version: 1.0 With the stack mapped on a per-CPU basis there's no risk of other CPUs being able to read the stack contents, but vCPUs running on the current pCPU could read stack rubble from operations of previous vCPUs. The #DF stack is not zeroed because handling of #DF results in a panic. The contents of the shadow stack are not cleared as part of this change. It's arguable that leaking internal Xen return addresses is not guest confidential data. At most those could be used by an attacker to figure out the paths inside of Xen previous execution flows have used. Signed-off-by: Roger Pau Monné --- Is it required to zero the stack when doing a non-lazy context switch from the idle vPCU to the previously running vCPU? d0v0 -> IDLE -> sync_execstate -> zero stack? -> d0v0 This is currently done in this proposal, as when running in the idle vCPU context (iow: not lazy switched) stacks from remote pCPUs can be mapped or tasklets executed. --- Changes since v1: - Zero the stack forward to use ERMS. - Only zero the IST stacks if they have been used. - Only zero the primary stack for full context switches. --- docs/misc/xen-command-line.pandoc | 4 +- xen/arch/x86/cpu/mcheck/mce.c | 4 ++ xen/arch/x86/domain.c | 13 ++++++- xen/arch/x86/include/asm/current.h | 53 +++++++++++++++++++++++--- xen/arch/x86/include/asm/domain.h | 3 ++ xen/arch/x86/include/asm/spec_ctrl.h | 1 + xen/arch/x86/spec_ctrl.c | 57 ++++++++++++++++++++++++---- xen/arch/x86/traps.c | 5 +++ 8 files changed, 124 insertions(+), 16 deletions(-) diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc index e7828d092098..9cde9e84aff2 100644 --- a/docs/misc/xen-command-line.pandoc +++ b/docs/misc/xen-command-line.pandoc @@ -204,7 +204,7 @@ to appropriate auditing by Xen. Argo is disabled by default. ### asi (x86) > `= List of [ , {pv,hvm}=, - {vcpu-pt,cpu-stack}=|{pv,hvm}= ]` + {vcpu-pt,cpu-stack,zero-stack}=|{pv,hvm}= ]` Offers control over whether the hypervisor will engage in Address Space Isolation, by not having potentially sensitive information permanently mapped @@ -224,6 +224,8 @@ meant to be used for debugging purposes only.** * `cpu-stack` prevent CPUs from having permanent mappings of stacks different than their own. Depends on the `vcpu-pt` option. +* `zero-stack` zero CPU stacks when context switching vCPUs. + ### asid (x86) > `= ` diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c index 9028ccde5477..eaaaefe7f8ba 100644 --- a/xen/arch/x86/cpu/mcheck/mce.c +++ b/xen/arch/x86/cpu/mcheck/mce.c @@ -92,10 +92,14 @@ struct mce_callbacks __ro_after_init mce_callbacks = { static const typeof(mce_callbacks.handler) __initconst_cf_clobber __used default_handler = unexpected_machine_check; +DEFINE_PER_CPU(unsigned int, slice_mce_count); + /* Call the installed machine check handler for this CPU setup. */ void do_machine_check(const struct cpu_user_regs *regs) { + this_cpu(slice_mce_count)++; + mce_enter(); alternative_vcall(mce_callbacks.handler, regs); mce_exit(); diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index ac6332266e95..1ff9200eb081 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2106,6 +2106,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) struct cpu_info *info = get_cpu_info(); const struct domain *prevd = prev->domain, *nextd = next->domain; unsigned int dirty_cpu = read_atomic(&next->dirty_cpu); + bool lazy = false; ASSERT(prev != next); ASSERT(local_irq_is_enabled()); @@ -2138,6 +2139,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next) */ set_current(next); local_irq_enable(); + lazy = true; } else { @@ -2212,12 +2214,19 @@ void context_switch(struct vcpu *prev, struct vcpu *next) /* Ensure that the vcpu has an up-to-date time base. */ update_vcpu_system_time(next); - reset_stack_and_call_ind(nextd->arch.ctxt_switch->tail); + /* + * Context switches to the idle vCPU (either lazy or full) will never + * trigger zeroing of the stack, because the idle domain doesn't have ASI + * enabled. Switching back to the previously running vCPU after a lazy + * switch shouldn't zero the stack either. + */ + reset_stack_and_call_ind(nextd->arch.ctxt_switch->tail, + !lazy && nextd->arch.zero_stack); } void continue_running(struct vcpu *same) { - reset_stack_and_call_ind(same->domain->arch.ctxt_switch->tail); + reset_stack_and_call_ind(same->domain->arch.ctxt_switch->tail, false); } int __sync_local_execstate(void) diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h index 4a9776f87a7a..9abb4e55aeea 100644 --- a/xen/arch/x86/include/asm/current.h +++ b/xen/arch/x86/include/asm/current.h @@ -170,6 +170,12 @@ unsigned long get_stack_dump_bottom (unsigned long sp); # define SHADOW_STACK_WORK "" #endif +#define ZERO_STACK \ + "test %[stk_size], %[stk_size];" \ + "jz .L_skip_zeroing.%=;" \ + "rep stosb;" \ + ".L_skip_zeroing.%=:" + #if __GNUC__ >= 9 # define ssaj_has_attr_noreturn(fn) __builtin_has_attribute(fn, __noreturn__) #else @@ -177,13 +183,43 @@ unsigned long get_stack_dump_bottom (unsigned long sp); # define ssaj_has_attr_noreturn(fn) true #endif -#define switch_stack_and_jump(fn, instr, constr) \ +DECLARE_PER_CPU(unsigned int, slice_mce_count); +DECLARE_PER_CPU(unsigned int, slice_nmi_count); +DECLARE_PER_CPU(unsigned int, slice_db_count); + +#define switch_stack_and_jump(fn, instr, constr, zero_stk) \ ({ \ unsigned int tmp; \ + \ BUILD_BUG_ON(!ssaj_has_attr_noreturn(fn)); \ + ASSERT(IS_ALIGNED((unsigned long)guest_cpu_user_regs() - \ + PRIMARY_STACK_SIZE + \ + sizeof(struct cpu_info), PAGE_SIZE)); \ + if ( zero_stk ) \ + { \ + unsigned long stack_top = get_stack_bottom() & \ + ~(STACK_SIZE - 1); \ + \ + if ( this_cpu(slice_mce_count) ) \ + { \ + this_cpu(slice_mce_count) = 0; \ + clear_page((void *)stack_top + IST_MCE * PAGE_SIZE); \ + } \ + if ( this_cpu(slice_nmi_count) ) \ + { \ + this_cpu(slice_nmi_count) = 0; \ + clear_page((void *)stack_top + IST_NMI * PAGE_SIZE); \ + } \ + if ( this_cpu(slice_db_count) ) \ + { \ + this_cpu(slice_db_count) = 0; \ + clear_page((void *)stack_top + IST_DB * PAGE_SIZE); \ + } \ + } \ __asm__ __volatile__ ( \ SHADOW_STACK_WORK \ "mov %[stk], %%rsp;" \ + ZERO_STACK \ CHECK_FOR_LIVEPATCH_WORK \ instr "[fun]" \ : [val] "=&r" (tmp), \ @@ -194,19 +230,26 @@ unsigned long get_stack_dump_bottom (unsigned long sp); ((PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8), \ [stack_mask] "i" (STACK_SIZE - 1), \ _ASM_BUGFRAME_INFO(BUGFRAME_bug, __LINE__, \ - __FILE__, NULL) \ + __FILE__, NULL), \ + /* For stack zeroing. */ \ + "D" ((void *)guest_cpu_user_regs() - \ + PRIMARY_STACK_SIZE + sizeof(struct cpu_info)), \ + [stk_size] "c" \ + ((zero_stk) ? PRIMARY_STACK_SIZE - sizeof(struct cpu_info)\ + : 0), \ + "a" (0) \ : "memory" ); \ unreachable(); \ }) #define reset_stack_and_jump(fn) \ - switch_stack_and_jump(fn, "jmp %c", "i") + switch_stack_and_jump(fn, "jmp %c", "i", false) /* The constraint may only specify non-call-clobbered registers. */ -#define reset_stack_and_call_ind(fn) \ +#define reset_stack_and_call_ind(fn, zero_stk) \ ({ \ (void)((fn) == (void (*)(void))NULL); \ - switch_stack_and_jump(fn, "INDIRECT_CALL %", "b"); \ + switch_stack_and_jump(fn, "INDIRECT_CALL %", "b", zero_stk); \ }) /* diff --git a/xen/arch/x86/include/asm/domain.h b/xen/arch/x86/include/asm/domain.h index f83d2860c0b4..c2cbd73a42b4 100644 --- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -468,6 +468,9 @@ struct arch_domain /* Use per-CPU mapped stacks. */ bool cpu_stack; + /* Zero CPU stack on non lazy context switch. */ + bool zero_stack; + /* Emulated devices enabled bitmap. */ uint32_t emulation_flags; } __cacheline_aligned; diff --git a/xen/arch/x86/include/asm/spec_ctrl.h b/xen/arch/x86/include/asm/spec_ctrl.h index c8943e81befa..c335c5eca35d 100644 --- a/xen/arch/x86/include/asm/spec_ctrl.h +++ b/xen/arch/x86/include/asm/spec_ctrl.h @@ -90,6 +90,7 @@ extern int8_t opt_xpti_hwdom, opt_xpti_domu; extern int8_t opt_vcpu_pt_pv, opt_vcpu_pt_hwdom, opt_vcpu_pt_hvm; extern int8_t opt_cpu_stack_pv, opt_cpu_stack_hwdom, opt_cpu_stack_hvm; +extern int8_t opt_zero_stack_pv, opt_zero_stack_hwdom, opt_zero_stack_hvm; extern bool cpu_has_bug_l1tf; extern int8_t opt_pv_l1tf_hwdom, opt_pv_l1tf_domu; diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c index 4f1e912f8057..edae4b802e67 100644 --- a/xen/arch/x86/spec_ctrl.c +++ b/xen/arch/x86/spec_ctrl.c @@ -93,6 +93,10 @@ int8_t __ro_after_init opt_vcpu_pt_pv = -1; int8_t __ro_after_init opt_cpu_stack_hvm = -1; int8_t __ro_after_init opt_cpu_stack_hwdom = -1; int8_t __ro_after_init opt_cpu_stack_pv = -1; +/* Zero CPU stacks. */ +int8_t __ro_after_init opt_zero_stack_hvm = -1; +int8_t __ro_after_init opt_zero_stack_hwdom = -1; +int8_t __ro_after_init opt_zero_stack_pv = -1; static int __init cf_check parse_spec_ctrl(const char *s) { @@ -515,6 +519,7 @@ static int __init cf_check parse_asi(const char *s) { opt_vcpu_pt_pv = opt_vcpu_pt_hwdom = opt_vcpu_pt_hvm = 1; opt_cpu_stack_pv = opt_cpu_stack_hwdom = opt_cpu_stack_hvm = 1; + opt_zero_stack_pv = opt_zero_stack_hvm = opt_zero_stack_hwdom = 1; } do { @@ -529,13 +534,14 @@ static int __init cf_check parse_asi(const char *s) case 1: opt_vcpu_pt_pv = opt_vcpu_pt_hwdom = opt_vcpu_pt_hvm = val; opt_cpu_stack_pv = opt_cpu_stack_hvm = opt_cpu_stack_hwdom = val; + opt_zero_stack_pv = opt_zero_stack_hvm = opt_zero_stack_hwdom = val; break; default: if ( (val = parse_boolean("pv", s, ss)) >= 0 ) - opt_cpu_stack_pv = opt_vcpu_pt_pv = val; + opt_zero_stack_pv = opt_cpu_stack_pv = opt_vcpu_pt_pv = val; else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) - opt_cpu_stack_hvm = opt_vcpu_pt_hvm = val; + opt_zero_stack_hvm = opt_cpu_stack_hvm = opt_vcpu_pt_hvm = val; else if ( (val = parse_boolean("vcpu-pt", s, ss)) != -1 ) { switch ( val ) @@ -579,6 +585,28 @@ static int __init cf_check parse_asi(const char *s) break; } } + else if ( (val = parse_boolean("zero-stack", s, ss)) != -1 ) + { + switch ( val ) + { + case 1: + case 0: + opt_zero_stack_pv = opt_zero_stack_hvm = + opt_zero_stack_hwdom = val; + break; + + case -2: + s += strlen("zero-stack="); + if ( (val = parse_boolean("pv", s, ss)) >= 0 ) + opt_zero_stack_pv = val; + else if ( (val = parse_boolean("hvm", s, ss)) >= 0 ) + opt_zero_stack_hvm = val; + else + default: + rc = -EINVAL; + break; + } + } else if ( *s ) rc = -EINVAL; break; @@ -791,17 +819,21 @@ static void __init print_details(enum ind_thunk thunk) #endif #ifdef CONFIG_HVM - printk(" ASI features for HVM VMs:%s%s%s\n", - opt_vcpu_pt_hvm || opt_cpu_stack_hvm ? "" : " None", + printk(" ASI features for HVM VMs:%s%s%s%s\n", + opt_vcpu_pt_hvm || opt_cpu_stack_hvm || + opt_zero_stack_hvm ? "" : " None", opt_vcpu_pt_hvm ? " vCPU-PT" : "", - opt_cpu_stack_hvm ? " CPU-STACK" : ""); + opt_cpu_stack_hvm ? " CPU-STACK" : "", + opt_zero_stack_hvm ? " ZERO-STACK" : ""); #endif #ifdef CONFIG_PV - printk(" ASI features for PV VMs:%s%s%s\n", - opt_vcpu_pt_pv || opt_cpu_stack_pv ? "" : " None", + printk(" ASI features for PV VMs:%s%s%s%s\n", + opt_vcpu_pt_pv || opt_cpu_stack_pv || + opt_zero_stack_pv ? "" : " None", opt_vcpu_pt_pv ? " vCPU-PT" : "", - opt_cpu_stack_pv ? " CPU-STACK" : ""); + opt_cpu_stack_pv ? " CPU-STACK" : "", + opt_zero_stack_pv ? " ZERO-STACK" : ""); #endif } @@ -1912,6 +1944,9 @@ void spec_ctrl_init_domain(struct domain *d) d->arch.cpu_stack = is_hardware_domain(d) ? opt_cpu_stack_hwdom : pv ? opt_cpu_stack_pv : opt_cpu_stack_hvm; + d->arch.zero_stack = is_hardware_domain(d) ? opt_zero_stack_hwdom + : pv ? opt_zero_stack_pv + : opt_zero_stack_hvm; } void __init init_speculation_mitigations(void) @@ -2221,6 +2256,12 @@ void __init init_speculation_mitigations(void) opt_cpu_stack_hwdom = 0; if ( opt_cpu_stack_hvm == -1 ) opt_cpu_stack_hvm = 0; + if ( opt_zero_stack_pv == -1 ) + opt_zero_stack_pv = 0; + if ( opt_zero_stack_hwdom == -1 ) + opt_zero_stack_hwdom = 0; + if ( opt_zero_stack_hvm == -1 ) + opt_zero_stack_hvm = 0; if ( opt_vcpu_pt_pv || opt_vcpu_pt_hvm ) warning_add( diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index c80ef2268e94..2aa53550e8e6 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -1792,6 +1792,7 @@ static void unknown_nmi_error(const struct cpu_user_regs *regs, static nmi_callback_t *__read_mostly nmi_callback; DEFINE_PER_CPU(unsigned int, nmi_count); +DEFINE_PER_CPU(unsigned int, slice_nmi_count); void do_nmi(const struct cpu_user_regs *regs) { @@ -1801,6 +1802,7 @@ void do_nmi(const struct cpu_user_regs *regs) bool handle_unknown = false; this_cpu(nmi_count)++; + this_cpu(slice_nmi_count)++; nmi_enter(); /* @@ -1919,6 +1921,8 @@ void asmlinkage do_device_not_available(struct cpu_user_regs *regs) void nocall sysenter_eflags_saved(void); +DEFINE_PER_CPU(unsigned int, slice_db_count); + void asmlinkage do_debug(struct cpu_user_regs *regs) { unsigned long dr6; @@ -1927,6 +1931,7 @@ void asmlinkage do_debug(struct cpu_user_regs *regs) /* Stash dr6 as early as possible. */ dr6 = read_debugreg(6); + this_cpu(slice_db_count)++; /* * At the time of writing (March 2018), on the subject of %dr6: *