From patchwork Fri Jan 10 18:40:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brendan Jackman X-Patchwork-Id: 13935249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C781E7719C for ; Fri, 10 Jan 2025 18:41:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD6FB8D000B; Fri, 10 Jan 2025 13:41:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A861E8D0005; Fri, 10 Jan 2025 13:41:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B29C8D000B; Fri, 10 Jan 2025 13:41:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 680198D0005 for ; Fri, 10 Jan 2025 13:41:34 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 2BCC1AE2BF for ; Fri, 10 Jan 2025 18:41:34 +0000 (UTC) X-FDA: 82992410508.20.C352F55 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf07.hostedemail.com (Postfix) with ESMTP id 4BDBE40008 for ; Fri, 10 Jan 2025 18:41:32 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=N8JAsaBt; spf=pass (imf07.hostedemail.com: domain of 32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736534492; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z5b6fu2IPJ4vMWdeJyvKaXZwCsIFdjwPZGonrVH8VuY=; b=UaQ4hp8p0MVKa/jKcsysCRS/1CaNQcKMdyVZ9FrqlC+ElPE1SSDuilrSykY/QegczM7jqN CFg2cvNTxO7rSFQd1Lxum7DQeoUtxMtOX+q+7tMDq3XCuA9gfWIC979jBc9Kn9TNfBQzo8 A6KM218C/iG2FN9rgvqs7YXg8qZcf9w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736534492; a=rsa-sha256; cv=none; b=ZmmxXh6elvyXCv3Uu96RtctElvUQaAjOK8pJUIOBy6HixMSMTP1DH8qHlGvxzLdYxj7RTx fVs0jeyetfbzeFeqdJ/dmz3QO/I4U/LbLL2l8Bnq6OOHkqL4nQvLaI7/pRtE8hUuhUn+rU lt0OTyQwR9WTRI6Op1VuJOTjo2jwl8Q= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=N8JAsaBt; spf=pass (imf07.hostedemail.com: domain of 32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=32mmBZwgKCPUgXZhjXkYdlldib.Zljifkru-jjhsXZh.lod@flex--jackmanb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-385d6ee042eso1539655f8f.0 for ; Fri, 10 Jan 2025 10:41:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736534491; x=1737139291; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=z5b6fu2IPJ4vMWdeJyvKaXZwCsIFdjwPZGonrVH8VuY=; b=N8JAsaBtfho3CuWRPoONP84u9Q37ry7Glgf5YNM7O/bp103BPK9xmHBcPXe8i1Vgbi aZhN156JUdGf1IHj+BNKvMHcMD1PqcYrLwtgATwqnb9hPChm3pI6Kf66B99nyvpYhS5e 9T+Hse02SeDrylWClSooPenHlT+/WWrlz5IdCPv18jIYICDossIgzSXGAjTo0Ff4YyPi huNQ/XlwExuzeKmLHSrnSGr/K4LRUQs3uKyAD5SSlXhjPinRzAL8ErKAwDeXoAWQCAi1 DjLVW60n7MbNOh4n/q/VF9m0ObSxt98gNWvzG5X0/6hAntvAvlqQFFREe129zOvLJoHC k0Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736534491; x=1737139291; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=z5b6fu2IPJ4vMWdeJyvKaXZwCsIFdjwPZGonrVH8VuY=; b=IW2NEVhoTWPFO0OFt93vIy49QVDE5C07H8wGlcKWmBU8RxoaAooHgXyKURFRs4wZqV 9n5Eibn8WNzO4ERPWyHKmMyrpLszePvlV+lf7htOIH4cLc/Nkq2rSL8hcLEdyVwfuHvx wQVBuHbdkmwFZlfycqwKqs+ecnxrggz65WhkArtCH7rZQ2IU9bzi82PS5H4zCBBNHzGc 8W6Gw6nLYr/NPoolkXP7T3diFgQFz8/QeGru7qj3tFGTzkfx9qy9ulEgfLJ4+z34jHAg 3JI6ZHT0p6oTf+jCHp1cy+yhYSIX5IPVlZUc4OCBEO17lfMIO1tW0rQ4ZzjtlNW/dbuy yx9A== X-Forwarded-Encrypted: i=1; AJvYcCWigI6cViYv6Fki74h9fOCCfkj536VQUI1oaneQYFB8r4jYuc2DnEWnhozyJVyt+1i+pJaKGPhA0w==@kvack.org X-Gm-Message-State: AOJu0YyM9PLNDX4HaC4gOj2Km4HFeyBZ8HB+obAjIO2fqK80JSzG79jw JYfzYtn2wCelupPMHrp4GJOQxFJgxUQSuZo8W5O4WNf1rGih9XIzIPfLw5md7fnipSJWfyCu+Lo ohupPcB2Dvg== X-Google-Smtp-Source: AGHT+IEE74PFwgraC6zRNOcME3rOW4swKtJd/bKE4AIl8MRowFtdpWNq19ZBTTKPz5sLGPxmEc1GHABVlwipiQ== X-Received: from wmjv9.prod.google.com ([2002:a7b:cb49:0:b0:434:f173:a51]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1a85:b0:38a:4184:1520 with SMTP id ffacd0b85a97d-38a872eb1eamr9947778f8f.27.1736534490587; Fri, 10 Jan 2025 10:41:30 -0800 (PST) Date: Fri, 10 Jan 2025 18:40:46 +0000 In-Reply-To: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> Mime-Version: 1.0 References: <20250110-asi-rfc-v2-v2-0-8419288bc805@google.com> X-Mailer: b4 0.15-dev Message-ID: <20250110-asi-rfc-v2-v2-20-8419288bc805@google.com> Subject: [PATCH RFC v2 20/29] mm: asi: Make TLB flushing correct under ASI From: Brendan Jackman To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Richard Henderson , Matt Turner , Vineet Gupta , Russell King , Catalin Marinas , Will Deacon , Guo Ren , Brian Cain , Huacai Chen , WANG Xuerui , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Jonas Bonn , Stefan Kristiansson , Stafford Horne , "James E.J. Bottomley" , Helge Deller , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Naveen N Rao , Madhavan Srinivasan , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Yoshinori Sato , Rich Felker , John Paul Adrian Glaubitz , "David S. Miller" , Andreas Larsson , Richard Weinberger , Anton Ivanov , Johannes Berg , Chris Zankel , Max Filippov , Arnd Bergmann , Andrew Morton , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Uladzislau Rezki , Christoph Hellwig , Masami Hiramatsu , Mathieu Desnoyers , Mike Rapoport , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Dennis Zhou , Tejun Heo , Christoph Lameter , Sean Christopherson , Paolo Bonzini , Ard Biesheuvel , Josh Poimboeuf , Pawan Gupta Cc: x86@kernel.org, linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, linux-hexagon@vger.kernel.org, loongarch@lists.linux.dev, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-openrisc@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org, linux-um@lists.infradead.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, Brendan Jackman X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 4BDBE40008 X-Stat-Signature: oyceqzww8rs55bignt77abgqowtcnh83 X-Rspam-User: X-HE-Tag: 1736534492-262778 X-HE-Meta: U2FsdGVkX1/4kGGE3rsVJRQw26puXcnbfwkX1rT3aKbo+E95JQBGycf8pADA/xKf3jVeOevjTvBzJZ2jRukNytVawEHA9UeEuzyHHrSMnhS18kFQ9V7b2CFbovqxL373VlWpBMfEAW+9IgukHrCe4LbzLHJaL5fwEYHa3UN3D7t3I9H9aeWfVZPXeuNiBgiB64KVuE2klXv8XfVl8YCH/I1CWVUq3fL7nreY5HPhmcMEsENpMYkstlJNCHFGZeaaLoflmOLJLj8owa2HH2rk7/pvAfOX9WRmVZXC+buDXmOlLw5gFCE8si7K6VvSKNdJ70KdiLC5tQTH/66ruyjq32aD8z5/wCaLdZN5hysZvOlRqqzzyXH8YYqFOFxr6nFJjzUCh30Y5BgiuwLQHf1fW4SDHvGpuOYLMEhyW5qxlUwDbx6ywl+hBwI6UOLvwLSjXaFBv4CBBNMfhPyiAKsYbGG1htO3Y3ixyJvPGmFRx/877g/xCVI2PyoyV4BLIAFlzkYPLz0W5F4aw4ge+j4fIj8HYO9zxHV5GNd+H9H6FblC2WWeqQeakAUZC44lw185TfxRdDIqczeolqH4nzBWslArQ+zyZXV6OR+EsxN8ydV460FNf81raJzQURZQqGLx+Gf8KCLHlcs5Feb2JUdmiEbDzgazNwm43JUgAwkwpZpyEvfCpvRleTKk5D23y6QH/CgFXs58Rvuyu4eVTfWE4CvZbyFJNyEf2kKKPr7W6V0iNKOiCkMTxO7XnnjIgpZYDbsvFudVNraqd1ZPlUlngVJyeyRm5IdlGz/Njgv1DhN0jYZG9ygYOcVfmcAlPE74tqu9Sm2QMMZeck5pOs2u+vM4taBOHmfbmKXAbQATdkk4yBVydy9OMP1VR9qiCIF0GF0/vv0hAe6ffjnoxlIc/x2xpOkUUfuHqktPQ6FxSCSa2ctTJfGOKpdz5rgckqgV/SQYCCdTiQKDtwh0tkh wfZMm7hQ /7wv7Dr9QMPYqKyJGRY8CxM2yfSQ6Tt7eNla01u+PXRa4uoSCPRBsEI3u3vUvlGuSlY20vuCbR9KRmZW6ZWUIQJvWZIWTsjr0b5LSKGA4tvDbFCkUKc7nzbJzvg8X78bVIbJ5QdJjjZI9bP3OCwbKVA9Z2e0N7KondGDCJFIpZtff071qu00epmdOCNLg2JfhRaOFu0mLveEfqOrsCUxMS4RQuN5g5KQTpYagZHa+ECYxHVHebdsdNvfTqkFmy8N8zfwgVjDAiO+6v9RLbX2njFZlhDvbBWEQJkekuv+paTHMXuqHK3AcbES63Z+apV2VWVHumw7bueyUfHLLhZAmSdVCeQE13KY1ShS1aDfaYsh5OFpwCbH2uOv27cfGN2ZsBCT2XnQIe8JVs+L74mkx5OqyjZAMDbMzrsO/eMOCMSnxz/qeUkDxkA3X0FZt0Pf19wLPdlCN15dUOmL6PZ1X4b81PsLrBJA/03ZYGLGYm2EVq/nm0SqEyorMcSLx+FCuZBFDtIBk7elW5W6fHNjWNkNZulJht01mEI3N/LqQuMWJzZUsCCg6TM84DmTPn1llLUv+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is the absolute minimum change for TLB flushing to be correct under ASI. There are two arguably orthogonal changes in here but they feel small enough for a single commit. .:: CR3 stabilization As noted in the comment ASI can destabilize CR3, but we can stabilize it again by calling asi_exit, this makes it safe to read CR3 and write it back. This is enough to be correct - we don't have to worry about invalidating the other ASI address space (i.e. we don't need to invalidate the restricted address space if we are currently unrestricted / vice versa) because we currently never set the noflush bit in CR3 for ASI transitions. Even without using CR3's noflush bit there are trivial optimizations still on the table here: on where invpcid_flush_single_context is available (i.e. with the INVPCID_SINGLE feature) we can use that in lieu of the CR3 read/write, and avoid the extremely costly asi_exit. .:: Invalidating kernel mappings Before ASI, with KPTI off we always either disable PCID or use global mappings for kernel memory. However ASI disables global kernel mappings regardless of factors. So we need to invalidate other address spaces to trigger a flush when we switch into them. Note that there is currently a pointless write of cpu_tlbstate.invalidate_other in the case of KPTI and !PCID. We've added another case of that (ASI, !KPTI and !PCID). I think that's preferable to expanding the conditional in flush_tlb_one_kernel. Signed-off-by: Brendan Jackman --- arch/x86/mm/tlb.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ce5598f96ea7a84dc0e8623022ab5bfbba401b48..07b1657bee8e4cf17452ea57c838823e76f482c0 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -231,7 +231,7 @@ static void clear_asid_other(void) * This is only expected to be set if we have disabled * kernel _PAGE_GLOBAL pages. */ - if (!static_cpu_has(X86_FEATURE_PTI)) { + if (!static_cpu_has(X86_FEATURE_PTI) && !static_asi_enabled()) { WARN_ON_ONCE(1); return; } @@ -1040,7 +1040,6 @@ static void put_flush_tlb_info(void) noinstr u16 asi_pcid(struct asi *asi, u16 asid) { return kern_pcid(asid) | ((asi->class_id + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); - // return kern_pcid(asid) | ((asi->index + 1) << X86_CR3_ASI_PCID_BITS_SHIFT); } void asi_flush_tlb_range(struct asi *asi, void *addr, size_t len) @@ -1192,15 +1191,19 @@ void flush_tlb_one_kernel(unsigned long addr) * use PCID if we also use global PTEs for the kernel mapping, and * INVLPG flushes global translations across all address spaces. * - * If PTI is on, then the kernel is mapped with non-global PTEs, and - * __flush_tlb_one_user() will flush the given address for the current - * kernel address space and for its usermode counterpart, but it does - * not flush it for other address spaces. + * If PTI or ASI is on, then the kernel is mapped with non-global PTEs, + * and __flush_tlb_one_user() will flush the given address for the + * current kernel address space and, if PTI is on, for its usermode + * counterpart, but it does not flush it for other address spaces. */ flush_tlb_one_user(addr); - if (!static_cpu_has(X86_FEATURE_PTI)) + /* Nothing more to do if PTI and ASI are completely off. */ + if (!static_cpu_has(X86_FEATURE_PTI) && !static_asi_enabled()) { + VM_WARN_ON_ONCE(static_cpu_has(X86_FEATURE_PCID) && + !(__default_kernel_pte_mask & _PAGE_GLOBAL)); return; + } /* * See above. We need to propagate the flush to all other address @@ -1289,6 +1292,16 @@ STATIC_NOPV void native_flush_tlb_local(void) invalidate_user_asid(this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + /* + * Restricted ASI CR3 is unstable outside of critical section, so we + * couldn't flush via a CR3 read/write. asi_exit() stabilizes it. + * We don't expect any flushes in a critical section. + */ + if (WARN_ON(asi_in_critical_section())) + native_flush_tlb_global(); + else + asi_exit(); + /* If current->mm == NULL then the read_cr3() "borrows" an mm */ native_write_cr3(__native_read_cr3()); }