From patchwork Thu Feb 13 16:13:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1ADDC021A4 for ; Thu, 13 Feb 2025 16:18:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D1C36B008A; Thu, 13 Feb 2025 11:18:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3813A6B008C; Thu, 13 Feb 2025 11:18:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 248F86B0092; Thu, 13 Feb 2025 11:18:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0873D6B008A for ; Thu, 13 Feb 2025 11:18:00 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BA787A12B0 for ; Thu, 13 Feb 2025 16:17:59 +0000 (UTC) X-FDA: 83115427878.02.B721367 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf16.hostedemail.com (Postfix) with ESMTP id 2649218000B for ; Thu, 13 Feb 2025 16:17:57 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463478; a=rsa-sha256; cv=none; b=NFZWZIn+xh0qFLSbxiBxLCQhEGROTRnxyZlK2vLGwbaAFvny299/iSs35LbChT3c5/jxzG /I8Ysu1gtkQIxUPmezkPmQLgAnCSHCoBUs8bpJoIgSkxWy7ZUJrG7xUMz4DGOsFAmSGSVZ UEEgMacY5nhzH0saDiTCiv59SPtLubU= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463478; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zQVkWtQvWh/JQlkH3zRjmdk+4uvC0ZngJuxM1UBby2g=; b=FYO/SyoBpKqWCh5XfrHC1pyB0swv8YdKbBtXfPHt6xYyIRR6Dh+HQ8BRU7fcHdIC4zf8W9 CmRU7X8TcoSLd/01jxswfZpmKXptH11Wpjk6F8Vs6Z4Lf3z7laVCmGGJEJv1Mp1DJ4ruGT ThXPFS0mikk+DZpBvJYshFibbEd6K4I= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0QCo; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 01/12] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional Date: Thu, 13 Feb 2025 11:13:52 -0500 Message-ID: <20250213161423.449435-2-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Stat-Signature: aan5p45ruchpubus9bor1z913izxemth X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2649218000B X-Rspam-User: X-HE-Tag: 1739463477-156483 X-HE-Meta: U2FsdGVkX1+RRc1XiTxzVkKQO08FHvyXokoSYM+c0PXZCkuBkTA0MzrNwKFceGMmipA/fiAmVMSVn+UEV4qWSdjA2foNat8xHdfGlXHwrBD5gyI2fWSAAgeRr1JrTgHh5BsAQenbxaMtc2/xpbxF1nbU8CDVA95I/c3sFtfN6jUweMrh3ux37E37h9aPfvSJjAUvT6QalKE0ff/SbzzUiRVvTkltoNnWPhkxaYBB3dxxt8Wx0T2OYjgDtLcT99rnaxaDZITH2CcOTS2RRp+om6vxdd7PWUKHWUzf8iJO33I06eAXygC5uAjFdNRI5CZ0lkOC+3/msXonlPF2GTx8LXclqU+SJFkUw50Xnut1nPQm7I0FTrHCS3Tb9FASUWW2A2SSigXa0hasVLTvXEsojjQDQ6vSt4z1TO9I+0BwhiZXMpSKKIGxkmNG729+n7zBZspvZ9Y1T9fzYtVGItDgb7LYD2cEfw6Tkvdq4FbtVwHlupjJVZP6TxRtb/5jDKEHmsr5/JktAMpTktAB1fJZk4+M+pW58GoFJrbiW1eHOM/YIJj6jM9875UMnEeqx26Kp2V/Vj0UnTUoLAqJucdK42m/6lh11a/kW9/JY3ltijwfJINfGSBXUyYglL/HFBF7GX6r3E83xi+DXVF4u8D5NUGe5oR7FD7X39I821hMRPY9U5jgcwpDv3zThOef7/icTj9JDmbIkUSJ/mKlxq6CGSc0ixpY41icTHylUP/gDVPSRGV89qkGRIb0muNfu3t1nsiCQaYduW+V/4kO+PAbLDiMS1bmZbtkV2U8ziwRG/u/6OQLp0/oEz8fLzW9mw2/hKHp3e7ugPULHdQfxE8k01Iu8bi4TbUWx563RwjbcSS9APCq76JlicKSUstz9aOtAWwXKd3dq0VpP2PzEz8BK0yQt1TIjEad+1riwZFt8BzNaG0FnHy+b79of9jCPTidW3RwUuKQ9VjK0ikyESo nHMOqJft b83TdHVUS1nFW1QmNCFyo8IPBFoHq06/eevUbcPJ2Sw5MWPprjOsmDmaNpSoHA+fJketLoxoGWt62nCrbfLl5pZDlZpeyMwBIc8jIajbJb63yICaPlUALm2NelLCk5tGyepKhzlI52swFuMRziqEYsaR5DEliQtJ9goA/D+QyGRk75jH5CJTH3Y6O5IR6d2vImzpT75vtOJfPG4XkmO5EmVd/3wg1NE/3BEgE5oz66/DbgNDQNXSeffxJq1rmOmQnx2xXIVCaqv7rgeWDmH1ALT+fy+lNOFnrsHDwzsguNmok8roIVy1K5CwEULUisbPEQQNPlvbQ/aEO/qPZIvbcXm+5n2gZOtoUYLyobN8QLWVV9sLVTZof2jpfZoRORPV5HmU4fePs6dRBWqIooopSFYm5anHoAqxsrEkyqrSmU2wvHLL887Ht22FfgPXoEaSg8/30jpHtXemyGebB6aTotEa61ovCQYmmrJ7d1SRLCtoRO7VgGYchNyxgwPf3Si8/5frhSU96kf8YyvEA3jUcirIlNICC0u0HXfZwp49iC3Vj0KZr++FWx2KaMH6UXVG4LvbIPCvmPimtCwF8dEpUdHKGwKJeqDWMlEtL2m6mq66+FrIUWOddaK+0GBeaRFQTxOH/50MU1k52mDxzRwGgaAaL6ciYs63pkpuUdfRjc3oC0/A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently x86 uses CONFIG_MMU_GATHER_TABLE_FREE when using paravirt, and not when running on bare metal. There is no real good reason to do things differently for each setup. Make them all the same. Currently get_user_pages_fast synchronizes against page table freeing in two different ways: - on bare metal, by blocking IRQs, which block TLB flush IPIs - on paravirt, with MMU_GATHER_RCU_TABLE_FREE This is done because some paravirt TLB flush implementations handle the TLB flush in the hypervisor, and will do the flush even when the target CPU has interrupts disabled. Always handle page table freeing with MMU_GATHER_RCU_TABLE_FREE. Using RCU synchronization between page table freeing and get_user_pages_fast() allows bare metal to also do TLB flushing while interrupts are disabled. Various places in the mm do still block IRQs or disable preemption as an implicit way to block RCU frees. That makes it safe to use INVLPGB on AMD CPUs. Signed-off-by: Rik van Riel Suggested-by: Peter Zijlstra Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/Kconfig | 2 +- arch/x86/kernel/paravirt.c | 17 +---------------- arch/x86/mm/pgtable.c | 27 ++++----------------------- 3 files changed, 6 insertions(+), 40 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 6df7779ed6da..aeb07da762fc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -278,7 +278,7 @@ config X86 select HAVE_PCI select HAVE_PERF_REGS select HAVE_PERF_USER_STACK_DUMP - select MMU_GATHER_RCU_TABLE_FREE if PARAVIRT + select MMU_GATHER_RCU_TABLE_FREE select MMU_GATHER_MERGE_VMAS select HAVE_POSIX_CPU_TIMERS_TASK_WORK select HAVE_REGS_AND_STACK_ACCESS_API diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 1ccaa3397a67..527f5605aa3e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,21 +59,6 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } -#ifndef CONFIG_PT_RECLAIM -static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - struct ptdesc *ptdesc = (struct ptdesc *)table; - - pagetable_dtor(ptdesc); - tlb_remove_page(tlb, ptdesc_page(ptdesc)); -} -#else -static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - tlb_remove_table(tlb, table); -} -#endif - struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; @@ -195,7 +180,7 @@ struct paravirt_patch_template pv_ops = { .mmu.flush_tlb_kernel = native_flush_tlb_global, .mmu.flush_tlb_one_user = native_flush_tlb_one_user, .mmu.flush_tlb_multi = native_flush_tlb_multi, - .mmu.tlb_remove_table = native_tlb_remove_table, + .mmu.tlb_remove_table = tlb_remove_table, .mmu.exit_mmap = paravirt_nop, .mmu.notify_page_enc_status_changed = paravirt_nop, diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 1fef5ad32d5a..b1c1f72c1fd1 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -18,25 +18,6 @@ EXPORT_SYMBOL(physical_mask); #define PGTABLE_HIGHMEM 0 #endif -#ifndef CONFIG_PARAVIRT -#ifndef CONFIG_PT_RECLAIM -static inline -void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - struct ptdesc *ptdesc = (struct ptdesc *)table; - - pagetable_dtor(ptdesc); - tlb_remove_page(tlb, ptdesc_page(ptdesc)); -} -#else -static inline -void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - tlb_remove_table(tlb, table); -} -#endif /* !CONFIG_PT_RECLAIM */ -#endif /* !CONFIG_PARAVIRT */ - gfp_t __userpte_alloc_gfp = GFP_PGTABLE_USER | PGTABLE_HIGHMEM; pgtable_t pte_alloc_one(struct mm_struct *mm) @@ -64,7 +45,7 @@ early_param("userpte", setup_userpte); void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte) { paravirt_release_pte(page_to_pfn(pte)); - paravirt_tlb_remove_table(tlb, page_ptdesc(pte)); + tlb_remove_table(tlb, page_ptdesc(pte)); } #if CONFIG_PGTABLE_LEVELS > 2 @@ -78,21 +59,21 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd) #ifdef CONFIG_X86_PAE tlb->need_flush_all = 1; #endif - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(pmd)); + tlb_remove_table(tlb, virt_to_ptdesc(pmd)); } #if CONFIG_PGTABLE_LEVELS > 3 void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud) { paravirt_release_pud(__pa(pud) >> PAGE_SHIFT); - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(pud)); + tlb_remove_table(tlb, virt_to_ptdesc(pud)); } #if CONFIG_PGTABLE_LEVELS > 4 void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d) { paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT); - paravirt_tlb_remove_table(tlb, virt_to_ptdesc(p4d)); + tlb_remove_table(tlb, virt_to_ptdesc(p4d)); } #endif /* CONFIG_PGTABLE_LEVELS > 4 */ #endif /* CONFIG_PGTABLE_LEVELS > 3 */ From patchwork Thu Feb 13 16:13:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3486C021A0 for ; Thu, 13 Feb 2025 16:17:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1AE026B0088; Thu, 13 Feb 2025 11:17:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 15E9E6B0089; Thu, 13 Feb 2025 11:17:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0264A280001; Thu, 13 Feb 2025 11:16:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id D7C6E6B0088 for ; Thu, 13 Feb 2025 11:16:59 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8B0F1C11E3 for ; Thu, 13 Feb 2025 16:16:59 +0000 (UTC) X-FDA: 83115425358.27.189FBAD Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf13.hostedemail.com (Postfix) with ESMTP id 1AC732001D for ; Thu, 13 Feb 2025 16:16:55 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463417; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ict0ZUfh5/z/IoUmcY6i3Lw6+8ZsHB39bOyYhjfs4aA=; b=IRslqIZ+3GuqVON9vCf52zNWE6DKOOtXIGGjJabLwDurU5Epx7s1EMQ9SF48t2PHqPW5gP i4tH6dW/GyxJsjpBeBMOsbr+oqJgd3VcMTxbM23+oQomKyvqS+njoijin2ypu24K5bvnoD LF04MkzdCtlJZqzyDL1Iz0J/pQD2Otg= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf13.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463417; a=rsa-sha256; cv=none; b=ed7RfAr8yzILMQC7/WyvJw53VpdPsbCTr0YshgYav3ac6VqRnBkWQ16Ck5nGOWZwl+DRcE epoNnTphwCgPPfrgGMIEbNoEkEVUjI2yL2p7D5q2ekAeOvU3VaazfQXLDbjoFpdQ+rb66n JRwMBor3QbfC0vBT5Augt4ZxQP96LB0= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0VBu; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 02/12] x86/mm: remove pv_ops.mmu.tlb_remove_table call Date: Thu, 13 Feb 2025 11:13:53 -0500 Message-ID: <20250213161423.449435-3-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 1AC732001D X-Stat-Signature: 775dx6asn5ku99xagtd7gfywugnwcra9 X-HE-Tag: 1739463415-379377 X-HE-Meta: U2FsdGVkX18POtE0JYPt4AA00EXK8NFiA8b4x2XJ5wjcooifo9ddjEdprkM6CKHlMjlFwuM1BG7nNyDHpkRq2esDVFFZNsPc2J6dsCJrzIfAxYjjgaaRnn9oL7FQCgyqlDUV4Xx78FwZehBdHzP5AwQ3R9TgUKz9zkOlxhAS8qC5G1duss29bkcwBwFwpxUfXVi28jJ7gTaUGEwSMJPs00dx5a21gsIhnkh/3hUxSoUMtj1KVAcKzXYsmKg0Cj5ju/KRw4ubaFEQGJpYtHRuXP7GKCVIHMHwbXva4pzNEWZ8Sx9Hd+SB6hKQU8G8LuIccIDHWuTh9e44AK4uAovwWw8g8IoJ8W5QejuKgI3YMu3w2KNtCH+PZvBjwtUhRxnTEKkcETfJPm3aMmky/DUF9OyagdpvwyImW4n0tyvvkmK9PD2ce0WjH7hnsCJ0enG5Tp32dm5a2scHbmVsUBJ+yvVloUH2Loe++VG61S9D/MvtQGhc0EtlctB2U/rw7dgxeBqQiR4Zvvad3vhrQP5yjsJmQZVQQiTuub1tbTaf3N7u4ailtDqkpRj0+FZeHYAPnX/vBeGF7k1/zV/TCqtupDZCE9qKb0jgI+rMBgl54vYnpPVE1VSjJBnF5qnpYRJYO0msfKF83GxLiO7B/69u86UOUHQgsuGjwJ9ng2bc8MAVNqnA6FzdMNfXpMRT+HM8ngRolBmWa7ybBi9nTdCrSMmVY5dOBF8kOkOR1gUN26Ujs93L/NFKReDinio5wUYRfiYU0Y35IiDpEzlC4SKmE4Kc0yPy16eOuutHvvXqUZfKAKjwBoeuingC4CnsSPNeqogf55OkpcXJZesUcayo1/dV9ZCdX99SmthBDa4esoY93Tdh9cH9cy4ue/yIsiIf2OANYUbWx/mjVZhdhHVCU6jQ4GHmpc2V3OVv2Ami7ab1uF50aelY8zAB1oqi3UH7LEQKXVQ6a1KM80ONa97 QKC7oUsG HHAgCLbgDRc8JEebIAY8SvwwunZmphkeJWKJyqto8vEWZlTwcvl23NfevbYUV5YLMwSrd05iUKDtcSboqddoiROOnQN/y8ZIdGaCQ2vEy6oo84NMWKjvzDIyvNdLDthh/Za6j4QuF2ky+GVk/h3SdjA+n3nQZVWGh8c9N9Mx3ATZOc+tg6CnBLfOD8sJv74NM20s1LOhsmfTqpVf28d+7hVXanVG3UCP7q2MPxDdN8Cq0mSE/+XIksPaSX2SSaq/MjKg+Oer5yUFh7/1TMmowgZAu3LsWIVWNgbuN1yM5AGJ9xoP00WBPL1k1qf2R+29jNyH84F3sNMqOob1H7ferbeqtLgpahYoPK9MKfhvWqisdZWswGVvk0exVzXKNeKFdM3G0RJr7jILBgKFGCe/lESwcYw0FNzpo2BcZM+JIfDSv4yhs0yte0oeYGNg/xQGl4yGkM4beMlPQG/OtimNQ4z5tD6SHQKM54lrEHNypMNK8FWyOjAfi+nmx5knDi1trC/eJz1RXBfXh+NUUS4qqLeJ5uP1qK2fCtxiVsjnU07cXfA4niwcqhHSjdZjamo3fpXIf53/sEalv/+DyUU8h7fJ74YQ3xWGDunPLHBTdjnPTbOHf4znmlPlkDVpUa1OTQMcFp86T2VKNUw4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Every pv_ops.mmu.tlb_remove_table call ends up calling tlb_remove_table. Get rid of the indirection by simply calling tlb_remove_table directly, and not going through the paravirt function pointers. Signed-off-by: Rik van Riel Suggested-by: Qi Zheng Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/hyperv/mmu.c | 1 - arch/x86/include/asm/paravirt.h | 5 ----- arch/x86/include/asm/paravirt_types.h | 2 -- arch/x86/kernel/kvm.c | 1 - arch/x86/kernel/paravirt.c | 1 - arch/x86/xen/mmu_pv.c | 1 - 6 files changed, 11 deletions(-) diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index cc8c3bd0e7c2..1f7c3082a36d 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -239,5 +239,4 @@ void hyperv_setup_mmu_ops(void) pr_info("Using hypercall for remote TLB flush\n"); pv_ops.mmu.flush_tlb_multi = hyperv_flush_tlb_multi; - pv_ops.mmu.tlb_remove_table = tlb_remove_table; } diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 041aff51eb50..38a632a282d4 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -91,11 +91,6 @@ static inline void __flush_tlb_multi(const struct cpumask *cpumask, PVOP_VCALL2(mmu.flush_tlb_multi, cpumask, info); } -static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) -{ - PVOP_VCALL2(mmu.tlb_remove_table, tlb, table); -} - static inline void paravirt_arch_exit_mmap(struct mm_struct *mm) { PVOP_VCALL1(mmu.exit_mmap, mm); diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index fea56b04f436..e26633c00455 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -134,8 +134,6 @@ struct pv_mmu_ops { void (*flush_tlb_multi)(const struct cpumask *cpus, const struct flush_tlb_info *info); - void (*tlb_remove_table)(struct mmu_gather *tlb, void *table); - /* Hook for intercepting the destruction of an mm_struct. */ void (*exit_mmap)(struct mm_struct *mm); void (*notify_page_enc_status_changed)(unsigned long pfn, int npages, bool enc); diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 7a422a6c5983..3be9b3342c67 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -838,7 +838,6 @@ static void __init kvm_guest_init(void) #ifdef CONFIG_SMP if (pv_tlb_flush_supported()) { pv_ops.mmu.flush_tlb_multi = kvm_flush_tlb_multi; - pv_ops.mmu.tlb_remove_table = tlb_remove_table; pr_info("KVM setup pv remote TLB flush\n"); } diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 527f5605aa3e..2aa251d0b308 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -180,7 +180,6 @@ struct paravirt_patch_template pv_ops = { .mmu.flush_tlb_kernel = native_flush_tlb_global, .mmu.flush_tlb_one_user = native_flush_tlb_one_user, .mmu.flush_tlb_multi = native_flush_tlb_multi, - .mmu.tlb_remove_table = tlb_remove_table, .mmu.exit_mmap = paravirt_nop, .mmu.notify_page_enc_status_changed = paravirt_nop, diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 2c70cd35e72c..a0b371557125 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -2141,7 +2141,6 @@ static const typeof(pv_ops) xen_mmu_ops __initconst = { .flush_tlb_kernel = xen_flush_tlb, .flush_tlb_one_user = xen_flush_tlb_one_user, .flush_tlb_multi = xen_flush_tlb_multi, - .tlb_remove_table = tlb_remove_table, .pgd_alloc = xen_pgd_alloc, .pgd_free = xen_pgd_free, From patchwork Thu Feb 13 16:13:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973652 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C3F3C021A4 for ; Thu, 13 Feb 2025 16:22:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0783D6B009B; Thu, 13 Feb 2025 11:22:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 04F596B009C; Thu, 13 Feb 2025 11:22:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E597E280001; Thu, 13 Feb 2025 11:22:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C62986B009B for ; Thu, 13 Feb 2025 11:22:04 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5BB174B555 for ; Thu, 13 Feb 2025 16:22:04 +0000 (UTC) X-FDA: 83115438168.21.5C3C406 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf02.hostedemail.com (Postfix) with ESMTP id 9DBCC80015 for ; Thu, 13 Feb 2025 16:22:02 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463722; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=feHMxbUJk0/EzMWTQfrt/LATGrgLKwPkbP9LPcpQaKU=; b=HiF9WLFRXeSuFLjDhEZnlWaxTm8T6g/T/xQjZon6uWzovQX/gNLELaDflYzKckussnxFx6 HP3rmFR0BEd7zJRvguvG6ureJuXZfwOmxgfuY+El17ikkVaoKTDdY0JcwtZ2FNxJtngojK NQRnEk27q4ANZFGzg7cdaabrlc+U0Js= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463722; a=rsa-sha256; cv=none; b=bB9LhgabsCPmlIsv9TnZ3bJgOkfDyJDmO4AApF909cPLqDWCxgZAncEzdVvzW4qm/s21nY ElRmjsFWSxLb9NYh9qtv4wLX9DbOr5Xxfme/L9kO5F/yGF5tCIp7lHmxCd1o2Mm8i5iOn5 7fcNY7k2hHznEwdZSCxUYpxl3wzG0FQ= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0bsl; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Dave Hansen Subject: [PATCH v11 03/12] x86/mm: consolidate full flush threshold decision Date: Thu, 13 Feb 2025 11:13:54 -0500 Message-ID: <20250213161423.449435-4-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9DBCC80015 X-Stat-Signature: ngs7u49bt8biio8sgzx9k9nepx33ap99 X-Rspam-User: X-HE-Tag: 1739463722-582902 X-HE-Meta: U2FsdGVkX1+d/Ph/VZ23MnOrojHyVb+uojWC6/tzYNcaTWQvc+NkUYc1Y1rPLtxmhb2l1kh0pjEXo57IH51i+z/VNfur4HE+ODypuo9TSROcyq8sndkhUT6Wrt/OnWt1bYPAbkN7Cv14vuUdsi2q1Xo03QxuD/CEixukUVMJuhsZs5Xr6FLvSiLpncQlj5XvU3MNhcHG5NvZuvNZ6aH/JezmIdgfxRd2iBvmn5ZS1xEnnefimkp1Hiyb8R6ySjcEfUKLCbgwUVBtFvcdOV5ZYinlsS5FjDXG5sgdBo33WRDYG+qUtR/jM/HxVsfvJwCsVnZ2UsLhcLuTWWc/SN6tqCUpG4x3IGtPUXU6rTUsgu1yI607ixAXg2pmHNYJxh2w7yrw3La/fBZzOkpbCB4D64QmodNiP8Ly4/Zw1sSA6EfkEb/zcbuTehzl/sL4tGSBnFO29od7QcEufsl7p1VTuK8H0Rz+8FG9LPeRRJHvXchqPucKyrjQrJlCGIGH317B5AQXuTZHpGAxrHySer06vb/tg0HMwAYmr7yUD3iK2Mz+V6ZuoE7nZxZxt9rawV81Nih3RqooUuCWquaMSOCiCihjRTNKqrzxJMUGbpsN1S/jbaX7/0YYgNEfMMEqAUueQxPNtBqLogEmS5S6JR8Osf7k8zYC3Xh1jnfNVUfNkjqQ/koFPNlXDfq2K5r7Ef9bbpxuCnvqpALdOePhJo4UIr8qCRCaIRmThKeyKHIwBW6RyDU5ehfBBKRI+GbabRg/uh+DZuaknk5sQfwRanWuLDr/BhPZMba6STfE0to8MGzVAnFYN5aciAGIJfI+JdfkOGVAZmradLQYzYeTyHBZqeuExS+E1fttNCQYYdh2dfn+/yjqTqqjMglWrkUz76CPwWYxFBtwM/uMJMyqkVmRGO0wh4rYdRxnNrRDEc9hNWrrdtoqs5iAs4hEHvKSiHHKqijtwwISbVhUajd07A7 IAEh3vdZ zkoQK+pREAiNSK78efVJEYa+jn7eEUyr3Zqq5BZGDXiJI0Vu9TuWSzLydcyqZXlhID5XS9LUCpND4w12ZiAIKFNi5ciuRyjhqzseqdkH2NjzybSxpwVsC/XCpZ5dEaM28/a0aoBDKvLSyHbmF2JJobjr7PiS8buFE0odvAF/mtcspSODueBWFEI+f9loG41zMdNOSUj/mqUdw+En+h54DxUCR+pT0u8es/dIB6cCeWCEk822aEnAnR/0Ov2S8H71+Vmy/X1RJ8HVc7sFRMDq9BbtmJzUeRcnqOWECSpSwvt55VF3QMGUY9kYl1lUKznx+w+qBmWqTRqdRjRtfkjRiVio6h+uN5gYK0dFJ6zZhtBc1AMO7FgO+9qlSGxZRAjk3Q+b46uVVLT5xbeZhVXvUtiQd1/qXteb+BDU3Vy7aj9DmASgXKJn07bYY0IhcTL/LF2nwfN302E27tJh+ol/w4FK7rmn7zXAGo6Sj9+R2yNBsGc/zE2V69NVmXKBj0G8gN8nOTbGqOLxRjEc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Reduce code duplication by consolidating the decision point for whether to do individual invalidations or a full flush inside get_flush_tlb_info. Signed-off-by: Rik van Riel Suggested-by: Dave Hansen Tested-by: Michael Kelley Acked-by: Dave Hansen Reviewed-by: Borislav Petkov (AMD) --- arch/x86/mm/tlb.c | 41 +++++++++++++++++++---------------------- 1 file changed, 19 insertions(+), 22 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ffc25b348041..924ac2263725 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1009,6 +1009,15 @@ static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, info->initiating_cpu = smp_processor_id(); info->trim_cpumask = 0; + /* + * If the number of flushes is so large that a full flush + * would be faster, do a full flush. + */ + if ((end - start) >> stride_shift > tlb_single_page_flush_ceiling) { + info->start = 0; + info->end = TLB_FLUSH_ALL; + } + return info; } @@ -1026,17 +1035,8 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, bool freed_tables) { struct flush_tlb_info *info; + int cpu = get_cpu(); u64 new_tlb_gen; - int cpu; - - cpu = get_cpu(); - - /* Should we flush just the requested range? */ - if ((end == TLB_FLUSH_ALL) || - ((end - start) >> stride_shift) > tlb_single_page_flush_ceiling) { - start = 0; - end = TLB_FLUSH_ALL; - } /* This is also a barrier that synchronizes with switch_mm(). */ new_tlb_gen = inc_mm_tlb_gen(mm); @@ -1089,22 +1089,19 @@ static void do_kernel_range_flush(void *info) void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - /* Balance as user space task's flush, a bit conservative */ - if (end == TLB_FLUSH_ALL || - (end - start) > tlb_single_page_flush_ceiling << PAGE_SHIFT) { - on_each_cpu(do_flush_tlb_all, NULL, 1); - } else { - struct flush_tlb_info *info; + struct flush_tlb_info *info; - preempt_disable(); - info = get_flush_tlb_info(NULL, start, end, 0, false, - TLB_GENERATION_INVALID); + guard(preempt)(); + info = get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false, + TLB_GENERATION_INVALID); + + if (info->end == TLB_FLUSH_ALL) + on_each_cpu(do_flush_tlb_all, NULL, 1); + else on_each_cpu(do_kernel_range_flush, info, 1); - put_flush_tlb_info(); - preempt_enable(); - } + put_flush_tlb_info(); } /* From patchwork Thu Feb 13 16:13:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973651 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7810FC021A0 for ; Thu, 13 Feb 2025 16:21:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF6D66B009A; Thu, 13 Feb 2025 11:21:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EA6B36B009B; Thu, 13 Feb 2025 11:21:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D954F280001; Thu, 13 Feb 2025 11:21:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BD8076B009A for ; Thu, 13 Feb 2025 11:21:53 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6FD091A1216 for ; Thu, 13 Feb 2025 16:21:52 +0000 (UTC) X-FDA: 83115437664.23.670DA5F Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf09.hostedemail.com (Postfix) with ESMTP id DBC60140007 for ; Thu, 13 Feb 2025 16:21:49 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf09.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463709; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OY+og+lXtFxyGMzRlXc+ZRJtW63uV5BPolJ+Ud40Hvo=; b=2yHBRqOfENyPPVVhfdd4ytsaM+SOsCZftVBMVX22iEtKhOv0JORmBahrgE0EBz1cSEH4ix xTyPAfx2faYYSpAaeUsHynAgqK0/gkiwDUBEQyTeVgY6D79WNuBYFi93NsamY2ZHzdZTZn mQ2ysodEQ2rVG56IGevHHwN1z/qqHG4= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf09.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463709; a=rsa-sha256; cv=none; b=W8fPQ4jQPjrJwq2X+Ie5kbKEXhRNHgRZJWny607zJXBSPb+wcD675XuclalOMzHYck0nJj TzyohhB9ODOIcl9ppwkOCvWuqOOKWTA7FrSPQySDbMgp2An9oxN1S8DweKzdGSbF3oVOx8 AOAqzzyLx1bOuuywGziDHwIZULluBwk= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0gKp; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 04/12] x86/mm: get INVLPGB count max from CPUID Date: Thu, 13 Feb 2025 11:13:55 -0500 Message-ID: <20250213161423.449435-5-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: DBC60140007 X-Stat-Signature: mx6g6aiohydgn6k6kca595k13s58poib X-HE-Tag: 1739463709-332680 X-HE-Meta: U2FsdGVkX1+mBNLHWbwlz8yhSG20Lrd/shZjCd3NS0GDBlGTxLX9ZMGAdPX1QEFzvnM5Q8WXu1qeHltbbpxclCqfjwaSqMcnzbIXoFlAs7chuB+VcgULv0dEKDBiGAZN2exe3p5JsXdHv3++wzUJ5m2dd5T02EdKHX93IK8nPttmoAfDYL9dRrJA8AbSWowFOkafzqYchi2agmZLY1pD3MyUgH/nBMZ9hWkOvVoc+qrZf0brx0hkwO9PbXanr08GCguqDBtYSU8s6u5js53pE7hl5FMCpov65V5xQoDwwFtjJexed/MEktxPjAbkxZGzZi+f62G9sDiG+o3bZHqKt6sH49i05A7fbUGaU1/b0hTkw3R0FCWCx42GHEb2ebwoFjQ67wvpD+z9lmQ+E6hmZjqK0KJl6hJAiqLyL2G/4PylVu4/zWKu+nMIdcOClWDjFxwbyTQSRgTI++RMJWt0+6ULAtd1WLz73CZR4tBrSZVswfKDrNkicC0VRT1l9sXialw/zPNn5pB5vMD8n6IFmLax3jSPgFEFH/q0rHNRKwqPSxQa3lUUKQY4w6zBhevhDeHJUGTajcOqmYv/TryNchlrVPWbULPQAx0ktW6Jcz6GlzW4L2+jCymFOs+ClZ8DP5BryemS/YTUYHh4rm8z1DJDueRyC7nVOgeS8Q/x+1KioGGKrNJmDJv70ZNzsYU9LEdIqHBfwDWv4VjDfx50sn+rqG1R76SW3UetjOYTrxKTRmgyMnLdG1WnsxTK1njhemAyvU0jGZSGpV3btp41cdURp7fa+/F0hYm/QStGxVzDKXHNuyJ4Iv5WAHwREBBVBV9c8igC4DjxiCiSYI+lt3rzvdxSe9l8MvopDJ77/OqdusTWq92/oj0cb/SDpbWMNqK5cxgoxWaJiDeO6mpGNZqO/MZmKlWcocL3eEr8YmJ25bHKeCTtRPYVDmg5D7fWpkEwyLEhOEPUZxsEmWa dAQe6mWt M6UVgLcX4adjIITtRAop//lPRvjZa3akqIFNMHhW+0F7atG31MW1COLpgOwlUy9FXLyD05jobMVdWQDqvKni4QTrBy/NpwV+ozLAWmoMk+zLTQIEwoeuN5H1x0jn+2TOp6IABOh6hHlE68ozUD1VHEzanJCQW9YIuCdWwzfJ7WKGJjH8LwAiNwuv21RuySewdM52ARQ+LIUKRRx2eapIYlD4zr6oUlVE8AICxLAd57XvT2EX1RIZddekfab10xyCaxqt1chnpf8BwvIg99CBBGejrecQiFjj7RXw3/T9iXrYEOYyZaq9qr2+bJfz/9vJ8BFt0htPHhiWf8xpiE2HUsWtbxyC7IszVe7b3/5hYMatKFJ/HHG+LPudB3RSvjBo5/jVvZEDdPhK3Ou1XLGfqItPbFweCmeqxKnMHIMo2h5Q1Tnvjlw/Nw/Lehkq9+8Z4cQlVQi+6mzW8Iewo2y0sxDpyk8la8KrpgHSRinjvV4+OogQa+28SqSZPevHdaAy5DDmzR9PG8O/sWHd8G6HdfR0GcJ7NWqTKz3wr X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The CPU advertises the maximum number of pages that can be shot down with one INVLPGB instruction in the CPUID data. Save that information for later use. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/Kconfig.cpu | 5 +++++ arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/tlbflush.h | 7 +++++++ arch/x86/kernel/cpu/amd.c | 8 ++++++++ 4 files changed, 21 insertions(+) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index 2a7279d80460..abe013a1b076 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -395,6 +395,10 @@ config X86_VMX_FEATURE_NAMES def_bool y depends on IA32_FEAT_CTL +config X86_BROADCAST_TLB_FLUSH + def_bool y + depends on CPU_SUP_AMD && 64BIT + menuconfig PROCESSOR_SELECT bool "Supported processor vendors" if EXPERT help @@ -431,6 +435,7 @@ config CPU_SUP_CYRIX_32 config CPU_SUP_AMD default y bool "Support AMD processors" if PROCESSOR_SELECT + select X86_BROADCAST_TLB_FLUSH help This enables detection, tunings and quirks for AMD processors diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 508c0dad116b..b5c66b7465ba 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -338,6 +338,7 @@ #define X86_FEATURE_CLZERO (13*32+ 0) /* "clzero" CLZERO instruction */ #define X86_FEATURE_IRPERF (13*32+ 1) /* "irperf" Instructions Retired Count */ #define X86_FEATURE_XSAVEERPTR (13*32+ 2) /* "xsaveerptr" Always save/restore FP error pointers */ +#define X86_FEATURE_INVLPGB (13*32+ 3) /* INVLPGB and TLBSYNC instruction supported. */ #define X86_FEATURE_RDPRU (13*32+ 4) /* "rdpru" Read processor register at user level */ #define X86_FEATURE_WBNOINVD (13*32+ 9) /* "wbnoinvd" WBNOINVD instruction */ #define X86_FEATURE_AMD_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 3da645139748..e026a5cc388e 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -183,6 +183,13 @@ static inline void cr4_init_shadow(void) extern unsigned long mmu_cr4_features; extern u32 *trampoline_cr4_features; +/* How many pages can we invalidate with one INVLPGB. */ +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +extern u16 invlpgb_count_max; +#else +#define invlpgb_count_max 1 +#endif + extern void initialize_tlbstate_and_flush(void); /* diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 54194f5995de..3e8180354303 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -29,6 +29,8 @@ #include "cpu.h" +u16 invlpgb_count_max __ro_after_init; + static inline int rdmsrl_amd_safe(unsigned msr, unsigned long long *p) { u32 gprs[8] = { 0 }; @@ -1139,6 +1141,12 @@ static void cpu_detect_tlb_amd(struct cpuinfo_x86 *c) tlb_lli_2m[ENTRIES] = eax & mask; tlb_lli_4m[ENTRIES] = tlb_lli_2m[ENTRIES] >> 1; + + /* Max number of pages INVLPGB can invalidate in one shot */ + if (boot_cpu_has(X86_FEATURE_INVLPGB)) { + cpuid(0x80000008, &eax, &ebx, &ecx, &edx); + invlpgb_count_max = (edx & 0xffff) + 1; + } } static const struct cpu_dev amd_cpu_dev = { From patchwork Thu Feb 13 16:13:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFC30C021A0 for ; Thu, 13 Feb 2025 16:22:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6938A6B009A; Thu, 13 Feb 2025 11:22:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 669C56B009C; Thu, 13 Feb 2025 11:22:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5312C280001; Thu, 13 Feb 2025 11:22:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 372CA6B009A for ; Thu, 13 Feb 2025 11:22:38 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D638D1411A5 for ; Thu, 13 Feb 2025 16:22:37 +0000 (UTC) X-FDA: 83115439554.30.F0BA992 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf20.hostedemail.com (Postfix) with ESMTP id 3D75C1C001A for ; Thu, 13 Feb 2025 16:22:36 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf20.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463756; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v1NWRZNaw4jPjHPHf45p3LReOe6Fwhc0drvpjtQVzWI=; b=YzCLTsgiPfaYW/kSW+j/9sjrttOi2Vnscvk1rxguGJERU1ZKcithW+9cCleH+wHHR5q+za fri96fF3Y7tGSH/RQ2TW3tsRb+74ZTrg29ef5RxIr6rSJVd3/AEzev8bzQWmUw9nbfS0EK lPBp8JCslzHiinhLKQ4tiHWJUHTBVuc= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf20.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463756; a=rsa-sha256; cv=none; b=ANIXiPmijS6XZatBGhIDDwWTVPM/zfVaNZEnaR9C33G47GuiqyW56lmRVZLzKD+0C+e4W+ 6vz6QDJ1xuQc+ZBg18PUhcibGCW+bVbhMjxbCo8orZVbLU+uDmtVPQgA3VTPBl+b7GLD3I xqWhwHqkWFr/k/50SEzaaMK/opOkbMA= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0kxt; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 05/12] x86/mm: add INVLPGB support code Date: Thu, 13 Feb 2025 11:13:56 -0500 Message-ID: <20250213161423.449435-6-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 3D75C1C001A X-Stat-Signature: tgrfrbnbsjjua8cchmjmy1w6kcppgem7 X-HE-Tag: 1739463756-940623 X-HE-Meta: U2FsdGVkX18g8tG4zWAZt+RcPzov26IkduKRst8w3NoOqvqDa73c47FsWawYqIGHsVILZrb1C3rbTZaix3/1rhGvr8CY7nhAwecshJtILHbAc+mgK3MOhlipOt0tMDS9rrxC6AYKL10ZA49jEQg7wkkcwiVhiO09fn0CsbEKSSIlHKcKb9AMLkIACPY20mblozLC/qQUs5fWY+spX3tEoikYlxQynY1xv9RsRzaVCxZxt8/jAkHwWVS4RmWM7H6zk14yq5NJRSLBd1SHkpnGNcjnVTVrE0oFBynKjHBBh3ec/Vfxs+WfJtneX9VoiQqWWf4Ts7VPF62svVQcvwhR0IZgWrJbawvJnX0tHRbHWz+/aEIutKlxGDXs0CtV7AZlfNSOxbAyRad8SflDNKmHpoR11nofKGr6Ra7iIPHukaQaYzm6LBIexHroOqw8qKdBIWWhCpI4uZiGbM74kOJ4nfhS5x7fhfjcLDgznN5JB6gAWfSlyHuKqPMPuJNDanPcld130Ly0aFi7Tg+2r5SI45fXeBUcUymXOfj3yHlVgJwAla827Z01ODGsaRVuL2EO51y58kaluKjc7wWADxfCNoRN057fmwWxDze+zAiXbd88Vd9YP9/kbwrzMECcrl5G85u72Ccd+FtfU4iKjvghfYjVW3qExlfs7aJE4yj1EZpBGF8f1C0rFOik28+szJarGr5QkLXPjx4Soipzwb7gP11O1QHJn546vRsWzf8KNkB1yNfMzpbr/dosZ/hSmV1FjwZ5dC5wbKfr1nPt1zgMuUDYG7ZzY/mqpPYPQQ3m6v+MudmhXpSqbmv0ng3qb3/xXTHoIYqffQJld0KpHPdikNkJ2cHglvjPUzHyq4oi3rlSMgjEzSfT0FNiljLNl2ga5KO/PajsifEuFAJOxnQbtFo88rP+2d8EoFOtu0ynkQLPf4UXSLMcMWX/KBwFQRYMd6gQmf7R11QqWLoPMfS rzV/RHkC NLRLnTMDngzu0Rup2NyT09lNN0gsh9HzP5860lfXT33SIRYf7E8MaUl4AiWkIJsedufAMmEvzFUmvX4u95Wijp2XISc0gl6m729Ot/iFhvZBZ+4pYg+dR6UBPQBwqDUU/FGiQO7G3BfIfYYRtAfwGDU6l1dROgGOMHPTYXfcFqnAseI4TMfFAHMN4DvF8IG14+fN/OZtsQY0QFHQoPoSbsbCJ+YsQ1nx7mthOTe7xI6elCSpFsgsTv7wuhT7DQ1u6qVlT7Bz90MUcYD/lzZwLiygyn0EAd98q7YjEBE3PA3VHZgMP2UjKBxMYZ2J4hC+LEisoPOC25F3sk5N7H7vQXUlnEUHmiYZdYC2FvP15KkPgpc7kvPUvTKeqomZSk/nLX5BC8/d7U5zfPPIIFTHeo9lvxOSPyNWoHw0iOeQRdS5hp+OI1zZ5cecFtvqAYbtdjA/V6dtSxcTbDWQGlUZbLYIA0ChRiNXiCn+WBjLomO/PmNJiuBLrsCBzgLPbxavF/BhD5AEQlwW++Oeg0mKnv/Xhg8CXI/HNqX3l X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add invlpgb.h with the helper functions and definitions needed to use broadcast TLB invalidation on AMD EPYC 3 and newer CPUs. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley Acked-by: Dave Hansen --- arch/x86/include/asm/disabled-features.h | 8 +- arch/x86/include/asm/invlpgb.h | 101 +++++++++++++++++++++++ arch/x86/include/asm/tlbflush.h | 1 + 3 files changed, 109 insertions(+), 1 deletion(-) create mode 100644 arch/x86/include/asm/invlpgb.h diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h index c492bdc97b05..625a89259968 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -129,6 +129,12 @@ #define DISABLE_SEV_SNP (1 << (X86_FEATURE_SEV_SNP & 31)) #endif +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +#define DISABLE_INVLPGB 0 +#else +#define DISABLE_INVLPGB (1 << (X86_FEATURE_INVLPGB & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -146,7 +152,7 @@ #define DISABLED_MASK11 (DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \ DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK) #define DISABLED_MASK12 (DISABLE_FRED|DISABLE_LAM) -#define DISABLED_MASK13 0 +#define DISABLED_MASK13 (DISABLE_INVLPGB) #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \ diff --git a/arch/x86/include/asm/invlpgb.h b/arch/x86/include/asm/invlpgb.h new file mode 100644 index 000000000000..a1d5dedd5217 --- /dev/null +++ b/arch/x86/include/asm/invlpgb.h @@ -0,0 +1,101 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_INVLPGB +#define _ASM_X86_INVLPGB + +#include +#include +#include + +/* + * INVLPGB does broadcast TLB invalidation across all the CPUs in the system. + * + * The INVLPGB instruction is weakly ordered, and a batch of invalidations can + * be done in a parallel fashion. + * + * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from + * this CPU have completed. + */ +static inline void __invlpgb(unsigned long asid, unsigned long pcid, + unsigned long addr, u16 extra_count, + bool pmd_stride, u8 flags) +{ + u32 edx = (pcid << 16) | asid; + u32 ecx = (pmd_stride << 31) | extra_count; + u64 rax = addr | flags; + + /* The low bits in rax are for flags. Verify addr is clean. */ + VM_WARN_ON_ONCE(addr & ~PAGE_MASK); + + /* INVLPGB; supported in binutils >= 2.36. */ + asm volatile(".byte 0x0f, 0x01, 0xfe" : : "a" (rax), "c" (ecx), "d" (edx)); +} + +/* Wait for INVLPGB originated by this CPU to complete. */ +static inline void tlbsync(void) +{ + cant_migrate(); + /* TLBSYNC: supported in binutils >= 0.36. */ + asm volatile(".byte 0x0f, 0x01, 0xff" ::: "memory"); +} + +/* + * INVLPGB can be targeted by virtual address, PCID, ASID, or any combination + * of the three. For example: + * - INVLPGB_VA | INVLPGB_INCLUDE_GLOBAL: invalidate all TLB entries at the address + * - INVLPGB_PCID: invalidate all TLB entries matching the PCID + * + * The first can be used to invalidate (kernel) mappings at a particular + * address across all processes. + * + * The latter invalidates all TLB entries matching a PCID. + */ +#define INVLPGB_VA BIT(0) +#define INVLPGB_PCID BIT(1) +#define INVLPGB_ASID BIT(2) +#define INVLPGB_INCLUDE_GLOBAL BIT(3) +#define INVLPGB_FINAL_ONLY BIT(4) +#define INVLPGB_INCLUDE_NESTED BIT(5) + +/* Flush all mappings for a given pcid and addr, not including globals. */ +static inline void invlpgb_flush_user(unsigned long pcid, + unsigned long addr) +{ + __invlpgb(0, pcid, addr, 0, 0, INVLPGB_PCID | INVLPGB_VA); + tlbsync(); +} + +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, + bool pmd_stride) +{ + __invlpgb(0, pcid, addr, nr - 1, pmd_stride, INVLPGB_PCID | INVLPGB_VA); +} + +/* Flush all mappings for a given PCID, not including globals. */ +static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +{ + __invlpgb(0, pcid, 0, 0, 0, INVLPGB_PCID); +} + +/* Flush all mappings, including globals, for all PCIDs. */ +static inline void invlpgb_flush_all(void) +{ + __invlpgb(0, 0, 0, 0, 0, INVLPGB_INCLUDE_GLOBAL); + tlbsync(); +} + +/* Flush addr, including globals, for all PCIDs. */ +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +{ + __invlpgb(0, 0, addr, nr - 1, 0, INVLPGB_INCLUDE_GLOBAL); +} + +/* Flush all mappings for all PCIDs except globals. */ +static inline void invlpgb_flush_all_nonglobals(void) +{ + __invlpgb(0, 0, 0, 0, 0, 0); + tlbsync(); +} + +#endif /* _ASM_X86_INVLPGB */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index e026a5cc388e..bda7080dec83 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include From patchwork Thu Feb 13 16:13:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973654 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A4D5C021A4 for ; Thu, 13 Feb 2025 16:22:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C85C66B009E; Thu, 13 Feb 2025 11:22:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C35CA6B009F; Thu, 13 Feb 2025 11:22:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B24ED6B00A0; Thu, 13 Feb 2025 11:22:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 9687F6B009E for ; Thu, 13 Feb 2025 11:22:48 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5BC26B189E for ; Thu, 13 Feb 2025 16:22:48 +0000 (UTC) X-FDA: 83115440016.28.260CD5B Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id D00E840002 for ; Thu, 13 Feb 2025 16:22:46 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463766; a=rsa-sha256; cv=none; b=7tV/4mGNMEyySe81iG2p2xD4oXOUgxaU/iWImvuPxfqWQ7mX27lmARGmJLz8R6CogokyTw 9HYI3zNCK3JoNtnTZ1T8pBXemDbIK6mQqVR4TrLb6BGP9Qaah3nChCTx8obrLgAKlaXXxl 0E8xsbUlvspMLIyHDT2GMG/Nd39d6zU= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463766; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MZekwt/a2g4QHVCMg0sD0VVM0bg8cvPBbYfZSuWrOGs=; b=aIGYJMwm/WKn4TrliV0Q2sfWGmTrBit/viQ0tWefU2pcgWV0+mmX0BxxromWd9PcAlsWLA B/ldTVn4sFI20Hh0dufIeRUl42U3s/mTFB3JPre/GjQlrXWY6quYTeMTR7706bpSlOrXJV rBxuLJARmUW1xt0/1hrDAQkMbaBaa48= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0qLG; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 06/12] x86/mm: use INVLPGB for kernel TLB flushes Date: Thu, 13 Feb 2025 11:13:57 -0500 Message-ID: <20250213161423.449435-7-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Stat-Signature: y8xx6n3j4pgzfqxg6yet83i77ge4rhiy X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D00E840002 X-Rspam-User: X-HE-Tag: 1739463766-352026 X-HE-Meta: U2FsdGVkX19pvXf9eUWHfEPCn4NPQ4nNwTunyMJMDyrsCyizTjV+DyM2/U26e9BK466dHdoS6Cz/VEpKTfmhrfB6Mt8q5G27zlcHtWQNH4vh7kal+yvpzSVzAhv6Cnhw3TXC2hvDRRCyge28i5D4u6ghCfmn50AOn8CyGi5MA4GO1YuEelYH27qflV9WusqqMGUoe1d8coTVGBH0j4geAJdaVy4PRUG8nENjZ25z78okEG/i9/QwqZx0KEQ+8haXvuupLaWL+W5xwHSMUYAhp2Prl0aPJ92BT/93QnXwPWrrQAKG5wFOaOmbZCJpWCcRzGKR3Ld5zZ46mIY+MzzC5o/koxLbX1thLdAoaKMnkQDGc8jV7rC72Y2RapXAgZdXiwSb+t26yKkNsxZDbWpHl+tv/VrShMj8/fpgZmYR8MBnbe+UGabVFkje0bmRrrRT44ldtCcga97E96RmnoI8mKnD4QjtAEHbTRs0YdunD/lZ4yLCiZ61e/Y93qddWLeDdG1FzxEO77kewmp80ODhy8hv6gYmp5j1yzBDtMkC+VGCnW+8XZtcgv2KQuTfTFyqme08H6b+uSkNjf292C5PA1xk6L9vkbMCxSDp7QjMKcNUCrJL5oik/Av/COwR2RtNhmbORkA0XRV5JQoZtlMfy+pkrlmBE/FIZyG6dO0psnhLy7FZkYihEnyJAp0FNtQY5On2kt8duDLX9v5Dqgv9UL2TBmgcDGYVr3oPUuxFuSsmlOYoMZLD3D2IIAplq0PMYPuw3P0Mpby8sU3VQqM6p4XIG/X7P+e6dxR9mS4TKp/sfxi5/KwtLmfSwSLanNBTLbfDCa4kIKkCpW6i9r8UDMVef1tn/sAB5x4BjyU2An/YSFfrApsmEV8zFAp3rQda/zquYPI6I4QvFpa9rwcizhKWpkprysWQt2tp0VXN+KR6jMJq9W8jgNdKlijJz1oaaWdtLKiOv8Kiz1l+N7o ECwnWron c5Sr7bGNY20z7E3J973Ja4hdsviNKr6XaeljhiPr4FtTTr18z3QErpVij4VZi1jJNsSz5Lx7W8I6yDYuRjFlB2zu8n/VuTH/Um/qAbiiViv+AcgXMDYjwyt6pbSRetpxOnVN12JPT1rcaNugHmN65OV0WmPXu4B909LlhUblHtCo2yCdxgPyfNXeiQU9VP2PUj+HrkY37fN/Lppd86V48EWMS60g331m4hpyB7ndhpQl7lslYaZKMLeLyc1zVvxtchMGzLr2stycr9xwnxOkXfBu8hzOXOp9+V7+NozdcDkO6PWMD0MHoUVCsEywde3mwiTqR0MDTjnvgi/UCMO2m2eUd+nArSU1GDNCOJ9QIIDkQPr4fDO31HKZpYRJu8ZmQ4mxm1/ln2Fc5r1m4j4yxzLsFI1tVMc1OetzLQqIlcHugHdJOCmGD0aqJ/QhJVruyB6Lvwd+scvy0QItKLWz+42hMySZU69Glu6NcfBtVOprJBYrPUbXwxRDbD5ixBteR+g2r559M/8BkRDddXoA4sbjm9sf3soSNTeLq9FgCJetwC7gvMZDcMFthWptucpNC5LNp X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use broadcast TLB invalidation for kernel addresses when available. Remove the need to send IPIs for kernel TLB flushes. Signed-off-by: Rik van Riel Reviewed-by: Nadav Amit Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 924ac2263725..ce9df82754ce 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1077,6 +1077,28 @@ void flush_tlb_all(void) on_each_cpu(do_flush_tlb_all, NULL, 1); } +static bool broadcast_kernel_range_flush(struct flush_tlb_info *info) +{ + unsigned long addr; + unsigned long nr; + + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return false; + + if (info->end == TLB_FLUSH_ALL) { + invlpgb_flush_all(); + return true; + } + + for (addr = info->start; addr < info->end; addr += nr << PAGE_SHIFT) { + nr = (info->end - addr) >> PAGE_SHIFT; + nr = clamp_val(nr, 1, invlpgb_count_max); + invlpgb_flush_addr_nosync(addr, nr); + } + tlbsync(); + return true; +} + static void do_kernel_range_flush(void *info) { struct flush_tlb_info *f = info; @@ -1096,7 +1118,9 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) info = get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false, TLB_GENERATION_INVALID); - if (info->end == TLB_FLUSH_ALL) + if (broadcast_kernel_range_flush(info)) + ; /* Fall through. */ + else if (info->end == TLB_FLUSH_ALL) on_each_cpu(do_flush_tlb_all, NULL, 1); else on_each_cpu(do_kernel_range_flush, info, 1); From patchwork Thu Feb 13 16:13:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973650 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EB23C021A4 for ; Thu, 13 Feb 2025 16:21:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B4C676B0096; Thu, 13 Feb 2025 11:21:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AFCD16B0098; Thu, 13 Feb 2025 11:21:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C494280001; Thu, 13 Feb 2025 11:21:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7FB996B0096 for ; Thu, 13 Feb 2025 11:21:16 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 29C3CC1027 for ; Thu, 13 Feb 2025 16:21:16 +0000 (UTC) X-FDA: 83115436152.26.062A633 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf11.hostedemail.com (Postfix) with ESMTP id 2FF3D4000B for ; Thu, 13 Feb 2025 16:21:10 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf11.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463671; a=rsa-sha256; cv=none; b=1Akak3nqefDH67L0qF49irTam9FYtSCI63puu7Kyxp1dTSQEAUd3ljJepKiwgeHsS4CGDs dDj3cNhhlgIDDFuTkI81KVvKQJpnhiycW6XX5hx9rnUXv0CeJ6EJRKArKnGiQHlq6e+u/6 NXHs68vVUurFDvuiPopMYn3oYEr6l8A= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf11.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463671; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q9PFwVYBOVbL4bQOQKCEq5HQ+oSbeU6m4cTyKaYbdd4=; b=YvmOeTaV54pR0VqbYmo+bkpEcnDLo9Y+KN0ItvWlherFspRww64ubuaQUe0ygSNAEmyd1H TJUXjCDnPZOyldyKwgUwooICohPXVYtBJmpyu3qD1DnJ0a1M1+I5JS5P8hjVLI6rNiM+7b MiM9ZvM+B6NkWrIBbHdTiGGqrznCVso= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-0vS0; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 07/12] x86/mm: use INVLPGB in flush_tlb_all Date: Thu, 13 Feb 2025 11:13:58 -0500 Message-ID: <20250213161423.449435-8-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2FF3D4000B X-Stat-Signature: 9y7iq6sa64za948cjybgt7euw1h4rau1 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739463670-308966 X-HE-Meta: U2FsdGVkX1+i08vHWSb9L+8in0vBMjCLQEDbgEtyJyWqCDynS2X61VFfEb3ldTJbNZuDbwXmQQ08/D4m4cOX7bsm3OzK4X4Y9tArnRK/pGWeMIYd7xr566ks3nCJhsc49ATYDlGy5XDo65VhXNFRZ1jj/snZ5gbLMjaWAYKuC4p7M0o7mkIfzxiWHiRjue/6m3yPyCn6FdwCuGqUOHCVCsCMGuToK3+utdFdRxPcgCIOTIP+tLZKVoX5eKEektaiRpwXrdhDRrwA3kHKyJYH5eGya/1aHvM7+KTIdKK0itrXiOa4MceCXL7xE1z1fq7nmtIOpTN6G9kyFBZ1oYA4xWFcSPIvCBdukhT478IEALlKxgZVuw9Ali2hARp9wEc7Wt9hv7bTNRKDsRWVUq2PTsYlU07ULrqpccMH95ZTvgI8rvCRe+YL3s/orBaWqTpMaHcIjhzogfIe2oybz/tpu646uSlxkWjplVCbG++4dVRpzHH5sXiyvGxU0Y9i8ey6voR1bafz5tV7icpQcd2l1n1Trp+BMIvTJsizN8w+FK3emchS+j/UghxjsS1gMlWxIub5ppCOffbXO4BOnT24sE0r0yscLWH11b2U8UafQG/6+OXBlz5l/vurpLVTTMzwNCAelbHSqddv9rJ0XfGm+6wq2/tW/6kb0piJQUCd1PBs9r96vpAipQNlfwVW9BEy1zVWDdCwwjub5W9coXj5Fxxf0Pyrsedf+T8xH8TS2URr/+RzYLpHa5i7a1gUEKL1C0hYzvKXjhf5Y0XBdJIUVVExGQOBk8tImT2I0vt39Pgy+J27BvJyIlxIzeaUcsZRHpDI7P13kz+CCa43fkwVT2T8s/Z/8TcQnEhHnTywr1GHGJE5FDf0cRK0yi+6gxjkBp9kxVOzlM12QMKw1Lvuen0nELm5Y1+TGdEYkcjxf511JXfUopI88RgDrzeuzSMjHi0ojdmSvqh3WoRnJPO N50xLTk5 xmuDdPS9HxW7AswinEZlO44aAhj+Asut5ArLlYPTCsHVx8vztJ7BtW/Ji5Rgv+AeB/p9YKVjDi0uJcVF63UAeB+fFUFa/5J1NW3e6IbVJsXTu5MEpNn1xETMRnXe7iGJHvm5lyrNZ2lEte379tyzOeF3ZsXkmLn3OeBI3GqizrX4N3Es+rQwK72oaWqbhUSulfpvF+/KbrZfbioGPDadMjHuZ2pN6ka7ftiabYq7HO2ELYRSwY6n4qUlcwq5ls4QvHZjN0GJUWC5APmpLdak7d7YcD410NmWVtRqQwpLzynMioCNz+yulW5JXfh2b/NndhEKiEQehDdqqU+coa8oMcEeqZDub6ABltYCVfLA0Sefub7BASph+tqpXJrw+kav3x47k+7CzrE7igiO2v0jmqj7DJR6bOHorvWrNcdTS17TB//7fuAfrnslG0WKpGJ2F/qJPWy5c1aKR5zWrdLv2UcjfuiA5PrAxunYRvEe1IvcUr2ltcodIWtz4TA9Cn4Cn8B+5oR1d7dqnAoKZekLylgjA810nKaiLH39/KuFw4FWWfp3oNrvmjwdOEM2gEeN8tkVtKgWqgClUF9ey3k3tm1yE2z5hS3/8EoBA84WHzv2e6OM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The flush_tlb_all() function is not used a whole lot, but we might as well use broadcast TLB flushing there, too. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ce9df82754ce..3c29ef25dce4 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1065,6 +1065,16 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, } +static bool broadcast_flush_tlb_all(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return false; + + guard(preempt)(); + invlpgb_flush_all(); + return true; +} + static void do_flush_tlb_all(void *info) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED); @@ -1073,6 +1083,8 @@ static void do_flush_tlb_all(void *info) void flush_tlb_all(void) { + if (broadcast_flush_tlb_all()) + return; count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); on_each_cpu(do_flush_tlb_all, NULL, 1); } From patchwork Thu Feb 13 16:13:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 937D5C021A0 for ; Thu, 13 Feb 2025 16:20:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 339066B0092; Thu, 13 Feb 2025 11:20:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E65F6B0093; Thu, 13 Feb 2025 11:20:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D55E6B0095; Thu, 13 Feb 2025 11:20:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C996A6B0092 for ; Thu, 13 Feb 2025 11:20:51 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 32410C05C2 for ; Thu, 13 Feb 2025 16:20:51 +0000 (UTC) X-FDA: 83115435102.22.DBC06E5 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf15.hostedemail.com (Postfix) with ESMTP id 6249DA0002 for ; Thu, 13 Feb 2025 16:20:49 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463649; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y4BkN4UDnn45Ns4YQ+te9m/mGXyQZUGxlySO84t8JQ8=; b=M3IOdDE/rlWzYfevAY96wXRX+rRjWdM+9+S98B81YvmFq5UgPKReSnAX9Mv7WrRSt44VLE r4paqSxI0AdZcCscLkbd7NPjcN7InFYslukpCXTT2ghDYyl9VU9eqbjfDoQhpbaUo/++sT exkJHY/489PQFe8Bk9N2UxPgvVztnec= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463649; a=rsa-sha256; cv=none; b=6P978OJnfRd0FNrNfRV6AvDtmTL/guX5pPt7++z6ecNdZxVL6l477aWprknbsN04hoflAN fZLwcrW2oiNn8c7JRvbr85n/Hc4hKqYWF8oRhFx/S+EEBvALCWxK59K2trKkQ4LuUL8LGn t76D49XyvaNqEZOw38QYaNIUS0NHLhQ= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-1037; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing Date: Thu, 13 Feb 2025 11:13:59 -0500 Message-ID: <20250213161423.449435-9-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6249DA0002 X-Stat-Signature: u6sdijmskzzc7sum3n54mjm9pdfong3b X-Rspam-User: X-HE-Tag: 1739463649-143566 X-HE-Meta: U2FsdGVkX1/lVnqj+JNV1NVdC48+bLG2TQyUTsZQ5mNNfBEnQSLyguQ+U0kAD6ZoAow1Nbnd8nj9cqL+FCFl6AZa9I/8zbLctXR3OgsEzLUNu3dk4QozjjW5OayLctY5zBRSd2YMXT9gDhHR+BUj1LoS5Gez8rNoCpzjuLW5GSVp7LnVKrMm8971qQp/8f8PfluF0w6AzTyZ0rKk9erV4UfkU6TheuFCYozNhN4wIlEdtEGyDIx5KYzTUicxsmrJDXAD8y5+Dk++qHtKGeOt1pCiI3KkLzgHQa1iYb62bQmbkhqMmx2m0UhuSElbrO0t14PqtZfsKeDV0uu/xFueX+iCBvGENFOWp/mhWWiX7RGScJwscNAiutNkLU91VZGR0iFHhidKMBRUKaxSC+v6n9/OoV7pAOHbdazplAPHrIrMZNgDiVS/lMDNF2MhhjAtMdLTsrEpu8HPXnchziRpvKnvfGVCHGtXzsk+Hu8Zn9/SZeWwSy+eTqxZMkzK4RKYvOrh9EVBsS9ig3UUVOU4rzvBsIhUW2yxardOAnk/cakZdp7fQxq5zJcXTg2LGwNZgZWDenWHW+F2A1s1x7FuYh5ZAMdYH9AlQXabsey/7RWMHubk6MCbdtqjOp11sJhUEV4+mLqz3J78IwRQlJYbHT67oXka8IUqqndCRMMjnpRVgqTaEebAY37aBWiiO4zmCkBOTgfmeKqzZyXzaoo3ox0RT/F6fSvisue+4nqtI9KWuoz7sOXnGTwR+gfcmXk5tF4Bj2dLaqWQFX1EBQuZucznuctgPTkaWfwPB7g1KkNLNMHsGKAz3Vq57//veC//91nBoy/mAULxej0LZUr1ITu2nLCVYwuAQeXvUef2dfaQwu8r+jMGDfu/wwwaKBWjFQGAmOZHdBxVvWNodjoJTLUGD3GLbF/VlgQwsJzmLsF6oWa4FhjgFwzXfMSRSWIjWVwfJvpxdy+NSKLX3E5 1tPutxHW Mo8Ud2AYqWfUTPXTTlU1JA7ykTm3UTwjxW7bthsjl5DxMNRWUOoFCIHghJVl8GCi/kC+eeJPXFMxGmT5z4kWRi67eLSLS3XES+LKB0h2sDgpblrIZtYRM3na/3+oqVmrvSfBMM5zPyuw4XJ6o2crB78fHlp9yueh0Rho/IKiVYp9+EpaqHgLmV7limoqbCE1kj7HACvzOy19tQJB/T07fwU3rZUV4vqnUV7MZ0CEUAtxL1n+/4nWoEUo7uLo/J1Q8fQx3DHoS8Bic0Y30oEaWQQ8pzYJHryuJpcyd9YHchCKtL+ZnJElLhVZhIQ/ZZu0pSPWr8eQuGUiWOTvBuzs2izWnnCAHJf4ftZIogMuii+vjSzyCTxP3oyDk4IxWXtEw50fTnhPrdDpEkw0fm3XnS9KGWOmgspINwDfLP4cnsKz9zpNo7/CDexI5AXu8AtXrklye47mwytBaD+2h/+W6uL4yop6qbUvRgGgmxH/jW6J/d6QU+HplV9VXQ3d/pbEZpgKlzvIVD/Ni+OfmhoSLP53yiPoiwK5lby93 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the page reclaim code, we only track the CPU(s) where the TLB needs to be flushed, rather than all the individual mappings that may be getting invalidated. Use broadcast TLB flushing when that is available. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/mm/tlb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 3c29ef25dce4..de3f6e4ed16d 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1316,7 +1316,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + invlpgb_flush_all_nonglobals(); + } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); From patchwork Thu Feb 13 16:14:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 786A5C021A0 for ; Thu, 13 Feb 2025 16:23:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 13C046B0083; Thu, 13 Feb 2025 11:23:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0EC616B0088; Thu, 13 Feb 2025 11:23:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ECF056B0089; Thu, 13 Feb 2025 11:23:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CE9406B0083 for ; Thu, 13 Feb 2025 11:23:23 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8E510B04C1 for ; Thu, 13 Feb 2025 16:23:23 +0000 (UTC) X-FDA: 83115441486.21.6D200B0 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf05.hostedemail.com (Postfix) with ESMTP id D33EA10001A for ; Thu, 13 Feb 2025 16:23:21 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463802; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EMnKu0j4BUiKwrRCyDlzEjTY90bbVGEKNaiz5aGCd/w=; b=CFvjfVKdABQczGSWxhMAP4D2y5FSqsASe5b8LfF4RUVZDoOo3rcKMyB8Mc4W8zJu8a3Mmb onP70Dt8TrqhGuekruJ2sCFcPh2NvQBqwDU6Ut0wJnnT9Uo8vVT0jtte1Mw2rgb0XlWOJN vmSnmGGbhrCOXnf5zM288zNQP4Q/DG0= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf05.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463802; a=rsa-sha256; cv=none; b=qF8bptVOZa8IVupopXmLCQ3aUwBZvJAKRho8rGl4ewt0gYtOtjdbmQ7J5w3w5f02rBBJVn jd88HQFM5eZSidZB3jLEiwJuYhH/JD0OmArS6+oWb1bobcKwRQ2EOuZG34xJkHu2dUiPWF RN7V+P4+Fpiok0YUZYqGNgCHicFZNzc= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-15FB; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 09/12] x86/mm: enable broadcast TLB invalidation for multi-threaded processes Date: Thu, 13 Feb 2025 11:14:00 -0500 Message-ID: <20250213161423.449435-10-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: D33EA10001A X-Stat-Signature: uhnjhrep8ioyqrr77p4irfjg13exrup6 X-HE-Tag: 1739463801-543789 X-HE-Meta: U2FsdGVkX18iZTJX4IPQ2fDeu82JWUhkWrLyUCRwhAdpBgdmVrMlQ1JvmoACayQQiNcBR5aTQ8zwlBZiJfwPSgBKPicMhMYXv5zdyOF1lWDafPocdPK0LcW+vgK9yUeU/X3BS9gpxYYHNiJw+GdQwPnga+rWmjV6sce2P1H8KS7a3RfUHdzgAOox2nxRmXZsELGGvR2jcBP0BBxfpXekVPJL26YwVzE2HHsFO5eSGqCm5FH83XIBvkSj8dCcbPYTAhcav6XV2mQ9AZ2lus/lklt86Ik2+NTrEmptgDvsfzCtFwgotDmdFlZMiWhdiDkNZKvemOt/zli2+Wh1xhGwYwMD7xqOjpZULOX/Ci2c9/77upZ5bBuOxwh5tdBcNxj1mmBCF0WVggyzpIBMy/+WcS722uGGTEFyRZAqis0jEzStoZPj0rL7N7IeRY8WKeZXz64Ae4n2bnCvgMJ8UdPBVOVnyYYASijKzH5d1JAU0CnneWqN/XwyntuIOmNrf3NGyksieBY6GgdFLObBOPIK5BG0smWBxNHcPDDIzdlX0wcGQnQhCwRYTzlaLUrz3EfHsqKoVfpyZCm4U17vWFdcRAuueEGhVeBEJVVCPCb+15Yj88nSlXk9fgBgOHI4YzEZjMIkG7UMnsPQQ44OhKsTNIPr49URjA/TPXivnkiM81d/1NH7gbA5Wc08qnn5uefgR2kvMHnZ00LrpoqaPwa9tvZiz4vDOXjo7Q8nuO9SBTh09E6v/xJShOTJySasgT2ZAQpDCkjA7KzdD4vwLF4Q63UfWlO0zi/Y4S4i9gyjH/sdWfAqB51pWDosfeEQkJji9qBKg/+3OyQ6wnrQPzXNXWVHyqhCc6KCnZUpgPLmFGG10HRxv0rnHoCjDVSguMPxVOeTy66LVHpYXQ8kC1giEsHF3lomiogZbeaNzQ62SUXj+ii4qNkK5d9bwIrF7xlMxKo4fMuqliB2Ktjnb/2 mIQLC2nq fNs5jSlE1VpFipy+NDfsoAIYWfTbHn2jzDBgvnqjGu+8fOWuONK8RDZH4QIDCO0Kb88WTcAJv++KBrkhegLBO2z/Fy3bzCUaLb+hhLtBpXDNr6OEaj9YYl7MtgndAM9Uek3mHmn+UL4jdjgSXyl8cCl1tPVyS55rMGV9Antvbh0Ka6mIKvRLHvGz44YaUja5ikHzSGsroMFSAaktPTqqbkH5cqpcJESOiqo8q9DSZbGcXlvQb5eFoPaCuIuSEfkbiVU1Tx8Xj3Suy2JhTCUJiIq38bznrenH40vsTBqD230EFVm4DE+6EDEeuQx4T1FvfPKalxEjN7ZnzVX2uNA2/jpkdckPB7z0i1OtzBRVzoSJxvYzLh2iNRa8sF8E6Q3Ybb2AkkOuFfk7YdjLuXCyMB7MrUGxZaya9jY4aZxOVljjpBZNTZkXzaMHcsrEp2k/zSjiZ+1O/iUwKLb2v/GAUHCNXhxaXh3BojPmjkEWbtblJ83AorO1hQjbicdjHmLG1GIbyW3mYTXAe+Ab+OX/RLIP0z12Zhly+Nmj6FPF7MW8GYb/iJRUN0N6lMKyeWDWTaC6o X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use broadcast TLB invalidation, using the INVPLGB instruction, on AMD EPYC 3 and newer CPUs. In order to not exhaust PCID space, and keep TLB flushes local for single threaded processes, we only hand out broadcast ASIDs to processes active on 4 or more CPUs. Signed-off-by: Rik van Riel Reviewed-by: Nadav Amit Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/mmu.h | 6 + arch/x86/include/asm/mmu_context.h | 14 ++ arch/x86/include/asm/tlbflush.h | 73 ++++++ arch/x86/mm/tlb.c | 341 ++++++++++++++++++++++++++++- 4 files changed, 422 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 3b496cdcb74b..d71cd599fec4 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -69,6 +69,12 @@ typedef struct { u16 pkey_allocation_map; s16 execute_only_pkey; #endif + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + u16 global_asid; + bool asid_transition; +#endif + } mm_context_t; #define INIT_MM_CONTEXT(mm) \ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 795fdd53bd0a..d670699d32c2 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -139,6 +139,8 @@ static inline void mm_reset_untag_mask(struct mm_struct *mm) #define enter_lazy_tlb enter_lazy_tlb extern void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk); +extern void destroy_context_free_global_asid(struct mm_struct *mm); + /* * Init a new mm. Used on mm copies, like at fork() * and on mm's that are brand-new, like at execve(). @@ -161,6 +163,14 @@ static inline int init_new_context(struct task_struct *tsk, mm->context.execute_only_pkey = -1; } #endif + +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + mm->context.global_asid = 0; + mm->context.asid_transition = false; + } +#endif + mm_reset_untag_mask(mm); init_new_context_ldt(mm); return 0; @@ -170,6 +180,10 @@ static inline int init_new_context(struct task_struct *tsk, static inline void destroy_context(struct mm_struct *mm) { destroy_context_ldt(mm); +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + destroy_context_free_global_asid(mm); +#endif } extern void switch_mm(struct mm_struct *prev, struct mm_struct *next, diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index bda7080dec83..3080cb8d21dc 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -239,6 +240,78 @@ void flush_tlb_one_kernel(unsigned long addr); void flush_tlb_multi(const struct cpumask *cpumask, const struct flush_tlb_info *info); +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +static inline bool is_dyn_asid(u16 asid) +{ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return true; + + return asid < TLB_NR_DYN_ASIDS; +} + +static inline bool is_global_asid(u16 asid) +{ + return !is_dyn_asid(asid); +} + +static inline bool in_asid_transition(struct mm_struct *mm) +{ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return false; + + return mm && READ_ONCE(mm->context.asid_transition); +} + +static inline u16 mm_global_asid(struct mm_struct *mm) +{ + u16 asid; + + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + return 0; + + asid = smp_load_acquire(&mm->context.global_asid); + + /* mm->context.global_asid is either 0, or a global ASID */ + VM_WARN_ON_ONCE(asid && is_dyn_asid(asid)); + + return asid; +} +#else +static inline bool is_dyn_asid(u16 asid) +{ + return true; +} + +static inline bool is_global_asid(u16 asid) +{ + return false; +} + +static inline bool in_asid_transition(struct mm_struct *mm) +{ + return false; +} + +static inline u16 mm_global_asid(struct mm_struct *mm) +{ + return 0; +} + +static inline bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) +{ + return false; +} + +static inline void broadcast_tlb_flush(struct flush_tlb_info *info) +{ + VM_WARN_ON_ONCE(1); +} + +static inline void consider_global_asid(struct mm_struct *mm) +{ +} +#endif + #ifdef CONFIG_PARAVIRT #include #endif diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index de3f6e4ed16d..0ce0b71a5378 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -74,13 +74,15 @@ * use different names for each of them: * * ASID - [0, TLB_NR_DYN_ASIDS-1] - * the canonical identifier for an mm + * the canonical identifier for an mm, dynamically allocated on each CPU + * [TLB_NR_DYN_ASIDS, MAX_ASID_AVAILABLE-1] + * the canonical, global identifier for an mm, identical across all CPUs * - * kPCID - [1, TLB_NR_DYN_ASIDS] + * kPCID - [1, MAX_ASID_AVAILABLE] * the value we write into the PCID part of CR3; corresponds to the * ASID+1, because PCID 0 is special. * - * uPCID - [2048 + 1, 2048 + TLB_NR_DYN_ASIDS] + * uPCID - [2048 + 1, 2048 + MAX_ASID_AVAILABLE] * for KPTI each mm has two address spaces and thus needs two * PCID values, but we can still do with a single ASID denomination * for each mm. Corresponds to kPCID + 2048. @@ -225,6 +227,20 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen, return; } + /* + * TLB consistency for global ASIDs is maintained with broadcast TLB + * flushing. The TLB is never outdated, and does not need flushing. + */ + if (static_cpu_has(X86_FEATURE_INVLPGB)) { + u16 global_asid = mm_global_asid(next); + + if (global_asid) { + *new_asid = global_asid; + *need_flush = false; + return; + } + } + if (this_cpu_read(cpu_tlbstate.invalidate_other)) clear_asid_other(); @@ -251,6 +267,269 @@ static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen, *need_flush = true; } +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH +/* + * Logic for broadcast TLB invalidation. + */ +static DEFINE_RAW_SPINLOCK(global_asid_lock); +static u16 last_global_asid = MAX_ASID_AVAILABLE; +static DECLARE_BITMAP(global_asid_used, MAX_ASID_AVAILABLE) = { 0 }; +static DECLARE_BITMAP(global_asid_freed, MAX_ASID_AVAILABLE) = { 0 }; +static int global_asid_available = MAX_ASID_AVAILABLE - TLB_NR_DYN_ASIDS - 1; + +static void reset_global_asid_space(void) +{ + lockdep_assert_held(&global_asid_lock); + + /* + * A global TLB flush guarantees that any stale entries from + * previously freed global ASIDs get flushed from the TLB + * everywhere, making these global ASIDs safe to reuse. + */ + invlpgb_flush_all_nonglobals(); + + /* + * Clear all the previously freed global ASIDs from the + * broadcast_asid_used bitmap, now that the global TLB flush + * has made them actually available for re-use. + */ + bitmap_andnot(global_asid_used, global_asid_used, + global_asid_freed, MAX_ASID_AVAILABLE); + bitmap_clear(global_asid_freed, 0, MAX_ASID_AVAILABLE); + + /* + * ASIDs 0-TLB_NR_DYN_ASIDS are used for CPU-local ASID + * assignments, for tasks doing IPI based TLB shootdowns. + * Restart the search from the start of the global ASID space. + */ + last_global_asid = TLB_NR_DYN_ASIDS; +} + +static u16 get_global_asid(void) +{ + + u16 asid; + + lockdep_assert_held(&global_asid_lock); + + /* The previous allocated ASID is at the top of the address space. */ + if (last_global_asid >= MAX_ASID_AVAILABLE - 1) + reset_global_asid_space(); + + asid = find_next_zero_bit(global_asid_used, MAX_ASID_AVAILABLE, last_global_asid); + + if (asid >= MAX_ASID_AVAILABLE) { + /* This should never happen. */ + VM_WARN_ONCE(1, "Unable to allocate global ASID despite %d available\n", + global_asid_available); + return 0; + } + + /* Claim this global ASID. */ + __set_bit(asid, global_asid_used); + last_global_asid = asid; + global_asid_available--; + return asid; +} + +/* + * Returns true if the mm is transitioning from a CPU-local ASID to a global + * (INVLPGB) ASID, or the other way around. + */ +static bool needs_global_asid_reload(struct mm_struct *next, u16 prev_asid) +{ + u16 global_asid = mm_global_asid(next); + + /* Process is transitioning to a global ASID */ + if (global_asid && prev_asid != global_asid) + return true; + + /* Transition from global->local ASID does not currently happen. */ + if (!global_asid && is_global_asid(prev_asid)) + return true; + + return false; +} + +void destroy_context_free_global_asid(struct mm_struct *mm) +{ + if (!mm->context.global_asid) + return; + + guard(raw_spinlock_irqsave)(&global_asid_lock); + + /* The global ASID can be re-used only after flush at wrap-around. */ + __set_bit(mm->context.global_asid, global_asid_freed); + + mm->context.global_asid = 0; + global_asid_available++; +} + +/* + * Check whether a process is currently active on more than "threshold" CPUs. + * This is a cheap estimation on whether or not it may make sense to assign + * a global ASID to this process, and use broadcast TLB invalidation. + */ +static bool mm_active_cpus_exceeds(struct mm_struct *mm, int threshold) +{ + int count = 0; + int cpu; + + /* This quick check should eliminate most single threaded programs. */ + if (cpumask_weight(mm_cpumask(mm)) <= threshold) + return false; + + /* Slower check to make sure. */ + for_each_cpu(cpu, mm_cpumask(mm)) { + /* Skip the CPUs that aren't really running this process. */ + if (per_cpu(cpu_tlbstate.loaded_mm, cpu) != mm) + continue; + + if (per_cpu(cpu_tlbstate_shared.is_lazy, cpu)) + continue; + + if (++count > threshold) + return true; + } + return false; +} + +/* + * Assign a global ASID to the current process, protecting against + * races between multiple threads in the process. + */ +static void use_global_asid(struct mm_struct *mm) +{ + u16 asid; + + guard(raw_spinlock_irqsave)(&global_asid_lock); + + /* This process is already using broadcast TLB invalidation. */ + if (mm->context.global_asid) + return; + + /* The last global ASID was consumed while waiting for the lock. */ + if (!global_asid_available) { + VM_WARN_ONCE(1, "Ran out of global ASIDs\n"); + return; + } + + asid = get_global_asid(); + if (!asid) + return; + + /* + * Notably flush_tlb_mm_range() -> broadcast_tlb_flush() -> + * finish_asid_transition() needs to observe asid_transition = true + * once it observes global_asid. + */ + mm->context.asid_transition = true; + smp_store_release(&mm->context.global_asid, asid); +} + +/* + * x86 has 4k ASIDs (2k when compiled with KPTI), but the largest + * x86 systems have over 8k CPUs. Because of this potential ASID + * shortage, global ASIDs are handed out to processes that have + * frequent TLB flushes and are active on 4 or more CPUs simultaneously. + */ +static void consider_global_asid(struct mm_struct *mm) +{ + if (!static_cpu_has(X86_FEATURE_INVLPGB)) + return; + + /* Check every once in a while. */ + if ((current->pid & 0x1f) != (jiffies & 0x1f)) + return; + + if (!READ_ONCE(global_asid_available)) + return; + + /* + * Assign a global ASID if the process is active on + * 4 or more CPUs simultaneously. + */ + if (mm_active_cpus_exceeds(mm, 3)) + use_global_asid(mm); +} + +static void finish_asid_transition(struct flush_tlb_info *info) +{ + struct mm_struct *mm = info->mm; + int bc_asid = mm_global_asid(mm); + int cpu; + + if (!READ_ONCE(mm->context.asid_transition)) + return; + + for_each_cpu(cpu, mm_cpumask(mm)) { + /* + * The remote CPU is context switching. Wait for that to + * finish, to catch the unlikely case of it switching to + * the target mm with an out of date ASID. + */ + while (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) == LOADED_MM_SWITCHING) + cpu_relax(); + + if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm, cpu)) != mm) + continue; + + /* + * If at least one CPU is not using the global ASID yet, + * send a TLB flush IPI. The IPI should cause stragglers + * to transition soon. + * + * This can race with the CPU switching to another task; + * that results in a (harmless) extra IPI. + */ + if (READ_ONCE(per_cpu(cpu_tlbstate.loaded_mm_asid, cpu)) != bc_asid) { + flush_tlb_multi(mm_cpumask(info->mm), info); + return; + } + } + + /* All the CPUs running this process are using the global ASID. */ + WRITE_ONCE(mm->context.asid_transition, false); +} + +static void broadcast_tlb_flush(struct flush_tlb_info *info) +{ + bool pmd = info->stride_shift == PMD_SHIFT; + unsigned long asid = info->mm->context.global_asid; + unsigned long addr = info->start; + + /* + * TLB flushes with INVLPGB are kicked off asynchronously. + * The inc_mm_tlb_gen() guarantees page table updates are done + * before these TLB flushes happen. + */ + if (info->end == TLB_FLUSH_ALL) { + invlpgb_flush_single_pcid_nosync(kern_pcid(asid)); + /* Do any CPUs supporting INVLPGB need PTI? */ + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_single_pcid_nosync(user_pcid(asid)); + } else do { + unsigned long nr = 1; + + if (info->stride_shift <= PMD_SHIFT) { + nr = (info->end - addr) >> info->stride_shift; + nr = clamp_val(nr, 1, invlpgb_count_max); + } + + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); + + addr += nr << info->stride_shift; + } while (addr < info->end); + + finish_asid_transition(info); + + /* Wait for the INVLPGBs kicked off above to finish. */ + tlbsync(); +} +#endif /* CONFIG_X86_BROADCAST_TLB_FLUSH */ + /* * Given an ASID, flush the corresponding user ASID. We can delay this * until the next time we switch to it. @@ -556,8 +835,9 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ if (prev == next) { /* Not actually switching mm's */ - VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != - next->context.ctx_id); + VM_WARN_ON(is_dyn_asid(prev_asid) && + this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != + next->context.ctx_id); /* * If this races with another thread that enables lam, 'new_lam' @@ -573,6 +853,23 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); + /* + * Check if the current mm is transitioning to a new ASID. + */ + if (needs_global_asid_reload(next, prev_asid)) { + next_tlb_gen = atomic64_read(&next->context.tlb_gen); + + choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); + goto reload_tlb; + } + + /* + * Broadcast TLB invalidation keeps this PCID up to date + * all the time. + */ + if (is_global_asid(prev_asid)) + return; + /* * If the CPU is not in lazy TLB mode, we are just switching * from one thread in a process to another thread in the same @@ -606,6 +903,13 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ cond_mitigation(tsk); + /* + * Let nmi_uaccess_okay() and finish_asid_transition() + * know that we're changing CR3. + */ + this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); + barrier(); + /* * Leave this CPU in prev's mm_cpumask. Atomic writes to * mm_cpumask can be expensive under contention. The CPU @@ -620,14 +924,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, next_tlb_gen = atomic64_read(&next->context.tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); - - /* Let nmi_uaccess_okay() know that we're changing CR3. */ - this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); - barrier(); } +reload_tlb: new_lam = mm_lam_cr3_mask(next); if (need_flush) { + VM_WARN_ON_ONCE(is_global_asid(new_asid)); this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); load_new_mm_cr3(next->pgd, new_asid, new_lam, true); @@ -746,7 +1048,7 @@ static void flush_tlb_func(void *info) const struct flush_tlb_info *f = info; struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); - u64 local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); + u64 local_tlb_gen; bool local = smp_processor_id() == f->initiating_cpu; unsigned long nr_invalidate = 0; u64 mm_tlb_gen; @@ -769,6 +1071,16 @@ static void flush_tlb_func(void *info) if (unlikely(loaded_mm == &init_mm)) return; + /* Reload the ASID if transitioning into or out of a global ASID */ + if (needs_global_asid_reload(loaded_mm, loaded_mm_asid)) { + switch_mm_irqs_off(NULL, loaded_mm, NULL); + loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + } + + /* Broadcast ASIDs are always kept up to date with INVLPGB. */ + if (is_global_asid(loaded_mm_asid)) + return; + VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].ctx_id) != loaded_mm->context.ctx_id); @@ -786,6 +1098,8 @@ static void flush_tlb_func(void *info) return; } + local_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb_gen); + if (unlikely(f->new_tlb_gen != TLB_GENERATION_INVALID && f->new_tlb_gen <= local_tlb_gen)) { /* @@ -953,7 +1267,7 @@ STATIC_NOPV void native_flush_tlb_multi(const struct cpumask *cpumask, * up on the new contents of what used to be page tables, while * doing a speculative memory access. */ - if (info->freed_tables) + if (info->freed_tables || in_asid_transition(info->mm)) on_each_cpu_mask(cpumask, flush_tlb_func, (void *)info, true); else on_each_cpu_cond_mask(should_flush_tlb, flush_tlb_func, @@ -1049,9 +1363,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { + if (mm_global_asid(mm)) { + broadcast_tlb_flush(info); + } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { info->trim_cpumask = should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), info); + consider_global_asid(mm); } else if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); From patchwork Thu Feb 13 16:14:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F5B5C021A4 for ; Thu, 13 Feb 2025 16:19:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A140D6B0092; Thu, 13 Feb 2025 11:19:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C3686B0093; Thu, 13 Feb 2025 11:19:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88BF8280001; Thu, 13 Feb 2025 11:19:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6A2F76B0092 for ; Thu, 13 Feb 2025 11:19:29 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 075944B937 for ; Thu, 13 Feb 2025 16:19:29 +0000 (UTC) X-FDA: 83115431658.21.F55F338 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf19.hostedemail.com (Postfix) with ESMTP id 70E901A0016 for ; Thu, 13 Feb 2025 16:19:27 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463567; a=rsa-sha256; cv=none; b=P/w0i1AIGjq22BQaJ3hReSrCj5cRLoid+G0LhXlaBdLPqZeIgV1C53McvakGKkqYY+h/n3 n6YeguQW/g5glRPaqQjo6EKIWOaFV+t6UHx34KnBtuWiL5ZSL6IcY6jUI3szqoR0Ongrhi nBgXcNhQkSgUphC440zQ23nRJ5Yr3Tw= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463567; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CjTLpasyu0xYL9bvYHvochm0jwBg9eEfYvB1mH54d3w=; b=U/gl7AVvMRXX89Z3EV3Jy2Mjsw3wj4JCUC+INURsadAtFwRao/vDUJYfbCNlqrfDUi6+ah FnNtzyf7YbDUFp64NizEUNHPYd+NnIJhwYSbyrWOXFx8g2+w1USpAACYktWuefhDI+O8om 7ZFxqMyIolHtlYrcVybCqBLtxxZ5Sks= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-1CB0; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 10/12] x86/mm: do targeted broadcast flushing from tlbbatch code Date: Thu, 13 Feb 2025 11:14:01 -0500 Message-ID: <20250213161423.449435-11-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 70E901A0016 X-Rspamd-Server: rspam12 X-Stat-Signature: 5gmbc4t8485xh8k7ictw3i5sccihmxzi X-HE-Tag: 1739463567-609261 X-HE-Meta: U2FsdGVkX1+Dfxsr5IPr6BBnOA4vsmpxxFrG0vHStLIm1+fjQyfJ9fCCVF1ZZZCaG0JAZ33ExSPycQO07mY9ZxKwe+dKL5CsZrcxS8X2Vnb5JAbbZZGIElvIyiNazTQ7kaa+PXnIf3VlAeWYTAviAlkePklvdBbNq2HaRSPE0Ajy1OA/uwUQ/f3jEVTmfz8fEdbRx4cZYqHV9udW2uZH3A9NaPzoEu1Z5Y3EFzDgg2JshPD38ILmrJDuSXdx3KiWnDsdvL04zVUY+HUZMyNAmg+0I6ytIryNBj5RPIzxEYj/gzkYGt/8+Ftlz6nUhmI34DY7P1/xhSIueJXz2fvpVysr7+FbifZHPHDeTCmZjG4bkHIAzMqepZvV7l0vRA93uM53s6XgsH5q5JmwmZ6q+T1KQD/aDzLOI4q3UtTi0gaCttVgb96hxZwZ/sAYJvDUQ2AbIJ/ZoCbx4iEF8TwNOdy4b1Ggw5Zy2++glbQoNQylEUZ04KsSSe5cRiAhzz3JLR2xsprHTFjaxyIp77Xpr2j4KHEiqfJAabQ1TlJARkWbHjrRPUEkmSLzvfjYfCm9a/HIBfNC0ToziCFZ2ImeRR7fUXIDWgKEJwedOMUblbvkNTWqh3dIsoJ7rSip4oqeOzsFGVxNvCd9pDQyAop5Nt5hRSTPheC2IAHTzMMijoBv2UjAx60PmfPVL9oz7qShmNwCMB14PNj7yPoDzwskI4ADFX6vprOf/w9+IfkgsIqqIWn3I4eMsj3njgbIpYgOLkOK2ioi5c6unZxcwgRQrl6Zp4WDcVq5esxH/LoPat2GG5xS37jD/PKj+6ba+ne81AcxoULw3PrzzfpcYkgkLBzb/ozX28w876YDBfPcStOsjSTTUSQzz3mCTtNR6hbRaSyzxqWFGJOj9zA6BuOSGNSKrrVlO0+S8uibsPhmZK/l2xrvSnHKYClhV5fH9jVL6cUTpSa0LLere0biFC0 l/S39QcB GDiIm02sLugGBYqHbfiDWYZRNIybHeXpuY9sNQlwyS29+SIzlTb+g4CdU280bT4yfjcJc9yDDkwgYmPOV3kRRN+kVfjvaoJTkFF9+smvnT29lbMNg3weQPk9S3/25Tc/tCT6zG/1l0F7xQ6S3XDcm331J+ecWROXGdXgtTMiK+mpKTiJPPVwxiFgckkWUeEE7PmCqcXYABOBT67+yPkXOcfUZPGLEXA0AagGObj6xBBw+aRwEs8tBkr7TjrM3zDmW2CUPU8t6QbAiv4x4akzA/EicMZGjMFGI7O9wW6dNq0uWz+cYn8EJc/SE/SoMtLhh9L4RTXB96THVu6iZEtAXGdHmB5wwEwcQlmE86WUZuoTXRlK7oXk2KCZzDYkmVAY0inyatQST5iK/3gD7wyieD1QlsV7Tl09FPiiTA4R8d32AehyI1JN6JRORfNGqENchc06XlNdp8b6cG74HvRZm1RHxJAluZrHqmiPsA1TqtlKqc33LkT8iA0lwmVay5JbROVFr2NF1wpuhDtJA+8BBUBMYJPBOvva9+AW0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Instead of doing a system-wide TLB flush from arch_tlbbatch_flush, queue up asynchronous, targeted flushes from arch_tlbbatch_add_pending. This also allows us to avoid adding the CPUs of processes using broadcast flushing to the batch->cpumask, and will hopefully further reduce TLB flushing from the reclaim and compaction paths. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/invlpgb.h | 20 ++++----- arch/x86/include/asm/tlbflush.h | 19 ++++---- arch/x86/mm/tlb.c | 79 +++++++++++++++++++++++++++++++-- 3 files changed, 96 insertions(+), 22 deletions(-) diff --git a/arch/x86/include/asm/invlpgb.h b/arch/x86/include/asm/invlpgb.h index a1d5dedd5217..43c331507cc0 100644 --- a/arch/x86/include/asm/invlpgb.h +++ b/arch/x86/include/asm/invlpgb.h @@ -31,7 +31,7 @@ static inline void __invlpgb(unsigned long asid, unsigned long pcid, } /* Wait for INVLPGB originated by this CPU to complete. */ -static inline void tlbsync(void) +static inline void __tlbsync(void) { cant_migrate(); /* TLBSYNC: supported in binutils >= 0.36. */ @@ -61,19 +61,19 @@ static inline void invlpgb_flush_user(unsigned long pcid, unsigned long addr) { __invlpgb(0, pcid, addr, 0, 0, INVLPGB_PCID | INVLPGB_VA); - tlbsync(); + __tlbsync(); } -static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, - unsigned long addr, - u16 nr, - bool pmd_stride) +static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, + bool pmd_stride) { __invlpgb(0, pcid, addr, nr - 1, pmd_stride, INVLPGB_PCID | INVLPGB_VA); } /* Flush all mappings for a given PCID, not including globals. */ -static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +static inline void __invlpgb_flush_single_pcid_nosync(unsigned long pcid) { __invlpgb(0, pcid, 0, 0, 0, INVLPGB_PCID); } @@ -82,11 +82,11 @@ static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) static inline void invlpgb_flush_all(void) { __invlpgb(0, 0, 0, 0, 0, INVLPGB_INCLUDE_GLOBAL); - tlbsync(); + __tlbsync(); } /* Flush addr, including globals, for all PCIDs. */ -static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +static inline void __invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) { __invlpgb(0, 0, addr, nr - 1, 0, INVLPGB_INCLUDE_GLOBAL); } @@ -95,7 +95,7 @@ static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) static inline void invlpgb_flush_all_nonglobals(void) { __invlpgb(0, 0, 0, 0, 0, 0); - tlbsync(); + __tlbsync(); } #endif /* _ASM_X86_INVLPGB */ diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 3080cb8d21dc..27ba17603e0b 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -106,6 +106,9 @@ struct tlb_state { * need to be invalidated. */ bool invalidate_other; +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH + bool need_tlbsync; +#endif #ifdef CONFIG_ADDRESS_MASKING /* @@ -310,6 +313,10 @@ static inline void broadcast_tlb_flush(struct flush_tlb_info *info) static inline void consider_global_asid(struct mm_struct *mm) { } + +static inline void tlbsync(void) +{ +} #endif #ifdef CONFIG_PARAVIRT @@ -359,21 +366,15 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) return atomic64_inc_return(&mm->context.tlb_gen); } -static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm, - unsigned long uaddr) -{ - inc_mm_tlb_gen(mm); - cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); - mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); -} - static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm) { flush_tlb_mm(mm); } extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); +extern void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr); static inline bool pte_flags_need_flush(unsigned long oldflags, unsigned long newflags, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 0ce0b71a5378..8880bc7456ed 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -492,6 +492,37 @@ static void finish_asid_transition(struct flush_tlb_info *info) WRITE_ONCE(mm->context.asid_transition, false); } +static inline void tlbsync(void) +{ + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + return; + __tlbsync(); + this_cpu_write(cpu_tlbstate.need_tlbsync, false); +} + +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, + unsigned long addr, + u16 nr, bool pmd_stride) +{ + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + this_cpu_write(cpu_tlbstate.need_tlbsync, true); +} + +static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) +{ + __invlpgb_flush_single_pcid_nosync(pcid); + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + this_cpu_write(cpu_tlbstate.need_tlbsync, true); +} + +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) +{ + __invlpgb_flush_addr_nosync(addr, nr); + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) + this_cpu_write(cpu_tlbstate.need_tlbsync, true); +} + static void broadcast_tlb_flush(struct flush_tlb_info *info) { bool pmd = info->stride_shift == PMD_SHIFT; @@ -791,6 +822,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, if (IS_ENABLED(CONFIG_PROVE_LOCKING)) WARN_ON_ONCE(!irqs_disabled()); + tlbsync(); + /* * Verify that CR3 is what we think it is. This will catch * hypothetical buggy code that directly switches to swapper_pg_dir @@ -970,6 +1003,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, */ void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) { + tlbsync(); + if (this_cpu_read(cpu_tlbstate.loaded_mm) == &init_mm) return; @@ -1633,9 +1668,7 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { - invlpgb_flush_all_nonglobals(); - } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); @@ -1644,12 +1677,52 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) local_irq_enable(); } + /* + * If we issued (asynchronous) INVLPGB flushes, wait for them here. + * The cpumask above contains only CPUs that were running tasks + * not using broadcast TLB flushing. + */ + tlbsync(); + cpumask_clear(&batch->cpumask); put_flush_tlb_info(); put_cpu(); } +void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr) +{ + u16 asid = mm_global_asid(mm); + + if (asid) { + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); + /* Do any CPUs supporting INVLPGB need PTI? */ + if (static_cpu_has(X86_FEATURE_PTI)) + invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false); + + /* + * Some CPUs might still be using a local ASID for this + * process, and require IPIs, while others are using the + * global ASID. + * + * In this corner case we need to do both the broadcast + * TLB invalidation, and send IPIs. The IPIs will help + * stragglers transition to the broadcast ASID. + */ + if (in_asid_transition(mm)) + asid = 0; + } + + if (!asid) { + inc_mm_tlb_gen(mm); + cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); + } + + mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL); +} + /* * Blindly accessing user memory from NMI context can be dangerous * if we're in the middle of switching the current user task or From patchwork Thu Feb 13 16:14:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B678DC021A0 for ; Thu, 13 Feb 2025 16:18:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5807D6B0082; Thu, 13 Feb 2025 11:18:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 52FB66B0083; Thu, 13 Feb 2025 11:18:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F7366B0089; Thu, 13 Feb 2025 11:18:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 216F76B0082 for ; Thu, 13 Feb 2025 11:18:44 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C257F1C8ECE for ; Thu, 13 Feb 2025 16:18:43 +0000 (UTC) X-FDA: 83115429726.25.1D860F3 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf07.hostedemail.com (Postfix) with ESMTP id 2BA2F4000C for ; Thu, 13 Feb 2025 16:18:42 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463522; a=rsa-sha256; cv=none; b=buIqhLvbBfB1dMVKUXmDyRO2n7SgKR5TKUh4edf2OzZc3WYP4LRiPtS3YEORvio/5x3jnS lAu/E18IoTOa15j/c4JOZvbjAtTqotT+1fg2YzZRe2Ok2a5f26nzg2PrwIZRvVrTi3g4qH KeVOeKPndEFpZ3MkmSAq72AmGWH10CI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf07.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463522; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oh1XTOganrHAXrn6/YqDJlPae8gxQKo7DQ7P4yE2UVs=; b=alvYI5nnCXCzpph2fVaY3hdA6MKSJVSgR8J9zrBXYcTBb0VntLF0CK8tope6fJ0Qrm6gkb VUVG/vO7vpRDP258c6oFh/E2cWJiNetlVeZ19IsTdXBIc/S1aTCjXF5ADXtyWgtMQ4MqEy KgtM6AIy5cRlf5PgbV+XsA6l32EPB1M= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-1Hne; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 11/12] x86/mm: enable AMD translation cache extensions Date: Thu, 13 Feb 2025 11:14:02 -0500 Message-ID: <20250213161423.449435-12-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 2BA2F4000C X-Rspamd-Server: rspam12 X-Stat-Signature: 3csshye61aisx3595mjaiwrb1q1wni1a X-HE-Tag: 1739463522-586002 X-HE-Meta: U2FsdGVkX19HVLPagTMS4kRUj9hW02NE/3PHfCi5i3UxCyLqqeIYAC/hnD+RiIovccCGT/q5Fg4m+747SJN1ZM2XJ2ypb2/86/8XJOB5i8Y+xHE3UI6o+U1ZNcqOjKkeV8wjvdovl88fUcmMdVAh5ny4IRzg+WzmyUIif3Ue+WANwEmzUaz9hosdiDZk6nWp0sK+sacaYPRTrcTEZ1kK6J1DClv94OmSGtOeY+dMwP9ekg6AcStq5GZ1HWKOcL5okd9LhYiCuxChs/zk/x8FfNn6W8SSlXnsDDjpEnzt31MP6s8rKwYdOFXoHojj4dKJn7/UZveVpQZlTw1uQrL9hyt7/AYaq/yxulvB4i95EsHHcS6qqKHK7PWgxfrLmcRwCdU3n889mhYOCftaZULhfi4wzk9NiSsoSPsNpRF76Pw7Rx01F1dkm+yqzBk4AzPPZ0ZRUSPEOvr914aA/9ZHdzyfMU7f0WJHHle1Sd2iv39khXdxkFDc54UTOLhjnROhObzwpSfLA5XYQaND2X6+DvCsaVCwTnng8JAnLIQ8eMRKJhWCDNgQxWVw33/G/8pHLS/erAsTK3OiWPZnVWVX8d6YaLTzB4iSo+zCLUX3iiVxVTab1wrpGzaCiYgsJ8cn0F38VyZmquvCXBbDhDfXx/6BvEA04ZH7BrMWrFBhLAv8uI0PVzWb+ILBARnFFwIsP5OhqxGcnJsr47Mab5hgaHJhVfFYcvNDSJKEKDv5OV8xlbp7MZdCQC6Mbqase5ljnvriTb256FDFYwN892UuGMZRAA5mt8e8L18I65WFBVMZ1BIM9diosLye/NHBm1J+HlCEJ4ZFpZyfU8NAzT/xVhygxtZ8dMGMavaTmoKL00QKw4DKF+MPT22M4BKyOu28pmsN4U1STFFy+ucO3LtZkX/N/eKzSeSJCw0GQwKi0fWhs1tNUPJglU3fR1q3pda2goNwgkMpjpldF89SIuq FlSMH+XU HvAW+jYbK9InGgULs0WW0LN4U7sQQ2tSeUxD9EoTbMxyXN7L9ol9AY32ix+Z9E9xuncJO1lIEDHk2Eak6nQmRu8ExW3PhjCZ36LI1pnw+8gl0FS+E3dqRlunciY/X53KDrM9t/NA97MK3miW9uWkLkj5KUPnbP0xBK81D4ll9eJY01U4LuqedMIelKfqy6E9ryFrss+4NTcO587k2BDl8MvjgHhrspDq+UuMl+EdHkgGBqKLnvAtgZptqPwtudFgK0rhJFgru/yO291lF5nYPd3LFFAj3JflMqlLdMDRmnDpZCvFgkeQQrX96MAgdk38yShGYWz3M+uiwOlzh72/rjWbCcV70r0DT7LGNxPywQfhjy4u/zx8c24dLKVrTdnATOq1hLRLB7Ra56hMp5B47rzgqkUjkYcsYTluY6nLy0yg3oDb/8faLlT0PHzS/E7wQS5mlOFUmRI4dIKHYtixMWN9hoSgMwFiclmTCXw8qptkZGbYsKZ9VmBBwg1ng2tRVkeR0/9BvfjphAUnVhkU2V//LQQrZyFa8v5fB X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: With AMD TCE (translation cache extensions) only the intermediate mappings that cover the address range zapped by INVLPG / INVLPGB get invalidated, rather than all intermediate mappings getting zapped at every TLB invalidation. This can help reduce the TLB miss rate, by keeping more intermediate mappings in the cache. From the AMD manual: Translation Cache Extension (TCE) Bit. Bit 15, read/write. Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID instructions operate on TLB entries. When this bit is 0, these instructions remove the target PTE from the TLB as well as all upper-level table entries that are cached in the TLB, whether or not they are associated with the target PTE. When this bit is set, these instructions will remove the target PTE and only those upper-level entries that lead to the target PTE in the page table hierarchy, leaving unrelated upper-level entries intact. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/msr-index.h | 2 ++ arch/x86/kernel/cpu/amd.c | 4 ++++ tools/arch/x86/include/asm/msr-index.h | 2 ++ 3 files changed, 8 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9a71880eec07..a7ea9720ba3c 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 3e8180354303..38f454671c88 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1075,6 +1075,10 @@ static void init_amd(struct cpuinfo_x86 *c) /* AMD CPUs don't need fencing after x2APIC/TSC_DEADLINE MSR writes. */ clear_cpu_cap(c, X86_FEATURE_APIC_MSRS_FENCE); + + /* Enable Translation Cache Extension */ + if (cpu_feature_enabled(X86_FEATURE_TCE)) + msr_set_bit(MSR_EFER, _EFER_TCE); } #ifdef CONFIG_X86_32 diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 3ae84c3b8e6d..dc1c1057f26e 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -25,6 +25,7 @@ #define _EFER_SVME 12 /* Enable virtualization */ #define _EFER_LMSLE 13 /* Long Mode Segment Limit Enable */ #define _EFER_FFXSR 14 /* Enable Fast FXSAVE/FXRSTOR */ +#define _EFER_TCE 15 /* Enable Translation Cache Extensions */ #define _EFER_AUTOIBRS 21 /* Enable Automatic IBRS */ #define EFER_SCE (1<<_EFER_SCE) @@ -34,6 +35,7 @@ #define EFER_SVME (1<<_EFER_SVME) #define EFER_LMSLE (1<<_EFER_LMSLE) #define EFER_FFXSR (1<<_EFER_FFXSR) +#define EFER_TCE (1<<_EFER_TCE) #define EFER_AUTOIBRS (1<<_EFER_AUTOIBRS) /* From patchwork Thu Feb 13 16:14:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13973614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB954C021A0 for ; Thu, 13 Feb 2025 16:20:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 754C16B0089; Thu, 13 Feb 2025 11:20:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 705696B0092; Thu, 13 Feb 2025 11:20:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CCF16B0093; Thu, 13 Feb 2025 11:20:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2A0356B0089 for ; Thu, 13 Feb 2025 11:20:35 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AEF7A1C7E80 for ; Thu, 13 Feb 2025 16:20:34 +0000 (UTC) X-FDA: 83115434388.13.4D97AFE Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf30.hostedemail.com (Postfix) with ESMTP id 18EC480007 for ; Thu, 13 Feb 2025 16:20:32 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf30.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739463633; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2HvdZeAQcG3pHQbz4rlJXzf4Fkki4VYuMd6aUXqWTvA=; b=Q+sfFAgFeZAzuJWHVxfZD65SdCCufaqr8naqNJay1xEXMJfkr1prOLbS5D9tTPlm3ZQWBS iQ4gwk5eGcg6OMgW0zx8lRB2BflPJ24FGtoVwpT5jLt2xqveJ8EjBvyCCIFAS/ebTnbPKk dKhZ4cieJd+YT17w6X0mTp/DgpacDew= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf30.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739463633; a=rsa-sha256; cv=none; b=Ruyd/mC9RzYze20LUsZv2h29P/HA1Ph2o1i6H5ZaV5PA1KqRHBRjEaoap2CZrEFXaTije8 UDwL+i+lfwIopqtm59xgFkhH3rwjOs+GXNMt5WMPmtjfypYKBtJzXZS0XxcC+DNjHnFov5 4A5n2DKeiRPbByBiCudKaNkklzOqAKU= Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tibr7-000000003xx-1N2E; Thu, 13 Feb 2025 11:14:25 -0500 From: Rik van Riel To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, bp@alien8.de, peterz@infradead.org, dave.hansen@linux.intel.com, zhengqi.arch@bytedance.com, nadav.amit@gmail.com, thomas.lendacky@amd.com, kernel-team@meta.com, linux-mm@kvack.org, akpm@linux-foundation.org, jackmanb@google.com, jannh@google.com, mhklinux@outlook.com, andrew.cooper3@citrix.com, Rik van Riel , Manali Shukla Subject: [PATCH v11 12/12] x86/mm: only invalidate final translations with INVLPGB Date: Thu, 13 Feb 2025 11:14:03 -0500 Message-ID: <20250213161423.449435-13-riel@surriel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250213161423.449435-1-riel@surriel.com> References: <20250213161423.449435-1-riel@surriel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 18EC480007 X-Stat-Signature: dj4iu1nuy37rc67xakn4yfgcb4spaan3 X-HE-Tag: 1739463632-23124 X-HE-Meta: U2FsdGVkX183fwO1Eo0Z202K3ucW8ATr9qaLlokHOUmHaQqMViDojL8CTQn5ZbtyiMWxxCsLVRlIBJAtl/UuMLsOVMOzzs5emSxshRs8tbm2VJdF5nYvu2INHBEyBE0EcH/EJt8ahB89cfg7gmbjDWokpxlKiKhuvHn2yTNX8YwffZPaDKUEbnDMfMwx5TnEndzBeIG3MRUPJaP/8dDrqZGEp1zaunC9RQqrdM4QeCoVAiD2xv/pfofKk7pbKnV87OZt+tZdl1KKprdlQ4L111vyCn8FcPsazEhmy1ffEIdGQt/bxEE2B1IuX4QoDgy9rhzQBE2gU52e8i7PSCw7RF8mjlXg3fP1+eshjeR6HNLbrRv3m3FtvvAt7wIZ9lQpvoxIgtqPJa94kqzNP1/f2USPrUQZkU6v7b2tp47X0429FunXPJhYduABnvUfeTPpMguqt3iVX24Sd5f3sAcqRJAgpoPSXS+Fm4Fb0xMWxUEq1tFS0psmWV12pWuV4RRQh8tM8GjzFVwKuYi0oK+Avghl+YO6OSHs4Ck8co8uzg+09Vu8LlKDKUTZAoZdEIEqj9havPuj04yc/8fZ1oZay+Er/2AZu7i+rRxvDYUg540l8dtZ1UHuO/3E24FdUd7zbzt3FqbgyOw0buWNGRgPOwY2UXcLw4APkVObfcoDma1K1OCTEoJXumLOXxhe1p7pC4VZefUumzWHRov5T8dOAYsVQyz1V5X+FxSIxKJIlADMQ2gZBp5Bt/Gw6hOCrNa5zy1rgHD2bR6PG/qUa4mKMX/c8W4GvFoBcLqvpA1E4f9JK4mxE0Bz71atFnOu8/GkFhGhm70zQytTfvVWeaaJ+CuCJC6qglVJ4IprYak7+SXzE1ATEwmuViEdqNtJI0R8MImK5fU8YXwmd7L1R/11l9PxUem0Vf3pHlcdC5C9GN9ceBA8oz180WHsR/n34n1+BTRt67IYaHbvxlv3WPV gmoNQTZR t1qz8fyA+hUgXZt8t2o+DIF9x5aZmvyaa5aFZ+UfNWNSzlBsf/REQY2Ahi6hjVB349JtbR/fEglTtMpnNRodeHu9AUDemb4huH5YVVvxB3EDmmTh93uOZt3vfqNCDug+0JjA5L+g70PYZB6nLPluys7cDdV+T5zhQKkgyMnJVIQD8EWThXymOXp222Jj9waJriIPgQUfRwMZnQ8UFHU4LQ2Ukxc3OicORllrwlQq3esXnwtefV677rNa3qse1mstWbcfXHD1B6Ax1BzkabDUvREw71xJtCc2rlt9RCq6qttJrx3KIwuXKEL2FFvLr9ByPZrR/Cr7ELmSZQig9kbyNNyXXnpxBnDjqZXHHgPhgS2Eyl0fn6jalyHfhzxroFD6eA+9pBeT/mVjod5Px9MgPyZWJCm/o/4vcVygnw82WLYH9yX+QdZda3RHUHxPXaEKKkfJvNTCHzJ3hgkpf+6OshpcBq6ZVwt1vXbhgb1ZDwkZ3AtpZX0apfCoDJ3YD4GfGojKaLPPDoSRDTwwbaHZSMVxB8hVUnViPzhaL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use the INVLPGB_FINAL_ONLY flag when invalidating mappings with INVPLGB. This way only leaf mappings get removed from the TLB, leaving intermediate translations cached. On the (rare) occasions where we free page tables we do a full flush, ensuring intermediate translations get flushed from the TLB. Signed-off-by: Rik van Riel Tested-by: Manali Shukla Tested-by: Brendan Jackman Tested-by: Michael Kelley --- arch/x86/include/asm/invlpgb.h | 10 ++++++++-- arch/x86/mm/tlb.c | 13 +++++++------ 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/invlpgb.h b/arch/x86/include/asm/invlpgb.h index 43c331507cc0..220aba708b72 100644 --- a/arch/x86/include/asm/invlpgb.h +++ b/arch/x86/include/asm/invlpgb.h @@ -67,9 +67,15 @@ static inline void invlpgb_flush_user(unsigned long pcid, static inline void __invlpgb_flush_user_nr_nosync(unsigned long pcid, unsigned long addr, u16 nr, - bool pmd_stride) + bool pmd_stride, + bool freed_tables) { - __invlpgb(0, pcid, addr, nr - 1, pmd_stride, INVLPGB_PCID | INVLPGB_VA); + u8 flags = INVLPGB_PCID | INVLPGB_VA; + + if (!freed_tables) + flags |= INVLPGB_FINAL_ONLY; + + __invlpgb(0, pcid, addr, nr - 1, pmd_stride, flags); } /* Flush all mappings for a given PCID, not including globals. */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 8880bc7456ed..f09049207b78 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -502,9 +502,10 @@ static inline void tlbsync(void) static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, unsigned long addr, - u16 nr, bool pmd_stride) + u16 nr, bool pmd_stride, + bool freed_tables) { - __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride, freed_tables); if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) this_cpu_write(cpu_tlbstate.need_tlbsync, true); } @@ -547,9 +548,9 @@ static void broadcast_tlb_flush(struct flush_tlb_info *info) nr = clamp_val(nr, 1, invlpgb_count_max); } - invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd, info->freed_tables); if (static_cpu_has(X86_FEATURE_PTI)) - invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd, info->freed_tables); addr += nr << info->stride_shift; } while (addr < info->end); @@ -1697,10 +1698,10 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, u16 asid = mm_global_asid(mm); if (asid) { - invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false, false); /* Do any CPUs supporting INVLPGB need PTI? */ if (static_cpu_has(X86_FEATURE_PTI)) - invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false); + invlpgb_flush_user_nr_nosync(user_pcid(asid), uaddr, 1, false, false); /* * Some CPUs might still be using a local ASID for this