diff mbox series

[v13,06/14] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing

Message ID 20250223194943.3518952-7-riel@surriel.com (mailing list archive)
State New
Headers show
Series AMD broadcast TLB invalidation | expand

Commit Message

Rik van Riel Feb. 23, 2025, 7:48 p.m. UTC
In the page reclaim code, we only track the CPU(s) where the TLB needs
to be flushed, rather than all the individual mappings that may be getting
invalidated.

Use broadcast TLB flushing when that is available.

This is a temporary hack to ensure that the PCID context for
tasks in the next patch gets properly flushed from the page
reclaim code, because the IPI based flushing in arch_tlbbatch_flush
only flushes the currently loaded TLB context on each CPU.

Patch 10 replaces this with the actual mechanism used to do
broadcast TLB flushing from the page reclaim code.

Signed-off-by: Rik van Riel <riel@surriel.com>
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Tested-by: Brendan Jackman <jackmanb@google.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
 arch/x86/mm/tlb.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Borislav Petkov Feb. 24, 2025, 1:27 p.m. UTC | #1
On Sun, Feb 23, 2025 at 02:48:56PM -0500, Rik van Riel wrote:
> In the page reclaim code, we only track the CPU(s) where the TLB needs
> to be flushed, rather than all the individual mappings that may be getting
> invalidated.
> 
> Use broadcast TLB flushing when that is available.
> 
> This is a temporary hack to ensure that the PCID context for
> tasks in the next patch gets properly flushed from the page

There's no "next patch" in git - just merge this three-liner with the next
one.

> reclaim code, because the IPI based flushing in arch_tlbbatch_flush
> only flushes the currently loaded TLB context on each CPU.
> 
> Patch 10 replaces this with the actual mechanism used to do

Same. No "patch 10" in git.

> broadcast TLB flushing from the page reclaim code.
> 
> Signed-off-by: Rik van Riel <riel@surriel.com>
> Tested-by: Manali Shukla <Manali.Shukla@amd.com>
> Tested-by: Brendan Jackman <jackmanb@google.com>
> Tested-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  arch/x86/mm/tlb.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 2d7ed0fda61f..16839651f67f 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -1326,7 +1326,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>  	 * a local TLB flush is needed. Optimize this use-case by calling
>  	 * flush_tlb_func_local() directly in this case.
>  	 */
> -	if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
> +	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
> +		invlpgb_flush_all_nonglobals();

I'm confused. The docs say rAX needs to be 0x6 to do "Invalidate all TLB
entries that match {ASID, PCID} excluding Global". But you're calling INVLPGB
with rAX==0 and the APM doesn't say what this does.

I'm guessing you want this to mean invalidate all non-globals for any ASID and
PCID. So I muss be missing the place in the docs where it says so...

Hmmm?
diff mbox series

Patch

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 2d7ed0fda61f..16839651f67f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1326,7 +1326,9 @@  void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
 	 * a local TLB flush is needed. Optimize this use-case by calling
 	 * flush_tlb_func_local() directly in this case.
 	 */
-	if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
+	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
+		invlpgb_flush_all_nonglobals();
+	} else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
 		flush_tlb_multi(&batch->cpumask, info);
 	} else if (cpumask_test_cpu(cpu, &batch->cpumask)) {
 		lockdep_assert_irqs_enabled();