diff mbox series

[v11,08/12] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing

Message ID 20250213161423.449435-9-riel@surriel.com (mailing list archive)
State New
Headers show
Series AMD broadcast TLB invalidation | expand

Commit Message

Rik van Riel Feb. 13, 2025, 4:13 p.m. UTC
In the page reclaim code, we only track the CPU(s) where the TLB needs
to be flushed, rather than all the individual mappings that may be getting
invalidated.

Use broadcast TLB flushing when that is available.

Signed-off-by: Rik van Riel <riel@surriel.com>
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Tested-by: Brendan Jackman <jackmanb@google.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
 arch/x86/mm/tlb.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Dave Hansen Feb. 14, 2025, 6:51 p.m. UTC | #1
On 2/13/25 08:13, Rik van Riel wrote:
> In the page reclaim code, we only track the CPU(s) where the TLB needs
> to be flushed, rather than all the individual mappings that may be getting
> invalidated.
> 
> Use broadcast TLB flushing when that is available.

The changelog here is a little light. This patch is doing a *ton* of stuff.

The existing code has two cases where it is doing a full TLB flush, not
a ranged flush.

	1. An actual IPI to some CPUs in batch->cpumask
	2. A local flush, no IPI

The change here eliminates both of those options, even the "common case"
which is not sending an IPI at all. So this replaces a CPU-local (aka. 1
logical CPU) TLB flush with a broadcast to the *ENTIRE* system. That's a
really really big change to not be noted. It's not something that's an
obvious win to me.

> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 3c29ef25dce4..de3f6e4ed16d 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -1316,7 +1316,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>  	 * a local TLB flush is needed. Optimize this use-case by calling
>  	 * flush_tlb_func_local() directly in this case.
>  	 */
> -	if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
> +	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
> +		invlpgb_flush_all_nonglobals();
> +	} else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
>  		flush_tlb_multi(&batch->cpumask, info);
>  	} else if (cpumask_test_cpu(cpu, &batch->cpumask)) {
>  		lockdep_assert_irqs_enabled();

The structure of the code is also a bit off to me. O'd kinda prefer that
we stick the pattern of (logically):

	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
		invlpgb_...();
	} else {
		on_each_cpu*();
	}

This patch is going a couple of functions up in the call chain above the
on_each_cpu()'s.

It would be more consistent with the previous modifications in this
series if the X86_FEATURE_INVLPGB check was in native_flush_tlb_multi(),
instead.

Would that make sense here? It would also preserve the "common case"
optimization that's in arch_tlbbatch_flush().
Rik van Riel Feb. 18, 2025, 7:31 p.m. UTC | #2
On Fri, 2025-02-14 at 10:51 -0800, Dave Hansen wrote:
> 
> Would that make sense here? It would also preserve the "common case"
> optimization that's in arch_tlbbatch_flush().
> 
What we do in this patch goes out the window in patch
10. This is just a temporary stage along the way to
what we have at the end of the series, needed to make
sure we don't break bisect partway through the series.

I'm not sure we should be doing much with this patch.

I could fold it into the next patch, but that would
make things harder to read.
Dave Hansen Feb. 18, 2025, 7:46 p.m. UTC | #3
On 2/18/25 11:31, Rik van Riel wrote:
> On Fri, 2025-02-14 at 10:51 -0800, Dave Hansen wrote:
>> Would that make sense here? It would also preserve the "common case"
>> optimization that's in arch_tlbbatch_flush().
>>
> What we do in this patch goes out the window in patch
> 10. This is just a temporary stage along the way to
> what we have at the end of the series, needed to make
> sure we don't break bisect partway through the series.
> 
> I'm not sure we should be doing much with this patch.
> 
> I could fold it into the next patch, but that would
> make things harder to read.

Ahh, that makes sense. If it's going out the window, could it be
temporarily commented, please? Add the comment here and remove it in
patch 10 (or whatever).
Rik van Riel Feb. 18, 2025, 8:06 p.m. UTC | #4
On Tue, 2025-02-18 at 11:46 -0800, Dave Hansen wrote:
> 
> Ahh, that makes sense. If it's going out the window, could it be
> temporarily commented, please? Add the comment here and remove it in
> patch 10 (or whatever).
> 
I added an explanation in the commit message.

You are right that it wasn't totally clear.
diff mbox series

Patch

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 3c29ef25dce4..de3f6e4ed16d 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1316,7 +1316,9 @@  void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
 	 * a local TLB flush is needed. Optimize this use-case by calling
 	 * flush_tlb_func_local() directly in this case.
 	 */
-	if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
+	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
+		invlpgb_flush_all_nonglobals();
+	} else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) {
 		flush_tlb_multi(&batch->cpumask, info);
 	} else if (cpumask_test_cpu(cpu, &batch->cpumask)) {
 		lockdep_assert_irqs_enabled();