[RFC,v2,00/11] AMD broadcast TLB invalidation

Message ID	20241223025751.3268975-1-riel@surriel.com (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Rik van Riel <riel@surriel.com> To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, linux-mm@kvack.org Subject: [RFC PATCH v2 00/11] AMD broadcast TLB invalidation Date: Sun, 22 Dec 2024 21:55:06 -0500 Message-ID: <20241223025751.3268975-1-riel@surriel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	AMD broadcast TLB invalidation \| expand [RFC,v2,00/11] AMD broadcast TLB invalidation [01/11] x86/mm: make MMU_GATHER_RCU_TABLE_FREE unconditional [02/11] x86/mm: add X86_FEATURE_INVLPGB definition. [03/11] x86/mm: get INVLPGB count max from CPUID [04/11] x86/mm: add INVLPGB support code [05/11] x86/mm: use INVLPGB for kernel TLB flushes [06/11] x86/tlb: use INVLPGB in flush_tlb_all [07/11] x86/mm: use broadcast TLB flushing for page reclaim TLB flushing [08/11] x86/mm: enable broadcast TLB invalidation for multi-threaded processes [09/11] x86,tlb: do targeted broadcast flushing from tlbbatch code [10/11] x86/mm: enable AMD translation cache extensions [11/11] x86/mm: only invalidate final translations with INVLPGB

Message ID

20241223025751.3268975-1-riel@surriel.com (mailing list archive)

Headers

From: Rik van Riel <riel@surriel.com>
To: x86@kernel.org
Cc: linux-kernel@vger.kernel.org,
	kernel-team@meta.com,
	dave.hansen@linux.intel.com,
	luto@kernel.org,
	peterz@infradead.org,
	tglx@linutronix.de,
	mingo@redhat.com,
	bp@alien8.de,
	hpa@zytor.com,
	akpm@linux-foundation.org,
	linux-mm@kvack.org
Subject: [RFC PATCH v2 00/11] AMD broadcast TLB invalidation
Date: Sun, 22 Dec 2024 21:55:06 -0500
Message-ID: <20241223025751.3268975-1-riel@surriel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

AMD broadcast TLB invalidation | expand

Message

Rik van Riel Dec. 23, 2024, 2:55 a.m. UTC

Add support for broadcast TLB invalidation using AMD's INVLPGB instruction.

This allows the kernel to invalidate TLB entries on remote CPUs without
needing to send IPIs, without having to wait for remote CPUs to handle
those interrupts, and with less interruption to what was running on
those CPUs.

Because x86 PCID space is limited, and there are some very large
systems out there, broadcast TLB invalidation is only used for
processes that are active on 3 or more CPUs, with the threshold
being gradually increased the more the PCID space gets exhausted.

Combined with the removal of unnecessary lru_add_drain calls
(see https://lkml.org/lkml/2024/12/19/1388) this results in a
nice performance boost for the will-it-scale tlb_flush2_threads
test on an AMD Milan system with 36 cores:

- vanilla kernel:           527k loops/second
- lru_add_drain removal:    731k loops/second
- only INVLPGB:             527k loops/second
- lru_add_drain + INVLPGB: 1157k loops/second

Profiling with only the INVLPGB changes showed while
TLB invalidation went down from 40% of the total CPU
time to only around 4% of CPU time, the contention
simply moved to the LRU lock.

Fixing both at the same time about doubles the
number of iterations per second from this case.

v2:
- Apply suggestions by Peter and Borislav (thank you!)
- Fix bug in arch_tlbbatch_flush, where we need to do both
  the TLBSYNC, and flush the CPUs that are in the cpumask.
- Some updates to comments and changelogs based on questions.