From patchwork Wed Dec 18 16:56:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 13913948 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 529FAE77187 for ; Wed, 18 Dec 2024 16:59:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8D756B0088; Wed, 18 Dec 2024 11:59:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C3DA66B0089; Wed, 18 Dec 2024 11:59:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B057F6B008A; Wed, 18 Dec 2024 11:59:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 93BA26B0088 for ; Wed, 18 Dec 2024 11:59:30 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 438BC80EB2 for ; Wed, 18 Dec 2024 16:59:30 +0000 (UTC) X-FDA: 82908690606.29.E820A61 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by imf19.hostedemail.com (Postfix) with ESMTP id 123541A0021 for ; Wed, 18 Dec 2024 16:58:54 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734541153; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=G2N3/1SHwK/jb+UzlhMYfmsHZ5qfVi2p0RjKmIQmckk=; b=NSLRANu7a7At8N4mF6RXCFAfbYovvlQXoBCEXhBNBc3QLX0SaXi1OIzJIakUIEQpnf9Fc7 KihH5xHcpWTw/2Id99azLLosV74u0k8SPGihuljsXA6lDIyZZz5nptLpJhvaw6N1nDtDXz P922tWljWh/z6xWLPJBkk3MIanndg0Y= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734541153; a=rsa-sha256; cv=none; b=ySAutfTcjeC9IaAW4A5aXP/nqHuK5dSC6KYzWtL3fWZZ/DeSX6SCp/TxtZs+xSL0sB6xub YP75TNuN7WbmD8N93QurhpOVnBO23OfcLRKG1IzFxnQWFYyC13eJBBxcZ70tk/ufU5hRto YPTOEmOOWGO8jX9g6i7/xth4t74ek68= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of riel@shelob.surriel.com designates 96.67.55.147 as permitted sender) smtp.mailfrom=riel@shelob.surriel.com; dmarc=none Received: from [2601:18c:9101:a8b6:82e7:cf5d:dfd9:50ef] (helo=fangorn) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1tNxLA-000000008Pb-3yFZ; Wed, 18 Dec 2024 11:56:04 -0500 Date: Wed, 18 Dec 2024 11:56:04 -0500 From: Rik van Riel To: Andrew Morton Cc: "Huang, Ying" , Chris Li , Ryan Roberts , David Hildenbrand , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team@meta.com Subject: [PATCH] mm: add maybe_lru_add_drain() that only drains when threshold is exceeded Message-ID: <20241218115604.7e56bedb@fangorn> X-Mailer: Claws Mail 4.3.0 (GTK 3.24.43; x86_64-redhat-linux-gnu) MIME-Version: 1.0 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 123541A0021 X-Stat-Signature: y58mysj4ukhyi6iubuaah6poqsn4k3gb X-Rspam-User: X-HE-Tag: 1734541134-53857 X-HE-Meta: U2FsdGVkX1+Qofp3VNIwmP7Vv3CFQhqEgbzaj/U5gZ5ji3ju4W5cGOtjXiaHVJ1YOuvByESIURNxDk/eUbig7JTGRU6AbJ3cj7v8uHfG/SQoXoT3kmufowYFiI42Y/LVmZI7Ftn9n7r678pRzzMjRsrzVIufqr4X87CzTElIBms6+3TE7YIHhn9gkz4MmWSK8SQRF/yVsn50ePbQ9EyhLMNPV6BfsgNKrOWL7c8OyXlQXdt4NZ4+1Sc39moMhh9f8OAzRGwMQ9OeQYUQsuD+WxjoihOhH/xUtZLgi3qu8iEmn4SYRUvyE/NoUQfXmQtM5+VxFfcXHoUFQ6jyLYudEK0eBpB7q0glaFtnSf5ee8YhRb9MRN9lPx54dTw/tiIMoM1wkCQZDNlKs55B6NBVXA2Qpc/fiLVpnXrbLhsB6OcOc2kPd0mjSST4RbC7Qw8qOhQMOcqM8QmYv6zFUpTc9xBWwJp5G42kIw+mjTQueHMoAurP03KNIRQHCH/GXEiqzh3nBuo6ShflRzLE3lLoKc6Ew9+x5kDRffPv4uE/vPJ0UmyXTgPf9Yf+RBPW4azjW5kWQiQeAVe1wyan727TfR86RcZl2WpgOEmfJi60wVIKohlI9pXDjw26ZU/ddo8hqyNOBTKNZja8/uHDej2fr0JfFSO//fitviEM7V2/X8jhipK7/XkBfJA1Y1qBQrMr4Jad/mckpgNI2tNYY/I5PtTq3TSiUkV3v3Twd/GT6BKg7Wdbj4Erd31vUyitxi4D2yNFkRSrP82ZQGFKXgKAaU/ELbmQ+uAyEvpWG4hr0dM8tDrTIVYTyG0z48mFPJtetS7ZSj1toOKyyGITi/slf99Uz/K+oJa5Ff5AV2yxHA+/XXLtcKqoLCa08+TsJtKz53j+ki6V0ISHeJx72XLTgGecnf41s1/2WcN8HFVl6sD8XA2bxeY+h4ZQgI7tBgYZbo9SkvI7asTXPKKwkqF gjAYQAY3 xjElFasTJdyM6aARbDeLAUDXPJHWntvt5QJ46/ij7lpSiyGtGQizUeoPdqK65IqdGUbu9+b+IXWjndBp8bSSK5Hdy+HCQR0cxjV8e0fOnAInrAd0YtiTjJJj/8WOMVnEp+gxpq3D3a3mHf1k5K3p1ShB5Rp2D+hKHpyXUHRFeYPNYsCaFfHDuQTfHw1k4SDjV07/6k0Ux1DdLOAAJr23O2NLYNECVwUnRjvyZ3lciHkzg2V/miQlvRJFMT981ys47vUBNHgo7zig/X2++VFAfM7EQfCjNqCZTr6PhdXWyPzDJ2ridxLExVCwRsiJl3JsFVY4uGYIjP1UYYODYxWA/mci9VJGVgC8nN5lmmUxUNPtmLeMgIZgkqW7M+s1+aApGWa4W6hhhDUSTQP0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The lru_add_drain() call in zap_page_range_single() always takes some locks, and will drain the buffers even when there is only a single page pending. We probably don't need to do that, since we already deal fine with zap_page_range encountering pages that are still in the buffers of other CPUs. On an AMD Milan CPU, will-it-scale the tlb_flush2_threads test performance with 36 threads (one for each core) increases from 526k to 730k loops per second. The overhead in this case was on the lruvec locks, taking the lock to flush a single page. There may be other spots where this variant could be appropriate. Signed-off-by: Rik van Riel --- include/linux/swap.h | 1 + mm/memory.c | 2 +- mm/swap.c | 18 ++++++++++++++++++ mm/swap_state.c | 2 +- 4 files changed, 21 insertions(+), 2 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index dd5ac833150d..a2f06317bd4b 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -391,6 +391,7 @@ static inline void lru_cache_enable(void) } extern void lru_cache_disable(void); +extern void maybe_lru_add_drain(void); extern void lru_add_drain(void); extern void lru_add_drain_cpu(int cpu); extern void lru_add_drain_cpu_zone(struct zone *zone); diff --git a/mm/memory.c b/mm/memory.c index 2635f7bceab5..1767c65b93ad 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1919,7 +1919,7 @@ void zap_page_range_single(struct vm_area_struct *vma, unsigned long address, struct mmu_notifier_range range; struct mmu_gather tlb; - lru_add_drain(); + maybe_lru_add_drain(); mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma->vm_mm, address, end); hugetlb_zap_begin(vma, &range.start, &range.end); diff --git a/mm/swap.c b/mm/swap.c index 9caf6b017cf0..001664a652ff 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -777,6 +777,24 @@ void lru_add_drain(void) mlock_drain_local(); } +static bool should_lru_add_drain(void) +{ + struct cpu_fbatches *fbatches = this_cpu_ptr(&cpu_fbatches); + int pending = folio_batch_count(&fbatches->lru_add); + pending += folio_batch_count(&fbatches->lru_deactivate); + pending += folio_batch_count(&fbatches->lru_deactivate_file); + pending += folio_batch_count(&fbatches->lru_lazyfree); + + /* Don't bother draining unless we have several pages pending. */ + return pending > SWAP_CLUSTER_MAX; +} + +void maybe_lru_add_drain(void) +{ + if (should_lru_add_drain()) + lru_add_drain(); +} + /* * It's called from per-cpu workqueue context in SMP case so * lru_add_drain_cpu and invalidate_bh_lrus_cpu should run on diff --git a/mm/swap_state.c b/mm/swap_state.c index 3a0cf965f32b..1ae4cd7b041e 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -317,7 +317,7 @@ void free_pages_and_swap_cache(struct encoded_page **pages, int nr) struct folio_batch folios; unsigned int refs[PAGEVEC_SIZE]; - lru_add_drain(); + maybe_lru_add_drain(); folio_batch_init(&folios); for (int i = 0; i < nr; i++) { struct folio *folio = page_folio(encoded_page_ptr(pages[i]));