From patchwork Mon Apr 7 18:01:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 14041470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8674CC36010 for ; Mon, 7 Apr 2025 18:02:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17E516B0005; Mon, 7 Apr 2025 14:02:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 12DF46B0007; Mon, 7 Apr 2025 14:02:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0F416B0008; Mon, 7 Apr 2025 14:01:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D208F6B0005 for ; Mon, 7 Apr 2025 14:01:59 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0EBFD1414A2 for ; Mon, 7 Apr 2025 18:02:00 +0000 (UTC) X-FDA: 83308016400.14.4B7314B Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf23.hostedemail.com (Postfix) with ESMTP id D563F140002 for ; Mon, 7 Apr 2025 18:01:57 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=s3UPUVyf; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf23.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.41 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744048918; a=rsa-sha256; cv=none; b=24s0JBGBZptR/jtVbnKlyAnKZ0vSn2Wz0i2/0M9JdYiQDaPnBgT5R2oqm0VWTybQ+5Z0P3 pvAk7D19OSLRqjp0xH+p9gSw1tOQiR8OnFVDdX7ogK8By6TYWEdk17j0T3535Dk4c6NZrb e9DYN4hR522yv1w/+X5tsAURQmuTcco= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=s3UPUVyf; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf23.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.41 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744048918; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=1Vt8cqKYnz2MPMdOl9mwLikohokhtaGg2JyeHvDBS+M=; b=4d4iF4FS9JFoeiCYc6jkfgL2YGoc8pp3Sn3DgCV1eouIAWK/PwbToUY8U4x3U9XrHcl1uC S+CwEjr3yWi/uaxug9cKr1CYJAfKCZfq9KtacALKzwSBs0R/Ooub5ly6PfS62enG3zSbZb wBxJbVcuPcEczcnq1B4Scn5PFk1EYFE= Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6f0c30a1ca3so19137426d6.1 for ; Mon, 07 Apr 2025 11:01:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1744048917; x=1744653717; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=1Vt8cqKYnz2MPMdOl9mwLikohokhtaGg2JyeHvDBS+M=; b=s3UPUVyf6NNbakKWvsHQpmqHoyaEsrnFeibo8YfxZNoV9xn+0oIjSq16zvJ4ZDNBS5 yXhy9TXS5vOHcCCt+zi8CTFmt6oy8EDzCu8Y3geScZFnQJYg8J+RmFDa3EH+919tHw94 vNWd3vqxxwl6s98t7CmdoQUKWq0CIOuvJOAkOQWZWV7MZrk9vXclkCG1I5EKpTa+OHB6 UmIj0BLM9C+HwdJZQ/FupLYwOZ4BaRO7hjCrHbSq7tYWyzsow10DI+OHS09+sJ3c5erK BkkzQgPnzo72aWhpae+mnQVc9m2gXD4rv0TwrkoN2n08LCuXfrf3c3x9lDvPxyQqLS33 M+hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744048917; x=1744653717; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1Vt8cqKYnz2MPMdOl9mwLikohokhtaGg2JyeHvDBS+M=; b=lshqkURXZenBH7rbrz3q81nCNM0CiOIVzLAzdJwBYLTPMaLQ9HVzu6DDgEQwMZvJBz r5sMOyTruZtwNqWJsYQM/GwtkokMyNKkNYzC2XoS5kjRQRkn7m7uskRVRzPbirsI4LwA 3cvQz3dRhQRmAf/A1mJOYTskVv2A7IDrWl5Fx7oEAUVbtDgTdw6qNodpCFemzZsikZ1j 6NkHulqqce33eQ0KIrL89haneeZNvLQUcpwZ9g4Ae6tbv5ZsBfZMyghONy/3EX8fXrnD bMc8BKWiEcJ23w5JZ7bssGKLXKA4m9O7nS4/QBMNYgWr8G/J9s8A8oHrpzEkWjL11seI nX6g== X-Forwarded-Encrypted: i=1; AJvYcCXxixdg7el0IpYDTgloEINixzRZAUs/dTYa3RnxhAFw6F3eAL8OHa/jPbmrAKy1rd3i3ZAVIicE8Q==@kvack.org X-Gm-Message-State: AOJu0YwRv4EEqnjll+RjPeUad7N/myUazeqksY6n4p8okfYPg/cGlPcS h0v2ndANO7KQblQAjzy3aFxgfVezBE11FF/ZRZLoEaJE2/gG/90mcbBzIZdyflg= X-Gm-Gg: ASbGnct+MVmvSRJMGOCwOt+Wgzoi2y6/9q0iHMfMALSInPCQNXSlvMp9TQHUeEZDNZp lFZLKE2kj/yvvyo6BCwOpKiW33XW2+CkAr7ECPaN/5o9hrR0WtUDfRZ+caABM20x5YmddqUEZ7Q HVzQ1rJdBlGX0X9RykVFnCdfQEc/LONlRFKf983fy/1rx6jJ4BKHzkopFmL8Eh+9yF4gOiqU07s LcxhdgtlaDQKHOu8NX252WKnCYgOKsS2mr8aFYd7MfA2XEZPFyNr0GYqcv0NQ0PQAshlzwXHVC5 sA1WOGTeNhTiMMevqxQqXyQmyegAy8gxIgyYH7YTx0U= X-Google-Smtp-Source: AGHT+IHfFCek/TyjJu8JV4kDrZVOMYWdBjD2hitXLlxQyYiBJ0cTlZewGMIdCZoA2TkopLyyoCy34g== X-Received: by 2002:a05:6214:19c9:b0:6d4:1425:6d2d with SMTP id 6a1803df08f44-6f01e7e46dcmr204245356d6.43.1744048916738; Mon, 07 Apr 2025 11:01:56 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:365a:60ff:fe62:ff29]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c76ea8304asm628078985a.103.2025.04.07.11.01.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Apr 2025 11:01:56 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Brendan Jackman , Mel Gorman , Carlos Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel test robot , stable@vger.kernel.org Subject: [PATCH 1/2] mm: page_alloc: speed up fallbacks in rmqueue_bulk() Date: Mon, 7 Apr 2025 14:01:53 -0400 Message-ID: <20250407180154.63348-1-hannes@cmpxchg.org> X-Mailer: git-send-email 2.49.0 MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D563F140002 X-Stat-Signature: zji15z4mx6483tqquwwx3n59ctm7iyoq X-HE-Tag: 1744048917-728820 X-HE-Meta: U2FsdGVkX1/LpZ+gd9C0OgBWQ3M1R7FCT8yEqrsn1DMsaBJ5BSZBmnGDBDWBLoFBV9nJERun7UA/ytcO0lx3+5kNoF1lHmFnCPfihItbsofJgjgfJAJCX3vSZ9puMFR1FLjpy4sFMDRo6hEcCIBHl4MYvrX8JcksfxDY0kTBt3Z2ZUMB0oBUFEwZ63sIMWHBaW/CvneYQ03jn2BCLEEhe+5aYueXe2eBgwFXxKVH7yyeacARN4nfitSSkfXcp/tMpY9PvYiiKBelcca+qJMrAOa30HsMjJs9uvxRcYejNHFtqN6Fx1Wx0qkXMtWra61/0elflWCava5PDhtq8NFoecvFTkLrCHrMtrfsyKv14EXi7f1D3iz+rcp2krlYNVVguqpAqjVL5Fp897BgkB2xOU1sKYgX2Gk8N2swH3mhQTbCxCZLrHhoq9Bvx71nXm2GOsuYOO5VNw5FZOP1pdMefVarU1VI5kvuvXan34p7T/Mrut9XoP7Zwj753Jtl3w9OhTh4eKwMi62JT4VHgstjrB2qp//tI7W/Kq2Rp9Hzmihz54To+OWb2Mktk2fkAvGghIKRa233atGyUolDN8BUwyhx7ZJU1RwxKCYXyfJUt9Weq0C2mw3ffWijOcNKyYdMHlLOqDkT/eNrvFbvDrG3GQpFXM94XfH87aAM/1bRWkfF56SUvoBv3aeDWKRMQsk6nJcvqU0rgX1Vx5O0T9dYG4KB808UIihhx3knAJTxMWiJEYAU8eZrrTRERJTtbcjZBqQBlOg0GDYOzhx5fDyVw5Baxd/9gz14PCp+V/jv9PEorLT/BFgG4MfsGOwrKxYYrp9a8sVPtrZO8yFj0xop5y3Y1F+lWxdhRYM7PaUL3+OvbOs0kXJIwVaOHwqDC2Svz6l/kSYzDbbj852iW2Q1Ax9/OZMlJiGcnNOItGztuuBVTU/pl42Zm894NEgfeVX2rGHhJZg8RxC0uwsa/q8 1nYWF8yU CxhxjUOaaJPwXpy/ZK1/jhfdZA4pqWKVzLRIRcX1pFv2JzeccjbcwkU4hFB6UsxqV0X6yMNe0PIQQwasXjePHp8xjyu3jE0s3LpCwx+eJ0RQt/1ouyxISdXBvtDGEeL5C02wWWeoOuAG0VjBJKtJUupizcxf64bJWKJEUC/DovPz+e8nZyRfq1as2QRKl16qZDQJ40JRHOH3THiN9p24p+L0ZFOjTtod/Y8YGx2HOgJqrEzIPI2JH+tFP5miOdtPVnW9NKJf2nRsA1leIjzfFMEKGFWxwQCgvzG+T/BfGkQBqIV8tf62ye3Spg3CG2rPj3/FKidmBf8rWHTgWsbPpRh5Td7HVlw2YEbil0JmgPoM1PfnbMLf7hMABzfvymWxhQhpYOyVbi2w8hqywf4veWEJFrW4ufCh4OnHe6x8vUMargVN3RyEatvsVrI3htoIbn29T/VQL123chKo/b8ARsoI0R4XPa5CMhOEKEXMC65k1BupZSDz73jnm2ld8Z6n/pZ0LugYempkMSs3p1SWrjDueDGfQY7Muav1EzUgUWa9kerI/sLWjLc0Zv13z39FzOpbQrBATFLqZq1fD9Df0liPNKEwsFOgXj346DNkqUsYd/WnMUYev3Mqj7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The test robot identified c2f6ea38fc1b ("mm: page_alloc: don't steal single pages from biggest buddy") as the root cause of a 56.4% regression in vm-scalability::lru-file-mmap-read. Carlos reports an earlier patch, c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion"), as the root cause for a regression in worst-case zone->lock+irqoff hold times. Both of these patches modify the page allocator's fallback path to be less greedy in an effort to stave off fragmentation. The flip side of this is that fallbacks are also less productive each time around, which means the fallback search can run much more frequently. Carlos' traces point to rmqueue_bulk() specifically, which tries to refill the percpu cache by allocating a large batch of pages in a loop. It highlights how once the native freelists are exhausted, the fallback code first scans orders top-down for whole blocks to claim, then falls back to a bottom-up search for the smallest buddy to steal. For the next batch page, it goes through the same thing again. This can be made more efficient. Since rmqueue_bulk() holds the zone->lock over the entire batch, the freelists are not subject to outside changes; when the search for a block to claim has already failed, there is no point in trying again for the next page. Modify __rmqueue() to remember the last successful fallback mode, and restart directly from there on the next rmqueue_bulk() iteration. Oliver confirms that this improves beyond the regression that the test robot reported against c2f6ea38fc1b: commit: f3b92176f4 ("tools/selftests: add guard region test for /proc/$pid/pagemap") c2f6ea38fc ("mm: page_alloc: don't steal single pages from biggest buddy") acc4d5ff0b ("Merge tag 'net-6.15-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") 2c847f27c3 ("mm: page_alloc: speed up fallbacks in rmqueue_bulk()") <--- your patch f3b92176f4f7100f c2f6ea38fc1b640aa7a2e155cc1 acc4d5ff0b61eb1715c498b6536 2c847f27c37da65a93d23c237c5 ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 25525364 ± 3% -56.4% 11135467 -57.8% 10779336 +31.6% 33581409 vm-scalability.throughput Carlos confirms that worst-case times are almost fully recovered compared to before the earlier culprit patch: 2dd482ba627d (before freelist hygiene): 1ms c0cd6f557b90 (after freelist hygiene): 90ms next-20250319 (steal smallest buddy): 280ms this patch : 8ms Reported-by: kernel test robot Reported-by: Carlos Song Tested-by: kernel test robot Fixes: c0cd6f557b90 ("mm: page_alloc: fix freelist movement during block conversion") Fixes: c2f6ea38fc1b ("mm: page_alloc: don't steal single pages from biggest buddy") Closes: https://lore.kernel.org/oe-lkp/202503271547.fc08b188-lkp@intel.com Cc: stable@vger.kernel.org # 6.10+ Signed-off-by: Johannes Weiner Acked-by: Johannes Weiner Signed-off-by: Johannes Weiner Reviewed-by: Brendan Jackman Signed-off-by: Brendan Jackman Acked-by: Zi Yan Tested-by: Carlos Song Reviewed-by: Vlastimil Babka Tested-by: Shivank Garg --- mm/page_alloc.c | 100 +++++++++++++++++++++++++++++++++++------------- 1 file changed, 74 insertions(+), 26 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f51aa6051a99..03b0d45ed45a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2194,11 +2194,11 @@ try_to_claim_block(struct zone *zone, struct page *page, * The use of signed ints for order and current_order is a deliberate * deviation from the rest of this file, to make the for loop * condition simpler. - * - * Return the stolen page, or NULL if none can be found. */ + +/* Try to claim a whole foreign block, take a page, expand the remainder */ static __always_inline struct page * -__rmqueue_fallback(struct zone *zone, int order, int start_migratetype, +__rmqueue_claim(struct zone *zone, int order, int start_migratetype, unsigned int alloc_flags) { struct free_area *area; @@ -2236,14 +2236,26 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, page = try_to_claim_block(zone, page, current_order, order, start_migratetype, fallback_mt, alloc_flags); - if (page) - goto got_one; + if (page) { + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; + } } - if (alloc_flags & ALLOC_NOFRAGMENT) - return NULL; + return NULL; +} + +/* Try to steal a single page from a foreign block */ +static __always_inline struct page * +__rmqueue_steal(struct zone *zone, int order, int start_migratetype) +{ + struct free_area *area; + int current_order; + struct page *page; + int fallback_mt; + bool claim_block; - /* No luck claiming pageblock. Find the smallest fallback page */ for (current_order = order; current_order < NR_PAGE_ORDERS; current_order++) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, @@ -2253,25 +2265,28 @@ __rmqueue_fallback(struct zone *zone, int order, int start_migratetype, page = get_page_from_free_area(area, fallback_mt); page_del_and_expand(zone, page, order, current_order, fallback_mt); - goto got_one; + trace_mm_page_alloc_extfrag(page, order, current_order, + start_migratetype, fallback_mt); + return page; } return NULL; - -got_one: - trace_mm_page_alloc_extfrag(page, order, current_order, - start_migratetype, fallback_mt); - - return page; } +enum rmqueue_mode { + RMQUEUE_NORMAL, + RMQUEUE_CMA, + RMQUEUE_CLAIM, + RMQUEUE_STEAL, +}; + /* * Do the hard work of removing an element from the buddy allocator. * Call me with the zone->lock already held. */ static __always_inline struct page * __rmqueue(struct zone *zone, unsigned int order, int migratetype, - unsigned int alloc_flags) + unsigned int alloc_flags, enum rmqueue_mode *mode) { struct page *page; @@ -2290,16 +2305,47 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype, } } - page = __rmqueue_smallest(zone, order, migratetype); - if (unlikely(!page)) { - if (alloc_flags & ALLOC_CMA) + /* + * Try the different freelists, native then foreign. + * + * The fallback logic is expensive and rmqueue_bulk() calls in + * a loop with the zone->lock held, meaning the freelists are + * not subject to any outside changes. Remember in *mode where + * we found pay dirt, to save us the search on the next call. + */ + switch (*mode) { + case RMQUEUE_NORMAL: + page = __rmqueue_smallest(zone, order, migratetype); + if (page) + return page; + fallthrough; + case RMQUEUE_CMA: + if (alloc_flags & ALLOC_CMA) { page = __rmqueue_cma_fallback(zone, order); - - if (!page) - page = __rmqueue_fallback(zone, order, migratetype, - alloc_flags); + if (page) { + *mode = RMQUEUE_CMA; + return page; + } + } + fallthrough; + case RMQUEUE_CLAIM: + page = __rmqueue_claim(zone, order, migratetype, alloc_flags); + if (page) { + /* Replenished native freelist, back to normal mode */ + *mode = RMQUEUE_NORMAL; + return page; + } + fallthrough; + case RMQUEUE_STEAL: + if (!(alloc_flags & ALLOC_NOFRAGMENT)) { + page = __rmqueue_steal(zone, order, migratetype); + if (page) { + *mode = RMQUEUE_STEAL; + return page; + } + } } - return page; + return NULL; } /* @@ -2311,6 +2357,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long count, struct list_head *list, int migratetype, unsigned int alloc_flags) { + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; unsigned long flags; int i; @@ -2321,7 +2368,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, } for (i = 0; i < count; ++i) { struct page *page = __rmqueue(zone, order, migratetype, - alloc_flags); + alloc_flags, &rmqm); if (unlikely(page == NULL)) break; @@ -2934,6 +2981,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, { struct page *page; unsigned long flags; + enum rmqueue_mode rmqm = RMQUEUE_NORMAL; do { page = NULL; @@ -2945,7 +2993,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, if (alloc_flags & ALLOC_HIGHATOMIC) page = __rmqueue_smallest(zone, order, MIGRATE_HIGHATOMIC); if (!page) { - page = __rmqueue(zone, order, migratetype, alloc_flags); + page = __rmqueue(zone, order, migratetype, alloc_flags, &rmqm); /* * If the allocation fails, allow OOM handling and From patchwork Mon Apr 7 18:01:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 14041471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B89F5C369A1 for ; Mon, 7 Apr 2025 18:02:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D9426B0007; Mon, 7 Apr 2025 14:02:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 836F86B0008; Mon, 7 Apr 2025 14:02:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D83E6B000A; Mon, 7 Apr 2025 14:02:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 51EBC6B0007 for ; Mon, 7 Apr 2025 14:02:01 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6A8FD802A0 for ; Mon, 7 Apr 2025 18:02:01 +0000 (UTC) X-FDA: 83308016442.02.39AE312 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf26.hostedemail.com (Postfix) with ESMTP id 5F567140003 for ; Mon, 7 Apr 2025 18:01:59 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=NcVvNxBy; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf26.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744048919; a=rsa-sha256; cv=none; b=bM8/sCCx24Jyr/8W+vuYG/rgqygRAldoepNI0vGHJYBYDkLdy1b1UKmCAh+qTudWWOSH+c CxFd44WUAk8NmczE5zj6ve9eKo3FJF8cuSHEeGqYFiPf2xi1B3r+hjn/FaUiG9jmhmWeEP HL2Nt771m+goaMP9ncIuQffHpZoEZyc= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=NcVvNxBy; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf26.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744048919; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CQLZCr05VGtGVBv3A2tu1D2mkKumfPoYpq5Esa8pDhY=; b=mCsnWuxw0keFwS4Cadbm9W2GWGB6M2gnq/VHdcQQBVk/OEGuTsOo6OKIRTZme4ezDPMlZn K+3wBvZ3vNfbNOvTXy4rmCmnCZe60Wtkzp2MzokYit54TkJbapEEBQvXrUpEFIA/5st81A VR6dLARgjO2LJ+qTKfxHGVidm6Hajeg= Received: by mail-qk1-f179.google.com with SMTP id af79cd13be357-7c08f9d0ef3so277466185a.2 for ; Mon, 07 Apr 2025 11:01:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1744048918; x=1744653718; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CQLZCr05VGtGVBv3A2tu1D2mkKumfPoYpq5Esa8pDhY=; b=NcVvNxByAWQUQ/P4FReTpbs8MokHGaC9jlQ3rJEs88An3e6YY/MzTk52U70VWxd1Gu xEN3+uH5ENBG3eHhG4lEuCql25z1YDWt9GRtsbFjfyTvLgrwXCRDbCDO0aM5jSl8Kgwj wUukjQf3xvJDNZAQBfAypcM3/pc/tX7BZOSh3fl08q8Bhu7WEdva0q1NVvJJ0oppUcac ao9J/Qv6gjLQX1MK2xhNSQQwv5fSny1sB5e7pkUTs3kSyU/525BuS/cFG3sMYTOarsJ0 k6Mj9trjr84pJU5Wa9gEbwnP0rr6D4EuG9VPwA+824bDodzdcDSLWukK9ZbPD3rUq05x 7dKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744048918; x=1744653718; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CQLZCr05VGtGVBv3A2tu1D2mkKumfPoYpq5Esa8pDhY=; b=WI/cwn3jcYI9RlUde9ILhxUpjcoHEHhjGHCigkBLbgNHkrBbs+6qW2d3fy+fzs1ulO zdbfUr29zwuvnUNv8sePf0PGian7IGEwxGwwe7FsR2IOK+8LbJin/8sBjCUL67+IzRtA dfnhyu0/xPDcnAKSmrtinouMpK5I3KUqajuX8yfdVqUS1QZT9/WMYdoenH+gRDHJtI1W 1Jh29Ir2TMWYu+USRDCOcNvhIoRwlEY4c0aPDdKH/qyj5dXBlBzuIcqfsz7avIs19v9y o1tunh82ZVPc8h8OeyGg/7VbY9Kcqbt05FPJlSvVjEYcy5ra699JuTmlL4RUt1JbmcZY VwpA== X-Forwarded-Encrypted: i=1; AJvYcCXj7oty7Km6IvEFVar3TDyhEY8Sa6f5h8Hx2r5gLRLVisLnAfnSMALsUEUSivFmu+yL5aaIGP1RrQ==@kvack.org X-Gm-Message-State: AOJu0YzxBfq6CwjjAg6TZ/+ybnPd7qv5/JAawsnOtGKfIbEDC1WBwBIZ zKRXwcmwTKhMiZWV4RAeiOhUQMiZ97WQAbMJJjAQlwcS8vyGQYPEY/iy9Srs/Os= X-Gm-Gg: ASbGncvlDJRvXywRGHOiQcocoP+OSxonVMFJp0FJGeE+BFgvxM+QAp9LQPsW6tDdj0R C7SitdXG9wr3SQCnhm53yD782aNJ1Y+NKVN/2kK/n5uv1Qx+FZbUasN6EePJCuG3I9JqpESfDmI HwDqNKp8OxUOgrVyvI0+fIXmNQuh1ElBItZcdwmtKHTMQ2dZasef7jxuZWMkc4djz/nxzMXt9qw KcFTSDG+rZPE7rOSi7/NduY71+4n6ViT3gfQmx0p7Wf4dPHnrOL6zCXJMmYXfUa7iYbBtvqdLl9 Nnzg6AZ3GEa5/LCkEYloV6Zoec8mqWDqAUIiwzyHhMg= X-Google-Smtp-Source: AGHT+IFP40wX1tqDVmp7PqGhxZhz6Aw85evf0lAoeVchmmMqlQfdBSCHrrixVtc5cp7a9YIfnuha/Q== X-Received: by 2002:a05:620a:318b:b0:7c5:4278:d151 with SMTP id af79cd13be357-7c775ac006emr2098904885a.43.1744048918447; Mon, 07 Apr 2025 11:01:58 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:365a:60ff:fe62:ff29]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c76e96cf01sm627509185a.55.2025.04.07.11.01.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Apr 2025 11:01:57 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Brendan Jackman , Mel Gorman , Carlos Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] mm: page_alloc: tighten up find_suitable_fallback() Date: Mon, 7 Apr 2025 14:01:54 -0400 Message-ID: <20250407180154.63348-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250407180154.63348-1-hannes@cmpxchg.org> References: <20250407180154.63348-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 5F567140003 X-Stat-Signature: x4sn96q7c65ab4k7ewzdbfb45x7zr4dc X-Rspam-User: X-HE-Tag: 1744048919-431531 X-HE-Meta: U2FsdGVkX1/Qd1ztVFI69CXvMKF5iX83GrDb0N54fPsFWkJxw0VhEjliIc1NWH8oJ9t7HZF9JEM9DNnYvnd/JS5cTUJ5ozNr2Y3gB0kJPtR+TIU+Y7FttezLSKibzQna2QbaODO/YgP1LNnXjV+vvb0rYlmkZoalfojBd+hGeFfdhkDGK0dPqlW1M9fHpkAp9UnBiDsiX2N1iNxmXpuTPx0gJyAGzY17zMRBbhvgPR/0y2e1WKnWVErePT/f7lZHv9MDoDTdSr/UwvZC/+jNLOVPucUKAHe+WTDpVENxwYYdVnLIWTlgM7zulOmHswXJJr7auvCiExt+0VyqfX+n+3VNkkTaKvxzzAU+tEIYddqknN2/QtmoJsY7SPAcBlchCDifCpW9dlerMrFl8Xte9GdU8xnaf+ugH5RXO/53mQjsG8N2BJIeWDZ5Bc0ZxsM6QpLEePxoD2NRo+N19PLY7KKDI9ZfaLgyJNvFeby2dobYEc//l+bXjWQnTekOEa79co8iu2H5NZbNU4K7/7HvRZH5+LsPrNvewkSN1P4zal0gwkW6QgfZS/XoTiqZXHCzce6Shq5jsLfQHnyMpFwTBbOphL13LFL7Lg1+7cLNo7NBuZMs9gpABn1d4cdw0xL0mrsaBmLZxjw4iquTbp6POXyl6YFt0UmjUFsC93/b0OjfbIOxaRYnw56tLyKszMw2PM414J6pSmf8B0UIo8w2ZDImCyY4SKn82n1exAuu2MTDic/Le9ID3rDYVTrP+07QbC/R6MxstsDP9Oc8TnTrO/jkYvYoBmfa+oiluvOGykmrStofvgfYEYxSiE61zGyr9UEmWBsXQpJKUIUgpFv8XnoRZC3C/L/DteBKp3q818Mik80HBplG5tawY/ZXUgK8VM74J/b8kcVuxxfHii9gMsioldYmeENMwcjheIsR6Wr5rigXKcpVfe4Ls8gy34dWgb2LcJjDCKBdNVo+7Vq CfipPTmG 5AESOutltEXD/I29vrM2c8KEkrUwrSmQaxjTeVD1MdydFqeRu0PBBBt1PQdRM0bLvolj1C1LUegI8y3eSuvRkUmSlhsDOLMftqgrKdVm9WfqkUh2E11XjkFFKuN7nAx2NpIIWtkG+5dHNEeI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: find_suitable_fallback() is not as efficient as it could be, and somewhat difficult to follow. 1. should_try_claim_block() is a loop invariant. There is no point in checking fallback areas if the caller is interested in claimable blocks but the order and the migratetype don't allow for that. 2. __rmqueue_steal() doesn't care about claimability, so it shouldn't have to run those tests. Different callers want different things from this helper: 1. __compact_finished() scans orders up until it finds a claimable block 2. __rmqueue_claim() scans orders down as long as blocks are claimable 3. __rmqueue_steal() doesn't care about claimability at all Move should_try_claim_block() out of the loop. Only test it for the two callers who care in the first place. Distinguish "no blocks" from "order + mt are not claimable" in the return value; __rmqueue_claim() can stop once order becomes unclaimable, __compact_finished() can keep advancing until order becomes claimable. Before: Performance counter stats for './run case-lru-file-mmap-read' (5 runs): 85,294.85 msec task-clock # 5.644 CPUs utilized ( +- 0.32% ) 15,968 context-switches # 187.209 /sec ( +- 3.81% ) 153 cpu-migrations # 1.794 /sec ( +- 3.29% ) 801,808 page-faults # 9.400 K/sec ( +- 0.10% ) 733,358,331,786 instructions # 1.87 insn per cycle ( +- 0.20% ) (64.94%) 392,622,904,199 cycles # 4.603 GHz ( +- 0.31% ) (64.84%) 148,563,488,531 branches # 1.742 G/sec ( +- 0.18% ) (63.86%) 152,143,228 branch-misses # 0.10% of all branches ( +- 1.19% ) (62.82%) 15.1128 +- 0.0637 seconds time elapsed ( +- 0.42% ) After: Performance counter stats for './run case-lru-file-mmap-read' (5 runs): 84,380.21 msec task-clock # 5.664 CPUs utilized ( +- 0.21% ) 16,656 context-switches # 197.392 /sec ( +- 3.27% ) 151 cpu-migrations # 1.790 /sec ( +- 3.28% ) 801,703 page-faults # 9.501 K/sec ( +- 0.09% ) 731,914,183,060 instructions # 1.88 insn per cycle ( +- 0.38% ) (64.90%) 388,673,535,116 cycles # 4.606 GHz ( +- 0.24% ) (65.06%) 148,251,482,143 branches # 1.757 G/sec ( +- 0.37% ) (63.92%) 149,766,550 branch-misses # 0.10% of all branches ( +- 1.22% ) (62.88%) 14.8968 +- 0.0486 seconds time elapsed ( +- 0.33% ) Signed-off-by: Johannes Weiner Reviewed-by: Vlastimil Babka Tested-by: Shivank Garg Reviewed-by: Brendan Jackman Signed-off-by: Brendan Jackman Signed-off-by: Brendan Jackman Acked-by: Johannes Weiner --- mm/compaction.c | 4 +--- mm/internal.h | 2 +- mm/page_alloc.c | 31 +++++++++++++------------------ 3 files changed, 15 insertions(+), 22 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 139f00c0308a..7462a02802a5 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -2348,7 +2348,6 @@ static enum compact_result __compact_finished(struct compact_control *cc) ret = COMPACT_NO_SUITABLE_PAGE; for (order = cc->order; order < NR_PAGE_ORDERS; order++) { struct free_area *area = &cc->zone->free_area[order]; - bool claim_block; /* Job done if page is free of the right migratetype */ if (!free_area_empty(area, migratetype)) @@ -2364,8 +2363,7 @@ static enum compact_result __compact_finished(struct compact_control *cc) * Job done if allocation would steal freepages from * other migratetype buddy lists. */ - if (find_suitable_fallback(area, order, migratetype, - true, &claim_block) != -1) + if (find_suitable_fallback(area, order, migratetype, true) >= 0) /* * Movable pages are OK in any pageblock. If we are * stealing for a non-movable allocation, make sure diff --git a/mm/internal.h b/mm/internal.h index 50c2f590b2d0..55384b9971c3 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -915,7 +915,7 @@ static inline void init_cma_pageblock(struct page *page) int find_suitable_fallback(struct free_area *area, unsigned int order, - int migratetype, bool claim_only, bool *claim_block); + int migratetype, bool claimable); static inline bool free_area_empty(struct free_area *area, int migratetype) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 03b0d45ed45a..1522e3a29b16 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2077,31 +2077,25 @@ static bool should_try_claim_block(unsigned int order, int start_mt) /* * Check whether there is a suitable fallback freepage with requested order. - * Sets *claim_block to instruct the caller whether it should convert a whole - * pageblock to the returned migratetype. - * If only_claim is true, this function returns fallback_mt only if + * If claimable is true, this function returns fallback_mt only if * we would do this whole-block claiming. This would help to reduce * fragmentation due to mixed migratetype pages in one pageblock. */ int find_suitable_fallback(struct free_area *area, unsigned int order, - int migratetype, bool only_claim, bool *claim_block) + int migratetype, bool claimable) { int i; - int fallback_mt; + + if (claimable && !should_try_claim_block(order, migratetype)) + return -2; if (area->nr_free == 0) return -1; - *claim_block = false; for (i = 0; i < MIGRATE_PCPTYPES - 1 ; i++) { - fallback_mt = fallbacks[migratetype][i]; - if (free_area_empty(area, fallback_mt)) - continue; + int fallback_mt = fallbacks[migratetype][i]; - if (should_try_claim_block(order, migratetype)) - *claim_block = true; - - if (*claim_block || !only_claim) + if (!free_area_empty(area, fallback_mt)) return fallback_mt; } @@ -2206,7 +2200,6 @@ __rmqueue_claim(struct zone *zone, int order, int start_migratetype, int min_order = order; struct page *page; int fallback_mt; - bool claim_block; /* * Do not steal pages from freelists belonging to other pageblocks @@ -2225,11 +2218,14 @@ __rmqueue_claim(struct zone *zone, int order, int start_migratetype, --current_order) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, - start_migratetype, false, &claim_block); + start_migratetype, true); + + /* No block in that order */ if (fallback_mt == -1) continue; - if (!claim_block) + /* Advanced into orders too low to claim, abort */ + if (fallback_mt == -2) break; page = get_page_from_free_area(area, fallback_mt); @@ -2254,12 +2250,11 @@ __rmqueue_steal(struct zone *zone, int order, int start_migratetype) int current_order; struct page *page; int fallback_mt; - bool claim_block; for (current_order = order; current_order < NR_PAGE_ORDERS; current_order++) { area = &(zone->free_area[current_order]); fallback_mt = find_suitable_fallback(area, current_order, - start_migratetype, false, &claim_block); + start_migratetype, false); if (fallback_mt == -1) continue;