From patchwork Mon Sep 11 19:41:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13379606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93B9CC71153 for ; Mon, 11 Sep 2023 19:50:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC9336B02DC; Mon, 11 Sep 2023 15:50:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E78C46B02DD; Mon, 11 Sep 2023 15:50:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CA3746B02DE; Mon, 11 Sep 2023 15:50:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BCAC26B02DC for ; Mon, 11 Sep 2023 15:50:36 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 929611605AE for ; Mon, 11 Sep 2023 19:50:36 +0000 (UTC) X-FDA: 81225358872.10.80652FB Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) by imf05.hostedemail.com (Postfix) with ESMTP id C30DA10000F for ; Mon, 11 Sep 2023 19:50:34 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=BiVBoMzb; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf05.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.221.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694461834; a=rsa-sha256; cv=none; b=3yS9HclopcHnb6TZPAi4cdwMqDw+MOWVSIgo9ZPKFZy4W5n39eKZ5WZ+3eLaa+34H27jkw OFCnl4ih6uSPuAZEyKqxZo2+Gm6/+vp4v0sXRzzlu5fQ6dA7qS8BItspK+Kbodf66ca9Zt waPqdRHXju8jax8ekN6cJTBU13gkgNE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=BiVBoMzb; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf05.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.221.170 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694461834; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7FpxEl1I3pH8RvyVkhbOxLMeix4VtK7whCGHR+Kbh8c=; b=ORb8n4DBSHPq+MOhJcWwUX6XAjmLi4wHQeCb+dGIFDu3N9FVUnlKjrkyKyKm+Kr1SOEIB7 kSnZOeDWYHxImPFkv9CJBGVwfRKfyorogdlCrid7hm+bl4e1D+jUZbigawNU6lfaP3NvkD jlm/FH6MAotWF923Sl3qapDiZ5FjXWM= Received: by mail-vk1-f170.google.com with SMTP id 71dfb90a1353d-49352207f33so1802790e0c.2 for ; Mon, 11 Sep 2023 12:50:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461834; x=1695066634; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7FpxEl1I3pH8RvyVkhbOxLMeix4VtK7whCGHR+Kbh8c=; b=BiVBoMzboCSIbAPzXRQQwXUaxIa6Nz25H2KoRy9AJvCC0xEckl5e5ePKqILT9AKR8r BNMH3ZDeJtSOuK6oiamnzzCHEouMtvy1RXNhODPhTTT1dwCvpwaBhADMj1wV8Vvtlsvx bsDFR/oY9kd38GrQccwH7WF1iXt+dfhT2t16MaPNaEC28uRwIkA3uv95yqY+3+4e/714 376OoOjfajiVcbaqsLRYJRM7kj/zVK1g/wKt8TorXrD8Pqyc+ni1VMQ/gjes+CuugmBj jZ+y/SsNzDfVpy0ygK6usNXycPOJg7Xc/g2DiQ5BDCzv/kaZHkJL5+cmudHVCKlbsc6U uIVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694461834; x=1695066634; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7FpxEl1I3pH8RvyVkhbOxLMeix4VtK7whCGHR+Kbh8c=; b=O7uEzliBIH/TUiNpAb5VXXzcmqLU2Gru8ejOg2rA2/1uhfGCYb9GeD5aLrGM+l5/ow V0tS5NDwe27Ir26ajl1ypYt9fo7yReL79KXqyHNA6YG5bXxX6JLNj1Szi7bpJwB7TRWg LGMPMG5IFkmd8el49UWy7olAmBw4HWveoDenRsxJRlxM1LNDajCxFkw3bZaCdtDt5ZUv 5aGigHA3uzxqNkGCPMpNjf9aM4+Kl+ftBtlGszThDDsGZWFo7cPCcfWu5NsvKTR9ts/H kmAcC0R0GGCmvVhsY1hJqWsgfpGvC1MOwGh95GWVYH8PYfqgStTU9+WjnuPmL0+QfKuZ Z4TA== X-Gm-Message-State: AOJu0Yz6XPA56t5zhij+KbAbOBcLQwcz8koDED7eYRg8ik0H4u4eZxxx CDKC7Wk7oT9RtBqpb3q1BCVftQ== X-Google-Smtp-Source: AGHT+IH5gBtYy1/4AAerHO8D5EjmzZL+ejo/P0LZt667+IenuP9gxk1guF7nx21/dGBJi/c5z4jB+Q== X-Received: by 2002:a1f:e7c4:0:b0:495:db79:ea76 with SMTP id e187-20020a1fe7c4000000b00495db79ea76mr4004344vkh.1.1694461833711; Mon, 11 Sep 2023 12:50:33 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id o10-20020a0ccb0a000000b0063d038df3f3sm3149714qvk.52.2023.09.11.12.50.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Sep 2023 12:50:33 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Miaohe Lin , Kefeng Wang , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching Date: Mon, 11 Sep 2023 15:41:42 -0400 Message-ID: <20230911195023.247694-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org> References: <20230911195023.247694-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: C30DA10000F X-Stat-Signature: zzaji11y9dazj456ucpxhjdpfykkqg7t X-HE-Tag: 1694461834-126390 X-HE-Meta: U2FsdGVkX1+ql1/bTSIdKpGun59hY7Pv1L4tSh8oz7N2sLgvMqNQUogsXV7OzFTQVascXrh1ck/4ZGXkML0CqgfkGfHRi2MVWtpq2DyhUg5bwTaFpIXxKOewj5zfhYGTUXDRnFdciC06q6UhtHZQyPMXPaoTuyCoQLDbErCf0Z9i1qNMEcLxSiqgrBNOVug3Kq93/ZUUGfRkxKSz6Txy7mCzsQpZ9DqIP2JoGSbP31gBSfUGpH9DVeza8oUTx2WBQ6iWDAPsuKFI9qAKekNyLItIddOiCfCB+8hzAe9uSPylv41Z+TRMvqWo8+o1RaIJyHJP20Cq02swLuoJtqhV6put9cQaxn00SYjiHKTamJSi4Adfn2EnHH12GZ7nA5YvKIHy0MJ4UmqhI53LM9Zq6bu2rf6cGLbnLAMVGz1YngMZzblAFPrWZFZPgS4HTstJET2+z51n/hHbM+dUKNjCECY6ruwjPwgwHwmv9uHau2IxA/L+MHliYFb2KyKNZO1rZkNlvYTO7ZT0YJ5+AOlYOVPiwczHummsbzusPBEzYztl14NwrVSRSTiZXzyL9nHi4YsRoJdo8OlMA3naF1xixTK2CUzpyLTfc1kcxWpXopRn1yStQAetFgU5f7sMq4cIFHT6H45j5mRTY64GfeMESuS5PCLOD1Ipa2A554YnG70SII53B1lThkOC+i1q760RAIKYyoXEyKY8mP1v0JwtVX152US5/iLVnO5W2t2xhRGG8kZewwB9/C+Rc2fzTG/MgTZr5/RuWZ2dNE6ozYQ36K9Y1YB5avpHz+g3+1szZ5H7fN3ddyqKcfEK3Q2mJE7fjFdYy04ZNTwPxDqQOLYCYo7PylXG+bNwdZOS1uVpYeK4nNNDLo7vwLpVwE1/d7SNMwQwiBSZTtbuWqSvRy+xWfGq1tdVcZkHSoZW7L+zcAltKxEVnDDI1dXqpFd6q9BnPtPybzKJbwsGHc0ihG1 ZgfcH2VH yOZZ8zQ7CCDmb4c9at0r0oToHg2ZX89nKpSqn4gSK/ywNpP2TdX9oQrnuaiEjnBUZ4tIrmKmsCJbLdJNZUcys5VZAKHE/t2YRRNnJQzMncGnfxg7CEcmygtpaUWCnSdbu/zzcte5/G+X98KgHvNCl4X4hzdpWQSPp3vPwATH3A7eujysBAqZs1I7dP+C1CMJVhfYvaHEF79Mx6CWC8aEVCdQHMeWMoXCyWhbrSl+tGBiruDesQIG8Zczny08828V3VeyHiR0T9aizoh9X1doLm0Pjx4m8mIV5+E4ogBUbXi60iv3Ayd3qqvGCtTOD96lO4huon7PPLdchGFaKHEVgKP6TUA13ERR1z0LbqppgCsf3IMCkTzoxHnVleTM7UMR3Z1jAsApp48qbYMaBkT+h40Guh1Y3TfQbZoO0rBFfeG+kvjX3ylY0VP5myVBczc+PSRyCfc047n0SY4f0NWbATiEv8HCGDseamebA02kWXt34duc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The idea behind the cache is to save get_pageblock_migratetype() lookups during bulk freeing. A microbenchmark suggests this isn't helping, though. The pcp migratetype can get stale, which means that bulk freeing has an extra branch to check if the pageblock was isolated while on the pcp. While the variance overlaps, the cache write and the branch seem to make this a net negative. The following test allocates and frees batches of 10,000 pages (~3x the pcp high marks to trigger flushing): Before: 8,668.48 msec task-clock # 99.735 CPUs utilized ( +- 2.90% ) 19 context-switches # 4.341 /sec ( +- 3.24% ) 0 cpu-migrations # 0.000 /sec 17,440 page-faults # 3.984 K/sec ( +- 2.90% ) 41,758,692,473 cycles # 9.541 GHz ( +- 2.90% ) 126,201,294,231 instructions # 5.98 insn per cycle ( +- 2.90% ) 25,348,098,335 branches # 5.791 G/sec ( +- 2.90% ) 33,436,921 branch-misses # 0.26% of all branches ( +- 2.90% ) 0.0869148 +- 0.0000302 seconds time elapsed ( +- 0.03% ) After: 8,444.81 msec task-clock # 99.726 CPUs utilized ( +- 2.90% ) 22 context-switches # 5.160 /sec ( +- 3.23% ) 0 cpu-migrations # 0.000 /sec 17,443 page-faults # 4.091 K/sec ( +- 2.90% ) 40,616,738,355 cycles # 9.527 GHz ( +- 2.90% ) 126,383,351,792 instructions # 6.16 insn per cycle ( +- 2.90% ) 25,224,985,153 branches # 5.917 G/sec ( +- 2.90% ) 32,236,793 branch-misses # 0.25% of all branches ( +- 2.90% ) 0.0846799 +- 0.0000412 seconds time elapsed ( +- 0.05% ) A side effect is that this also ensures that pages whose pageblock gets stolen while on the pcplist end up on the right freelist and we don't perform potentially type-incompatible buddy merges (or skip merges when we shouldn't), whis is likely beneficial to long-term fragmentation management, although the effects would be harder to measure. Settle for simpler and faster code as justification here. Signed-off-by: Johannes Weiner Acked-by: Zi Yan Reviewed-by: Vlastimil Babka Signed-off-by: Johannes Weiner Reviewed-by: Vlastimil Babka Acked-by: Mel Gorman Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 61 ++++++++++++------------------------------------- 1 file changed, 14 insertions(+), 47 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 95546f376302..e3f1c777feed 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -204,24 +204,6 @@ EXPORT_SYMBOL(node_states); gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK; -/* - * A cached value of the page's pageblock's migratetype, used when the page is - * put on a pcplist. Used to avoid the pageblock migratetype lookup when - * freeing from pcplists in most cases, at the cost of possibly becoming stale. - * Also the migratetype set in the page does not necessarily match the pcplist - * index, e.g. page might have MIGRATE_CMA set but be on a pcplist with any - * other index - this ensures that it will be put on the correct CMA freelist. - */ -static inline int get_pcppage_migratetype(struct page *page) -{ - return page->index; -} - -static inline void set_pcppage_migratetype(struct page *page, int migratetype) -{ - page->index = migratetype; -} - #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE unsigned int pageblock_order __read_mostly; #endif @@ -1186,7 +1168,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, { unsigned long flags; unsigned int order; - bool isolated_pageblocks; struct page *page; /* @@ -1199,7 +1180,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, pindex = pindex - 1; spin_lock_irqsave(&zone->lock, flags); - isolated_pageblocks = has_isolate_pageblock(zone); while (count > 0) { struct list_head *list; @@ -1215,10 +1195,12 @@ static void free_pcppages_bulk(struct zone *zone, int count, order = pindex_to_order(pindex); nr_pages = 1 << order; do { + unsigned long pfn; int mt; page = list_last_entry(list, struct page, pcp_list); - mt = get_pcppage_migratetype(page); + pfn = page_to_pfn(page); + mt = get_pfnblock_migratetype(page, pfn); /* must delete to avoid corrupting pcp list */ list_del(&page->pcp_list); @@ -1227,11 +1209,8 @@ static void free_pcppages_bulk(struct zone *zone, int count, /* MIGRATE_ISOLATE page should not go to pcplists */ VM_BUG_ON_PAGE(is_migrate_isolate(mt), page); - /* Pageblock could have been isolated meanwhile */ - if (unlikely(isolated_pageblocks)) - mt = get_pageblock_migratetype(page); - __free_one_page(page, page_to_pfn(page), zone, order, mt, FPI_NONE); + __free_one_page(page, pfn, zone, order, mt, FPI_NONE); trace_mm_page_pcpu_drain(page, order, mt); } while (count > 0 && !list_empty(list)); } @@ -1577,7 +1556,6 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, continue; del_page_from_free_list(page, zone, current_order); expand(zone, page, order, current_order, migratetype); - set_pcppage_migratetype(page, migratetype); trace_mm_page_alloc_zone_locked(page, order, migratetype, pcp_allowed_order(order) && migratetype < MIGRATE_PCPTYPES); @@ -2145,7 +2123,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, * pages are ordered properly. */ list_add_tail(&page->pcp_list, list); - if (is_migrate_cma(get_pcppage_migratetype(page))) + if (is_migrate_cma(get_pageblock_migratetype(page))) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, -(1 << order)); } @@ -2304,19 +2282,6 @@ void drain_all_pages(struct zone *zone) __drain_all_pages(zone, false); } -static bool free_unref_page_prepare(struct page *page, unsigned long pfn, - unsigned int order) -{ - int migratetype; - - if (!free_pages_prepare(page, order, FPI_NONE)) - return false; - - migratetype = get_pfnblock_migratetype(page, pfn); - set_pcppage_migratetype(page, migratetype); - return true; -} - static int nr_pcp_free(struct per_cpu_pages *pcp, int high, bool free_high) { int min_nr_free, max_nr_free; @@ -2402,7 +2367,7 @@ void free_unref_page(struct page *page, unsigned int order) unsigned long pfn = page_to_pfn(page); int migratetype, pcpmigratetype; - if (!free_unref_page_prepare(page, pfn, order)) + if (!free_pages_prepare(page, order, FPI_NONE)) return; /* @@ -2412,7 +2377,7 @@ void free_unref_page(struct page *page, unsigned int order) * get those areas back if necessary. Otherwise, we may have to free * excessively into the page allocator */ - migratetype = pcpmigratetype = get_pcppage_migratetype(page); + migratetype = pcpmigratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE); @@ -2448,7 +2413,8 @@ void free_unref_page_list(struct list_head *list) /* Prepare pages for freeing */ list_for_each_entry_safe(page, next, list, lru) { unsigned long pfn = page_to_pfn(page); - if (!free_unref_page_prepare(page, pfn, 0)) { + + if (!free_pages_prepare(page, 0, FPI_NONE)) { list_del(&page->lru); continue; } @@ -2457,7 +2423,7 @@ void free_unref_page_list(struct list_head *list) * Free isolated pages directly to the allocator, see * comment in free_unref_page. */ - migratetype = get_pcppage_migratetype(page); + migratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(is_migrate_isolate(migratetype))) { list_del(&page->lru); free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE); @@ -2466,10 +2432,11 @@ void free_unref_page_list(struct list_head *list) } list_for_each_entry_safe(page, next, list, lru) { + unsigned long pfn = page_to_pfn(page); struct zone *zone = page_zone(page); list_del(&page->lru); - migratetype = get_pcppage_migratetype(page); + migratetype = get_pfnblock_migratetype(page, pfn); /* * Either different zone requiring a different pcp lock or @@ -2492,7 +2459,7 @@ void free_unref_page_list(struct list_head *list) pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (unlikely(!pcp)) { pcp_trylock_finish(UP_flags); - free_one_page(zone, page, page_to_pfn(page), + free_one_page(zone, page, pfn, 0, migratetype, FPI_NONE); locked_zone = NULL; continue; @@ -2661,7 +2628,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, } } __mod_zone_freepage_state(zone, -(1 << order), - get_pcppage_migratetype(page)); + get_pageblock_migratetype(page)); spin_unlock_irqrestore(&zone->lock, flags); } while (check_new_pages(page, order));