From patchwork Wed Mar 20 18:02:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13598061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3DA0C54E58 for ; Wed, 20 Mar 2024 18:04:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 464996B0087; Wed, 20 Mar 2024 14:04:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 377326B0088; Wed, 20 Mar 2024 14:04:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A3A06B0089; Wed, 20 Mar 2024 14:04:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 079C66B0087 for ; Wed, 20 Mar 2024 14:04:56 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 902DD1212F8 for ; Wed, 20 Mar 2024 18:04:55 +0000 (UTC) X-FDA: 81918193350.24.391D32E Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf11.hostedemail.com (Postfix) with ESMTP id A052040028 for ; Wed, 20 Mar 2024 18:04:53 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=cq2lqv7o; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf11.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710957893; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FfocRRC1YB9gbTpQRA0qsUF6TQAVW4ud00P9WTspsIs=; b=vIeWX2LYojiROFSR5AX6tBbnqs0qAk9kpE+E8qGkZkAtYX3pIIO44XaaQNRTdRam04D5TW V6549B68NjKfQK+2MDG1SgkstWHfVLzDu/KEcP5UrZWPDFQxQR3DQIWt7FuMdfuuvoOBwl t4RNoIpb/fjT8FoGBy1HTUhgHAvmsew= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=cq2lqv7o; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf11.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.176 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710957893; a=rsa-sha256; cv=none; b=ni8oUE4Ye1Xm0CdDu57yVegrTTGJuVF2i2IV1d00rcEI+ezoclDRdmqVef4D2r3lcUBOWy dciZq1ulJdOZCzSH1uqhJLDYVEUN8/gqtzNT4VUtaSOyw2cpsj027Z0xOTNYBVN9T+8MmQ ByjmRpxeCwLKJezkszlEeEg2qwvkR+4= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-42e323a2e39so1642131cf.1 for ; Wed, 20 Mar 2024 11:04:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1710957892; x=1711562692; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FfocRRC1YB9gbTpQRA0qsUF6TQAVW4ud00P9WTspsIs=; b=cq2lqv7o1Wli+xcII+ZWCjaxDNsrrH+qBpYBz0wkAXOWl3g3CePdgnXpHJx+INEC5A 4cDGuzfAMksnFpo1OFqM/B0eflwPCryu9DdQMl7ZDmNL2p7B4ZjZWhcXeguY1zWrWlvZ WGlc3nAN03hC2h5utMkY39C9MwZTXUNybl+Y29h7nnQssaQtCl+zxc1bCVLVhdHIrJn9 umxh5s/NtTKYa5GAGgqv5R1aGp8i4xAMAnx+qKeciHFHfGEhskNZtp9zPi16Ma43p6JC bxZ0qRRypvH1mGtt6k2P8vMctjD0cVYTnKNDtbLVooeZEFbDdcg5VRGDCBRhiD+VJvnf Ynow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710957892; x=1711562692; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FfocRRC1YB9gbTpQRA0qsUF6TQAVW4ud00P9WTspsIs=; b=bS3gje8fs8Gp1I9IkRkrxvvkJxGjbCB30kgdynrOAWxQKVNhg2/GFhwG1PbJp9W/Ub 6I1Sxg3tc6tv5EOBiloAOIeBJOAXHHx9mYUwt2/45gMEDV8RHs+wChdTYpfFeXwqQApW 3qNv1Vt28IKbxY4PYvf2Vzf+5cN3vgRGJRR++4xLDGR4c47Ld3LL3EIUegicbTB2qvcu v1OP4YaTd+rTOZ9bcjvvmd7ZhHocNiaKL2eCvdAxmHCOMWSFdK/jeqLWPhonnJypBaw9 QxUaLwR9NubdcjCTuI4ep0qgZaXil5c/U7Atg4Leuu3feKVmVD8osTmy1gbNqIgkEhiY F24A== X-Forwarded-Encrypted: i=1; AJvYcCWW3r5cUt/v76qINZET6sMirj32hEG/ZTUmqhS9kpFqSXMSi0PjIeEnoISXMV/UuR4g6D/3xD5Scqb50me1op+z1EA= X-Gm-Message-State: AOJu0Yx93FIh/N06AlxRnSo0XtVCJDaO1kBHbVC5FTpkhj9RfAfg4dRg zGcULCv9RS1z/3CYA960MAd4mPieBElM+MjHgdHOrm8PJaHZXEc455PfLi9F72I= X-Google-Smtp-Source: AGHT+IHoamNYVBdLmp9p84PZmX7dks/ROfSNPw4orYipVjbhoiZ9qdiu/sQyu3TIfPVvxmMBsK0o2g== X-Received: by 2002:a05:622a:188a:b0:431:155e:7ef0 with SMTP id v10-20020a05622a188a00b00431155e7ef0mr750683qtc.6.1710957892574; Wed, 20 Mar 2024 11:04:52 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-da5e-d3ff-fee7-26e7.res6.spectrum.com. [2603:7000:c01:2716:da5e:d3ff:fee7:26e7]) by smtp.gmail.com with ESMTPSA id ex9-20020a05622a518900b00430bcec5432sm5506623qtb.85.2024.03.20.11.04.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Mar 2024 11:04:52 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Mel Gorman , Zi Yan , "Huang, Ying" , David Hildenbrand , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 01/10] mm: page_alloc: remove pcppage migratetype caching Date: Wed, 20 Mar 2024 14:02:06 -0400 Message-ID: <20240320180429.678181-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240320180429.678181-1-hannes@cmpxchg.org> References: <20240320180429.678181-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: A052040028 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: zyquzfq9y8cuwnc1mkxzfnes3aojko7p X-HE-Tag: 1710957893-306925 X-HE-Meta: U2FsdGVkX19qkiaw8vkdc947lCY3RjepEgMZghqmZL4kwnZhUjNIsKu/LXK1+1vaCWVep06kioEkQ2uxD+f9qP3LlC1Hbc5ejX02jjOGYYj8kwlxWebCV3BFpKAKxk0UzTwufHam/9jyPK1i1t+na6gO4zEIyFqfNapRuMVcv0V3D7QuuUl5yXUy5hzXRROC1AztN2qkO2JofmeYSzRtIf6O33zGmldxRnEQ5KX3REzhnyeBPzulrZcQqupb1Qlkpz1Y/kZWAF6rnIqSGsJ2oCfTSSs+gUL41ens5McIfirZT+lxZ9AmgCK+Fy4/1UrY1/nKoc9zoGrPPAD1kuVMydEuR5v6BBl/MwYgtnPA3z5deI6kNt4moA3WWiEl1wVNKxm8m63Ulegp6GfzaWNGuyVgNackZgmfCH42eLN4SRm77Dmj64m1v/XayKVLx05Fj/ab++nTGaOUrNIojafUJIWSOXdqDCBaBg5gYGszvAG/x+3PJFiv51o/zgfSL3xgM6Mxmb5kFfSqK29kglFC1panpaGWZ6SXDvZ8pTn8W3W0AER6pRMfTd5+6dmYiCJTeKimfrJWYsEM79zQsDr8xNMmxeZUVJ4eryZwxm7Op+vYsK4P5MJFZTMPE4k79J8QeeL/toLl7QEUdpuhvaOqCi+OHXwcHU3Ve8mEVR1zPhn8JIYK09+QouVRiMOzfg0jksXZzoMLlwcBzLgZVxfbwKTrpd1bhN8ns5H77QBxkntaiEqXeMLW4KIbuT2o1Je2x18PiG9n2OY0Af+Ga+gPFXNGVvV4D4rDHHfvzVAf08jRVilmqr+Bm2ohLPTCGsuNEVohIGGVYFR2r1Z6pNrdLc6pxuWMr1A4ZO7HWEuddliUYgkgxT+M3QznqxcpaEQtSJg6QoDKg50oMD2Sda+eFSm/UMsSjnXUCv1l8TFvAdhLC2FQGbaQBZrbV3oYtRcvRLsKkPplrSBya5P1VPO z7Az8TxC 3EXAFJd7O56dZTiu1iFX8V8/9hmciwAsCvgrpwUWYlRbidgv2/j0i7TV51roxnTS87BaA1jP2fpXKcjy9MsGSWKyUCBFZByiYt/jRsaQcY+QK9V+iXQRmIxCTblJtyndIFr1HyuZLC639IGed1EEkXwod3tVBJmMWiXuhZWl4/EY6NeLP/A8ZuKpMKFa3ggUYwsrXNA4UElv4CW/hClTq7HIXVN0RsrKO35BUcXE2a46EOJE/7uv2fxCtOS/GtBlotWVKWwTx5OWn2+bZlLqmr5ZiikgW2qHIcBEEKpI2MMlDto1PvqKE7LNVS49EVTzkhoTdUm1tQs76COK4Irbk6FCDsdfCyQ/NxE48zjQJeLMAnhBULmCCrm+44bz3DojlLMrZiyt5wD8GYOEAH74iCLLhlj0TR4k8JOshDAYYFONk2GHyOcl/xD1q8iZ6JmOBRNphV2oHowqagJR8TjD+3NgH2L16ohF0FUj6Tu45jtrubo4jT5OHkFt9ilR57P82LVXjLIrwRRXLGgGjaXy++kztBxTyeUWhnlP0XILbWYe7jkkdLpHtZLTKaE0NfLIRk1EnbIEzH2R36WowOlSVl0BzRtb9W7F60EQw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The idea behind the cache is to save get_pageblock_migratetype() lookups during bulk freeing. A microbenchmark suggests this isn't helping, though. The pcp migratetype can get stale, which means that bulk freeing has an extra branch to check if the pageblock was isolated while on the pcp. While the variance overlaps, the cache write and the branch seem to make this a net negative. The following test allocates and frees batches of 10,000 pages (~3x the pcp high marks to trigger flushing): Before: 8,668.48 msec task-clock # 99.735 CPUs utilized ( +- 2.90% ) 19 context-switches # 4.341 /sec ( +- 3.24% ) 0 cpu-migrations # 0.000 /sec 17,440 page-faults # 3.984 K/sec ( +- 2.90% ) 41,758,692,473 cycles # 9.541 GHz ( +- 2.90% ) 126,201,294,231 instructions # 5.98 insn per cycle ( +- 2.90% ) 25,348,098,335 branches # 5.791 G/sec ( +- 2.90% ) 33,436,921 branch-misses # 0.26% of all branches ( +- 2.90% ) 0.0869148 +- 0.0000302 seconds time elapsed ( +- 0.03% ) After: 8,444.81 msec task-clock # 99.726 CPUs utilized ( +- 2.90% ) 22 context-switches # 5.160 /sec ( +- 3.23% ) 0 cpu-migrations # 0.000 /sec 17,443 page-faults # 4.091 K/sec ( +- 2.90% ) 40,616,738,355 cycles # 9.527 GHz ( +- 2.90% ) 126,383,351,792 instructions # 6.16 insn per cycle ( +- 2.90% ) 25,224,985,153 branches # 5.917 G/sec ( +- 2.90% ) 32,236,793 branch-misses # 0.25% of all branches ( +- 2.90% ) 0.0846799 +- 0.0000412 seconds time elapsed ( +- 0.05% ) A side effect is that this also ensures that pages whose pageblock gets stolen while on the pcplist end up on the right freelist and we don't perform potentially type-incompatible buddy merges (or skip merges when we shouldn't), which is likely beneficial to long-term fragmentation management, although the effects would be harder to measure. Settle for simpler and faster code as justification here. v2: - remove erroneous leftover VM_BUG_ON in pcp bulk freeing (Mike) Acked-by: Zi Yan Reviewed-by: Vlastimil Babka Acked-by: Mel Gorman Tested-by: "Huang, Ying" Signed-off-by: Johannes Weiner --- mm/page_alloc.c | 66 +++++++++++-------------------------------------- 1 file changed, 14 insertions(+), 52 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 4491d0240bc6..60a632b7c9f6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -206,24 +206,6 @@ EXPORT_SYMBOL(node_states); gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK; -/* - * A cached value of the page's pageblock's migratetype, used when the page is - * put on a pcplist. Used to avoid the pageblock migratetype lookup when - * freeing from pcplists in most cases, at the cost of possibly becoming stale. - * Also the migratetype set in the page does not necessarily match the pcplist - * index, e.g. page might have MIGRATE_CMA set but be on a pcplist with any - * other index - this ensures that it will be put on the correct CMA freelist. - */ -static inline int get_pcppage_migratetype(struct page *page) -{ - return page->index; -} - -static inline void set_pcppage_migratetype(struct page *page, int migratetype) -{ - page->index = migratetype; -} - #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE unsigned int pageblock_order __read_mostly; #endif @@ -1191,7 +1173,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, { unsigned long flags; unsigned int order; - bool isolated_pageblocks; struct page *page; /* @@ -1204,7 +1185,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, pindex = pindex - 1; spin_lock_irqsave(&zone->lock, flags); - isolated_pageblocks = has_isolate_pageblock(zone); while (count > 0) { struct list_head *list; @@ -1220,23 +1200,19 @@ static void free_pcppages_bulk(struct zone *zone, int count, order = pindex_to_order(pindex); nr_pages = 1 << order; do { + unsigned long pfn; int mt; page = list_last_entry(list, struct page, pcp_list); - mt = get_pcppage_migratetype(page); + pfn = page_to_pfn(page); + mt = get_pfnblock_migratetype(page, pfn); /* must delete to avoid corrupting pcp list */ list_del(&page->pcp_list); count -= nr_pages; pcp->count -= nr_pages; - /* MIGRATE_ISOLATE page should not go to pcplists */ - VM_BUG_ON_PAGE(is_migrate_isolate(mt), page); - /* Pageblock could have been isolated meanwhile */ - if (unlikely(isolated_pageblocks)) - mt = get_pageblock_migratetype(page); - - __free_one_page(page, page_to_pfn(page), zone, order, mt, FPI_NONE); + __free_one_page(page, pfn, zone, order, mt, FPI_NONE); trace_mm_page_pcpu_drain(page, order, mt); } while (count > 0 && !list_empty(list)); } @@ -1575,7 +1551,6 @@ struct page *__rmqueue_smallest(struct zone *zone, unsigned int order, continue; del_page_from_free_list(page, zone, current_order); expand(zone, page, order, current_order, migratetype); - set_pcppage_migratetype(page, migratetype); trace_mm_page_alloc_zone_locked(page, order, migratetype, pcp_allowed_order(order) && migratetype < MIGRATE_PCPTYPES); @@ -2182,7 +2157,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, * pages are ordered properly. */ list_add_tail(&page->pcp_list, list); - if (is_migrate_cma(get_pcppage_migratetype(page))) + if (is_migrate_cma(get_pageblock_migratetype(page))) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, -(1 << order)); } @@ -2378,19 +2353,6 @@ void drain_all_pages(struct zone *zone) __drain_all_pages(zone, false); } -static bool free_unref_page_prepare(struct page *page, unsigned long pfn, - unsigned int order) -{ - int migratetype; - - if (!free_pages_prepare(page, order)) - return false; - - migratetype = get_pfnblock_migratetype(page, pfn); - set_pcppage_migratetype(page, migratetype); - return true; -} - static int nr_pcp_free(struct per_cpu_pages *pcp, int batch, int high, bool free_high) { int min_nr_free, max_nr_free; @@ -2523,7 +2485,7 @@ void free_unref_page(struct page *page, unsigned int order) unsigned long pfn = page_to_pfn(page); int migratetype, pcpmigratetype; - if (!free_unref_page_prepare(page, pfn, order)) + if (!free_pages_prepare(page, order)) return; /* @@ -2533,7 +2495,7 @@ void free_unref_page(struct page *page, unsigned int order) * get those areas back if necessary. Otherwise, we may have to free * excessively into the page allocator */ - migratetype = pcpmigratetype = get_pcppage_migratetype(page); + migratetype = pcpmigratetype = get_pfnblock_migratetype(page, pfn); if (unlikely(migratetype >= MIGRATE_PCPTYPES)) { if (unlikely(is_migrate_isolate(migratetype))) { free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE); @@ -2572,14 +2534,14 @@ void free_unref_folios(struct folio_batch *folios) if (order > 0 && folio_test_large_rmappable(folio)) folio_undo_large_rmappable(folio); - if (!free_unref_page_prepare(&folio->page, pfn, order)) + if (!free_pages_prepare(&folio->page, order)) continue; /* * Free isolated folios and orders not handled on the PCP * directly to the allocator, see comment in free_unref_page. */ - migratetype = get_pcppage_migratetype(&folio->page); + migratetype = get_pfnblock_migratetype(&folio->page, pfn); if (!pcp_allowed_order(order) || is_migrate_isolate(migratetype)) { free_one_page(folio_zone(folio), &folio->page, pfn, @@ -2596,10 +2558,11 @@ void free_unref_folios(struct folio_batch *folios) for (i = 0; i < folios->nr; i++) { struct folio *folio = folios->folios[i]; struct zone *zone = folio_zone(folio); + unsigned long pfn = folio_pfn(folio); unsigned int order = (unsigned long)folio->private; folio->private = NULL; - migratetype = get_pcppage_migratetype(&folio->page); + migratetype = get_pfnblock_migratetype(&folio->page, pfn); /* Different zone requires a different pcp lock */ if (zone != locked_zone) { @@ -2616,9 +2579,8 @@ void free_unref_folios(struct folio_batch *folios) pcp = pcp_spin_trylock(zone->per_cpu_pageset); if (unlikely(!pcp)) { pcp_trylock_finish(UP_flags); - free_one_page(zone, &folio->page, - folio_pfn(folio), order, - migratetype, FPI_NONE); + free_one_page(zone, &folio->page, pfn, + order, migratetype, FPI_NONE); locked_zone = NULL; continue; } @@ -2787,7 +2749,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zone, struct zone *zone, } } __mod_zone_freepage_state(zone, -(1 << order), - get_pcppage_migratetype(page)); + get_pageblock_migratetype(page)); spin_unlock_irqrestore(&zone->lock, flags); } while (check_new_pages(page, order));