From patchwork Fri Oct 19 04:33:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Yang X-Patchwork-Id: 10648651 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 74BC613B0 for ; Fri, 19 Oct 2018 04:33:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5306F28BF0 for ; Fri, 19 Oct 2018 04:33:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4266728BF5; Fri, 19 Oct 2018 04:33:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 966AD28BF0 for ; Fri, 19 Oct 2018 04:33:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2D7206B0006; Fri, 19 Oct 2018 00:33:08 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 25BB86B0007; Fri, 19 Oct 2018 00:33:08 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0FC6B6B0008; Fri, 19 Oct 2018 00:33:08 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by kanga.kvack.org (Postfix) with ESMTP id AAA1E6B0006 for ; Fri, 19 Oct 2018 00:33:07 -0400 (EDT) Received: by mail-ed1-f70.google.com with SMTP id b34-v6so19829573ede.5 for ; Thu, 18 Oct 2018 21:33:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:date:from:to:cc:subject :message-id:mime-version:content-disposition:user-agent; bh=zX8hgNYo5FyFw5wDSWpdyFbKsLHMgoCkx06/9RagvlY=; b=sVFeh7WGK4dpILRou4MqbIl44u2Zv1gLD0GNpbj3e2imkxOHZV18ErrEKHCzjByQY2 dv/InRA4M9yFzeLwneGN+OKk1F7HsCSUO1d7ZKBm8WEQqiIBvL6Ewfm+OS0zGbwSRcru 3CCmhfOFQVQPon+QewaB/65iL2Md9ivvQ5nB2849ZkT2WqOssglSBZk78WHT81mPPVlG GdXGwgKdnqaLKwH4t+rbj8htLFEGxSinxJwWWqfbjbrj7k/DSzVWtTrCls368qxrKJHg ZL34wxx4bwwP+GCjYfhePZjy74ctcddoDr2P67VSiNELAkpcTiY/nsMrO6sg9POmpy+8 31eg== X-Gm-Message-State: ABuFfoj6TRapcf1fRG2CPDte5pFv9z0h2pCePAiSpF2bezpGkLuEc/sj NR8ofa7upHfdNL/70KOLdYgkBKfYTsWGcFbvxxavaSrpqeg7oMTwUC1xCnIUWsRHi5PpK/Pv7IK LGw5HHquZoQK5Fa1mHndgj0eedhy3owsO5P0bHHy84l8e8xFPZzyWw7Wlb2zn28SKcgi4E78Z42 ffLebuKOHLd/ueYEy/KPwRmPHuWLWxsoBTFagoXpe+Hd8XrWYv2ETm9+9+BTN1ismB4LJkS/nK3 yP9zMyS+DTiLk1wps3pZqqLNfOsH2DQAXoIJ5DHdtbNT4Ztzmji80mCHqeXJ2DBa+wUoMrRlu6R qKz0zH0ioYpiLMwpVuznDvK67dTpg1wTme6ek4CoguQn5/DkwmDP2JRehNg5ae04fHFjnQbvXlu 5 X-Received: by 2002:a50:a8a2:: with SMTP id k31-v6mr5115622edc.192.1539923587083; Thu, 18 Oct 2018 21:33:07 -0700 (PDT) X-Received: by 2002:a50:a8a2:: with SMTP id k31-v6mr5115572edc.192.1539923585933; Thu, 18 Oct 2018 21:33:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539923585; cv=none; d=google.com; s=arc-20160816; b=pNf+VJ0F0mBkRcyTThTPd9AbSrsmvjpmvFplqiLlFAhwnVxgy4CkXhdAQj1Pijl5gR 9cMbiieZZUqwnBeveD6+t+pXSFgYJRRl2lUAa51bowJnqz0/uyUjshU3APHRLlMI+dM6 L/nKiy6wjYxdWbUhTcr9IgKicZqRqYnZdZzDEINhI2ZPoXeRVYStc4QIkalSVFdBQIfz v2v23nxbhmoNFh+ccFLU63WmYfOlC98bOimgUGTwcql4txe5bwFK6ZjCWAJa2/oxSVhx uVmAKVqFJ2G/VMGqFvIUd1TZgzTot52xVvUsmAyA4RA0mTN53PJbssoXKuYy6VI1Iar/ thuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:content-disposition:mime-version:message-id:subject:cc :to:from:date:dkim-signature; bh=zX8hgNYo5FyFw5wDSWpdyFbKsLHMgoCkx06/9RagvlY=; b=VkWV4JTcAVWOEYxV9gZNjW59gkRnm38WM4D2sRB9SGedgGAWLuLeeoX9H8xEhl/OnY nOpE+XxlQ8ZcabzLWxTt3aIp5rY5500i5C4IQ12MfFwNJKO0je4UhD2sNcXpbk+vDWXo dv6DRG6QbZ6GooTOxb0v2a+ROvJBeq8zcj2kNcwW5ZyRLIsc2HHtQg+bJxsQMKJSNPkd YBoA5i1OY3CuccrOFarbkmMxksjPm9SeHAnTKT1h3Csh9lVv0ESsiBo/Ko03+77KZ1Ss C1qgD2dJTlAex+IEnay4fOMeqfpEULl64+FL2vKg6dpIbAgD6a5PveP2CsGQxgxbbrIx HtWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="s/myeHpp"; spf=pass (google.com: domain of richard.weiyang@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id m17-v6sor7494431eje.24.2018.10.18.21.33.05 for (Google Transport Security); Thu, 18 Oct 2018 21:33:05 -0700 (PDT) Received-SPF: pass (google.com: domain of richard.weiyang@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="s/myeHpp"; spf=pass (google.com: domain of richard.weiyang@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=richard.weiyang@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mime-version:content-disposition :user-agent; bh=zX8hgNYo5FyFw5wDSWpdyFbKsLHMgoCkx06/9RagvlY=; b=s/myeHppw/ennQjAMsOLA6w4m2W3+1C7RgooirgSVlhJzcTjTIBB+7UWL2wg//fWGe yz/qlUMNq0LSiiZUSVj5JJdEYiplEyXJPK/pEAetNx9hg+aVp7/fUYKuIxj7hZsq8eJD X/uZzUHpYOyFxy/oNjZB1AuFVxazMXL6gEZOZvqkzsxv0GZGm03NMmZF8T3/8wajbKYx ceYDtbV3FZ2zAIUVrL/5WUCwzxb/MldqL7NBzpYqf852lnSs+mwowyFBQ/RrlG+G6BkD WSaonSNawZcNRgKIqneH/Uon3pe0kIdsld5vlHeFpu7KO/5KIqR1P+KWyl0wFx5Za6IR XyAw== X-Google-Smtp-Source: ACcGV61J8zfNvknU2SLiZVf0BjFzpjgRh2ZeQkQQ+I19rQB2CVip8+Y6ub/ottQjnqyy8eofyzDEhg== X-Received: by 2002:a17:906:755d:: with SMTP id a29-v6mr29685676ejn.84.1539923585365; Thu, 18 Oct 2018 21:33:05 -0700 (PDT) Received: from localhost ([185.92.221.13]) by smtp.gmail.com with ESMTPSA id y8-v6sm10113273edd.43.2018.10.18.21.33.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 18 Oct 2018 21:33:04 -0700 (PDT) Date: Fri, 19 Oct 2018 04:33:03 +0000 From: Wei Yang To: willy@infradead.org, mhocko@suse.com, mgorman@techsingularity.net Cc: richard.weiyang@gmail.com, linux-mm@kvack.org, akpm@linux-foundation.org Subject: [RFC] put page to pcp->lists[] tail if it is not on the same node Message-ID: <20181019043303.s5axhjfb2v2lzsr3@master> MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20170113 (1.7.2) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP node Reply-To: Wei Yang Masters, During the code reading, I pop up this idea. In case we put some intelegence of NUMA node to pcp->lists[], we may get a better performance. The idea is simple: Put page on other nodes to the tail of pcp->lists[], because we allocate from head and free from tail. Since my desktop just has one numa node, I couldn't test the effect. I just run a kernel build test to see if it would degrade current kernel. The result looks not bad. make -j4 bzImage base-line: real 6m15.947s user 21m14.481s sys 2m34.407s real 6m16.089s user 21m18.295s sys 2m35.551s real 6m16.239s user 21m17.590s sys 2m35.252s patched: real 6m14.558s user 21m18.374s sys 2m33.143s real 6m14.606s user 21m14.969s sys 2m32.039s real 6m15.264s user 21m16.698s sys 2m33.024s Sorry for sending this without a real justification. Hope this will not make you uncomfortable. I would be very glad if you suggest some verifications that I could do. Below is my testing patch, look forward your comments. From 2f9a99521068dfe7ec98ea39f73649226d9a837b Mon Sep 17 00:00:00 2001 From: Wei Yang Date: Fri, 19 Oct 2018 11:37:09 +0800 Subject: [PATCH] mm: put page to pcp->lists[] tail if it is not on the same node pcp->lists[] is used to allocate/free page for order 0 page. While a list of CPU on Node A could contain page of Node B. If we put page on the same node to list head and put other pages on list tail, this would increase the chance to allocate a page on the same node and free a page on other nodes. On a 64bit machine, size of per_cpu_pages will not increase because of the alignment. The new added field *node* will fit in the same cache line with count, which minimize the performance impact. Signed-off-by: Wei Yang --- include/linux/mmzone.h | 1 + mm/page_alloc.c | 30 +++++++++++++++++++++--------- 2 files changed, 22 insertions(+), 9 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 5138efde11ae..27ce071bc99c 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -272,6 +272,7 @@ enum zone_watermarks { #define high_wmark_pages(z) (z->watermark[WMARK_HIGH]) struct per_cpu_pages { + int node; /* node id of this cpu */ int count; /* number of pages in the list */ int high; /* high watermark, emptying needed */ int batch; /* chunk size for buddy add/remove */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a398eafbae46..c7a27e461602 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2741,6 +2741,7 @@ static bool free_unref_page_prepare(struct page *page, unsigned long pfn) static void free_unref_page_commit(struct page *page, unsigned long pfn) { struct zone *zone = page_zone(page); + int page_node = page_to_nid(page); struct per_cpu_pages *pcp; int migratetype; @@ -2763,7 +2764,14 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn) } pcp = &this_cpu_ptr(zone->pageset)->pcp; - list_add(&page->lru, &pcp->lists[migratetype]); + /* + * If the page has the same node_id as this cpu, put the page at head. + * Otherwise, put at the end. + */ + if (page_node == pcp->node) + list_add(&page->lru, &pcp->lists[migratetype]); + else + list_add_tail(&page->lru, &pcp->lists[migratetype]); pcp->count++; if (pcp->count >= pcp->high) { unsigned long batch = READ_ONCE(pcp->batch); @@ -5615,7 +5623,7 @@ static int zone_batchsize(struct zone *zone) * exist). */ static void pageset_update(struct per_cpu_pages *pcp, unsigned long high, - unsigned long batch) + unsigned long batch, int node_id) { /* start with a fail safe value for batch */ pcp->batch = 1; @@ -5626,12 +5634,14 @@ static void pageset_update(struct per_cpu_pages *pcp, unsigned long high, smp_wmb(); pcp->batch = batch; + pcp->node = node_id; } /* a companion to pageset_set_high() */ -static void pageset_set_batch(struct per_cpu_pageset *p, unsigned long batch) +static void pageset_set_batch(struct per_cpu_pageset *p, unsigned long batch, + int node_id) { - pageset_update(&p->pcp, 6 * batch, max(1UL, 1 * batch)); + pageset_update(&p->pcp, 6 * batch, max(1UL, 1 * batch), node_id); } static void pageset_init(struct per_cpu_pageset *p) @@ -5650,7 +5660,7 @@ static void pageset_init(struct per_cpu_pageset *p) static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch) { pageset_init(p); - pageset_set_batch(p, batch); + pageset_set_batch(p, batch, 0); } /* @@ -5658,13 +5668,13 @@ static void setup_pageset(struct per_cpu_pageset *p, unsigned long batch) * to the value high for the pageset p. */ static void pageset_set_high(struct per_cpu_pageset *p, - unsigned long high) + unsigned long high, int node_id) { unsigned long batch = max(1UL, high / 4); if ((high / 4) > (PAGE_SHIFT * 8)) batch = PAGE_SHIFT * 8; - pageset_update(&p->pcp, high, batch); + pageset_update(&p->pcp, high, batch, node_id); } static void pageset_set_high_and_batch(struct zone *zone, @@ -5673,9 +5683,11 @@ static void pageset_set_high_and_batch(struct zone *zone, if (percpu_pagelist_fraction) pageset_set_high(pcp, (zone->managed_pages / - percpu_pagelist_fraction)); + percpu_pagelist_fraction), + zone->zone_pgdat->node_id); else - pageset_set_batch(pcp, zone_batchsize(zone)); + pageset_set_batch(pcp, zone_batchsize(zone), + zone->zone_pgdat->node_id); } static void __meminit zone_pageset_init(struct zone *zone, int cpu)