From patchwork Fri May 4 04:30:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joonsoo Kim X-Patchwork-Id: 10379867 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E899960159 for ; Fri, 4 May 2018 04:30:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D35BC29319 for ; Fri, 4 May 2018 04:30:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C716D2931B; Fri, 4 May 2018 04:30:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9E8A29319 for ; Fri, 4 May 2018 04:30:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE3B06B0011; Fri, 4 May 2018 00:30:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C92E16B0012; Fri, 4 May 2018 00:30:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B33A76B0022; Fri, 4 May 2018 00:30:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf0-f199.google.com (mail-pf0-f199.google.com [209.85.192.199]) by kanga.kvack.org (Postfix) with ESMTP id 7045D6B0011 for ; Fri, 4 May 2018 00:30:57 -0400 (EDT) Received: by mail-pf0-f199.google.com with SMTP id y12so8476386pfe.8 for ; Thu, 03 May 2018 21:30:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:from:to:cc:subject:date :message-id; bh=jWjdlXOtQiuACsvKk3omekLDOZaV9QA4MFcKgf/t2kc=; b=hL9Fvv7sKWaWbnXBwqkxHXxCvL50Jt1TmdSykI562/r9SvIk+1uSf2ojy6uWlFddvw qVmyvZkeI9EvU/mON1NZDRBFxoE9wdkmDzTiXahH3Ix8igSiACdIuFDRPA6MmFCzA3g4 j4+UNhQkWvrttZTJt7f1tJbCCyUwIgtHx3nF1h+NfsjSxQrmcnPuETy3r0k9AbZ4ph0t KU4ioCAa2x6/KaagzMUtaanhr967rFeKcxyMMHe7kuXZ7UU/BVkzVLpH9V+hkVC455Z9 K/UqomBt53OC3vo8EeKT/X+XzlBlcMvR0QUxxoiYtbdrbSWiymGwKIEi1K5UIS8nfeRr uYDA== X-Gm-Message-State: ALQs6tBh4Q2oSjihvQRuFY5g476+K38L4fXW0xeTW77oiulqlKTq3OsM jrdCyXw90/6hi8dPsWz7vtq1JH6EUcbWw3XCyTD+OMs7hkzwf5+kMsQi9h8oJzt4hHwUTP8Sh0y PXgYh3i+QhbEAGNWRRAcriTYwASFIvPhbWhOsHTF/whIwGfhGI5KbxrdGJ+k3PZcfKwa0PoFhEX Wc1AAUu3NBViYYeVfOPuiNLgMa16ng7Hp5RtlKwHut1ZynxSDefN6W4IW6VD2440v17ygW2GvSE MSoz2eVo5RHSJ8XX+V2ywN2v2nJiLsw2XBOlg4b24zW074XynMErEFYsc7jndm0YZYa+L2+B8oB n3vvg3AFaVdYsf1iBKjTwF3IHbiQSKeIxpCb/l5xHRG8PPYvLtc0cWlJqrFSklzCC6fEfaSraF5 W X-Received: by 10.98.224.76 with SMTP id f73mr25548847pfh.88.1525408257132; Thu, 03 May 2018 21:30:57 -0700 (PDT) X-Received: by 10.98.224.76 with SMTP id f73mr25548822pfh.88.1525408256316; Thu, 03 May 2018 21:30:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525408256; cv=none; d=google.com; s=arc-20160816; b=G3VBPUOpbYdZO7Rraou5wtH2pzJ3aAzNvuHldl8YlMV1RqYNXsl4N1MK42sCKBA5vE cJB9Nygl/nAu+KWHO76KRSiUUB7jEwTJWV3b/GZ20bgu6xJ/Lle4r/0ZZb+ocn5bv9nP yHuFRTvRg+08Ah8p3GCECHLSEy3Ta+nV/ingVeyQDqF6mTZuSh51WYzA5JYOPb8kjHVP xBj9j5emGAIh47NV1PP6PqppPVFX6a8gynXjLUogQ/B9shvn/ueJuC8t6rQW8cEUjqgz PK9xOc73cJE0hTGL2ULiP79QZ70fQsTYzXS/mkl/XySJGFLvfcplK9W7CMPMpb9iLzUr V7DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=jWjdlXOtQiuACsvKk3omekLDOZaV9QA4MFcKgf/t2kc=; b=cShstmvupYGVR1CeadqAKKXj3F1kg0oLF4AyKcw0ya80hH8OoGPpOrjNZYiU/IHI1Y qq2Kai9/XHaTxGzhKrpS3lTtqksGDnQPPUAc9tMdEJVh5OKNgSjnMqbEciWMqUzcn7tE cEgoJdLYdalo92yWA0wAfIEzSjpHQxf+qyWMiAG2zF/oFHrH7VshF1wfg9bUD6zXRssk 915s+QfkF5j/TnENfn80dF9S/PylID5KTUEAGyBsZJ7A3wn0TZfTnGyM9bwpJtRzB+kO iUQMEwINocKF07/cG6G9AN+x+UhWT6hS4CClEfvcfN7gkJwXFctp0jLrGPhZbA3CRuni V0Pg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=KvLo4p8G; spf=pass (google.com: domain of js1304@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=js1304@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id g1-v6sor2446711plt.70.2018.05.03.21.30.56 for (Google Transport Security); Thu, 03 May 2018 21:30:56 -0700 (PDT) Received-SPF: pass (google.com: domain of js1304@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=KvLo4p8G; spf=pass (google.com: domain of js1304@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=js1304@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=jWjdlXOtQiuACsvKk3omekLDOZaV9QA4MFcKgf/t2kc=; b=KvLo4p8GFTZWW03WWqH/mnKfWrNwdh/bw3itDGv1z/jzcMH7s12enlXpVfAwxlRQWu zsO9AHtrbE5PEozH7vLoP5v3S1CstRmkw6/s0RkNICVjTwvEY0wOF9f/ZTJJt3mxq19y vjwZE8u2chWFvjqzXOZIFej+K9rirpykHYL84ZCd4vKY6VDWrbTwXDgMsYeArSSaqGKF 1Xm1TgEoQB4JUd2SynT0rGy/1jy2qFIuwkPT+3suFPOAS8A85hiXBIO49BI3slAxYbEq 3OSQyxLenf/Da1e3ohUoDwrGtQ/iUy0WvvmszotSi764zZl0Eyb0Nuh6qWYEYifaCzIO PVhg== X-Google-Smtp-Source: AB8JxZqPCVDd13SecRTBx+MmRf04hyA6xwpJSbtwcjFDwnTZEZOO+dmr9QOIJBzyUW9noeJvSUCruA== X-Received: by 2002:a17:902:322:: with SMTP id 31-v6mr26148074pld.122.1525408255894; Thu, 03 May 2018 21:30:55 -0700 (PDT) Received: from localhost.localdomain ([124.56.155.17]) by smtp.gmail.com with ESMTPSA id v23sm27430777pfe.166.2018.05.03.21.30.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 03 May 2018 21:30:54 -0700 (PDT) From: js1304@gmail.com X-Google-Original-From: iamjoonsoo.kim@lge.com To: Andrew Morton Cc: Mel Gorman , Michal Hocko , Vlastimil Babka , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Johannes Weiner , Minchan Kim , Ye Xiaolong , Joonsoo Kim Subject: [PATCH] mm/page_alloc: use ac->high_zoneidx for classzone_idx Date: Fri, 4 May 2018 13:30:46 +0900 Message-Id: <1525408246-14768-1-git-send-email-iamjoonsoo.kim@lge.com> X-Mailer: git-send-email 2.7.4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Joonsoo Kim Currently, we use the zone index of preferred_zone which represents the best matching zone for allocation, as classzone_idx. It has a problem on NUMA system with ZONE_MOVABLE. In NUMA system, it can be possible that each node has different populated zones. For example, node 0 could have DMA/DMA32/NORMAL/MOVABLE zone and node 1 could have only NORMAL zone. In this setup, allocation request initiated on node 0 and the one on node 1 would have different classzone_idx, 3 and 2, respectively, since their preferred_zones are different. If they are handled by only their own node, there is no problem. However, if they are somtimes handled by the remote node, the problem would happen. In the following setup, allocation initiated on node 1 will have some precedence than allocation initiated on node 0 when former allocation is processed on node 0 due to not enough memory on node 1. They will have different lowmem reserve due to their different classzone_idx thus an watermark bars are also different. root@ubuntu:/sys/devices/system/memory# cat /proc/zoneinfo Node 0, zone DMA per-node stats ... pages free 3965 min 5 low 8 high 11 spanned 4095 present 3998 managed 3977 protection: (0, 2961, 4928, 5440) ... Node 0, zone DMA32 pages free 757955 min 1129 low 1887 high 2645 spanned 1044480 present 782303 managed 758116 protection: (0, 0, 1967, 2479) ... Node 0, zone Normal pages free 459806 min 750 low 1253 high 1756 spanned 524288 present 524288 managed 503620 protection: (0, 0, 0, 4096) ... Node 0, zone Movable pages free 130759 min 195 low 326 high 457 spanned 1966079 present 131072 managed 131072 protection: (0, 0, 0, 0) ... Node 1, zone DMA pages free 0 min 0 low 0 high 0 spanned 0 present 0 managed 0 protection: (0, 0, 1006, 1006) Node 1, zone DMA32 pages free 0 min 0 low 0 high 0 spanned 0 present 0 managed 0 protection: (0, 0, 1006, 1006) Node 1, zone Normal per-node stats ... pages free 233277 min 383 low 640 high 897 spanned 262144 present 262144 managed 257744 protection: (0, 0, 0, 0) ... Node 1, zone Movable pages free 0 min 0 low 0 high 0 spanned 262144 present 0 managed 0 protection: (0, 0, 0, 0) min watermark for NORMAL zone on node 0 allocation initiated on node 0: 750 + 4096 = 4846 allocation initiated on node 1: 750 + 0 = 750 This watermark difference could cause too many numa_miss allocation in some situation and then performance could be downgraded. Recently, there was a regression report about this problem on CMA patches since CMA memory are placed in ZONE_MOVABLE by those patches. I checked that problem is disappeared with this fix that uses high_zoneidx for classzone_idx. http://lkml.kernel.org/r/20180102063528.GG30397@yexl-desktop Using high_zoneidx for classzone_idx is more consistent way than previous approach because system's memory layout doesn't affect anything to it. With this patch, both classzone_idx on above example will be 3 so will have the same min watermark. allocation initiated on node 0: 750 + 4096 = 4846 allocation initiated on node 1: 750 + 4096 = 4846 Reported-by: Ye Xiaolong Tested-by: Ye Xiaolong Signed-off-by: Joonsoo Kim --- mm/internal.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/internal.h b/mm/internal.h index 228dd66..e1d7376 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -123,7 +123,7 @@ struct alloc_context { bool spread_dirty_pages; }; -#define ac_classzone_idx(ac) zonelist_zone_idx(ac->preferred_zoneref) +#define ac_classzone_idx(ac) (ac->high_zoneidx) /* * Locate the struct page for both the matching buddy in our