From patchwork Fri Jan 28 05:59:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 12727952 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82C2CC433F5 for ; Fri, 28 Jan 2022 05:59:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7D7A6B0074; Fri, 28 Jan 2022 00:59:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B2DDF6B007D; Fri, 28 Jan 2022 00:59:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F4C66B0080; Fri, 28 Jan 2022 00:59:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0083.hostedemail.com [216.40.44.83]) by kanga.kvack.org (Postfix) with ESMTP id 8BD476B0074 for ; Fri, 28 Jan 2022 00:59:54 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 37F33918D0 for ; Fri, 28 Jan 2022 05:59:54 +0000 (UTC) X-FDA: 79078644708.27.F010158 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf19.hostedemail.com (Postfix) with ESMTP id A5AF51A000E for ; Fri, 28 Jan 2022 05:59:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643349592; x=1674885592; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=Y9j8XokzjsNGd2nOqWcvF1wdvX3sTQQLglF93VDFjPA=; b=adCWYEKFqwAWQYmJdsVNqQRQ8ea2opmN1xX6YusaqmI5noVfc1RYpbDt AuN+MoiCgHosASw08+XQHd9H25EQ01uZZtcoz5XwuHLqjGW9Ur1RUiyPy i4TgXkKbYkB5IZOlDSn5K3qZ7Itx66LnJc9IIfa78F9shBFpLp02pyphx klpSkkGjvyluUI0fJ4kB4TFgrGlkzF22gFm/YR5TuvXfp3QOQxKmKx/eA cOLMwYZKj1kA/OdNMXR/U3ECc/GlyBQTSudji6L2zR+LZQX5Sdl0z2Mnv S8x/1fP7diIBuHNOrMOFm71zkZXllsYBAuM4PCOKw6dfXDZiJVPdwweSH A==; X-IronPort-AV: E=McAfee;i="6200,9189,10240"; a="333405979" X-IronPort-AV: E=Sophos;i="5.88,322,1635231600"; d="scan'208";a="333405979" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jan 2022 21:59:51 -0800 X-IronPort-AV: E=Sophos;i="5.88,322,1635231600"; d="scan'208";a="536004956" Received: from yhuang6-desk2.sh.intel.com ([10.239.13.11]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jan 2022 21:59:48 -0800 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , Baolin Wang , Dave Hansen , Zi Yan , Oscar Salvador , Yang Shi , zhongjiang-ali , Xunlei Pang Subject: [PATCH] mm,migrate: fix establishing demotion target Date: Fri, 28 Jan 2022 13:59:40 +0800 Message-Id: <20220128055940.1792614-1-ying.huang@intel.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: A5AF51A000E X-Stat-Signature: otgkdgj89oo7znx5imrxy5dsrtko9hsg Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=adCWYEKF; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf19.hostedemail.com: domain of ying.huang@intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=ying.huang@intel.com X-Rspam-User: nil X-HE-Tag: 1643349592-219607 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In commit ac16ec835314 ("mm: migrate: support multiple target nodes demotion"), after the first demotion target node is found, we will continue to check the next candidate obtained via find_next_best_node(). This is to find all demotion target nodes with same NUMA distance. But one side effect of find_next_best_node() is that the candidate node returned will be set in "used" parameter, even if the candidate node isn't passed in the following NUMA distance checking, the candidate node will not be used as demotion target node for the following nodes. For example, for system as follows, node distances: node 0 1 2 3 0: 10 21 17 28 1: 21 10 28 17 2: 17 28 10 28 3: 28 17 28 10 when we establish demotion target node for node 0, in the first round node 2 is added to the demotion target node set. Then in the second round, node 3 is checked and failed because distance(0, 3) > distance(0, 2). But node 3 is set in "used" nodemask too. When we establish demotion target node for node 1, there is no available node. This is wrong, node 3 should be set as the demotion target of node 1. To fix this, if the candidate node is failed to pass the distance checking, it will be cleared in "used" nodemask. So that it can be used for the following node. The bug can be reproduced and fixed with this patch on a 2 socket server machine with DRAM and PMEM. Fixes: ac16ec835314 ("mm: migrate: support multiple target nodes demotion") Signed-off-by: "Huang, Ying" Cc: Baolin Wang Cc: Dave Hansen Cc: Zi Yan Cc: Oscar Salvador Cc: Yang Shi Cc: Baolin Wang Cc: zhongjiang-ali Cc: Xunlei Pang Reviewed-by: Baolin Wang --- mm/migrate.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index c7da064b4781..e8a6933af68d 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -3082,18 +3082,21 @@ static int establish_migrate_target(int node, nodemask_t *used, if (best_distance != -1) { val = node_distance(node, migration_target); if (val > best_distance) - return NUMA_NO_NODE; + goto out_clear; } index = nd->nr; if (WARN_ONCE(index >= DEMOTION_TARGET_NODES, "Exceeds maximum demotion target nodes\n")) - return NUMA_NO_NODE; + goto out_clear; nd->nodes[index] = migration_target; nd->nr++; return migration_target; +out_clear: + node_clear(migration_target, *used); + return NUMA_NO_NODE; } /*