From patchwork Wed May 26 05:01:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 12280631 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 709A9C47082 for ; Wed, 26 May 2021 05:02:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BE400613B9 for ; Wed, 26 May 2021 05:02:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BE400613B9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AB2F66B0071; Wed, 26 May 2021 01:02:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 99C446B0072; Wed, 26 May 2021 01:02:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A1BC6B0073; Wed, 26 May 2021 01:02:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id BC25D6B0072 for ; Wed, 26 May 2021 01:02:03 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 47C3F181AEF3C for ; Wed, 26 May 2021 05:02:03 +0000 (UTC) X-FDA: 78182185326.20.FA251FB Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf10.hostedemail.com (Postfix) with ESMTP id 5AE6140B8CE8 for ; Wed, 26 May 2021 05:01:56 +0000 (UTC) IronPort-SDR: iqc0bmh9F7cxNmJ4/J4GgSyW4oUIokzYrQikH48DsIaeVlIwPALucYJoXqdWnvfPhATggHM0oA nOA4mMok5wZQ== X-IronPort-AV: E=McAfee;i="6200,9189,9995"; a="202418853" X-IronPort-AV: E=Sophos;i="5.82,330,1613462400"; d="scan'208";a="202418853" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2021 22:02:02 -0700 IronPort-SDR: BZ2phDWZMYSCuXqzBP9HF4qiR/YUWL78LDIyN1AjQYoeArfwkIiuvjjJWo2ch1fE1lcM7urbKn Kg44OKWfsc6A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,330,1613462400"; d="scan'208";a="479700298" Received: from shbuild999.sh.intel.com ([10.239.147.94]) by fmsmga002.fm.intel.com with ESMTP; 25 May 2021 22:01:58 -0700 From: Feng Tang To: linux-mm@kvack.org, Andrew Morton , Michal Hocko , David Rientjes , Dave Hansen , Ben Widawsky Cc: linux-kernel@vger.kernel.org, Andrea Arcangeli , Mel Gorman , Mike Kravetz , Randy Dunlap , Vlastimil Babka , Andi Kleen , Dan Williams , ying.huang@intel.com, Feng Tang Subject: [PATCH v1 3/4] mm/mempolicy: don't handle MPOL_LOCAL like a fake MPOL_PREFERRED policy Date: Wed, 26 May 2021 13:01:41 +0800 Message-Id: <1622005302-23027-4-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1622005302-23027-1-git-send-email-feng.tang@intel.com> References: <1622005302-23027-1-git-send-email-feng.tang@intel.com> X-Rspamd-Queue-Id: 5AE6140B8CE8 Authentication-Results: imf10.hostedemail.com; dkim=none; spf=none (imf10.hostedemail.com: domain of feng.tang@intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=feng.tang@intel.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none) X-Rspamd-Server: rspam03 X-Stat-Signature: egs1p7qouc8xoqwfsskk1u31ezirtbgy X-HE-Tag: 1622005316-603078 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: MPOL_LOCAL policy has been setup as a real policy, but it is still handled like a faked POL_PREFERRED policy with one internal MPOL_F_LOCAL flag bit set, and there are many places having to judge the real 'prefer' or the 'local' policy, which are quite confusing. In current code, there are four cases that MPOL_LOCAL are used: * user specifies 'local' policy * user specifies 'prefer' policy, but with empty nodemask * system 'default' policy is used * 'prefer' policy + valid 'preferred' node with MPOL_F_STATIC_NODES flag set, and when it is 'rebind' to a nodemask which doesn't contains the 'preferred' node, it will add the MPOL_F_LOCAL bit and performs as 'local' policy. In future if it is 'rebind' again with valid nodemask, the policy will be restored back to 'prefer' So for the first three cases, we make 'local' a real policy instead of a fake 'prefer' one, this will reduce confusion for reading code. And next optional patch will kill the 'MPOL_F_LOCAL' bit. Signed-off-by: Feng Tang Acked-by: Michal Hocko --- mm/mempolicy.c | 66 ++++++++++++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 27 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 2830bb8..d97839d 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -121,8 +121,7 @@ enum zone_type policy_zone = 0; */ static struct mempolicy default_policy = { .refcnt = ATOMIC_INIT(1), /* never free it */ - .mode = MPOL_PREFERRED, - .flags = MPOL_F_LOCAL, + .mode = MPOL_LOCAL, }; static struct mempolicy preferred_node_policy[MAX_NUMNODES]; @@ -200,12 +199,9 @@ static int mpol_new_interleave(struct mempolicy *pol, const nodemask_t *nodes) static int mpol_new_preferred(struct mempolicy *pol, const nodemask_t *nodes) { - if (!nodes) - pol->flags |= MPOL_F_LOCAL; /* local allocation */ - else if (nodes_empty(*nodes)) - return -EINVAL; /* no allowed nodes */ - else - pol->v.preferred_node = first_node(*nodes); + if (nodes_empty(*nodes)) + return -EINVAL; + pol->v.preferred_node = first_node(*nodes); return 0; } @@ -217,6 +213,11 @@ static int mpol_new_bind(struct mempolicy *pol, const nodemask_t *nodes) return 0; } +static int mpol_new_local(struct mempolicy *pol, const nodemask_t *nodes) +{ + return 0; +} + /* * mpol_set_nodemask is called after mpol_new() to set up the nodemask, if * any, for the new policy. mpol_new() has already validated the nodes @@ -239,25 +240,19 @@ static int mpol_set_nodemask(struct mempolicy *pol, cpuset_current_mems_allowed, node_states[N_MEMORY]); VM_BUG_ON(!nodes); - if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes)) - nodes = NULL; /* explicit local allocation */ - else { - if (pol->flags & MPOL_F_RELATIVE_NODES) - mpol_relative_nodemask(&nsc->mask2, nodes, &nsc->mask1); - else - nodes_and(nsc->mask2, *nodes, nsc->mask1); - if (mpol_store_user_nodemask(pol)) - pol->w.user_nodemask = *nodes; - else - pol->w.cpuset_mems_allowed = - cpuset_current_mems_allowed; - } + if (pol->flags & MPOL_F_RELATIVE_NODES) + mpol_relative_nodemask(&nsc->mask2, nodes, &nsc->mask1); + else + nodes_and(nsc->mask2, *nodes, nsc->mask1); - if (nodes) - ret = mpol_ops[pol->mode].create(pol, &nsc->mask2); + if (mpol_store_user_nodemask(pol)) + pol->w.user_nodemask = *nodes; else - ret = mpol_ops[pol->mode].create(pol, NULL); + pol->w.cpuset_mems_allowed = + cpuset_current_mems_allowed; + + ret = mpol_ops[pol->mode].create(pol, &nsc->mask2); return ret; } @@ -290,13 +285,14 @@ static struct mempolicy *mpol_new(unsigned short mode, unsigned short flags, if (((flags & MPOL_F_STATIC_NODES) || (flags & MPOL_F_RELATIVE_NODES))) return ERR_PTR(-EINVAL); + + mode = MPOL_LOCAL; } } else if (mode == MPOL_LOCAL) { if (!nodes_empty(*nodes) || (flags & MPOL_F_STATIC_NODES) || (flags & MPOL_F_RELATIVE_NODES)) return ERR_PTR(-EINVAL); - mode = MPOL_PREFERRED; } else if (nodes_empty(*nodes)) return ERR_PTR(-EINVAL); policy = kmem_cache_alloc(policy_cache, GFP_KERNEL); @@ -427,6 +423,10 @@ static const struct mempolicy_operations mpol_ops[MPOL_MAX] = { .create = mpol_new_bind, .rebind = mpol_rebind_nodemask, }, + [MPOL_LOCAL] = { + .create = mpol_new_local, + .rebind = mpol_rebind_default, + }, }; static int migrate_page_add(struct page *page, struct list_head *pagelist, @@ -1965,6 +1965,8 @@ unsigned int mempolicy_slab_node(void) &policy->v.nodes); return z->zone ? zone_to_nid(z->zone) : node; } + case MPOL_LOCAL: + return node; default: BUG(); @@ -2089,6 +2091,11 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) *mask = mempolicy->v.nodes; break; + case MPOL_LOCAL: + nid = numa_node_id(); + init_nodemask_of_node(mask, nid); + break; + default: BUG(); } @@ -2333,6 +2340,8 @@ bool __mpol_equal(struct mempolicy *a, struct mempolicy *b) if (a->flags & MPOL_F_LOCAL) return true; return a->v.preferred_node == b->v.preferred_node; + case MPOL_LOCAL: + return true; default: BUG(); return false; @@ -2476,6 +2485,10 @@ int mpol_misplaced(struct page *page, struct vm_area_struct *vma, unsigned long polnid = pol->v.preferred_node; break; + case MPOL_LOCAL: + polnid = numa_node_id(); + break; + case MPOL_BIND: /* Optimize placement among multiple nodes via NUMA balancing */ if (pol->flags & MPOL_F_MORON) { @@ -2920,7 +2933,6 @@ int mpol_parse_str(char *str, struct mempolicy **mpol) */ if (nodelist) goto out; - mode = MPOL_PREFERRED; break; case MPOL_DEFAULT: /* @@ -2964,7 +2976,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol) else if (nodelist) new->v.preferred_node = first_node(nodes); else - new->flags |= MPOL_F_LOCAL; + new->mode = MPOL_LOCAL; /* * Save nodes for contextualization: this will be used to "clone"