mm: Remove VM_BUG_ON in __alloc_pages_node

Message ID	20190605060229.GA9468@bharath12345-Inspiron-5559 (mailing list archive)
State	New, archived
Headers	show Return-Path: <owner-linux-mm@kvack.org> Received-SPF: pass (google.com: domain of linux.bhar@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Date: Wed, 5 Jun 2019 11:32:29 +0530 From: Bharath Vedartham <linux.bhar@gmail.com> To: akpm@linux-foundation.org, vbabka@suse.cz, mhocko@suse.com, rientjes@google.com Cc: khalid.aziz@oracle.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node Message-ID: <20190605060229.GA9468@bharath12345-Inspiron-5559> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	mm: Remove VM_BUG_ON in __alloc_pages_node \| expand mm: Remove VM_BUG_ON in __alloc_pages_node

Bharath Vedartham June 5, 2019, 6:02 a.m. UTC

In __alloc_pages_node, there is a VM_BUG_ON on the condition (nid < 0 ||
nid >= MAX_NUMNODES). Remove this VM_BUG_ON and add a VM_WARN_ON, if the
condition fails and fail the allocation if an invalid NUMA node id is
passed to __alloc_pages_node.

The check (nid < 0 || nid >= MAX_NUMNODES) also considers NUMA_NO_NODE
as an invalid nid, but the caller of __alloc_pages_node is assumed to
have checked for the case where nid == NUMA_NO_NODE.

Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com>
---
 include/linux/gfp.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Michal Hocko June 5, 2019, 7:03 a.m. UTC | #1

On Wed 05-06-19 11:32:29, Bharath Vedartham wrote:
> In __alloc_pages_node, there is a VM_BUG_ON on the condition (nid < 0 ||
> nid >= MAX_NUMNODES). Remove this VM_BUG_ON and add a VM_WARN_ON, if the
> condition fails and fail the allocation if an invalid NUMA node id is
> passed to __alloc_pages_node.

What is the motivation of the patch? VM_BUG_ON is not enabled by default
and your patch adds a branch to a really hot path. Why is this an
improvement for something that shouldn't happen in the first place?

> 
> The check (nid < 0 || nid >= MAX_NUMNODES) also considers NUMA_NO_NODE
> as an invalid nid, but the caller of __alloc_pages_node is assumed to
> have checked for the case where nid == NUMA_NO_NODE.
> 
> Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com>
> ---
>  include/linux/gfp.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 5f5e25f..075bdaf 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -480,7 +480,11 @@ __alloc_pages(gfp_t gfp_mask, unsigned int order, int preferred_nid)
>  static inline struct page *
>  __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
>  {
> -	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> +	if (nid < 0 || nid >= MAX_NUMNODES) {
> +		VM_WARN_ON(nid < 0 || nid >= MAX_NUMNODES);
> +		return NULL; 
> +	}
> +
>  	VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
>  
>  	return __alloc_pages(gfp_mask, order, nid);
> -- 
> 2.7.4
>

Bharath Vedartham June 5, 2019, 1:07 p.m. UTC | #2

[Not replying inline as my mail is bouncing back]

This patch is based on reading the code rather than a kernel crash. My
thought process was that if an invalid node id was passed to
__alloc_pages_node, it would be better to add a VM_WARN_ON and fail the
allocation rather than crashing the kernel. 
I feel it would be better to fail the allocation early in the hot path
if an invalid node id is passed. This is irrespective of whether the
VM_[BUG|WARN]_*s are enabled or not. I do not see any checks in the hot
path for the node id, which in turn may cause NODE_DATA(nid) to fail to
get the pglist_data pointer for the node id. 
We can optimise the branch by wrapping it around in unlikely(), if
performance is the issue?
What are your thoughts on this? 

Thank you 
Bharath

Michal Hocko June 5, 2019, 2:22 p.m. UTC | #3

On Wed 05-06-19 18:37:28, Bharath Vedartham wrote:
> [Not replying inline as my mail is bouncing back]
> 
> This patch is based on reading the code rather than a kernel crash. My
> thought process was that if an invalid node id was passed to
> __alloc_pages_node, it would be better to add a VM_WARN_ON and fail the
> allocation rather than crashing the kernel. 

This makes some sense to me because BUG_ONs are usually a wrong way to
handle wrong usage of the API. On the other hand VM_BUG_ON is special in
the way that production although some distributions enable it by default
IIRC.

> I feel it would be better to fail the allocation early in the hot path
> if an invalid node id is passed. This is irrespective of whether the
> VM_[BUG|WARN]_*s are enabled or not. I do not see any checks in the hot
> path for the node id, which in turn may cause NODE_DATA(nid) to fail to
> get the pglist_data pointer for the node id. 
> We can optimise the branch by wrapping it around in unlikely(), if
> performance is the issue?

unlikely will just move the return NULL ouside of the main code flow.
The check will still be done.

> What are your thoughts on this? 

I don't know. I would leave the code as it is now or remove the
VM_BUG_ON. I do not remember this would be catching any real issues in
the past.

Bharath Vedartham June 5, 2019, 3:55 p.m. UTC | #4

IMO the reason why a lot of failures must not have occured in the past
might be because the programs which use it use stuff like cpu_to_node or
have checks for nid.
If one day we do get a program which passes an invalid node id without
VM_BUG_ON enabled, it might get weird.

Michal Hocko June 6, 2019, 5:29 a.m. UTC | #5

On Wed 05-06-19 21:25:01, Bharath Vedartham wrote:
> IMO the reason why a lot of failures must not have occured in the past
> might be because the programs which use it use stuff like cpu_to_node or
> have checks for nid.
> If one day we do get a program which passes an invalid node id without
> VM_BUG_ON enabled, it might get weird.

It will blow up on a NULL NODE_DATA and it will be quite obvious what
that was so I wouldn't lose any sleep over that. I do not think we have
any directly user controlable way to provide a completely ad-hoc numa
node for an allocation.

mm: Remove VM_BUG_ON in __alloc_pages_node

Commit Message

Comments

Patch