diff mbox series

maple_tree: use GFP_KERNEL on mas_node_count

Message ID 20230907033914epcms1p61c5eed4d34d5c4212436c201f33292b3@epcms1p6 (mailing list archive)
State New
Headers show
Series maple_tree: use GFP_KERNEL on mas_node_count | expand

Commit Message

Jaeseon Sim Sept. 7, 2023, 3:39 a.m. UTC
Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
in order to allow memory reclaim.

Currently, fork errors occur on low free memory as follows:

 Zygote  : Failed to fork child process: Out of memory (12)

-ENOMEM was returned as following path:

 mas_node_count
 mas_expected_entries
 dup_mmap
 dup_mm
 copy_mm
 copy_process

Signed-off-by: jason.sim <jason.sim@samsung.com>
---
 lib/maple_tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox Sept. 7, 2023, 3:49 a.m. UTC | #1
On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> in order to allow memory reclaim.

What testing did you do of this patch?  In particular, did you try it
with lockdep enabled?
Peng Zhang Sept. 7, 2023, 4:02 a.m. UTC | #2
在 2023/9/7 11:49, Matthew Wilcox 写道:
> On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
>> Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
>> in order to allow memory reclaim.
There are many paths that call maple tree's mas_node_count(). Some paths
cannot reclaim memory.
> 
> What testing did you do of this patch?  In particular, did you try it
> with lockdep enabled?
>
Matthew Wilcox Sept. 7, 2023, 4:10 a.m. UTC | #3
On Thu, Sep 07, 2023 at 12:02:02PM +0800, Peng Zhang wrote:
> 
> 
> 在 2023/9/7 11:49, Matthew Wilcox 写道:
> > On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> > > Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> > > in order to allow memory reclaim.
> There are many paths that call maple tree's mas_node_count(). Some paths
> cannot reclaim memory.

Right ... but we should be handling the ENOMEM inside the maple tree and
allocating some nodes with GFP_KERNEL instead of failing fork().

> > What testing did you do of this patch?  In particular, did you try it
> > with lockdep enabled?
Jaeseon Sim Sept. 7, 2023, 4:41 a.m. UTC | #4
> On Thu, Sep 07, 2023 at 12:02:02PM +0800, Peng Zhang wrote:
> > 
> > 
> > 在 2023/9/7 11:49, Matthew Wilcox 写道:
> > > On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> > > > Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> > > > in order to allow memory reclaim.
> > There are many paths that call maple tree's mas_node_count(). Some paths
> > cannot reclaim memory.
> 
> Right ... but we should be handling the ENOMEM inside the maple tree and
> allocating some nodes with GFP_KERNEL instead of failing fork().
> 
> > > What testing did you do of this patch?  In particular, did you try it
> > > with lockdep enabled?
I did power on/off test with this patch.
I did not try it with lockdep enabled.
Liam R. Howlett Sept. 7, 2023, 2:49 p.m. UTC | #5
* Jaeseon Sim <jason.sim@samsung.com> [230907 00:41]:
> > On Thu, Sep 07, 2023 at 12:02:02PM +0800, Peng Zhang wrote:
> > > 
> > > 
> > > 在 2023/9/7 11:49, Matthew Wilcox 写道:
> > > > On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> > > > > Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> > > > > in order to allow memory reclaim.
> > > There are many paths that call maple tree's mas_node_count(). Some paths
> > > cannot reclaim memory.
> > 
> > Right ... but we should be handling the ENOMEM inside the maple tree and
> > allocating some nodes with GFP_KERNEL instead of failing fork().
> > 
> > > > What testing did you do of this patch?  In particular, did you try it
> > > > with lockdep enabled?
> I did power on/off test with this patch.
> I did not try it with lockdep enabled.

To accomplish the same result, but with a much smaller scope that will
work with lockdep, I would suggest changing mas_expected_entries() to
use mas_node_count_gfp() (which already exists) and pass in GFP_KERNEL.

Since fork is the only current user of mas_expected_entries(), this
won't break other users and we can deal with changing it for others if
it is needed.

If we do go this route, please add a note in the documentation about
using GFP_KERNEL.

Willy, does that work for you?

Thanks,
Liam
Jaeseon Sim Sept. 14, 2023, 2:49 a.m. UTC | #6
> * Jaeseon Sim <jason.sim@samsung.com> [230907 00:41]:
> > > On Thu, Sep 07, 2023 at 12:02:02PM +0800, Peng Zhang wrote:
> > > > 
> > > > 
> > > > 在 2023/9/7 11:49, Matthew Wilcox 写道:
> > > > > On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> > > > > > Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> > > > > > in order to allow memory reclaim.
> > > > There are many paths that call maple tree's mas_node_count(). Some paths
> > > > cannot reclaim memory.
> > > 
> > > Right ... but we should be handling the ENOMEM inside the maple tree and
> > > allocating some nodes with GFP_KERNEL instead of failing fork().
> > > 
> > > > > What testing did you do of this patch?  In particular, did you try it
> > > > > with lockdep enabled?
> > I did power on/off test with this patch.
> > I did not try it with lockdep enabled.
> 
> To accomplish the same result, but with a much smaller scope that will
> work with lockdep, I would suggest changing mas_expected_entries() to
> use mas_node_count_gfp() (which already exists) and pass in GFP_KERNEL.
> 
> Since fork is the only current user of mas_expected_entries(), this
> won't break other users and we can deal with changing it for others if
> it is needed.
> 
> If we do go this route, please add a note in the documentation about
> using GFP_KERNEL.
> 
> Willy, does that work for you?
> 
> Thanks,
> Liam

Dear Liam,
Can I ask you the reason why mas_node_count is using GFP_NOWAIT | __GFP_NOWARN?
I wonder if other callers for mas_node_count might have similar issue.

From your opinion, I'll post v2 patch as follows

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index ee1ff0c59fd7..b0229271c24e 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5574,7 +5574,7 @@ int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries)
 	/* Internal nodes */
 	nr_nodes += DIV_ROUND_UP(nr_nodes, nonleaf_cap);
 	/* Add working room for split (2 nodes) + new parents */
-	mas_node_count(mas, nr_nodes + 3);
+	mas_node_count_gfp(mas, nr_nodes + 3, GFP_KERNEL);
 
 	/* Detect if allocations run out */
 	mas->mas_flags |= MA_STATE_PREALLOC;
Liam R. Howlett Sept. 14, 2023, 3:31 p.m. UTC | #7
* Jaeseon Sim <jason.sim@samsung.com> [230913 22:49]:
> > * Jaeseon Sim <jason.sim@samsung.com> [230907 00:41]:
> > > > On Thu, Sep 07, 2023 at 12:02:02PM +0800, Peng Zhang wrote:
> > > > > 
> > > > > 
> > > > > 在 2023/9/7 11:49, Matthew Wilcox 写道:
> > > > > > On Thu, Sep 07, 2023 at 12:39:14PM +0900, 심재선 wrote:
> > > > > > > Use GFP_KERNEL on mas_node_count instead of GFP_NOWAIT | __GFP_NOWARN
> > > > > > > in order to allow memory reclaim.
> > > > > There are many paths that call maple tree's mas_node_count(). Some paths
> > > > > cannot reclaim memory.
> > > > 
> > > > Right ... but we should be handling the ENOMEM inside the maple tree and
> > > > allocating some nodes with GFP_KERNEL instead of failing fork().
> > > > 
> > > > > > What testing did you do of this patch?  In particular, did you try it
> > > > > > with lockdep enabled?
> > > I did power on/off test with this patch.
> > > I did not try it with lockdep enabled.
> > 
> > To accomplish the same result, but with a much smaller scope that will
> > work with lockdep, I would suggest changing mas_expected_entries() to
> > use mas_node_count_gfp() (which already exists) and pass in GFP_KERNEL.
> > 
> > Since fork is the only current user of mas_expected_entries(), this
> > won't break other users and we can deal with changing it for others if
> > it is needed.
> > 
> > If we do go this route, please add a note in the documentation about
> > using GFP_KERNEL.
> > 
> > Willy, does that work for you?
> > 
> > Thanks,
> > Liam
> 
> Dear Liam,
> Can I ask you the reason why mas_node_count is using GFP_NOWAIT | __GFP_NOWARN?

Must users in the VMA space have complicated locking schemes which
require no sleeping during a store operation.  Most operations will drop
the lock and re-try with GFP_KERNEL when using the internal lock (see
mas_nomem()).

> I wonder if other callers for mas_node_count might have similar issue.

The external callers who need GFP_KERNEL are either using
mas_store_gfp() or mas_prealloc to set up a store prior to taking a
series of other locks.

During a mas_prealloc() or mas_expected_entries() call, we set the
MA_STATE_PREALLOC flag to indicate that there are nodes preallocated.
This is to catch users who call mas_node_count() and require increased
allocations when allocations should not be taken.  You can see this flag
directly below the line you modified.

> 
> From your opinion, I'll post v2 patch as follows

Thanks.  Please test with lockdep but I don't see a nesting lock issue
with fork and this change.

> 
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index ee1ff0c59fd7..b0229271c24e 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5574,7 +5574,7 @@ int mas_expected_entries(struct ma_state *mas, unsigned long nr_entries)
>  	/* Internal nodes */
>  	nr_nodes += DIV_ROUND_UP(nr_nodes, nonleaf_cap);
>  	/* Add working room for split (2 nodes) + new parents */
> -	mas_node_count(mas, nr_nodes + 3);
> +	mas_node_count_gfp(mas, nr_nodes + 3, GFP_KERNEL);
>  
>  	/* Detect if allocations run out */
>  	mas->mas_flags |= MA_STATE_PREALLOC;
> -- 
> 2.17.1
> 
> Thanks
> Jaeseon
diff mbox series

Patch

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index ee1ff0c59fd7..076798f83baa 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1336,11 +1336,11 @@  static void mas_node_count_gfp(struct ma_state *mas, int count, gfp_t gfp)
  * @mas: The maple state
  * @count: The number of nodes needed
  *
- * Note: Uses GFP_NOWAIT | __GFP_NOWARN for gfp flags.
+ * Note: Uses GFP_KERNEL for gfp flags.
  */
 static void mas_node_count(struct ma_state *mas, int count)
 {
-       return mas_node_count_gfp(mas, count, GFP_NOWAIT | __GFP_NOWARN);
+       return mas_node_count_gfp(mas, count, GFP_KERNEL);
 }
 
 /*