diff mbox series

[RFC,1/3] mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc

Message ID 20211018114712.9802-2-mhocko@kernel.org (mailing list archive)
State New
Headers show
Series extend vmalloc support for constrained allocations | expand

Commit Message

Michal Hocko Oct. 18, 2021, 11:47 a.m. UTC
From: Michal Hocko <mhocko@suse.com>

vmalloc historically hasn't supported GFP_NO{FS,IO} requests because
page table allocations do not support externally provided gfp mask
and performed GFP_KERNEL like allocations.

Since few years we have scope (memalloc_no{fs,io}_{save,restore}) APIs
to enforce NOFS and NOIO constrains implicitly to all allocators within
the scope. There was a hope that those scopes would be defined on a
higher level when the reclaim recursion boundary starts/stops (e.g. when
a lock required during the memory reclaim is required etc.). It seems
that not all NOFS/NOIO users have adopted this approach and instead
they have taken a workaround approach to wrap a single [k]vmalloc
allocation by a scope API.

These workarounds do not serve the purpose of a better reclaim recursion
documentation and reduction of explicit GFP_NO{FS,IO} usege so let's
just provide them with the semantic they are asking for without a need
for workarounds.

Add support for GFP_NOFS and GFP_NOIO to vmalloc directly. All internal
allocations already comply with the given gfp_mask. The only current
exception is vmap_pages_range which maps kernel page tables. Infer the
proper scope API based on the given gfp mask.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/vmalloc.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

Comments

NeilBrown Oct. 19, 2021, 12:44 a.m. UTC | #1
On Mon, 18 Oct 2021, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> vmalloc historically hasn't supported GFP_NO{FS,IO} requests because
> page table allocations do not support externally provided gfp mask
> and performed GFP_KERNEL like allocations.
> 
> Since few years we have scope (memalloc_no{fs,io}_{save,restore}) APIs
> to enforce NOFS and NOIO constrains implicitly to all allocators within
> the scope. There was a hope that those scopes would be defined on a
> higher level when the reclaim recursion boundary starts/stops (e.g. when
> a lock required during the memory reclaim is required etc.). It seems
> that not all NOFS/NOIO users have adopted this approach and instead
> they have taken a workaround approach to wrap a single [k]vmalloc
> allocation by a scope API.
> 
> These workarounds do not serve the purpose of a better reclaim recursion
> documentation and reduction of explicit GFP_NO{FS,IO} usege so let's
> just provide them with the semantic they are asking for without a need
> for workarounds.
> 
> Add support for GFP_NOFS and GFP_NOIO to vmalloc directly. All internal
> allocations already comply with the given gfp_mask. The only current
> exception is vmap_pages_range which maps kernel page tables. Infer the
> proper scope API based on the given gfp mask.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/vmalloc.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d77830ff604c..7455c89598d3 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2889,6 +2889,8 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  	unsigned long array_size;
>  	unsigned int nr_small_pages = size >> PAGE_SHIFT;
>  	unsigned int page_order;
> +	unsigned int flags;
> +	int ret;
>  
>  	array_size = (unsigned long)nr_small_pages * sizeof(struct page *);
>  	gfp_mask |= __GFP_NOWARN;
> @@ -2930,8 +2932,24 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  		goto fail;
>  	}
>  
> -	if (vmap_pages_range(addr, addr + size, prot, area->pages,
> -			page_shift) < 0) {
> +	/*
> +	 * page tables allocations ignore external gfp mask, enforce it
> +	 * by the scope API
> +	 */
> +	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> +		flags = memalloc_nofs_save();
> +	else if (!(gfp_mask & (__GFP_FS | __GFP_IO)))

I would *much* rather this were written

        else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)

so that the comparison with the previous test is more obvious.  Ditto
for similar code below.
It could even be

   switch (gfp_mask & (__GFP_FS | __GFP_IO)) {
   case __GFP__IO: flags = memalloc_nofs_save(); break;
   case 0:         flags = memalloc_noio_save(); break;
   }

But I'm not completely convinced that is an improvement.

In terms of functionality this looks good.
Thanks,
NeilBrown


> +		flags = memalloc_noio_save();
> +
> +	ret = vmap_pages_range(addr, addr + size, prot, area->pages,
> +			page_shift);
> +
> +	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> +		memalloc_nofs_restore(flags);
> +	else if (!(gfp_mask & (__GFP_FS | __GFP_IO)))
> +		memalloc_noio_restore(flags);
> +
> +	if (ret < 0) {
>  		warn_alloc(gfp_mask, NULL,
>  			"vmalloc error: size %lu, failed to map pages",
>  			area->nr_pages * PAGE_SIZE);
> -- 
> 2.30.2
> 
>
Michal Hocko Oct. 19, 2021, 6:59 a.m. UTC | #2
On Tue 19-10-21 11:44:01, Neil Brown wrote:
> On Mon, 18 Oct 2021, Michal Hocko wrote:
[...]
> > @@ -2930,8 +2932,24 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> >  		goto fail;
> >  	}
> >  
> > -	if (vmap_pages_range(addr, addr + size, prot, area->pages,
> > -			page_shift) < 0) {
> > +	/*
> > +	 * page tables allocations ignore external gfp mask, enforce it
> > +	 * by the scope API
> > +	 */
> > +	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
> > +		flags = memalloc_nofs_save();
> > +	else if (!(gfp_mask & (__GFP_FS | __GFP_IO)))
> 
> I would *much* rather this were written
> 
>         else if ((gfp_mask & (__GFP_FS | __GFP_IO)) == 0)

Sure, this looks better indeed.

> so that the comparison with the previous test is more obvious.  Ditto
> for similar code below.
> It could even be
> 
>    switch (gfp_mask & (__GFP_FS | __GFP_IO)) {
>    case __GFP__IO: flags = memalloc_nofs_save(); break;
>    case 0:         flags = memalloc_noio_save(); break;
>    }
> 
> But I'm not completely convinced that is an improvement.

I am not a great fan of this though.

> In terms of functionality this looks good.

Thanks for the review!
diff mbox series

Patch

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d77830ff604c..7455c89598d3 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2889,6 +2889,8 @@  static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	unsigned long array_size;
 	unsigned int nr_small_pages = size >> PAGE_SHIFT;
 	unsigned int page_order;
+	unsigned int flags;
+	int ret;
 
 	array_size = (unsigned long)nr_small_pages * sizeof(struct page *);
 	gfp_mask |= __GFP_NOWARN;
@@ -2930,8 +2932,24 @@  static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 		goto fail;
 	}
 
-	if (vmap_pages_range(addr, addr + size, prot, area->pages,
-			page_shift) < 0) {
+	/*
+	 * page tables allocations ignore external gfp mask, enforce it
+	 * by the scope API
+	 */
+	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+		flags = memalloc_nofs_save();
+	else if (!(gfp_mask & (__GFP_FS | __GFP_IO)))
+		flags = memalloc_noio_save();
+
+	ret = vmap_pages_range(addr, addr + size, prot, area->pages,
+			page_shift);
+
+	if ((gfp_mask & (__GFP_FS | __GFP_IO)) == __GFP_IO)
+		memalloc_nofs_restore(flags);
+	else if (!(gfp_mask & (__GFP_FS | __GFP_IO)))
+		memalloc_noio_restore(flags);
+
+	if (ret < 0) {
 		warn_alloc(gfp_mask, NULL,
 			"vmalloc error: size %lu, failed to map pages",
 			area->nr_pages * PAGE_SIZE);