diff mbox series

[3/4] mm/vmalloc: be more explicit about supported gfp flags.

Message ID 20211025150223.13621-4-mhocko@kernel.org (mailing list archive)
State New
Headers show
Series extend vmalloc support for constrained allocations | expand

Commit Message

Michal Hocko Oct. 25, 2021, 3:02 p.m. UTC
From: Michal Hocko <mhocko@suse.com>

The core of the vmalloc allocator __vmalloc_area_node doesn't say
anything about gfp mask argument. Not all gfp flags are supported
though. Be more explicit about constrains.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/vmalloc.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

NeilBrown Oct. 25, 2021, 11:26 p.m. UTC | #1
On Tue, 26 Oct 2021, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> The core of the vmalloc allocator __vmalloc_area_node doesn't say
> anything about gfp mask argument. Not all gfp flags are supported
> though. Be more explicit about constrains.
> 
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/vmalloc.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 602649919a9d..2199d821c981 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>   * @caller:		  caller's return address
>   *
>   * Allocate enough pages to cover @size from the page level
> - * allocator with @gfp_mask flags.  Map them into contiguous
> - * kernel virtual space, using a pagetable protection of @prot.
> + * allocator with @gfp_mask flags. Please note that the full set of gfp
> + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not

In what sense is GFP_KERNEL "preferred"??
The choice of GFP_NOFS, when necessary, isn't based on preference but
on need.

I understand that you would prefer no one ever used GFP_NOFs ever - just
use the scope API.  I even agree.  But this is not the place to make
that case. 

> + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka

I don't think "aka" is the right thing to use here.  It is short for
"also known as" and there is nothing that is being known as something
else.
It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
"i.e." is short for the Latin "id est" which means "that is" and
normally introduces an alternate description (whereas aka introduces an
alternate name).


> + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).

Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.

> + * __GFP_NOWARN can be used to suppress error messages about failures.

Surely "NOWARN" suppresses warning messages, not error messages ....

Thanks,
NeilBrown


> + * 
> + * Map them into contiguous kernel virtual space, using a pagetable
> + * protection of @prot.
>   *
>   * Return: the address of the area or %NULL on failure
>   */
> -- 
> 2.30.2
> 
>
Michal Hocko Oct. 26, 2021, 7:10 a.m. UTC | #2
On Tue 26-10-21 10:26:06, Neil Brown wrote:
> On Tue, 26 Oct 2021, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > The core of the vmalloc allocator __vmalloc_area_node doesn't say
> > anything about gfp mask argument. Not all gfp flags are supported
> > though. Be more explicit about constrains.
> > 
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  mm/vmalloc.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 602649919a9d..2199d821c981 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> >   * @caller:		  caller's return address
> >   *
> >   * Allocate enough pages to cover @size from the page level
> > - * allocator with @gfp_mask flags.  Map them into contiguous
> > - * kernel virtual space, using a pagetable protection of @prot.
> > + * allocator with @gfp_mask flags. Please note that the full set of gfp
> > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
> 
> In what sense is GFP_KERNEL "preferred"??
> The choice of GFP_NOFS, when necessary, isn't based on preference but
> on need.
> 
> I understand that you would prefer no one ever used GFP_NOFs ever - just
> use the scope API.  I even agree.  But this is not the place to make
> that case. 

Any suggestion for a better wording?

> > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
> 
> I don't think "aka" is the right thing to use here.  It is short for
> "also known as" and there is nothing that is being known as something
> else.
> It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
> "i.e." is short for the Latin "id est" which means "that is" and
> normally introduces an alternate description (whereas aka introduces an
> alternate name).

OK
 
> > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
> 
> Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.

Because they cannot be passed to the page table allocator. In both cases
the allocation would fail when system is short on memory. GFP_KERNEL
used for ptes implicitly doesn't behave that way.

> 
> > + * __GFP_NOWARN can be used to suppress error messages about failures.
> 
> Surely "NOWARN" suppresses warning messages, not error messages ....

I am not sure I follow. NOWARN means "do not warn" independently on the
log level chosen for the message. Is an allocation failure an error
message? Is the "vmalloc error: size %lu, failed to map pages" an error
message?

Anyway I will go with "__GFP_NOWARN can be used to suppress failure messages"

Is that better?
NeilBrown Oct. 26, 2021, 10:43 a.m. UTC | #3
On Tue, 26 Oct 2021, Michal Hocko wrote:
> On Tue 26-10-21 10:26:06, Neil Brown wrote:
> > On Tue, 26 Oct 2021, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > 
> > > The core of the vmalloc allocator __vmalloc_area_node doesn't say
> > > anything about gfp mask argument. Not all gfp flags are supported
> > > though. Be more explicit about constrains.
> > > 
> > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > > ---
> > >  mm/vmalloc.c | 12 ++++++++++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 602649919a9d..2199d821c981 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > >   * @caller:		  caller's return address
> > >   *
> > >   * Allocate enough pages to cover @size from the page level
> > > - * allocator with @gfp_mask flags.  Map them into contiguous
> > > - * kernel virtual space, using a pagetable protection of @prot.
> > > + * allocator with @gfp_mask flags. Please note that the full set of gfp
> > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
> > 
> > In what sense is GFP_KERNEL "preferred"??
> > The choice of GFP_NOFS, when necessary, isn't based on preference but
> > on need.
> > 
> > I understand that you would prefer no one ever used GFP_NOFs ever - just
> > use the scope API.  I even agree.  But this is not the place to make
> > that case. 
> 
> Any suggestion for a better wording?

 "GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all supported".

> 
> > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
> > 
> > I don't think "aka" is the right thing to use here.  It is short for
> > "also known as" and there is nothing that is being known as something
> > else.
> > It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
> > "i.e." is short for the Latin "id est" which means "that is" and
> > normally introduces an alternate description (whereas aka introduces an
> > alternate name).
> 
> OK
>  
> > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
> > 
> > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.
> 
> Because they cannot be passed to the page table allocator. In both cases
> the allocation would fail when system is short on memory. GFP_KERNEL
> used for ptes implicitly doesn't behave that way.

Could you please point me to the particular allocation which uses
GFP_KERNEL rather than the flags passed to __vmalloc_node()?  I cannot
find it.

> 
> > 
> > > + * __GFP_NOWARN can be used to suppress error messages about failures.
> > 
> > Surely "NOWARN" suppresses warning messages, not error messages ....
> 
> I am not sure I follow. NOWARN means "do not warn" independently on the
> log level chosen for the message. Is an allocation failure an error
> message? Is the "vmalloc error: size %lu, failed to map pages" an error
> message?

If guess working with a C compiler has trained me to think that
"warnings" are different from "errors".

> 
> Anyway I will go with "__GFP_NOWARN can be used to suppress failure messages"
> 
> Is that better?

Yes, that's an excellent solution!  Thanks.

NeilBrown


> -- 
> Michal Hocko
> SUSE Labs
> 
>
Michal Hocko Oct. 26, 2021, 12:20 p.m. UTC | #4
On Tue 26-10-21 21:43:17, Neil Brown wrote:
> On Tue, 26 Oct 2021, Michal Hocko wrote:
> > On Tue 26-10-21 10:26:06, Neil Brown wrote:
> > > On Tue, 26 Oct 2021, Michal Hocko wrote:
> > > > From: Michal Hocko <mhocko@suse.com>
> > > > 
> > > > The core of the vmalloc allocator __vmalloc_area_node doesn't say
> > > > anything about gfp mask argument. Not all gfp flags are supported
> > > > though. Be more explicit about constrains.
> > > > 
> > > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > > > ---
> > > >  mm/vmalloc.c | 12 ++++++++++--
> > > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > > index 602649919a9d..2199d821c981 100644
> > > > --- a/mm/vmalloc.c
> > > > +++ b/mm/vmalloc.c
> > > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > > >   * @caller:		  caller's return address
> > > >   *
> > > >   * Allocate enough pages to cover @size from the page level
> > > > - * allocator with @gfp_mask flags.  Map them into contiguous
> > > > - * kernel virtual space, using a pagetable protection of @prot.
> > > > + * allocator with @gfp_mask flags. Please note that the full set of gfp
> > > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> > > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
> > > 
> > > In what sense is GFP_KERNEL "preferred"??
> > > The choice of GFP_NOFS, when necessary, isn't based on preference but
> > > on need.
> > > 
> > > I understand that you would prefer no one ever used GFP_NOFs ever - just
> > > use the scope API.  I even agree.  But this is not the place to make
> > > that case. 
> > 
> > Any suggestion for a better wording?
> 
>  "GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all supported".

OK. Check the incremental update at the end of the email

> > > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> > > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
> > > 
> > > I don't think "aka" is the right thing to use here.  It is short for
> > > "also known as" and there is nothing that is being known as something
> > > else.
> > > It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
> > > "i.e." is short for the Latin "id est" which means "that is" and
> > > normally introduces an alternate description (whereas aka introduces an
> > > alternate name).
> > 
> > OK
> >  
> > > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
> > > 
> > > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.
> > 
> > Because they cannot be passed to the page table allocator. In both cases
> > the allocation would fail when system is short on memory. GFP_KERNEL
> > used for ptes implicitly doesn't behave that way.
> 
> Could you please point me to the particular allocation which uses
> GFP_KERNEL rather than the flags passed to __vmalloc_node()?  I cannot
> find it.
> 

It is dug 
__vmalloc_area_node
  vmap_pages_range
    vmap_pages_range_noflush
      vmap_range_noflush || vmap_small_pages_range_noflush
        vmap_p4d_range
	  p4d_alloc_track
	    __p4d_alloc
	      p4d_alloc_one
	        get_zeroed_page(GFP_KERNEL_ACCOUNT)

the same applies for all other levels of page tables.

This is what I have currently
commit ae7fc6c2ef6949a76d697fc61bb350197dfca330
Author: Michal Hocko <mhocko@suse.com>
Date:   Tue Oct 26 14:16:32 2021 +0200

    fold me "mm/vmalloc: be more explicit about supported gfp flags."

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 2ddaa9410aee..82a07b04317e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2981,12 +2981,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
  *
  * Allocate enough pages to cover @size from the page level
  * allocator with @gfp_mask flags. Please note that the full set of gfp
- * flags are not supported. GFP_KERNEL would be a preferred allocation mode
- * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
- * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
- * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
- * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
- * __GFP_NOWARN can be used to suppress error messages about failures.
+ * flags are not supported. GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all
+ * supported.
+ * Zone modifiers are not supported. From the reclaim modifiers
+ * __GFP_DIRECT_RECLAIM is required (aka GFP_NOWAIT is not supported)
+ * and only __GFP_NOFAIL is supported (i.e. __GFP_NORETRY and 
+ * __GFP_RETRY_MAYFAIL are not supported).
+ *
+ * __GFP_NOWARN can be used to suppress failures messages.
  * 
  * Map them into contiguous kernel virtual space, using a pagetable
  * protection of @prot.
diff mbox series

Patch

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 602649919a9d..2199d821c981 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2980,8 +2980,16 @@  static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
  * @caller:		  caller's return address
  *
  * Allocate enough pages to cover @size from the page level
- * allocator with @gfp_mask flags.  Map them into contiguous
- * kernel virtual space, using a pagetable protection of @prot.
+ * allocator with @gfp_mask flags. Please note that the full set of gfp
+ * flags are not supported. GFP_KERNEL would be a preferred allocation mode
+ * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
+ * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
+ * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
+ * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
+ * __GFP_NOWARN can be used to suppress error messages about failures.
+ * 
+ * Map them into contiguous kernel virtual space, using a pagetable
+ * protection of @prot.
  *
  * Return: the address of the area or %NULL on failure
  */